Execution Tracers

S2E provides plugins to record execution traces. The main plugin is ExecutionTracer. All other tracing plugins, such as MemoryTracer or TranslationBlockTracer, use ExecutionTracer in order to append various trace entries to the trace file (e.g., a memory access or a translation block execution). ExecutionTracer gathers trace data from the various plugins and outputs them to an ExecutionTracer.dat file in protobuf format.

Note

This document is a reference. For a more hands-on explanation of how to generate traces, please read this tutorial first.

ExecutionTracer

An execution trace is a sequence of (magic, header_size, header, item_size, item) entries. The magic number identifies the start of a trace entry. It is followed by the size of the header, the header, the item size, then finally the actual item.

The header is defined by the PbTraceItemHeader protobuf type. For more details, see the TraceEntries.proto file.

message PbTraceItemHeader {
    required uint32 state_id = 1;
    required uint64 timestamp = 2;
    required uint64 address_space = 3;
    required uint64 pid = 4;
    required uint64 pc = 5;
    required PbTraceItemHeaderType type = 6;
}

The header entry contains data that is common to all trace items. This includes the program id, the program counter, and the address space (i.e., cr3 register on x86) at the time a trace entry is created. The type field specifies the type of the item that follows the header. The header also stores the state identifier to help reconstruct the execution tree offline.

The execution tracer ensures that it is possible to reconstruct the execution tree and extract individual paths when S2E terminates. The tree can be decoded even when the trace recording is partial (e.g., in case S2E is forcefully terminated). In order to do this, the execution tracer plugin monitors forks and records the id of the forked states in the trace (see PbTraceItemFork). Any subsequent trace item has the corresponding state id in its header. If two items in the trace have the same state identifier, the item that happened earlier in the path is the one that comes first in the trace.

ModuleTracer

This plugin traces process and module events. The following snippet is an example of a module load event. This indicates that the OS loaded the shared library /lib64/ld-linux-x86-64.so.2 into a process whose program identifier is 1245. The event also records the location of each loaded section of the binary. The runtime_load_base of a section is its start address in virtual memory as decided by the OS. The native_load_base is the address of the section as specified by the linker. The two are often different because of ASLR or relocations.

{
    "address_space": 255299584,
    "name": "ld-linux-x86-64.so.2",
    "path": "/lib64/ld-linux-x86-64.so.2",
    "pc": 18446744071580784396,
    "pid": 1245,
    "sections": [
        {
            "executable": true,
            "name": "",
            "native_load_base": 0,
            "readable": true,
            "runtime_load_base": 140472821178368,
            "size": 142304,
            "writable": false
        },
        {
            "executable": false,
            "name": "",
            "native_load_base": 2243520,
            "readable": true,
            "runtime_load_base": 140472823421888,
            "size": 5120,
            "writable": true
        }
    ],
    "state_id": 0,
    "timestamp": 630430186029900,
    "type": "TRACE_MOD_LOAD"
}

ModuleTracer is a fundamental plugin, as it helps resolve the raw program counters in the trace to actual binary names. s2e-env uses this extensively to get file and line numbers when binaries have debug information.

TestCaseGenerator

The TestCaseGenerator has several functions:

Write test cases to debug.txt
Store test case files in s2e-last
Write test cases to ExecutionTracer.dat.

In this document, we will examine the latter. A test case entry has the following format:

{
    "address_space": 225800192,
    "items": [
        {
            "key": "v0___symfile____tmp_input___0_1_symfile___0",
            "value": "AQEBAQEBAQEBAQEBAQE...."
        }
    ],
    "pc": 134518939,
    "pid": 1295,
    "state_id": 1,
    "timestamp": 630185690261719,
    "type": "TRACE_TESTCASE"
}

The most important field is items: it records a list of (key, value) pairs. The key is the name of the symbolic variable and the value is the concrete assignment to that variable. The variable name has the following format:

vXXX_variablename_YYY

XXX is the relative order of the variable within the execution path. When a new variable is created, this number is incremented by one. So the 10th variable of a state will have an id of 9.

variablename is the string passed to s2e_make_symbolic.

YYY is the absolute order of the variable within a given S2E run. This number can be non-deterministic and influenced by the sequence of state switches.

These two identifiers ensure that each variable is globally unique and can be easily ordered when generating a test case.

MemoryTracer

This plugin traces memory accesses, page faults, and TLB misses in the specified processes. It requires the following configuration in s2e-config.lua:

add_plugin("MemoryTracer")

pluginsConfig.MemoryTracer = {
    -- You can selectively enable/disable tracing various events
    traceMemory = true,
    tracePageFaults = true,
    traceTlbMisses = true,

    -- The filterPlugin parameter specifies which plugin to use to restrict memory tracing.
    -- If the parameter is missing, the memory tracer will trace all accesses in the system.
    filterPlugin = "ThreadExecutionDetector"
}

Note

MemoryTracer may produce large amounts of data (on the order of gigabytes), so make sure to restrict tracing to the modules of interest.

Here is an example of a memory access:

{
    "address": 140720832808700,
    "address_space": 255279104,
    "concrete_buffer": 0,
    "flags": 1,
    "host_address": 0,
    "pc": 94739530127322,
    "pid": 1251,
    "size": 4,
    "state_id": 0,
    "timestamp": 630430187009925,
    "type": "TRACE_MEMORY",
    "value": 3735928559
}

address: the virtual address of the memory access
value: the concrete value written/read by the memory access
size: the size in bytes of the memory access
flags: this is a bitmask that indicates the type of the access
- bit 0: set to 1 if the access is a write, read otherwise
- bit 1: set to 1 if the access is memory-mapped I/O. Note that due to how CPU emulation works, normal RAM accesses may sometimes appear as MMIO.
- bit 2: indicates that the value is symbolic. In this case , the trace entry stores the concrete version of the data.
- bit 3: the address is symbolic. In this case, the trace entry stores the concrete version of the symbolic address.
Several other bits are available, please check TraceEntries.proto for more details.

Several other fields are stored when traceHostAddresses is set in the configuration. These are useful under rare circumstances when debugging the execution engine, e.g., to make sure that memory accesses get translated to the right memory location on the host machine.

host_address: this is the address of the access as mapped by QEMU when initializing KVM memory regions.
concrete_buffer: this is the final address of the access, after S2E translated the host address to the actual per-state location.

A TLB miss looks as follows. It only stores the address and whether the access was a read or a write.

{
    "address": 140720832808704,
    "address_space": 255279104,
    "is_write": false,
    "pc": 94739530127511,
    "pid": 1251,
    "state_id": 1,
    "timestamp": 630430187323794,
    "type": "TRACE_TLBMISS"
}

TranslationBlockTracer

This plugin records the state of the CPU registers before and/or after a translation block is executed. A translation block is a sequence of guest instructions that ends in a control flow change.

add_plugin("TranslationBlockTracer")

pluginsConfig.TranslationBlockTracer = {
    -- In general, the CPU state at the beginning of a translation block is equal
    -- to the state at the end of the previous translation block.
    -- In cases where a translation block is interrupted because of an exception, the end state
    -- may not be recorded.
    traceTbStart = true,
    traceTbEnd = false,

    -- The filterPlugin parameter specifies which plugin to use to restrict translation block tracing.
    -- If the parameter is missing, the translation block tracer will trace all translation
    -- blocks in the system.
    filterPlugin = "ThreadExecutionDetector"
}

Here is an example of a trace entry for translation blocks for 64-bit x86 code:

{
    "address_space": 255324160,
    "data": {
        "first_pc": 93907771242192,
        "last_pc": 93907771242228,
        "size": 42,
        "tb_type": "TB_CALL_IND"
    },
    "pc": 93907771242192,
    "pid": 1253,
    "regs": {
        "values": [
            28,
            140723069759552,
            139806606502816,
            0,
            140723069759520,
            0,
            2,
            139806608687472,
            139806608688896,
            0,
            8,
            140723070026140,
            93907771242192,
            140723069759520,
            0,
            0
        ],
        "symb_mask": 0
    },
    "state_id": 0,
    "timestamp": 631227908208916,
    "type": "TRACE_TB_START"
}

There are several fields specific to this type of trace entry:

tb_type: the type of the translation block. It is defined by the last instruction of the block (e.g., direct call/jump, indirect call/jump, system call, etc.).
regs: the register data:
- values: concrete content of the CPU registers.
- symb_mask: this is a bitmask that indicates which register contains symbolic data. In case a register is symbolic, the values field contains a concrete value that satisfies path constraints at the time of the recording. Note that symbolic data is not recorded by the plugin.
size: the size in bytes of the guest instructions contained in this translation block.
first_pc: the program counter of the first instruction in the translation block.
last_pc: the program counter of the last instruction in the translation block.

InstructionCounter

This plugin counts how many instructions have been executed in the configured processes or modules and writes the count to the execution trace. The plugin keeps a per-path count and writes it when the path terminates.

add_plugin("InstructionCounter")
pluginsConfig.InstructionCounter = {
    -- The filterPlugin parameter specifies which plugin to use to restrict instruction counting.
    -- If the parameter is missing, the instruction counter will count all instructions in the system.
    filterPlugin = "ThreadExecutionDetector"
}

Here is a sample trace entry:

{
    "address_space": 232222720,
    "count": 191951636,
    "pc": 134518939,
    "pid": 1255,
    "state_id": 0,
    "timestamp": 631227908821908,
    "type": "TRACE_ICOUNT"
}

Filtering Plugins

Execution tracers record by default every event in the system. This may be very expensive and result in enormous trace files that are hard to work with.

Execution tracers allow specifying a filtering plugin to restrict the trace to code of interest. A filtering plugin may, e.g., limit the trace to a single process, thread, or even a small code section.

The following section describes available filtering plugins.

ProcessExecutionDetector

This plugin filters tracing to a process of interest. The process can be either specified by name inside the lua configuration file or set at run time by guest code using an API from process_execution_detector.h.

add_plugin("ProcessExecutionDetector")
pluginsConfig.ProcessExecutionDetector = {
    moduleNames = {
        "myprogram.exe",
    },
    trackKernel = false
}

ThreadExecutionDetector

This plugins filters tracing to one or more thread of interest. The plugin is configured from the guest using APIs in thread_execution_detector.h. The plugin supports filtering of kernel code as well.

add_plugin("ThreadExecutionDetector")
pluginsConfig.ThreadExecutionDetector = {

}

ModuleExecutionDetector

This plugin filters tracing to specific binary names. It can be configured in the lua file. This plugin does not take into account threads or processes. For example, when using this filter to trace memory accesses in a shared library, accesses will be traced in every process where that library is loaded.

add_plugin("ModuleExecutionDetector")
pluginsConfig.ModuleExecutionDetector = {
    mod_0 = {
        moduleName = "myprogram.exe",
    },
}