Execution Tracers
S2E provides plugins to record execution traces. The main plugin is ExecutionTracer
. All other tracing plugins, such
as MemoryTracer
or TranslationBlockTracer
, use ExecutionTracer
in order to append various trace entries to
the trace file (e.g., a memory access or a translation block execution). ExecutionTracer
gathers trace data from the
various plugins and outputs them to an ExecutionTracer.dat
file in protobuf
format.
Note
This document is a reference. For a more hands-on explanation of how to generate traces, please read this tutorial first.
ExecutionTracer
An execution trace is a sequence of (magic, header_size, header, item_size, item)
entries.
The magic number identifies the start of a trace entry. It is followed by the size of the header, the header, the
item size, then finally the actual item.
The header is defined by the PbTraceItemHeader
protobuf type. For more details, see the
TraceEntries.proto
file.
message PbTraceItemHeader {
required uint32 state_id = 1;
required uint64 timestamp = 2;
required uint64 address_space = 3;
required uint64 pid = 4;
required uint64 pc = 5;
required PbTraceItemHeaderType type = 6;
}
The header entry contains data that is common to all trace items. This includes the program id, the program counter,
and the address space (i.e., cr3
register on x86) at the time a trace entry is created. The type field specifies
the type of the item that follows the header. The header also stores the state identifier to help reconstruct the
execution tree offline.
The execution tracer ensures that it is possible to reconstruct the execution tree and extract individual paths when
S2E terminates. The tree can be decoded even when the trace recording is partial (e.g., in case S2E is forcefully
terminated). In order to do this, the execution tracer plugin monitors forks and records the id of the forked states in
the trace (see PbTraceItemFork
). Any subsequent trace item has the corresponding state id in its header. If two
items in the trace have the same state identifier, the item that happened earlier in the path is the one that comes
first in the trace.
ModuleTracer
This plugin traces process and module events. The following snippet is an example of a module load event. This indicates
that the OS loaded the shared library /lib64/ld-linux-x86-64.so.2
into a process whose program identifier is 1245.
The event also records the location of each loaded section of the binary. The runtime_load_base
of a section
is its start address in virtual memory as decided by the OS. The native_load_base
is the address of the section
as specified by the linker. The two are often different because of ASLR or relocations.
{
"address_space": 255299584,
"name": "ld-linux-x86-64.so.2",
"path": "/lib64/ld-linux-x86-64.so.2",
"pc": 18446744071580784396,
"pid": 1245,
"sections": [
{
"executable": true,
"name": "",
"native_load_base": 0,
"readable": true,
"runtime_load_base": 140472821178368,
"size": 142304,
"writable": false
},
{
"executable": false,
"name": "",
"native_load_base": 2243520,
"readable": true,
"runtime_load_base": 140472823421888,
"size": 5120,
"writable": true
}
],
"state_id": 0,
"timestamp": 630430186029900,
"type": "TRACE_MOD_LOAD"
}
ModuleTracer
is a fundamental plugin, as it helps resolve the raw program counters in the trace to actual
binary names. s2e-env
uses this extensively to get file and line numbers when binaries have debug information.
TestCaseGenerator
The TestCaseGenerator
has several functions:
Write test cases to
debug.txt
Store test case files in
s2e-last
Write test cases to
ExecutionTracer.dat
.
In this document, we will examine the latter. A test case entry has the following format:
{
"address_space": 225800192,
"items": [
{
"key": "v0___symfile____tmp_input___0_1_symfile___0",
"value": "AQEBAQEBAQEBAQEBAQE...."
}
],
"pc": 134518939,
"pid": 1295,
"state_id": 1,
"timestamp": 630185690261719,
"type": "TRACE_TESTCASE"
}
The most important field is items
: it records a list of (key, value) pairs. The key is the name of the symbolic
variable and the value is the concrete assignment to that variable. The variable name has the following format:
vXXX_variablename_YYY
XXX
is the relative order of the variable within the execution path. When a new variable is created, this
number is incremented by one. So the 10th variable of a state will have an id of 9.
variablename
is the string passed to s2e_make_symbolic
.
YYY
is the absolute order of the variable within a given S2E run. This number can be non-deterministic and
influenced by the sequence of state switches.
These two identifiers ensure that each variable is globally unique and can be easily ordered when generating a test case.
MemoryTracer
This plugin traces memory accesses, page faults, and TLB misses in the specified processes. It requires the following
configuration in s2e-config.lua
:
add_plugin("MemoryTracer")
pluginsConfig.MemoryTracer = {
-- You can selectively enable/disable tracing various events
traceMemory = true,
tracePageFaults = true,
traceTlbMisses = true,
-- The filterPlugin parameter specifies which plugin to use to restrict memory tracing.
-- If the parameter is missing, the memory tracer will trace all accesses in the system.
filterPlugin = "ThreadExecutionDetector"
}
Note
MemoryTracer
may produce large amounts of data (on the order of gigabytes), so make sure to restrict tracing
to the modules of interest.
Here is an example of a memory access:
{
"address": 140720832808700,
"address_space": 255279104,
"concrete_buffer": 0,
"flags": 1,
"host_address": 0,
"pc": 94739530127322,
"pid": 1251,
"size": 4,
"state_id": 0,
"timestamp": 630430187009925,
"type": "TRACE_MEMORY",
"value": 3735928559
}
address
: the virtual address of the memory accessvalue
: the concrete value written/read by the memory accesssize
: the size in bytes of the memory accessflags
: this is a bitmask that indicates the type of the accessbit 0: set to 1 if the access is a write, read otherwise
bit 1: set to 1 if the access is memory-mapped I/O. Note that due to how CPU emulation works, normal RAM accesses may sometimes appear as MMIO.
bit 2: indicates that the value is symbolic. In this case , the trace entry stores the concrete version of the data.
bit 3: the address is symbolic. In this case, the trace entry stores the concrete version of the symbolic address.
Several other bits are available, please check
TraceEntries.proto
for more details.
Several other fields are stored when traceHostAddresses
is set in the configuration. These are useful under
rare circumstances when debugging the execution engine, e.g., to make sure that memory accesses get translated to
the right memory location on the host machine.
host_address
: this is the address of the access as mapped by QEMU when initializing KVM memory regions.concrete_buffer
: this is the final address of the access, after S2E translated the host address to the actual per-state location.
A TLB miss looks as follows. It only stores the address and whether the access was a read or a write.
{
"address": 140720832808704,
"address_space": 255279104,
"is_write": false,
"pc": 94739530127511,
"pid": 1251,
"state_id": 1,
"timestamp": 630430187323794,
"type": "TRACE_TLBMISS"
}
TranslationBlockTracer
This plugin records the state of the CPU registers before and/or after a translation block is executed. A translation block is a sequence of guest instructions that ends in a control flow change.
add_plugin("TranslationBlockTracer")
pluginsConfig.TranslationBlockTracer = {
-- In general, the CPU state at the beginning of a translation block is equal
-- to the state at the end of the previous translation block.
-- In cases where a translation block is interrupted because of an exception, the end state
-- may not be recorded.
traceTbStart = true,
traceTbEnd = false,
-- The filterPlugin parameter specifies which plugin to use to restrict translation block tracing.
-- If the parameter is missing, the translation block tracer will trace all translation
-- blocks in the system.
filterPlugin = "ThreadExecutionDetector"
}
Here is an example of a trace entry for translation blocks for 64-bit x86 code:
{
"address_space": 255324160,
"data": {
"first_pc": 93907771242192,
"last_pc": 93907771242228,
"size": 42,
"tb_type": "TB_CALL_IND"
},
"pc": 93907771242192,
"pid": 1253,
"regs": {
"values": [
28,
140723069759552,
139806606502816,
0,
140723069759520,
0,
2,
139806608687472,
139806608688896,
0,
8,
140723070026140,
93907771242192,
140723069759520,
0,
0
],
"symb_mask": 0
},
"state_id": 0,
"timestamp": 631227908208916,
"type": "TRACE_TB_START"
}
There are several fields specific to this type of trace entry:
tb_type
: the type of the translation block. It is defined by the last instruction of the block (e.g., direct call/jump, indirect call/jump, system call, etc.).regs
: the register data:values
: concrete content of the CPU registers.symb_mask
: this is a bitmask that indicates which register contains symbolic data. In case a register is symbolic, thevalues
field contains a concrete value that satisfies path constraints at the time of the recording. Note that symbolic data is not recorded by the plugin.
size
: the size in bytes of the guest instructions contained in this translation block.first_pc
: the program counter of the first instruction in the translation block.last_pc
: the program counter of the last instruction in the translation block.
InstructionCounter
This plugin counts how many instructions have been executed in the configured processes or modules and writes the count to the execution trace. The plugin keeps a per-path count and writes it when the path terminates.
add_plugin("InstructionCounter")
pluginsConfig.InstructionCounter = {
-- The filterPlugin parameter specifies which plugin to use to restrict instruction counting.
-- If the parameter is missing, the instruction counter will count all instructions in the system.
filterPlugin = "ThreadExecutionDetector"
}
Here is a sample trace entry:
{
"address_space": 232222720,
"count": 191951636,
"pc": 134518939,
"pid": 1255,
"state_id": 0,
"timestamp": 631227908821908,
"type": "TRACE_ICOUNT"
}
Filtering Plugins
Execution tracers record by default every event in the system. This may be very expensive and result in enormous trace files that are hard to work with.
Execution tracers allow specifying a filtering plugin to restrict the trace to code of interest. A filtering plugin may, e.g., limit the trace to a single process, thread, or even a small code section.
The following section describes available filtering plugins.
ProcessExecutionDetector
This plugin filters tracing to a process of interest. The process can be either specified by name
inside the lua configuration file or set at run time by guest code using an API from
process_execution_detector.h
.
add_plugin("ProcessExecutionDetector")
pluginsConfig.ProcessExecutionDetector = {
moduleNames = {
"myprogram.exe",
},
trackKernel = false
}
ThreadExecutionDetector
This plugins filters tracing to one or more thread of interest. The plugin is configured from the guest
using APIs in thread_execution_detector.h
. The plugin supports filtering of kernel code as well.
add_plugin("ThreadExecutionDetector")
pluginsConfig.ThreadExecutionDetector = {
}
ModuleExecutionDetector
This plugin filters tracing to specific binary names. It can be configured in the lua file. This plugin does not take into account threads or processes. For example, when using this filter to trace memory accesses in a shared library, accesses will be traced in every process where that library is loaded.
add_plugin("ModuleExecutionDetector")
pluginsConfig.ModuleExecutionDetector = {
mod_0 = {
moduleName = "myprogram.exe",
},
}