Moved unicorn_mode to unicornafl

This commit is contained in:
Dominik Maier
2019-12-15 05:03:32 +01:00
parent d40b670388
commit 49c9b68e4e
9 changed files with 273 additions and 172 deletions

3
.gitignore vendored
View File

@@ -34,6 +34,3 @@ afl-whatsup.8
qemu_mode/libcompcov/compcovtest
as
qemu_mode/qemu-*
unicorn_mode/unicorn
unicorn_mode/unicorn-*
unicorn_mode/*.tar.gz

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "unicorn_mode/unicorn"]
path = unicorn_mode/unicorn
url = https://github.com/vanhauser-thc/unicorn.git

View File

@@ -20,7 +20,7 @@ but at least we're able to use AFL on these binaries, right?
## 2) How to use
Requirements: you need an installed python2 environment.
Requirements: you need an installed python environment.
### Building AFL's Unicorn Mode
@@ -31,11 +31,8 @@ features:
$ cd unicorn_mode
$ ./build_unicorn_support.sh
NOTE: This script downloads a Unicorn Engine commit that has been tested
and is stable-ish from the Unicorn github page. If you are offline, you'll need
to hack up this script a little bit and supply your own copy of Unicorn's latest
stable release. It's not very hard, just check out the beginning of the
build_unicorn_support.sh script and adjust as necessary.
NOTE: This script checks out a Unicorn Engine fork as submodule that has been tested
and is stable-ish, based on the unicorn engine master.
Building Unicorn will take a little bit (~5-10 minutes). Once it completes
it automatically compiles a sample application and verify that it works.
@@ -51,11 +48,10 @@ To really use unicorn-mode effectively you need to prepare the following:
+ Quality/speed of results will depend greatly on quality of starting
samples
+ See AFL's guidance on how to create a sample corpus
* Unicorn-based test harness which:
* Unicornafl-based test harness which:
+ Adds memory map regions
+ Loads binary code into memory
+ Emulates at least one instruction*
+ Yeah, this is lame. See 'Gotchas' section below for more info
+ Calls uc.afl_fuzz() / uc.afl_start_forkserver
+ Loads and verifies data to fuzz from a command-line specified file
+ AFL will provide mutated inputs by changing the file passed to
the test harness
@@ -103,16 +99,20 @@ for the x86, x86_64 and ARM targets.
## 4) Gotchas, feedback, bugs
To make sure that AFL's fork server starts up correctly the Unicorn test
harness script must emulate at least one instruction before loading the
data that will be fuzzed from the input file. It doesn't matter what the
instruction is, nor if it is valid. This is an artifact of how the fork-server
is started and could likely be fixed with some clever re-arranging of the
patches applied to Unicorn.
Running the build script builds Unicornafl and its python bindings and installs
them on your system.
This installation will leave any existing Unicorn installations untouched.
If you want to use unicornafl instead of unicorn in a script,
replace all `unicorn` imports with `unicornafl` inputs, everything else should "just work".
If you use 3rd party code depending on unicorn, you can use unicornafl monkeypatching:
Before importing anything that depends on unicorn, do:
Running the build script builds Unicorn and its python bindings and installs
them on your system. This installation will supersede any existing Unicorn
installation with the patched afl-unicorn version.
```python
import unicornafl
unicornafl.monkeypatch()
```
This will replace all unicorn imports with unicornafl inputs.
Refer to the unicorn_mode/samples/arm_example/arm_tester.c for an example
of how to do this properly! If you don't get this right, AFL will not

View File

@@ -33,9 +33,6 @@
# You must make sure that Unicorn Engine is not already installed before
# running this script. If it is, please uninstall it first.
UNICORN_URL="https://github.com/unicorn-engine/unicorn/archive/24f55a7973278f20f0de21b904851d99d4716263.tar.gz"
UNICORN_SHA384="7180d47ca52c99b4c073a343a2ead91da1a829fdc3809f3ceada5d872e162962eab98873a8bc7971449d5f34f41fdb93"
echo "================================================="
echo "Unicorn-AFL build script"
echo "================================================="
@@ -52,7 +49,7 @@ if [ ! "$PLT" = "Linux" ] && [ ! "$PLT" = "Darwin" ] && [ ! "$PLT" = "FreeBSD" ]
fi
if [ ! -f "patches/afl-unicorn-cpu-inl.h" -o ! -f "../config.h" ]; then
if [ ! -f "../config.h" ]; then
echo "[-] Error: key files not found - wrong working directory?"
exit 1
@@ -66,40 +63,30 @@ if [ ! -f "../afl-showmap" ]; then
fi
PYTHONBIN=python
MAKECMD=make
EASY_INSTALL='easy_install'
TARCMD=tar
if [ "$PLT" = "Linux" ]; then
CKSUMCMD='sha384sum --'
PYTHONBIN=python2
MAKECMD=make
CORES=`nproc`
TARCMD=tar
EASY_INSTALL=easy_install
fi
if [ "$PLT" = "Darwin" ]; then
CKSUMCMD="shasum -a 384"
PYTHONBIN=python2.7
MAKECMD=make
CORES=`sysctl hw.ncpu | cut -d' ' -f2`
TARCMD=tar
EASY_INSTALL=easy_install-2.7
fi
if [ "$PLT" = "FreeBSD" ]; then
CKSUMCMD="sha384 -q"
PYTHONBIN=python2.7
MAKECMD=gmake
CORES=`sysctl hw.ncpu | cut -d' ' -f2`
TARCMD=gtar
EASY_INSTALL=easy_install-2.7
fi
if [ "$PLT" = "NetBSD" ] || [ "$PLT" = "OpenBSD" ]; then
CKSUMCMD="cksum -a sha384 -q"
PYTHONBIN=python2.7
MAKECMD=gmake
CORES=`sysctl hw.ncpu | cut -d'=' -f2`
TARCMD=gtar
EASY_INSTALL=easy_install-2.7
fi
for i in wget $PYTHONBIN automake autoconf $MAKECMD $TARCMD; do
@@ -108,7 +95,7 @@ for i in wget $PYTHONBIN automake autoconf $MAKECMD $TARCMD; do
if [ "$T" = "" ]; then
echo "[-] Error: '$i' not found. Run 'sudo apt-get install $i'."
echo "[-] Error: '$i' not found. Run 'sudo apt-get install $i' or similar."
exit 1
fi
@@ -136,51 +123,13 @@ fi
echo "[+] All checks passed!"
ARCHIVE="`basename -- "$UNICORN_URL"`"
echo "[*] Making sure unicornafl is checked out"
git submodule init || exit 1
git submodule update || exit 1
echo "[+] Got unicornafl."
CKSUM=`$CKSUMCMD "$ARCHIVE" 2>/dev/null | cut -d' ' -f1`
if [ ! "$CKSUM" = "$UNICORN_SHA384" ]; then
echo "[*] Downloading Unicorn v1.0.1 from the web..."
rm -f "$ARCHIVE"
OK=
while [ -z "$OK" ]; do
wget -c -O "$ARCHIVE" -- "$UNICORN_URL" && OK=1
done
CKSUM=`$CKSUMCMD "$ARCHIVE" 2>/dev/null | cut -d' ' -f1`
fi
if [ "$CKSUM" = "$UNICORN_SHA384" ]; then
echo "[+] Cryptographic signature on $ARCHIVE checks out."
else
echo "[-] Error: signature mismatch on $ARCHIVE (perhaps download error?)."
exit 1
fi
echo "[*] Uncompressing archive (this will take a while)..."
rm -rf "unicorn" || exit 1
mkdir "unicorn" || exit 1
$TARCMD xzf "$ARCHIVE" -C ./unicorn --strip-components=1 || exit 1
echo "[+] Unpacking successful."
#rm -rf "$ARCHIVE" || exit 1
echo "[*] Applying patches..."
cp patches/*.h unicorn || exit 1
patch -p1 --directory unicorn < patches/patches.diff || exit 1
patch -p1 --directory unicorn < patches/compcov.diff || exit 1
echo "[+] Patching done."
echo "[*] making sure config.h matches"
cp "../config.h" "./unicorn/" || exit 1
echo "[*] Configuring Unicorn build..."
@@ -188,8 +137,9 @@ cd "unicorn" || exit 1
echo "[+] Configuration complete."
echo "[*] Attempting to build Unicorn (fingers crossed!)..."
echo "[*] Attempting to build unicornafl (fingers crossed!)..."
$MAKECMD clean # make doesn't seem to work for unicorn
UNICORN_QEMU_FLAGS="--python=$PYTHONBIN" $MAKECMD -j$CORES || exit 1
echo "[+] Build process successful!"
@@ -197,20 +147,21 @@ echo "[+] Build process successful!"
echo "[*] Installing Unicorn python bindings..."
cd bindings/python || exit 1
if [ -z "$VIRTUAL_ENV" ]; then
echo "[*] Info: Installing python unicorn using --user"
$PYTHONBIN setup.py install --user --prefix=|| exit 1
echo "[*] Info: Installing python unicornafl using --user"
$PYTHONBIN setup.py install --user --force --prefix=|| exit 1
else
echo "[*] Info: Installing python unicorn to virtualenv: $VIRTUAL_ENV"
$PYTHONBIN setup.py install || exit 1
echo "[*] Info: Installing python unicornafl to virtualenv: $VIRTUAL_ENV"
$PYTHONBIN setup.py install --force || exit 1
fi
export LIBUNICORN_PATH='$(pwd)' # in theory, this allows to switch between afl-unicorn and unicorn so files.
# export LIBUNICORN_PATH='$(pwd)' # in theory, this allows to switch between afl-unicorn and unicorn so files.
echo '[*] If needed, you can (re)install the bindigns from `./unicorn/bindings/python` using `python setup.py install`'
cd ../../ || exit 1
echo "[+] Unicorn bindings installed successfully."
echo "[*] Unicornafl bindings installed successfully."
# Compile the sample, run it, verify that it works!
echo "[*] Testing unicorn-mode functionality by running a sample test harness under afl-unicorn"
echo "[*] Testing unicornafl python functionality by running a sample test harness"
cd ../samples/simple || exit 1
@@ -222,6 +173,8 @@ if [ -s .test-instr0 ]
then
echo "[+] Instrumentation tests passed. "
echo '[+] Make sure to adapt older scripts to `import unicornafl` and use `uc.afl_forkserver_start`'
echo ' or `uc.afl_fuzz` to kick off fuzzing.'
echo "[+] All set, you can now use Unicorn mode (-U) in afl-fuzz!"
RETVAL=0

View File

@@ -1,3 +1,4 @@
#!/usr/bin/env python
"""
Simple test harness for AFL's Unicorn Mode.
@@ -17,8 +18,8 @@ import argparse
import os
import signal
from unicorn import *
from unicorn.x86_const import *
from unicornafl import *
from unicornafl.x86_const import *
# Path to the file containing the binary to emulate
BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'compcov_target.bin')
@@ -120,51 +121,39 @@ def main():
uc.mem_map(STACK_ADDRESS, STACK_SIZE)
uc.reg_write(UC_X86_REG_RSP, STACK_ADDRESS + STACK_SIZE)
#-----------------------------------------------------
# Emulate 1 instruction to kick off AFL's fork server
# THIS MUST BE DONE BEFORE LOADING USER DATA!
# If this isn't done every single run, the AFL fork server
# will not be started appropriately and you'll get erratic results!
# It doesn't matter what this returns with, it just has to execute at
# least one instruction in order to get the fork server started.
# Mapping a location to write our buffer to
uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
# Execute 1 instruction just to startup the forkserver
print("Starting the AFL forkserver by executing 1 instruction")
try:
uc.emu_start(uc.reg_read(UC_X86_REG_RIP), 0, 0, count=1)
except UcError as e:
print("ERROR: Failed to execute a single instruction (error: {})!".format(e))
return
#-----------------------------------------------
# Load the mutated input and map it into memory
# Load the mutated input from disk
print("Loading data input from {}".format(args.input_file))
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
def place_input_callback(uc, input, _, data):
"""
Callback that loads the mutated input into memory.
"""
# Load the mutated input from disk
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
print("Test input is too long (> {} bytes)".format(DATA_SIZE_MAX))
return
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
return
# Write the mutated command into the data buffer
uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
uc.mem_write(DATA_ADDRESS, input)
# Write the mutated command into the data buffer
uc.mem_write(DATA_ADDRESS, input)
#------------------------------------------------------------
# Emulate the code, allowing it to process the mutated input
print("Executing until a crash or execution reaches 0x{0:016x}".format(end_address))
try:
result = uc.emu_start(uc.reg_read(UC_X86_REG_RIP), end_address, timeout=0, count=0)
except UcError as e:
print("Execution failed with error: {}".format(e))
force_crash(e)
print("Done.")
print("Starting the AFL fuzz")
uc.afl_fuzz(
input_file=args.input_file,
place_input_callback=place_input_callback,
exits=[end_address],
persistent_iters=1
)
if __name__ == "__main__":
main()

View File

@@ -1,3 +1,4 @@
#!/usr/bin/env python
"""
Simple test harness for AFL's Unicorn Mode.
@@ -17,8 +18,8 @@ import argparse
import os
import signal
from unicorn import *
from unicorn.mips_const import *
from unicornafl import *
from unicornafl.mips_const import *
# Path to the file containing the binary to emulate
BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin')
@@ -120,51 +121,29 @@ def main():
uc.mem_map(STACK_ADDRESS, STACK_SIZE)
uc.reg_write(UC_MIPS_REG_SP, STACK_ADDRESS + STACK_SIZE)
#-----------------------------------------------------
# Emulate 1 instruction to kick off AFL's fork server
# THIS MUST BE DONE BEFORE LOADING USER DATA!
# If this isn't done every single run, the AFL fork server
# will not be started appropriately and you'll get erratic results!
# It doesn't matter what this returns with, it just has to execute at
# least one instruction in order to get the fork server started.
# Execute 1 instruction just to startup the forkserver
print("Starting the AFL forkserver by executing 1 instruction")
try:
uc.emu_start(uc.reg_read(UC_MIPS_REG_PC), 0, 0, count=1)
except UcError as e:
print("ERROR: Failed to execute a single instruction (error: {})!".format(e))
return
#-----------------------------------------------
# Load the mutated input and map it into memory
# Load the mutated input from disk
print("Loading data input from {}".format(args.input_file))
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
print("Test input is too long (> {} bytes)".format(DATA_SIZE_MAX))
return
# Write the mutated command into the data buffer
# reserve some space for data
uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
uc.mem_write(DATA_ADDRESS, input)
#------------------------------------------------------------
# Emulate the code, allowing it to process the mutated input
#-----------------------------------------------------
# Set up a callback to place input data (do little work here, it's called for every single iteration)
# We did not pass in any data and don't use persistent mode, so we can ignore these params.
# Be sure to check out the docstrings for the uc.afl_* functions.
def place_input_callback(uc, input, persistent_round, data):
# Load the mutated input from disk
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
print("Executing until a crash or execution reaches 0x{0:016x}".format(end_address))
try:
result = uc.emu_start(uc.reg_read(UC_MIPS_REG_PC), end_address, timeout=0, count=0)
except UcError as e:
print("Execution failed with error: {}".format(e))
force_crash(e)
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
#print("Test input is too long (> {} bytes)")
return False
print("Done.")
# Write the mutated command into the data buffer
uc.mem_write(DATA_ADDRESS, input)
# Start the fuzzer.
uc.afl_fuzz(args.input_file, place_input_callback, [end_address])
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,179 @@
#!/usr/bin/env python
"""
Alternative simple test harness for Unicornafl.
It is slower but compatible with anything that uses unicorn.
Have a look at `unicornafl.monkeypatch()` for an easy way to fuzz unicorn projects.
This loads the simple_target.bin binary (precompiled as MIPS code) into
Unicorn's memory map for emulation, places the specified input into
simple_target's buffer (hardcoded to be at 0x300000), and executes 'main()'.
If any crashes occur during emulation, this script throws a matching signal
to tell AFL that a crash occurred.
Run under AFL as follows:
$ cd <afl_path>/unicorn_mode/samples/simple/
$ ../../../afl-fuzz -U -m none -i ./sample_inputs -o ./output -- python simple_test_harness_alt.py @@
"""
import argparse
import os
import signal
from unicornafl import *
from unicornafl.mips_const import *
# Path to the file containing the binary to emulate
BINARY_FILE = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'simple_target.bin')
# Memory map for the code to be tested
CODE_ADDRESS = 0x00100000 # Arbitrary address where code to test will be loaded
CODE_SIZE_MAX = 0x00010000 # Max size for the code (64kb)
STACK_ADDRESS = 0x00200000 # Address of the stack (arbitrarily chosen)
STACK_SIZE = 0x00010000 # Size of the stack (arbitrarily chosen)
DATA_ADDRESS = 0x00300000 # Address where mutated data will be placed
DATA_SIZE_MAX = 0x00010000 # Maximum allowable size of mutated data
try:
# If Capstone is installed then we'll dump disassembly, otherwise just dump the binary.
from capstone import *
cs = Cs(CS_ARCH_MIPS, CS_MODE_MIPS32 + CS_MODE_BIG_ENDIAN)
def unicorn_debug_instruction(uc, address, size, user_data):
mem = uc.mem_read(address, size)
for (cs_address, cs_size, cs_mnemonic, cs_opstr) in cs.disasm_lite(bytes(mem), size):
print(" Instr: {:#016x}:\t{}\t{}".format(address, cs_mnemonic, cs_opstr))
except ImportError:
def unicorn_debug_instruction(uc, address, size, user_data):
print(" Instr: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def unicorn_debug_block(uc, address, size, user_data):
print("Basic Block: addr=0x{0:016x}, size=0x{1:016x}".format(address, size))
def unicorn_debug_mem_access(uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE:
print(" >>> Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> Read: addr=0x{0:016x} size={1}".format(address, size))
def unicorn_debug_mem_invalid_access(uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE_UNMAPPED:
print(" >>> INVALID Write: addr=0x{0:016x} size={1} data=0x{2:016x}".format(address, size, value))
else:
print(" >>> INVALID Read: addr=0x{0:016x} size={1}".format(address, size))
def force_crash(uc_error):
# This function should be called to indicate to AFL that a crash occurred during emulation.
# Pass in the exception received from Uc.emu_start()
mem_errors = [
UC_ERR_READ_UNMAPPED, UC_ERR_READ_PROT, UC_ERR_READ_UNALIGNED,
UC_ERR_WRITE_UNMAPPED, UC_ERR_WRITE_PROT, UC_ERR_WRITE_UNALIGNED,
UC_ERR_FETCH_UNMAPPED, UC_ERR_FETCH_PROT, UC_ERR_FETCH_UNALIGNED,
]
if uc_error.errno in mem_errors:
# Memory error - throw SIGSEGV
os.kill(os.getpid(), signal.SIGSEGV)
elif uc_error.errno == UC_ERR_INSN_INVALID:
# Invalid instruction - throw SIGILL
os.kill(os.getpid(), signal.SIGILL)
else:
# Not sure what happened - throw SIGABRT
os.kill(os.getpid(), signal.SIGABRT)
def main():
parser = argparse.ArgumentParser(description="Test harness for simple_target.bin")
parser.add_argument('input_file', type=str, help="Path to the file containing the mutated input to load")
parser.add_argument('-d', '--debug', default=False, action="store_true", help="Enables debug tracing")
args = parser.parse_args()
# Instantiate a MIPS32 big endian Unicorn Engine instance
uc = Uc(UC_ARCH_MIPS, UC_MODE_MIPS32 + UC_MODE_BIG_ENDIAN)
if args.debug:
uc.hook_add(UC_HOOK_BLOCK, unicorn_debug_block)
uc.hook_add(UC_HOOK_CODE, unicorn_debug_instruction)
uc.hook_add(UC_HOOK_MEM_WRITE | UC_HOOK_MEM_READ, unicorn_debug_mem_access)
uc.hook_add(UC_HOOK_MEM_WRITE_UNMAPPED | UC_HOOK_MEM_READ_INVALID, unicorn_debug_mem_invalid_access)
#---------------------------------------------------
# Load the binary to emulate and map it into memory
print("Loading data input from {}".format(args.input_file))
binary_file = open(BINARY_FILE, 'rb')
binary_code = binary_file.read()
binary_file.close()
# Apply constraints to the mutated input
if len(binary_code) > CODE_SIZE_MAX:
print("Binary code is too large (> {} bytes)".format(CODE_SIZE_MAX))
return
# Write the mutated command into the data buffer
uc.mem_map(CODE_ADDRESS, CODE_SIZE_MAX)
uc.mem_write(CODE_ADDRESS, binary_code)
# Set the program counter to the start of the code
start_address = CODE_ADDRESS # Address of entry point of main()
end_address = CODE_ADDRESS + 0xf4 # Address of last instruction in main()
uc.reg_write(UC_MIPS_REG_PC, start_address)
#-----------------
# Setup the stack
uc.mem_map(STACK_ADDRESS, STACK_SIZE)
uc.reg_write(UC_MIPS_REG_SP, STACK_ADDRESS + STACK_SIZE)
# reserve some space for data
uc.mem_map(DATA_ADDRESS, DATA_SIZE_MAX)
#-----------------------------------------------------
# Kick off AFL's fork server
# THIS MUST BE DONE BEFORE LOADING USER DATA!
# If this isn't done every single run, the AFL fork server
# will not be started appropriately and you'll get erratic results!
print("Starting the AFL forkserver")
afl_mode = uc.afl_forkserver_start([end_address])
if afl_mode != UC_AFL_RET_NO_AFL:
# Disable prints for speed
out = lambda x, y: None
else:
out = lambda x, y: print(x.format(y))
#-----------------------------------------------
# Load the mutated input and map it into memory
# Load the mutated input from disk
out("Loading data input from {}", args.input_file)
input_file = open(args.input_file, 'rb')
input = input_file.read()
input_file.close()
# Apply constraints to the mutated input
if len(input) > DATA_SIZE_MAX:
out("Test input is too long (> {} bytes)", DATA_SIZE_MAX)
return
# Write the mutated command into the data buffer
uc.mem_write(DATA_ADDRESS, input)
#------------------------------------------------------------
# Emulate the code, allowing it to process the mutated input
out("Executing until a crash or execution reaches 0x{0:016x}", end_address)
try:
uc.emu_start(uc.reg_read(UC_MIPS_REG_PC), end_address, timeout=0, count=0)
except UcError as e:
out("Execution failed with error: {}", e)
force_crash(e)
# UC_AFL_RET_ERROR = 0
# UC_AFL_RET_CHILD = 1
# UC_AFL_RET_NO_AFL = 2
# UC_AFL_RET_FINISHED = 3
out("Done. AFL Mode is {}", afl_mode)
if __name__ == "__main__":
main()

1
unicorn_mode/unicorn Submodule

Submodule unicorn_mode/unicorn added at c15508a373