Note

Starting at 3-Introduction to Testing.pdf, slide 41

Random testing

AFL vs LibFuzzer

AFL runs as a black-box process and mutates inputs externally
LibFuzzer is in-process and uses compiler instrumentation to guide input mutations more directly

AFL	LibFuzzer
standalone tool	library that can be implemented into a larger testing framework

OSS-Fuzz

a Google service that uses ClusterFuzz (+ sanitizers and fuzzers like AFL/LibFuzzer) to continuously fuzz open-source projects

has discovered over 17400 bugs from 2016 to 2019 in many large projects (e.g. openssl, llvm, postresql, git, firefox)

ClusterFuzz

Google’s scalable fuzzing infrastructure

used in OSS-Fuzz & to fuzz the Chrome browser
as of January 2019, it has found ~16000 bugs in Chrome and ~11000 bugs in 160+ OS projects
highly scalable (1000+ machines)
accurate deduplication of crashes
fully automatic bug filing for issue trackers
analytics & web interface

Grammar-based fuzzing

there are different ways to formally describe a language:

Regular expressions: simplest class of languages
- example: [a-z]* denotes a (possibly empty) sequence of lowercase letters
Context-free grammars: can express a wide range of properties of an input language (e.g. syntactical structure of an input format)
- e.g.
  - expression → term | expression + term
  - term → factor | term * factor
  - factor → integer | (expression)
  - integer → 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Testing concurrent programs

uncovering bugs in concurrent programs requires not only discovering specific program inputs, but also specific thread schedules

Solution: add random delays using sleep(x) (adding these delays has the effect of attempting different thread schedules → hopefully one causes a bug)

Depth of a concurrency bug

Bug depth: the number of ordering constraints (e.g. if/else’s) a schedule has to satisfy to find a bug

how deeply the bug is embedded in the program’s logic and how difficult it is to fix
Ordering constraints: the order in which the operations are executed by different threads

Case Studies

Google Monkey (android testing)

generates TOUCH(x,y), where x and y are randomly generated
generates MOVE(x2,y2), where x2 and y2 are randomly generated
generates MOTION(..), consists of a DOWN() event somewhere on the screen, sequence of MOVE(..) events, and an UP() event

Grammar of Monkey events

test_case -> event *
event     -> action ( x, y ) | ...
action    -> DOWN | MOVE | UP
x         -> 0 | 1 | ... | x_limit
y         -> 0 | 1 | ... | y_limit

Input: DOWN(0,0); MOVE(1,1); UP(2,2);
Expected output: an event of DOWN is happening on the coordinate (0,0), an event of MOVE is happening on the coordinate (1,1), and an event of UP is happening on the coordinate (2,2)

Microsoft Cuzz (concurrent testing)

works by generating random inputs and concurrent schedules of threads to test the application’s behaviour
- detects and reports any race conditions or other concurrency-related bugs that are found during testing
Main idea: automate the approach of implementing sleep() calls systematically
gives worst-case probabilistic guarantee on finding bugs

Probabilistic guarantee

Given a program with:

n-threads (~tens)
k steps (~millions)
bug of depth (1 or 2)

Cuzz will find the bug with a probability of at least $\frac{1}{n * k ^{d - 1}}$

Example

# function that each thread will run
def increment_counter():
    global counter
    for _ in range(1000000):
        counter += 1 
        
# fixed version (thread safe)
counter_lock = threading.Lock()
def increment_counter(): 
    global counter 
    for _ in range(1000000): 
        with counter_lock: 
            counter += 1

Quiz: Concurrency bug depth

// thread 1:
lock(a);
lock(b);
g = g + 1;
unlock(b);
unlock(a);

// thread 2:
lock(b);
lock(a);
g = 0;
unlock(a);
unlock(b);

specify the depth of the currency bug, and specify all ordering constraints needed to trigger the bug (use notation x(y) to mean statement x comes before statement y, and separate multiple constraints by a space)

Solution

Depth of the concurrency bug: 2
Ordering constraints: (1,7) (6,2)
deadlock happens

Note

Slides up to slide 76/77

Connor's Notes

⬅️ Back to portfolio

Explorer

COSC 3P95: Lecture 7 Notes

Random testing

AFL vs LibFuzzer

OSS-Fuzz

ClusterFuzz

Grammar-based fuzzing

Testing concurrent programs

Depth of a concurrency bug

Case Studies

Google Monkey (android testing)

Grammar of Monkey events

Microsoft Cuzz (concurrent testing)

Probabilistic guarantee

Example

Quiz: Concurrency bug depth

Graph View

Table of Contents