Note

Starting at 3-Introduction to Testing.pdf, slide 41

Random testing

AFL vs LibFuzzer

  • AFL runs as a black-box process and mutates inputs externally
  • LibFuzzer is in-process and uses compiler instrumentation to guide input mutations more directly
AFLLibFuzzer
standalone toollibrary that can be implemented into a larger testing framework

OSS-Fuzz

a Google service that uses ClusterFuzz (+ sanitizers and fuzzers like AFL/LibFuzzer) to continuously fuzz open-source projects

  • has discovered over 17400 bugs from 2016 to 2019 in many large projects (e.g. openssl, llvm, postresql, git, firefox)

ClusterFuzz

Google’s scalable fuzzing infrastructure

  • used in OSS-Fuzz & to fuzz the Chrome browser
  • as of January 2019, it has found ~16000 bugs in Chrome and ~11000 bugs in 160+ OS projects
  • highly scalable (1000+ machines)
  • accurate deduplication of crashes
  • fully automatic bug filing for issue trackers
  • analytics & web interface

Grammar-based fuzzing

there are different ways to formally describe a language:

  • Regular expressions: simplest class of languages
    • example: [a-z]* denotes a (possibly empty) sequence of lowercase letters
  • Context-free grammars: can express a wide range of properties of an input language (e.g. syntactical structure of an input format)
    • e.g.
      • expression term | expression + term
      • term factor | term * factor
      • factor integer | (expression)
      • integer 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Testing concurrent programs

uncovering bugs in concurrent programs requires not only discovering specific program inputs, but also specific thread schedules

  • Solution: add random delays using sleep(x) (adding these delays has the effect of attempting different thread schedules hopefully one causes a bug)

Depth of a concurrency bug

Bug depth: the number of ordering constraints (e.g. if/else’s) a schedule has to satisfy to find a bug

  • how deeply the bug is embedded in the program’s logic and how difficult it is to fix
  • Ordering constraints: the order in which the operations are executed by different threads

Case Studies

Google Monkey (android testing)

  • generates TOUCH(x,y), where x and y are randomly generated
  • generates MOVE(x2,y2), where x2 and y2 are randomly generated
  • generates MOTION(..), consists of a DOWN() event somewhere on the screen, sequence of MOVE(..) events, and an UP() event

Grammar of Monkey events

test_case -> event *
event     -> action ( x, y ) | ...
action    -> DOWN | MOVE | UP
x         -> 0 | 1 | ... | x_limit
y         -> 0 | 1 | ... | y_limit
  • Input: DOWN(0,0); MOVE(1,1); UP(2,2);
  • Expected output: an event of DOWN is happening on the coordinate (0,0), an event of MOVE is happening on the coordinate (1,1), and an event of UP is happening on the coordinate (2,2)

Microsoft Cuzz (concurrent testing)

  • works by generating random inputs and concurrent schedules of threads to test the application’s behaviour
    • detects and reports any race conditions or other concurrency-related bugs that are found during testing
  • Main idea: automate the approach of implementing sleep() calls systematically
  • gives worst-case probabilistic guarantee on finding bugs

Probabilistic guarantee

Given a program with:

  • n-threads (~tens)
  • k steps (~millions)
  • bug of depth (1 or 2)

Cuzz will find the bug with a probability of at least

Example

# function that each thread will run
def increment_counter():
    global counter
    for _ in range(1000000):
        counter += 1 
        
# fixed version (thread safe)
counter_lock = threading.Lock()
def increment_counter(): 
    global counter 
    for _ in range(1000000): 
        with counter_lock: 
            counter += 1

Quiz: Concurrency bug depth

// thread 1:
lock(a);
lock(b);
g = g + 1;
unlock(b);
unlock(a);

// thread 2:
lock(b);
lock(a);
g = 0;
unlock(a);
unlock(b);

specify the depth of the currency bug, and specify all ordering constraints needed to trigger the bug (use notation x(y) to mean statement x comes before statement y, and separate multiple constraints by a space)

Note

Slides up to slide 76/77