Announcements

  • midterm:
    • up to chapter 4
    • might be moved to after midterm
  • assignment 1 released

Delta Debugging

Note

Started chapter 4 slides, slide 1

A real-world scenario

In July 1999, Bugzilla listed more than 370 open bugs reports for Mozilla’s firefox:

  • these were not even simplified
  • mozilla engineers were overwhelmed with work
  • created the Mozilla BugAThon: a call for volunteers to simplify bug reports

Simplification

Once we have reproduced a program failure, we must find out what’s relevant:

  • does the failure really depend on 10000 lines of code?
  • does the failure really require this exact schedule of events?
  • does the failure need this sequence of function calls?

Why Simplify?

  • a simplified test case is easier to explain
  • smaller test cases results in smaller states and shorter executions
  • simplified test cases subsume several duplicates

Example

<table>
  <tr>
    <td align="left" valign="top">
      <select name="op sys" multiple size="7">
        <option value="All">All</option>
        <option value="Windows 3.1">Windows 3.1</option>
        <option value="Windows 95">Windows 95</option>
        <option value="Windows 98">Windows 98</option>
        <option value="Windows ME">Windows ME</option>
        <option value="Windows 2000">Windows 2000</option>
        <option value="Windows NT">Windows NT</option>
        <option value="Mac System 7">Mac System 7</option>
        <option value="Mac System 7.5">Mac System 7.5</option>
        <option value="Mac System 7.6.1">Mac System 7.6.1</option>
        <option value="Mac System 8.0">Mac System 8.0</option>
        <option value="Mac System 8.5">Mac System 8.5</option>
        <option value="Mac System 8.6">Mac System 8.6</option>
        <option value="Mac System 9.x">Mac System 9.x</option>
        <option value="MacOS X">MacOS X</option>
        <option value="Linux">Linux</option>
        <option value="BSDI">BSDI</option>
        <option value="FreeBSD">FreeBSD</option>
        <option value="NetBSD">NetBSD</option>
        <option value="OpenBSD">OpenBSD</option>
        <option value="AIX">AIX</option>
        <option value="BeOS">BeOS</option>
        <option value="HP-UX">HP-UX</option>
        <option value="IRIX">IRIX</option>
        <option value="Neutrino">Neutrino</option>
        <option value="OpenVMS">OpenVMS</option>
        <option value="OS/2">OS/2</option>
        <option value="OSF/1">OSF/1</option>
        <option value="Solaris">Solaris</option>
        <option value="SunOS">SunOS</option>
        <option value="other">other</option>
      </select>
    </td>
 
    <td align="left" valign="top">
      <select name="priority" multiple size="7">
        <option value="--">--</option>
        <option value="P1">P1</option>
        <option value="P2">P2</option>
        <option value="P3">P3</option>
        <option value="P4">P4</option>
        <option value="P5">P5</option>
      </select>
    </td>
 
    <td align="left" valign="top">
      <select name="bug severity" multiple size="7">
        <option value="blocker">blocker</option>
        <option value="critical">critical</option>
        <option value="major">major</option>
        <option value="normal">normal</option>
        <option value="minor">minor</option>
        <option value="trivial">trivial</option>
        <option value="enhancement">enhancement</option>
      </select>
    </td>
  </tr>
</table>
flowchart LR
    File["**File**"] --> Print["**Print**"]
    Print --> SegFault["<span style='color:red;font-weight:bold;'>Segmentation Fault</span>"]

Basic idea

Delta Debugging: an algorithm used for automated debugging, where the main goal is to minimize a test case that produces a bug

  • works in iterations
  • set up an automated test that checks whether the failure occurs or not
  • implement a strategy that uses divide and conquer (e.g. binary search)
    • divides the data set (test input) into smaller subsets and test each one
    • if a subset still causes the failure, it becomes the new focus
    • repeat the process until the minimal failing input is found
  • Delta debugging is not limited to binary search
    • just need to use a method for dividing a set of test cases into a smaller
  • run binary search on the lines of code find the single code line/block that is causing the issue
  • if we want to go further, we can run binary search on the individual line of code we got at the end
    • if both half’s of the line pass, but the full line fails, what do we do?
<select name="priority" multiple size="7"> <!-- ❌ -->
<select name="priori                       <!-- ✅ -->
                    ty" multiple size="7"> <!-- ✅ -->
  • at this point, binary search says we’re stuck, since neither half of the input causes a failure on its own
    • divide into more and more parts (4, 8, …)
<select name="priority" multiple size="7"> <!-- ❌ -->
<select name="priori                       <!-- ✅ -->
                    ty" multiple size="7"> <!-- ✅ -->
          me="priority" multiple size="7"> <!-- ✅ -->
<select na          ty" multiple size="7"> <!-- ❌ -->
<select na                    le size="7"> <!-- ❌ -->
<select na                                 <!-- -->
...

QUIZ: Impact of Input Granularity

Fill in the blanks:

  1. Slower
  2. Higher
  3. Faster
  4. Lower
Input granularityFinerCoarser
chance of finding the failing input subset
progress of the search

Test case

  • we have input without failure:
  • we have an input with failure:
    • the goal is to minimize
  • we have a set of changes
  • We have a set of changes such that:
  • each subset of of is a test case

Example: Browser crash

  • : an empty HTML file (always safe)
  • : a full HTML file that crashes
  • : the differences between them (html elements, tag attributes, etc)

Now:

  • if you apply the subset to (the empty file), you get a test HTML file with just a table
  • if you apply another subset , you get a test HTML file with a table and a select element
  • if you apply (all changes), you get the original failing input
  • each subset like gives you a new HTML test case

Test Cases and Minimization

  • given a test case c, we want to know if the input generated by applying changes in c to causes the same failure as

  • define the function:

    such that, for :

  • Goal: Find the smallest test case c such that

  • a failing test case is called the global minimum of if:

  • among all subsets of , there is no smaller failing case than c c is absolutely minimal

    • the global minimum is the smallest set of changes which makes the program fail
  • finding the global minimum may require performing an exponential number of tests

    • if has size , in the worst case we need tests to find the global minimum

Note

Ended slide 43