Announcements
- midterm:
- up to chapter 4
- might be moved to after midterm
- assignment 1 released
Delta Debugging
Note
Started chapter 4 slides, slide 1
A real-world scenario
In July 1999, Bugzilla listed more than 370 open bugs reports for Mozilla’s firefox:
- these were not even simplified
- mozilla engineers were overwhelmed with work
- created the Mozilla BugAThon: a call for volunteers to simplify bug reports
Simplification
Once we have reproduced a program failure, we must find out what’s relevant:
- does the failure really depend on 10000 lines of code?
- does the failure really require this exact schedule of events?
- does the failure need this sequence of function calls?
Why Simplify?
- a simplified test case is easier to explain
- smaller test cases results in smaller states and shorter executions
- simplified test cases subsume several duplicates
Example
<table>
<tr>
<td align="left" valign="top">
<select name="op sys" multiple size="7">
<option value="All">All</option>
<option value="Windows 3.1">Windows 3.1</option>
<option value="Windows 95">Windows 95</option>
<option value="Windows 98">Windows 98</option>
<option value="Windows ME">Windows ME</option>
<option value="Windows 2000">Windows 2000</option>
<option value="Windows NT">Windows NT</option>
<option value="Mac System 7">Mac System 7</option>
<option value="Mac System 7.5">Mac System 7.5</option>
<option value="Mac System 7.6.1">Mac System 7.6.1</option>
<option value="Mac System 8.0">Mac System 8.0</option>
<option value="Mac System 8.5">Mac System 8.5</option>
<option value="Mac System 8.6">Mac System 8.6</option>
<option value="Mac System 9.x">Mac System 9.x</option>
<option value="MacOS X">MacOS X</option>
<option value="Linux">Linux</option>
<option value="BSDI">BSDI</option>
<option value="FreeBSD">FreeBSD</option>
<option value="NetBSD">NetBSD</option>
<option value="OpenBSD">OpenBSD</option>
<option value="AIX">AIX</option>
<option value="BeOS">BeOS</option>
<option value="HP-UX">HP-UX</option>
<option value="IRIX">IRIX</option>
<option value="Neutrino">Neutrino</option>
<option value="OpenVMS">OpenVMS</option>
<option value="OS/2">OS/2</option>
<option value="OSF/1">OSF/1</option>
<option value="Solaris">Solaris</option>
<option value="SunOS">SunOS</option>
<option value="other">other</option>
</select>
</td>
<td align="left" valign="top">
<select name="priority" multiple size="7">
<option value="--">--</option>
<option value="P1">P1</option>
<option value="P2">P2</option>
<option value="P3">P3</option>
<option value="P4">P4</option>
<option value="P5">P5</option>
</select>
</td>
<td align="left" valign="top">
<select name="bug severity" multiple size="7">
<option value="blocker">blocker</option>
<option value="critical">critical</option>
<option value="major">major</option>
<option value="normal">normal</option>
<option value="minor">minor</option>
<option value="trivial">trivial</option>
<option value="enhancement">enhancement</option>
</select>
</td>
</tr>
</table>
flowchart LR File["**File**"] --> Print["**Print**"] Print --> SegFault["<span style='color:red;font-weight:bold;'>Segmentation Fault</span>"]
Solution
The problem ended up being the
<select name="priority" multiple size="7">
tag!
Algorithm: Reduce the failing input (or situation) step by step until you find the smallest subset that still triggers the problem
Step by step:
- download the web page to your machine
- using a text editor, start removing HTML from the page. every few minutes/changes, make sure it still reproduces the bug
- code not required to reproduce the bug can be safely removed
- when you’ve cut away as much as you can, you’re done
Better solution
checking line by line or line by item may take a long time! use binary search to find the broken test case faster → can be automated for large test cases!
Basic idea
Delta Debugging: an algorithm used for automated debugging, where the main goal is to minimize a test case that produces a bug
- works in iterations
- set up an automated test that checks whether the failure occurs or not
- implement a strategy that uses divide and conquer (e.g. binary search)
- divides the data set (test input) into smaller subsets and test each one
- if a subset still causes the failure, it becomes the new focus
- repeat the process until the minimal failing input is found
- Delta debugging is not limited to binary search
- just need to use a method for dividing a set of test cases into a smaller
Binary search
- run binary search on the lines of code → find the single code line/block that is causing the issue
- if we want to go further, we can run binary search on the individual line of code we got at the end
- if both half’s of the line pass, but the full line fails, what do we do?
<select name="priority" multiple size="7"> <!-- ❌ -->
<select name="priori <!-- ✅ -->
ty" multiple size="7"> <!-- ✅ -->
- at this point, binary search says we’re stuck, since neither half of the input causes a failure on its own
- divide into more and more parts (4, 8, …)
<select name="priority" multiple size="7"> <!-- ❌ -->
<select name="priori <!-- ✅ -->
ty" multiple size="7"> <!-- ✅ -->
me="priority" multiple size="7"> <!-- ✅ -->
<select na ty" multiple size="7"> <!-- ❌ -->
<select na le size="7"> <!-- ❌ -->
<select na <!-- ✅ -->
...
QUIZ: Impact of Input Granularity
Fill in the blanks:
- Slower
- Higher
- Faster
- Lower
Input granularity | Finer | Coarser |
---|---|---|
chance of finding the failing input subset | ||
progress of the search |
Solution
| Input granularity | Finer | Coarser | | ------------------------------------------ | ----- | ------- | | chance of finding the failing input subset |2. Higher|4. Lower| | progress of the search |1. Slower|3. Faster|
Test case
- we have input without failure:
- we have an input with failure:
- the goal is to minimize
- we have a set of changes
- We have a set of changes such that:
- each subset of of is a test case
Example: Browser crash
- : an empty HTML file (always safe)
- : a full HTML file that crashes
- : the differences between them (html elements, tag attributes, etc)
Now:
- if you apply the subset to (the empty file), you get a test HTML file with just a table
- if you apply another subset , you get a test HTML file with a table and a select element
- if you apply (all changes), you get the original failing input
- each subset like gives you a new HTML test case
Test Cases and Minimization
-
given a test case c, we want to know if the input generated by applying changes in c to causes the same failure as
-
define the function:
such that, for :
-
Goal: Find the smallest test case c such that
-
a failing test case is called the global minimum of if:
-
among all subsets of , there is no smaller failing case than c → c is absolutely minimal
- the global minimum is the smallest set of changes which makes the program fail
-
finding the global minimum may require performing an exponential number of tests
- if has size , in the worst case we need tests to find the global minimum
Note
Ended slide 43