How good is your test strategy?
- test suites can be evaluated by:
- Code coverage metrics: measure how much code is exercised
- Mutation analysis (mutation testing): evaluate test effectiveness by introducing faults
Code coverage
- metric: fraction of program code reached by the test suite
- tools: EclEmma, Cobertura, Hansel, NoUnit, CoView
- types of coverage:
- function coverage: which functions were called?
- statement coverage: which statements executed?
- branch coverage: which branches taken?
- others: line, condition, path, basic block
Quiz: code coverage metrics
What is the statement coverage and branch coverage? Give arguments for another call that could improve coverage.
Test Suite: { foo(1, 0) }
int foo(int x, int y) {
int z = 0;
if (x <= y) {
z = x;
} else {
z = y;
}
return z;
}
Solution
statement coverage: 80%
branch coverage: 50%
add testfoo(0, 1)
to cover the other branch → now 100% coverage
Mutation testing
- introduce small changes (“mutants”) into source code
- run test suite to see if changes are detected
- if output differs → mutant is killed
- if output same → mutant survives (weak test suite)
- also called fault-based testing (white-box)
Steps
- introduce faults → generate mutants
- run test suite on original + mutants
- compare outputs
- mutant killed if different outputs
- surviving mutants indicate weak test cases
Example of mutant
Original: if (x > y) print "Hello" else print "Hi"
Mutant: if (x < y) print "Hello" else print "Hi"
Types of mutations
- Arithmetic mutations: change math ops (
+ → -
,* → /
) - Boolean mutations: change logical ops (
&& → ||
,== → !=
) - Statement mutations: remove/replace statements
- Variable mutations: rename/replace variables
- Method call mutations: modify calls or parameters
Quiz: mutation type
The following mutation is an example of:
- Arithmetic mutations
- Boolean mutations
- Variable mutations
- Method call mutations
# original
print(x.upper())
# mutated
print(x.lower())
Solution
method call mutation (different method invoked)
Mutation score
- mutation score = killed_mutants / total_mutants
- higher score → better test suite
Quiz: mutation analysis
Which test suite is better?
int foo(int x, int y) {
int z = 0;
if (x <= y) { z = x; }
else { z = y; }
return z;
}
- test suite 1: foo(0,1)=0; foo(0,0)=0
- test suite 2: foo(0,1)=0; foo(0,0)=0; foo(1,0)=0
part 1: mutants
Solution
Check the boxes indicating a passed test. Test 1 assert: foo(0, 1) == 0 Test 2 assert: foo(0, 0) == 0 Mutant 1 x <= y → x > y
❌ ✅ Mutant 2 x <= y → x != y
✅ ✅ tests kill both mutants → adequate
part 2: give failing case for mutant 2
What is a failing test case that kills mutant 2 but passes the original?
Solution
foo(1,1) == 1
kills mutant 2 but passes original
Solution
test suite 2 is better (covers more cases)
Quiz: mutation score (factorial)
Explain a mutation testing on the following code:
def factorial(n):
if n == 0:
return 1
else:
return n * factorial(n-1)
Solution
mutate operators, base case, recursive call → run tests
mutation score = proportion of mutants killed by tests
Types of software testing
- unit testing: each module individually (white-box, by developers)
- integration testing: focus on errors at module interfaces
- system testing: entire system tested
- alpha: in-house test team
- beta: selected external users
- acceptance: by customer for delivery decision
- performance: system under load/stress
Unit testing in python
import unittest
class CalculatorTest(unittest.TestCase):
def setUp(self):
self.calculator = Calculator()
def test_addition(self):
self.assertEqual(self.calculator.add(2,3), 5)
def test_subtraction(self):
self.assertEqual(self.calculator.subtract(5,3), 2)
def test_multiplication(self):
self.assertEqual(self.calculator.multiply(2,3), 6)
def test_division(self):
self.assertEqual(self.calculator.divide(6,3), 2)
if __name__ == '__main__':
unittest.main()
setUp
: creates test object before each testassertEqual(actual, expected)
ensures correctness- other assertions:
assertTrue
,assertFalse
,assertIs
,assertIsNone
,assertIn
,assertIsInstance
Unit test example: BankAccount
class BankAccount:
def __init__(self, balance=0): self.balance = balance
def deposit(self, amount): self.balance += amount; return self.balance
def withdraw(self, amount):
if amount > self.balance: return "Insufficient funds"
self.balance -= amount; return self.balance
def check_balance(self): return self.balance
class BankAccountTest(unittest.TestCase):
def setUp(self): self.account = BankAccount()
def test_deposit(self):
self.account.deposit(100)
self.assertEqual(self.account.check_balance(), 100)
def test_withdraw(self):
self.account.deposit(100)
self.account.withdraw(50)
self.assertEqual(self.account.check_balance(), 50)
def test_withdraw_insufficient(self):
self.account.deposit(50)
self.assertEqual(self.account.withdraw(100), "Insufficient funds")
- code coverage: 100% (all branches covered)
Who tests the software?
- developers: understand system, but biased to deliver
- independent testers: external perspective, try to break system