[Translation] Mutation Testing: Testing Tests

[Translation] Mutation Testing: Testing Tests



Writing tests should instill in us confidence in the correct operation of the code. Often we operate with a degree of code coverage, and when we reach 100%, we can say that the solution is correct. Are you sure about this? Perhaps there is a tool that will give more accurate feedback?

Mutation Testing


This term describes a situation where we modify small pieces of code and see how it affects tests. If, after the changes, tests are performed correctly, it means that there are not enough tests for these code fragments. Of course, everything depends on what exactly we are changing, because we do not need to test all the smallest changes, for example, indents or variable names, because after them the tests must also complete correctly. Therefore, in mutational tests we use the so-called mutators (modifier methods), which replace some code fragments with others, but in a way that makes sense. We will talk about this in more detail below. Sometimes we carry out such tests ourselves, checking if the tests break down if we change something in the code. If we refactor the “half of the system” and the tests are still green, then we can immediately say that they are bad. And if someone did this and the tests turned out to be good, then my congratulations!

Infectious Framework


Today in PHP, the most popular mutation testing framework is Infection . It supports PHPUnit and PHPSpec, and requires PHP 7.1+ and Xdebug or phpdbg to work with it.

First launch and configuration


When you first start, we see an interactive framework configurator, which creates a special file with settings - infection.json.dist. It looks like this:

  {
  "timeout": 10,
  "source": {
  "directories": [
  "src"
 
  },
  "logs": {
  "text": "infection.log",
  "perMutator": "per-mutator.md"
  },
  "mutators": {
  "@default": true
 
 }  

Timeout is an option, the value of which should be equal to the maximum duration of the execution of one test. In source we specify the directories from which we will mutate the code, we can set exceptions. In logs there is an option text with which we set the collection of statistics only for erroneous tests, which is the most interesting for us. The perMutator option allows you to save used mutators. Read more about this in the documentation.

Example


  final class Calculator
 {
  public function add (int $ a, int $ b): int
 
  return $ a + $ b;
 
 }  

Suppose we have the above class. Let's write a test in PHPUnit:

  final class CalculatorTest extends TestCase
 {
/**
  * @var Calculator
  */
  private $ calculator;

  public function setUp (): void
 
  $ this- & gt; calculator = new Calculator ();
 

/**
  * @dataProvider additionProvider
  */
  public function testAdd (int $ a, int $ b, int $ expected): void
 
  $ this- & gt; assertEquals ($ expected, $ this- & gt; calculator- & gt; add ($ a, $ b));
 

  public function additionProvider (): array
 
  return [
  [0, 0, 0],
  [6, 4, 10],
  [-1, -2, -3]
  [-2, 2, 0]
  ];
 
 }  

Of course, this test needs to be written before we implement the add () method. When executing ./vendor/bin/phpunit we get:

  PHPUnit 8.2.2 by Sebastian Bergmann and contributors.

 .... 4/4 (100%)

 Time: 39 ms, Memory: 4.00 MB

 OK (4 tests, 4 assertions)  

Now we’ll execute ./vendor/bin/infection :

  You are running Infection with Xdebug enabled.____ ____ __ _
/_/___/__/__ _____//_ (_) ___ ____
///__ \//_/_ \/___/__//__ \/__ \
  _//////__/__//__//_///_//////
/___/_//_/_/\ ___/\ ___/\ __/_/\ ____/_//_/

 Running initial test suite ...

 PHPUnit version: 8.2.2

  9 [============================] 1 sec

 Generate mutants ...

 Processing source code files: 1/1Creating mutated files and processes: 0/2
 Creating mutated files and processes: 2/2
 .: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

 .. (2/2)

 2 mutations were generated:
  2 mutants were killed
  0 mutants were not covered by tests
  0 covered mutants were not detected
  0 errors were encountered
  0 time outs were encountered

 Metrics:
  Mutation Score Indicator (MSI): 100%
  Mutation Code Coverage: 100%
  Covered Code MSI: 100%

 Please note that some mutants will inevitably be harmless (i.e. false positives).

 Time: 1s.  Memory: 10.00MB  

According to Infection, our tests are accurate. In the file per-mutator.md we can see which mutations were used:

  # Effects per Mutator

 |  Mutator |  Mutations |  Killed |  Escaped |  Errors |  Timed Out |  MSI |  Covered MSI |
 |  ------- |  --------- |  ------ |  ------- | ------- |  --------- |  --- |  ----------- |
 |  Plus |  1 |  1 |  0 |  0 |  0 |  100 |  100 |
 |  PublicVisibility |  1 |  1 |  0 |  0 |  0 |  100 |  100 |  

Mutator Plus - a simple sign change from plus to minus, which should break the tests. And the PublicVisibility mutator changes the access modifier of this method, which should also break tests, and in this case it works.

Now add a more complicated method.

 /**
  * @param int [] $ numbers
  */
 public function findGreaterThan (array $ numbers, int $ threshold): array
 {
  return \ array_values ​​(\ array_filter ($ numbers, static function (int $ number) use ($ threshold) {
  return $ number & gt;  $ threshold;
  }));
 }


/**
  * @dataProvider findGreaterThanProvider
  */
 public function testFindGreaterThan (array $ numbers, int $ threshold, array $ expected): void
 {
  $ this- & gt; assertEquals ($ expected, $ this- & gt; calculator- & gt; findGreaterThan ($ numbers, $ threshold));
 }

 public function findGreaterThanProvider (): array
 {
  return [
  [[1, 2, 3], -1, [1, 2, 3]],
  [[-2, -3, -4], 0, []]
  ];
 }  

After execution, we will see the following result:

  You are running Infection with Xdebug enabled.

  ____ ____ __ _
/_/___/__/__ _____//_ (_) ___ ____
///__ \//_/_ \/___/__//__ \/__ \
  _//////__/__//__//_///_//////
/___/_//_/_/\ ___/\ ___/\ __/_/\ ____/_//_/

 Running initial test suite ...

 PHPUnit version: 8.2.2

  11 [============================] & lt;  1 sec

 Generate mutants ...

 Processing source code files: 1/1Creating mutated files and processes: 0/7
 Creating mutated files and processes: 7/7
 .: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

 ..M..M.  (7/7)

 7 mutations were generated:
  5 mutants were killed
  0 mutants were not covered by tests
  2 covered mutants were not detected
  0 errors were encountered
  0 time outs were encountered

 Metrics:
  Mutation Score Indicator (MSI): 71%
  Mutation Code Coverage: 100%
  Covered Code MSI: 71%

 Please note that some mutants will inevitably be harmless (i.e. false positives).

 Time: 1s.  Memory: 10.00MB  

Our tests are not all right. First check the infection.log file:

  Escaped mutants:
 ================


 1)/home/sarven/projects/infection-playground/infection-playground/src/Calculator.php:19 [M] UnwrapArrayValues

 --- Original
 +++ New
 @@ @@
  */
  public function findGreaterThan (array $ numbers, int $ threshold): array
 
 - return \ array_values ​​(\ array_filter ($ numbers, static function (int $ number) use ($ threshold) {
 + return \ array_filter ($ numbers, static function (int $ number) use ($ threshold) {
  return $ number & gt;  $ threshold;
 -}));
 +});
 

 2)/home/sarven/projects/infection-playground/infection-playground/src/Calculator.php: 20 [M] GreaterThan

 --- Original
 +++ New
 @@ @@
  public function findGreaterThan (array $ numbers, int $ threshold): array
 
  return \ array_values ​​(\ array_filter ($ numbers, static function (int $ number) use ($ threshold) {
 - return $ number & gt;  $ threshold;
 + return $ number & gt; = $ threshold;
  }));
 

 Timed Out mutants:
 ==================

 Not Covered mutants:
 ====================  

The first uncaught problem is the use of the array_values ​​ function. It is used to reset keys, because array_filter returns values ​​with keys from the previous array. In addition, in our tests there is no case when you need to use array_values ​​, because otherwise an array is returned with the same values ​​but different keys.

The second problem is related to borderline cases. In comparison, we used the & gt; sign, but we do not test any borderline cases, so replacing with & gt; = does not break the tests. Need to add only one test:

  public function findGreaterThanProvider (): array
 {
  return [
  [[1, 2, 3], -1, [1, 2, 3]],
  [[-2, -3, -4], 0, []],
  [[4, 5, 6], 4, [5, 6]]
  ];
 }  

And now Infection is happy with everything:

  You are running Infection with Xdebug enabled.
  ____ ____ __ _
/_/___/__/__ _____//_ (_) ___ ____
///__ \//_/_ \/___/__//__ \/__ \
  _//////__/__//__//_///_//////
/___/_//_/_/\ ___/\ ___/\ __/_/\ ____/_//_/

 Running initial test suite ...

 PHPUnit version: 8.2.2

  12 [============================] & lt;  1 sec

 Generate mutants ...

 Processing source code files: 1/1Creating mutated files and processes: 0/7
 Creating mutated files and processes: 7/7
 .: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

 ....... (7/7)

 7 mutations were generated:
  7 mutants were killed
  0 mutants were not covered by tests
  0 covered mutants were not detected
  0 errors were encountered
  0 time outs were encountered

 Metrics:
  Mutation Score Indicator (MSI): 100%
  Mutation Code Coverage: 100%
  Covered Code MSI: 100%

 Please note that some mutants will inevitably be harmless (i.e. false positives).

 Time: 1s.  Memory: 10.00MB  

Add a subtract method to the Calculator class, but without a separate test in PHPUnit:

  public function subtract (int $ a, int $ b): int
 {
  return $ a - $ b;
 }  

And after doing Infection we see:

  You are running Infection with Xdebug enabled.

  ____ ____ __ _
/_/___/__/__ _____//_ (_) ___ ____
///__ \//_/_ \/___/__//__ \/__ \
  _//////__/__//__//_///_//////
/___/_//_/_/\ ___/\ ___/\ __/_/\ ____/_//_/

 Running initial test suite ...

 PHPUnit version: 8.2.2

  11 [============================] & lt;  1 sec

 Generate mutants ...

 Processing source code files: 1/1Creating mutated files and processes: 0/9
 Creating mutated files and processes: 9/9
 .: killed, M: escaped, S: uncovered, E: fatal error, T: timed out

 ....... SS (9/9)

 9 mutations were generated:
  7 mutants were killed
  2 mutants were not covered by tests
  0 covered mutants were not detected
  0 errors were encountered
  0 time outs were encountered

 Metrics:
  Mutation Score Indicator (MSI): 77%
  Mutation Code Coverage: 77%
  Covered Code MSI: 100%

 Please note that some mutants will inevitably be harmless (i.e. false positives).

 Time: 1s.  Memory: 10.00MB  

This time the instrument returned two uncovered mutations.

  Escaped mutants:
 ================

 Timed Out mutants:
 ==================

 Not Covered mutants:
 ====================


 1)/home/sarven/projects/infection-playground/infection-playground/src/Calculator.php: 24 [M] PublicVisibility

 --- Original
 +++ New
 @@ @@
  return $ number & gt;  $ threshold;
  }));
 
 - public function subtract (int $ a, int $ b): int
 + protected function subtract (int $ a, int $ b): int
 
  return $ a - $ b;
 
 

 2)/home/sarven/projects/infection-playground/infection-playground/src/Calculator.php:26 [M] Minus

 --- Original
 +++ New
 @@ @@
 
  public function subtract (int $ a, int $ b): int
 
 - return $ a - $ b;
 + return $ a + $ b;
  

Metrics


After each execution, the tool returns three metrics:

  Metrics:
  Mutation Score Indicator (MSI): 47%
  Mutation Code Coverage: 67%
  Covered Code MSI: 70%
  

Mutation Score Indicator - the proportion of mutations detected by tests.

The metric is calculated like this:

  TotalDefeatedMutants = KilledCount + TimedOutCount + ErrorCount;

 MSI = (TotalDefeatedMutants/TotalMutantsCount) * 100;
  

Mutation Code Coverage - the proportion of code covered by mutations.

The metric is calculated like this:

  TotalCoveredByTestsMutants = TotalMutantsCount - NotCoveredByTestsCount;

 CoveredRate = (TotalCoveredByTestsMutants/TotalMutantsCount) * 100;
  

Covered Code Mutation Score Indicator - determines the effectiveness of tests only for code that is covered by tests.

The metric is calculated like this:

  TotalCoveredByTestsMutants = TotalMutantsCount - NotCoveredByTestsCount;
 TotalDefeatedMutants = KilledCount + TimedOutCount + ErrorCount;

 CoveredCodeMSI = (TotalDefeatedMutants/TotalCoveredByTestsMutants) * 100;
  

Use in more complex projects


In the example above, there is only one class, so we performed Infection without parameters. But in everyday work on ordinary projects, it will be useful to use the –filter parameter, which allows you to specify the set of files to which we want to apply mutations.

  ./vendor/bin/infection - file = Calculator.php
  

False positives


Some mutations do not affect the operation of the code, and Infection returns MSI below 100%. But we cannot always do something about it, so we have to put up with such situations. Something similar is shown in this example:

  public function calcNumber (int $ a): int
 {
  return $ a/$ this- & gt; getRatio ();
 }

 private function getRatio (): int
 {
  return 1;
 }  

Of course, here the getRatio method does not make sense, in a normal project there probably would have been some kind of calculation instead. But the result could be 1 . Infection returns:

  Escaped mutants:
 ================


 1)/home/sarven/projects/infection-playground/infection-playground/src/Calculator.php:26 [M] Division

 --- Original
 +++ New
 @@ @@
 
  public function calcNumber (int $ a): int
 
 - return $ a/$ this- & gt; getRatio ();
 + return $ a * $ this- & gt; getRatio ();
 
  private function getRatio (): int  

As we know, multiplying and dividing by 1 returns the same result, equal to the original number. So this mutation should not break down the tests, and despite Infection’s dissatisfaction with the accuracy of our tests, everything is in order.

Optimization for large projects


In cases with large projects, the implementation of Infection can take a very long time. You can optimize execution during CI if you only process modified files. For more information, see the documentation: https://infection.github .io/guide/how-to.html

In addition, you can run tests in parallel on the modified code. However, this is only possible if all tests are independent. Namely, these should be good tests.To enable this option, use the –threads parameter:

  ./vendor/bin/infection --threads = 4  

How does it work?


The Infection framework uses AST (an abstract syntax tree), which represents code in the form of an abstract data structure. To do this, use a parser written by one of the creators of PHP ( php-parser ).

Simplified operation of the tool can be represented as:

  1. Generate AST based on code.
  2. Apply appropriate mutations (the full list is here ).
  3. Create a modified code based on the AST.
  4. Test run for modified code.

For example, you can check the replacement minus minus for plus:

  & lt;? php

 declare (strict_types = 1);

 namespace Infection \ Mutator \ Arithmetic;

 use Infection \ Mutator \ Util \ Mutator;
 use PhpParser \ Node;
 use PhpParser \ Node \ Expr \ Array_;
/**
  * @internal
  */
 final class Plus extends Mutator
 {
/**
  * Replaces "+" with "-"
 
  * @param Node & amp; Node \ Expr \ BinaryOp \ Plus $ node
 
  * @return Node \ Expr \ BinaryOp \ Minus
  */
  public function mutate (Node $ node)
 
  return new Node \ Expr \ BinaryOp \ Minus ($ node- & gt; left, $ node- & gt; right, $ node- & gt; getAttributes ());
 

  protected function mutatesNode (Node $ node): bool
 
  if (! ($ node instanceof Node \ Expr \ BinaryOp \ Plus)) {
  return false;
 

  if ($ node- & gt; left instanceof Array_ || $ node- & gt; right instanceof Array_) {
  return false;
 

  return true;
 
 }  

The mutate () method creates a new element that is replaced with a plus. The Node class is taken from the php-parser package, it is used for AST operations and for modifying PHP code. However, this change cannot be applied anywhere, so the mutatesNode () method contains additional conditions. If to the left of the plus or to the right of the minus is an array, then the change is invalid. This condition is used because of this code:

  $ tab = [0] + [1];
 is correct, but the following is not correct.
 $ tab = [0] - [1];  

Total


Mutation testing is a great tool that complements the CI process and allows you to evaluate the quality of tests. Green highlighting of tests does not give us confidence that everything is written well. You can improve the accuracy of tests using mutation testing - or testing tests - which increases our confidence in the performance of the solution. Of course, it is not necessary to strive for 100% metrics, because it is not always possible. It is necessary to analyze the logs and adjust tests accordingly.

Source text: [Translation] Mutation Testing: Testing Tests