Mutation testing: How to test your tests

public://webform/writeforus/profile-pictures/fritz.png
Michael Fritzius, President, Arch DevOps, LLC

How good are your tests, really?

You probably spend a lot of time writing, running, and maintaining them. And having your tests all pass is a great feeling. But when I ask test authors, “How do you know your tests are robust enough?” or, “What other tests are there that you didn't think of?” they freak out.

What if there really is something missing, and the developers and testers are blissfully unaware that they're shipping time bombs to the customer? How would they know? Who is testing the tests?

What if you could figure out how good your tests were just by modifying the code so it behaves differently, to see if any tests fail? What if you could make a tiny change, run your tests, and see if they were good enough to notice that difference?

That's the concept behind mutation testing. Here's how it can help.

World Quality Report 2018-19: The State of QA and Testing

Welcome to the world of code mutation

Mutation testing, which got its start back in the 1960s, injects faults into your code to see how well a battery of tests can detect the change. Code changed in this way becomes a “mutation,” and the goal is to make sure that your suite of tests can “kill the mutant” by having one or more tests fail.

A code mutation is a minor change that affects the overall behavior of that code. Examples include:

  • Replacing arithmetic operators, such as switching a plus with a minus, or a plus with an increment operator, or an asterisk with a double asterisk
  • Changing logical operators, such as swapping a “>” with a “<”, or a “>=”
  • Removing a line of code
  • Setting an assignment to a hard-coded value instead of a variable

Because this type of testing requires fast feedback, it's best suited for unit tests. When you create a mutant, the battery of unit tests gets run. If even one test fails, then you have caught, or "killed," the mutant. But if all the tests pass, you can set aside the mutant code, or perhaps print it out for the user so that you can build a unit test for it later.

Choose your faults carefully

The kinds of faults you inject should highlight where you make assumptions in the code, and then in the tests. Think about a recent bug that you or a colleague fixed. There's a good chance that a symptom of that bug showed up elsewhere, even though the bug itself was further upstream in the code. Some defect probably kept getting passed along until code somewhere else didn't know how to handle it, resulting in a defect. 

For example, let's say you have a method that takes in some information and then generates a small set of JSON data, such as a single array of values. If the developers are testing out a new feature, they'd likely write a unit test to confirm that the array has a certain value.

They probably wouldn't do further testing to verify that the data has the correct values or that the size of the array is correct. But these are the types of tests that will make your code bulletproof and help you detect unwanted changes sooner. Building unit tests to catch mutants is less about functionality and more about sanity. It would look for things such as:

  • No invalid or garbage data results from an operation.
  • The right data type is being returned.
  • All other pieces of data in the output are correct.
  • The correct number of items is being returned.
  • There are no rounding errors.

The mutations that you put into the code really are the kinds of things that result in actual bugs. People get in a hurry and check in code that still contains debugging artifacts, or they forget to uncomment a line of code, or they just get distracted and forget what they were doing. That's why it makes sense to inject these faults into your code. These changes will alter the overall behavior of the program in a way that can be hard to debug later.

At this point you might be scared away by the thought that you'll have to do this all manually. Actually, there are tools that make code modifications and running tests faster—Jester for JUnit, Pester for Python, and Heckle for Ruby, to name just a few. These tools modify the compiled or assembled code in memory, in real time, and not the source code. But it's important to understand the concept behind those tools so that you'll understand them when you use them later.

[ Webinar: Agile Portfolio Management: Three best practices ]

Why resurrect mutation testing now?

Because computing power wasn't as strong in the '60s as it is now, mutation testing fell by the wayside. Many companies couldn't do it in a way that was cost-effective; the tests would either take too long, or they required expensive equipment to perform. But today we have more computing power in our phones than in the computers of the '60s that took up entire floors in an office building. Not only do we have cheap computing power, but we also have orders of magnitude more complexity in software. You need this type of testing.

Picture a terribly complex system of 1,000 interconnected web services, each with 1,000 classes that each contain 1,000 methods. If there's one mistake in the code anywhere in the system, it will result in a bug. To come up with test cases for all of those would be impossible for an army of people, let alone one person, to perform and have decent coverage.

But that's the challenge developers and testers face today. As software's footprint continues to grow, thinking up every test case is impossible. You need something to help with that. Mutation testing can shine a light on where you need more robust unit tests, so that you can develop quality code faster.

Mutant tips and tricks

Unit testing is the best type of test to use for this because of the speed with which such tests execute. But if you mutate one method in one class in your code base, do you need to run every unit test you have?

No. Because they're unit tests, and because they target specific chunks of code, you can just run the segment of tests that touch that particular code. Mutation testing tools allow for this. Or, if you're feeling brave and want to write your own, you can include that segmentation logic in the solution.

And in an age when cloud computing, continuous integration, and parallel processing exist, you can mutate many such methods within classes, all at the same time, and get very fast feedback.

How to start mutating

With the technology available today, you can start a mutation-testing effort faster and cheaper than ever before. This type of testing is poised for a big comeback. Few people know about this testing method yet, but now you do. But will you implement it before your competition does?

Are you doing mutation testing, or just getting started? Share your experiences and suggestions below.