ARTICLE
Making Better Unit Tests: part 1, the AAA pattern
From Unit Testing, Principles, Practices, and Patterns by Vladimir Khorikov
In this article I’ll give you a refresher on some basic topics. I’ll go over the structure of a typical unit test, which is usually represented by the Arrange-Act-Assert pattern. I’ll also show the unit testing framework of my choice — xUnit — and explain why I’m using it and not one of its competitors.
__________________________________________________________________
Take 37% off Unit Testing: Principles, Practices, and Patterns by entering fcckhorikov into the discount code box at checkout at manning.com.
__________________________________________________________________
In part 2, we’ll talk about naming unit tests. For now, let’s dive into unit test structuring.
How to structure a unit test: the Arrange-Act-Assert pattern
This is the simple pattern in the structure of every unit test. The following is a Calculator
class with a single method which calculates a sum of two numbers:
public class Calculator
{
public double Sum(double first, double second)
{
return first + second;
}
}
Listing 1 shows a test that verifies the class’s behavior. This test exhibits the Arrange-Act-Assert pattern. It’s also called the 3A pattern because there are three “A”s: Arrange, Act, and Assert.
Listing 1. A test covering the Sum
method in Calculator
.
public class CalculatorTests (1)
{
[Fact] (2)
public void Sum_of_two_numbers() (3)
{
// Arrange
double first = 10; (4)
double second = 20; (4)
var calculator = new Calculator(); (4)
// Act
double result = calculator.Sum(first, second); (5)
// Assert
Assert.Equal(30, result); (6)
}
}
1 Class-container for a cohesive set of tests
2 xUnit’s attribute indicating a test
3 Name of the unit test
4 Arrange section
5 Act section
6 Assert section
The 3A pattern is simple and provides a uniform structure for all tests in the suite. This uniform structure is one of its biggest advantages: once you get used to this pattern, you can read and understand the tests more easily. That, in turn, reduces the maintenance cost for your entire test suite.
- The arrange section is where you set up the objects to be tested. You bring the system under test to a desired state and configure the dependencies: either instantiate them directly or prepare their test doubles.
- The act section is where you act upon the system under test. You call one of its methods: pass the dependencies and capture the output value if any.
- The assert section allows you to make the claims about the outcome. This may include the return value, the final state of the SUT and its collaborators, or the methods the SUT called on them.
The natural inclination is to start writing the test with the Arrange section. After all, it comes before the other two. And this approach works well in the vast majority of cases. But starting with the Assert section is a viable option too. If you practice Test-Driven Development, i.e. when you write a test before the production code, you don’t necessarily know enough about a given feature’s behavior, which can lead to the creation of a faulty test. It is advantageous to first outline what you expect from the behavior and then figure out how to develop the system to meet this expectation.
Such a technique might look counter-intuitive but it’s how we approach problem solving. We start by thinking about the objective: what a particular behavior should do for us. The solving of the problem comes after that. Writing down these assertions before everything else is merely a formalization of this thinking process. But again, this guideline is only applicable when you follow Test-Driven Development. If you write the production code before the test, by the time you move on to the test, you already know what to expect from the behavior.
Avoid multiple Arrange, Act, Assert sections
Occasionally, you can encounter a test with multiple Arrange, Act, or Assert sections. An example of how this might appear is in figure 1.
When you see multiple Act sections separated by Assert and, possibly, Arrange sections, it means that the test verifies multiple units of behavior. Such a test isn’t a unit test anymore but rather an integration one.
It’s best to avoid such a test structure. A single action ensures that your tests remain within the realm of unit testing. Which means that they’re simple, fast, and easy to understand. If you see a test containing a sequence of actions and assertions, refactor it. Extract each act into a test of its own.
It’s sometimes fine to have multiple Act sections in integration tests. Integration tests can be slow and you may optimize their execution speed this way. If bringing the system under test to a desired state is expensive, it makes sense to save precious time with a series of multiple acts and assertions. This is particularly true if all the states naturally flow from each other.
This optimization technique is only applicable to integration tests. Not all tests, but those which are slow and you don’t want them to become even slower. Such optimization in unit tests or integration tests that are fast enough is unnecessary. A multi-step unit test is always better off being split into several tests.
Avoid if
statements in tests
Similar to multiple occurrences of the Arrange, Act, or Assert sections, you can sometimes encounter a unit test with an if
statement. This is also an anti-pattern. A test — be it a unit or integration test — should be a simple sequence of steps with no branching.
An if
statement indicates that the test verifies too many things at once. Such a test, therefore, should be split into several tests, but unlike the situation with multiple AAA sections, there’s no exception for integration tests. Branching within a test brings you no benefits, only additional maintenance costs: if
statements make the tests harder to read and understand.
How large should each section be?
A common question people ask when starting out with the Arrange-Act-Assert pattern is: how large should each section be? And what about the Teardown section — the section that cleans up after the test? Let’s see.
The Arrange section is the largest one
The Arrange section is usually the largest among the three. It can be as large as the Act and Assert sections combined, but if it becomes significantly larger than that, it’s better to extract the arrangements either into private methods within the same test class, or to a separate factory class. Two popular patterns that can help you reuse the code in the Arrange sections are Object Mother and Test Data Builder.
Watch out for Act sections that are larger than a single line
The Act section is normally a single line of code. If the Act section consists of two or more lines, it could be an indication of a problem with the SUT’s public API.
It’s best to express this point with an example. In this example, the customer makes a purchase from a store.
Listing 2. The Act section consists of a single line.
[Fact]
public void Purchase_succeeds_when_enough_inventory()
{
// Arrange
var store = new Store();
store.AddInventory(Product.Shampoo, 10);
var customer = new Customer();
// Act
bool success = customer.Purchase(store, Product.Shampoo, 5);
// Assert
Assert.True(success);
Assert.Equal(5, store.GetInventory(Product.Shampoo));
}
Notice that the Act section in this test is a single method call, which is a sign of a well-designed class’s API. Now compare it to the version in listing 3. This version’s Act section contains two lines, and this is a sign of problem with the SUT’s API — it requires the client to remember to make the second method call to finish the purchase and it lacks encapsulation.
Listing 3. The Act section consists of two lines.
[Fact]
public void Purchase_succeeds_when_enough_inventory()
{
// Arrange
var store = new Store();
store.AddInventory(Product.Shampoo, 10);
var customer = new Customer();
// Act
bool success = customer.Purchase(store, Product.Shampoo, 5);
store.RemoveInventory(success, Product.Shampoo, 5);
// Assert
Assert.True(success);
Assert.Equal(5, store.GetInventory(Product.Shampoo));
}
Here is what you can read from listing 3’s Act section:
- In the first action line, the customer tries to acquire five units of shampoo from the store.
- In the second line, the inventory is removed from the store. The removal takes place only if the preceding call to
Purchase
returns a success.
The issue with the new version is that it now requires two method calls to perform a single operation. Note that this is not an issue with the test itself. It still verifies the same unit of behavior: the process of making a purchase. The issue lies in the API surface of the Customer
class. It shouldn’t require the client to make an additional method call.
From the business perspective, a successful purchase has two outcomes. First, the acquisition of a product by the customer. And second, the reduction of the inventory in the store. Both of these outcomes must be achieved together. Which means there should be a single public method that does both things. Otherwise, there’s room for inconsistency where the client code calls the first method but not the second, in which case the customer acquires the product but its available amount won’t be reduced in the store.
This kind of inconsistency is called an invariant violation. And the act of protecting your code against potential inconsistencies is called encapsulation. When an inconsistency penetrates into the database, it becomes a big problem. Now it’s impossible to reset the state of your application by restarting it. You need to deal with the corrupted data in the database, and, potentially, even contact the customers and handle this situation on a case-by-case basis. Imagine what would happen if the application generates confirmation receipts without reserving the inventory. It may issue claims to, and even charge for, more inventory than you can feasibly get a hold of in the near future.
The remedy is to maintain code encapsulation at all times. In the above example, the customer should remove the acquired inventory from the store as part of its Purchase
method and not rely on the client code to do that. When it comes to maintaining invariants, you should eliminate any potential course of action that could lead to their violation.
Remember, you can’t trust yourself to do the right thing all the time. So eliminate the possibility to do the wrong thing!
This guideline of keeping the Act section down to a single line holds true for the vast majority of code that contains business logic. This is less true for utility or infrastructure code. I won’t say “never do it,” but be sure to examine each such case for a potential breach in encapsulation.
How many assertions should the Assert section hold?
Finally, there’s the assertion part. You might have heard about the guideline of having one Assert per test. It takes root in the premise of targeting the smallest piece of code possible.
As you may know, this premise is incorrect. A unit in unit testing is a unit of behavior, not a unit of code. There could be multiple outcomes a single unit of behavior exhibits, and it’s perfectly fine to evaluate them all in one test.
Having that said, you need to watch out for assertion sections which grow too large. It could be a sign of a missing abstraction in the production code. For example, instead of asserting all properties inside an object returned by the SUT, it could be better to define proper equality members in the object’s class. You can then compare the object to an expected value using a single assertion.
What about the Teardown phase?
Some people also distinguish a fourth section, Teardown, which comes after Arrange, Act, and Assert. For example, you can use this section to remove any files created by the test or close a database connection. The teardown is usually represented by a separate method, which is re-used across all tests in the class. I don’t include this phase into the 3A pattern.
Note that most unit tests don’t need teardown. Unit tests don’t talk to out-of-process dependencies and don’t leave side effects that need to be disposed of; it’s a realm of integration testing.
Differentiating the system under test
The system under test plays a significant role in tests. It provides an entry point for the behavior you want to invoke in the application. The behavior can span across as much as several classes or as little as a single method, but there could be only one entry point; one class which triggers that behavior.
It’s important to differentiate the system under test from its dependencies, like when there are quite a few of them, and that you don’t need to spend too much time figuring out who’s who in the test. To do that, always name the SUT in tests as such: sut
. Listing 4 shows how CalculatorTests
looks after the renaming.
Listing 4. The calculator test after differentiating the SUT from its dependencies.
public class CalculatorTests
{
[Fact]
public void Sum_of_two_numbers()
{
// Arrange
double first = 10;
double second = 20;
var sut = new Calculator(); (1)
// Act
double result = sut.Sum(first, second);
// Assert
Assert.Equal(30, result);
}
}
1 The calculator is now called sut.
Dropping the Arrange, Act, Assert comments from tests
As it’s important to set the system under test apart from its dependencies, it’s also important to differentiate the three sections from each other, and it prevents spending too much time figuring out what section this particular line in the test belongs to. Putting // Arrange
, // Act
, and // Assert
comments before the beginning of each section is one way to do that. Another way is to separate the sections with empty lines, as shown in listing 5.
Listing 5. The calculator with sections separated by empty lines.
public class CalculatorTests
{
[Fact]
public void Sum_of_two_numbers()
{
double first = 10; (1)
double second = 20; (1)
var sut = new Calculator(); (1)
double result = sut.Sum(first, second); (2)
Assert.Equal(30, result); (3)
}
}
1 Arrange
2 Act
3 Assert
Separating sections with empty lines works great in most unit tests. It allows you to keep a balance between brevity and readability. It doesn’t work as well in large tests, though, where you may want to put additional empty lines inside the Arrange section to differentiate between configuration stages. This is often the case in integration tests — they frequently contain complicated setup logic. Therefore, drop the section comments in tests that follow the 3A pattern and where you can avoid additional empty lines inside the Arrange and Assert sections. Keep the section comments otherwise.
Exploring the xUnit testing framework
I’m using xUnit (https://github.com/xunit/xunit) as the unit testing framework. Although this framework works in .NET only, there are numerous unit testing frameworks in every object-oriented language (Java, C++, JavaScript, etc.) and they all look similar to each other. If you worked with one of them, you won’t have any issues working with another.
In .NET alone, there are several alternatives to choose from, such as NUnit (https://github.com/nunit/nunit) and the built-in MsTest. I personally prefer xUnit for the reasons I’ll describe shortly, but you can use NUnit; these two frameworks are on par functionality-wise. I don’t recommend MsTest, though. It doesn’t provide the same level of flexibility as xUnit and NUnit. Don’t just take my word for it; even people inside Microsoft avoid MsTest. For example, the ASP.NET Core team uses xUnit.
I prefer xUnit because it’s a cleaner, more concise version of NUnit. For example, you may have noticed that in the tests I brought up this far, there’re no framework related attributes other than [Fact]
. Which marks the method as a unit test to inform the unit testing framework to run it.
Any public class can contain a unit test — there’s no [TestFixture]
attributes, nor [SetUp]
or [TearDown]
. If you need to share configuration logic between tests, you can put it inside the constructor. And if you need to clean something up, you can implement the IDisposable
interface, as shown in listing 6.
Listing 6. The arrangement and teardown logic is defined separately and shared by all tests.
public class CalculatorTests : IDisposable
{
private readonly Calculator _sut;
public CalculatorTests() (1)
{ (1)
_sut = new Calculator(); (1)
} (1)
[Fact]
public void Sum_of_two_numbers()
{
/* ... */
}
public void Dispose() (2)
{ (2)
_sut.CleanUp(); (2)
} (2)
}
1 Called before each test in the class
2 Called after each test in the class
I like the [Fact]
attribute, specifically because it’s called Fact
and not Test
. It emphasizes the following rule of thumb: each test should tell a story. This story is an individual, atomic fact about the problem domain, and the passing test is proof that this fact is correct. If the test fails, it means the fact isn’t true anymore. Either the story is no longer valid and you need to re-write it, or the system itself has to be fixed.
I encourage you to adopt this way of thinking when you write unit tests. Your tests shouldn’t be a dull enumeration of what the production code does. Rather, they should provide a higher level description of the application behavior. Ideally, this description should be meaningful to not only programmers but business people too.
Reusing test fixtures between tests
I mentioned earlier that it’s fine to extract test fixture arrangements that take up too much space into separate methods or classes. You can even reuse them between tests. Two ways can do such a reuse and only one of these ways is beneficial. The other one leads to increase of the maintenance costs.
The term test fixture has two common uses.
- Per one of them, a test fixture is an object the test runs against. This object can be a regular dependency — an argument which is passed to the SUT. It can also be data in the database or a file on the hard disk. Such an object needs to remain in a known, fixed state before each test run, and it produces the same result. Hence the word fixture.
- The other definition comes from the NUnit testing framework. In NUnit,
TestFixture
is an attribute that marks a class containing tests.
The first way to reuse test fixtures is to initialize them in the constructor (or the method marked with a [SetUp]
attribute if you are using NUnit).
Listing 7. The common initialization code is extracted into the test class’s constructor.
public class CustomerTests
{
private readonly Store _store; (1)
private readonly Customer _sut;
public CustomerTests() (2)
{ (2)
_store = new Store(); (2)
_store.AddInventory(Product.Shampoo, 10); (2)
_sut = new Customer(); (2)
} (2)
[Fact]
public void Purchase_succeeds_when_enough_inventory()
{
bool success = _sut.Purchase(_store, Product.Shampoo, 5);
Assert.True(success);
Assert.Equal(5, _store.GetInventory(Product.Shampoo));
}
[Fact]
public void Purchase_fails_when_not_enough_inventory()
{
bool success = _sut.Purchase(_store, Product.Shampoo, 15);
Assert.False(success);
Assert.Equal(10, _store.GetInventory(Product.Shampoo));
}
}
1 Common test fixture
2 Runs before each test in the class
With this approach, you can significantly reduce the amount of test code — you can get rid of most or even all test fixture configurations in tests, but it has two significant drawbacks:
- It introduces high coupling between tests.
- It diminishes test readability.
Let’s discuss them in more detail.
High coupling between tests is an anti-pattern
In the new version, shown in listing 7, all tests are coupled to each other: a modification of one test’s arrangement logic will affect all tests in the class. For example, changing this line
_store.AddInventory(Product.Shampoo, 10);
to this
_store.AddInventory(Product.Shampoo, 15);
would invalidate the assumption the tests make about the store’s initial state and therefore would lead to unnecessary test failures.
That’s a violation of an important guideline: a modification of one test should not affect other tests. This guideline is similar to what we discussed in chapter 2 — that tests should run in isolation from each other. It’s not the same, though. Here, we are talking about independent modification of tests, not independent execution. Both are important attributes of a well-designed test.
To follow this guideline, you need to avoid introducing shared state in test classes. These two private fields are examples of such state:
private readonly Store _store;
private readonly Customer _sut;
The use of constructors in tests diminishes test readability
The other drawback to extracting the arrangement code into the constructor is diminished test readability. You no longer see the full picture by looking at the test. You have to examine different places in the class to understand what the test method does.
Even if there’s not much arrangement logic, say, only instantiation of the fixtures, you’re still better off moving it directly to the test method. Otherwise, you’ll be wondering if it’s instantiation or there’s something else being configured there too. A self-contained test doesn’t leave you with such uncertainties.
A better way to reuse test fixtures
The use of the constructor isn’t the best approach when it comes to reusing test fixtures. A better way is to introduce private factory methods in the test class, as shown in listing 8.
Listing 8. The common initialization code is extracted into private factory methods.
public class CustomerTests
{
[Fact]
public void Purchase_succeeds_when_enough_inventory()
{
Store store = CreateStoreWithInventory(Product.Shampoo, 10);
Customer sut = CreateCustomer();
bool success = sut.Purchase(store, Product.Shampoo, 5);
Assert.True(success);
Assert.Equal(5, store.GetInventory(Product.Shampoo));
}
[Fact]
public void Purchase_fails_when_not_enough_inventory()
{
Store store = CreateStoreWithInventory(Product.Shampoo, 10);
Customer sut = CreateCustomer();
bool success = sut.Purchase(store, Product.Shampoo, 15);
Assert.False(success);
Assert.Equal(10, store.GetInventory(Product.Shampoo));
}
private Store CreateStoreWithInventory(
Product product, int quantity)
{
Store store = new Store();
store.AddInventory(product, quantity);
return store;
}
private static Customer CreateCustomer()
{
return new Customer();
}
}
By extracting the common initialization code into private factory methods, you can also shorten the test code, but at the same time keep the full context of what’s going on in the tests. Moreover, the private methods don’t couple tests to each other as long as you make them generic enough. Allow the tests specify how they want the fixtures be created.
Look at this line for example:
Store store = CreateStoreWithInventory(Product.Shampoo, 10);
The test explicitly states that it wants the factory method to add ten units of shampoo to the store. This is both highly readable and reusable. Readable — because you don’t need to examine the internals of the factory method to understand the attributes of the created store. Reusable — because you can use this method in other tests too.
Note that in this particular sample, there’s no need for introduction of factory methods as the arrangement logic is simple. View it merely as a demonstration.
One exception to this rule of reusing test fixtures is that you can instantiate a fixture in the constructor if it’s used by all or almost all tests. This is often the case for integration tests that work with a database. All such tests require a database connection, which you can initialize once and then reuse everywhere.
Even then, it makes more sense to introduce a base class and initialize the database connection in that class’s constructor, not in individual test classes (listing 9).
Listing 9. The common initialization code a base class.
public class CustomerTests : IntegrationTests
{
[Fact]
public void Purchase_succeeds_when_enough_inventory()
{
/* use _database here */
}
}
public abstract class IntegrationTests : IDisposable
{
protected readonly Database _database;
protected IntegrationTests()
{
_database = new Database();
}
public void Dispose()
{
_database.Dispose();
}
}
Notice how CustomerTests
remains constructor-less. It gets access to the _database
instance by inheriting from the IntegrationTests
base class.
That’s all for this article.
If you want to learn more about the book, check it out on liveBook here and see this slide deck.