Tests and Testability

It’s not exactly traditional to use automated testing in the game development process, but at Rare we’ve been developing our title Sea of Thieves with a continuous delivery method. This means we try and have a stable production-ready build at all times, and this relies on the automated tests running round the clock. Every time we check in a change to the build, it has to run against the current build and pass all the tests. If you want to learn more about how our games-as-a-service based development process, there’s a really interesting series of blog posts about it here , written by my colleague Jafar Soltani from the infrastructure team.

So what’s it like working at the pointy end of this continuous delivery model? It can feel like it’s all fun and games until you have to write the tests, or at least it did when I joined. One year, and several silly mistakes caught by simple unit tests later, I’m decidedly pro-testing. When your unit tests are part of the implementation and debugging process, it becomes a tool rather than an annoyance.

This is how integrating unit tests into your dev flow can pick up the bugs you’d otherwise miss:

Idiot check your first pass
In the same way that self driving cars may be vulnerable to errors but are definitely safer than a human, automated tests are not infallible but much better at catching silly mistakes in my code than I am. Before I ever try a manual run through of the feature I’m writing, I write tests and get them all to pass. It saves a lot of awkward stepping through and examining my variables to figure out exactly when that null pointer got dereferenced.
Prevent breaks in other parts of the codebase
A butterfly’s wing flap in Tokyo could cause a tsunami on the other side of the world. Similarly, a member added to the melee weapon class could cause the services backend to spring a memory leak. When you can run all the tests in the codebase on your seemingly solid commit automatically you could catch bugs that would otherwise be incredibly tricky to track down.

Benefits aside, tests can be a pain to write if the production code is awkward to reproduce in a test environment. I’ve picked up these techniques to make it as painless as possible:

Orthogonal code design
There are a lot of reasons to make your code a nice modular set of building blocks that can be rearranged or reused. Writing your classes like isolated mini-APIs that depend on other classes as little as possible makes testing a whole lot easier than having a tangled web of functionality strewn across the codebase. This way, your tests can cover that one block of functionality and isolate bugs to individual classes.
Test behaviour, not implementation
When I’m writing a new class, or implementing new functionality, I think of it as a set of input to output expected outcomes. I used to find myself making some dubious “testable” classes that inherited from the class I was testing to make sure such and such member was set to this or a particular object had been cached. More recently I’ve been taking the approach that I should be testing that the actual use cases are giving the right output from the input – if such and such member is set wrong, this should result in the output being not as expected. Then if the internals of the class are tweaked, your tests don’t need to be adapted. Design changes will mean having to fix the tests, but this has the side effect of making tests that act as free documentation, given that they demonstrate how the class should be used and the expected result of each case.
Mock your dependencies
Any time you create an instance of one of your class’s dependencies in a test you’re now testing more than just your class. This is okay in terms of data structures of course, but if that dependency contains logic your tests no longer isolate issues to this class. It does usually require the dependency to be handled via some interface, but this way you can take out the real dependency and replace it with a mock class that will spit out the inputs you need for your test.
Inject your dependencies
Every time I see a call like GetWorld() in the Sea of Thieves codebase I get really sweaty thinking about what the test must look like. Setting up a unit test environment in which this kind of call will work would be almost impossible. Setup functions that take the class’s dependencies, process, check, and if necessary cache them are much handier. They allow you to dunk whatever mocks you have in there and provide that bit of self-documentation of what dependencies it requires.
Write testable code responsibly and in moderation
Pragmatism of course has to take preference sometimes, and while good testing practices can save you a lot of time and effort down the line, sometimes it does make sense just to create a testable or call GetWorld() to grab a global data asset. How you balance long term risk with the short term reward depends on your time scale, the size of the codebase, the worst case scenario of failure, and many more factors.

Code Examples

I’ll go through a few concrete examples of these techniques, in a well known language called “C++ But With Syntax Errors”. I’ve also assumed magical garbage collection. Of course I’ve had to simplify this to include only the parts I’m trying to demonstrate, but if you find yourself thinking “well, I’d never write code like that”, try and think about how it could happen in the context of a more complex system with a lot more lines of code obfuscating what’s going on.

For these examples I’ve used various implementations of a class that represents a storage container in a game, like a chest that could be used by the player to store items in.

Orthogonality

In this case, imagine our storage container is already set up with several items inside and a player can retrieve them until it is empty. The RetrieveItem() function takes in a pointer to the player interacting with it and adds the first item in storage to that player’s inventory.

class Container 
{
public:
    void RetrieveItem(Character* Interactor)
    {
        if(ContainedItems.Count > 0)
        {
            Item* ItemToRetrieve = ContainedItems.Pop();
            Interactor.Inventory.AddItem(ItemToRetrieve);
        }
    } 

    Array ContainedItems;
}

To test this, we’ll have to create a test character to pass into that function, and check that character’s inventory to see if the item they’ve retrieved is there.

[Test]
GivenStorageContainerWithItem_WhenItemIsRetrieved_ThenCharacterInventoryContainsItem()
{
    Container TestContainer;
    Character TestCharacter;
    Item* TestItem = new Item();

    TestContainer.ContainedItems.Add(TestItem);

    TestContainer.RetrieveItem(&TestCharacter);

    TestTrue(TestCharacter.Inventory.Contains(TestItem));
}

This is testing a little more than just the item retrieval logic- it’s also tied in to how adding items to player inventories works, which could be very complicated under the hood. You’re also required to create a Character instance, which in most games will be a very heavy weight class, using up a lot of CPU time and memory.

Let’s try instead making it so that the container class no longer cares about what it is interacting with. By having it return a pointer to the item it’s retrieving from storage, not only do you no longer have to care about instantiating a character in the test but it makes it a more stand alone piece of logic. It’s no longer inherently tied to characters, or anything that has an inventory.

class Container
{
public:
    Item* RetrieveItem()
    {
        return ContainedItems.Pop()
    }

    Array ContainedItems;
}

[Test]
GivenStorageContainerWithItem_WhenRetrieveItemIsCalled_ThenThatItemIsReturned()
{
    Container TestContainer;
    Item* TestItem = new Item();
    
    TestContainer.ContainedItems.Add(TestItem);

    RetrievedItem = TestContainer.RetrieveItem();

    TestTrue(RetrievedItem == TestItem);
}

I’ve mutated the ContainedItem array directly there for brevity, which is not something I would normally do although it would be more permissible in a test. The next example will cover alternatives to this.

Behaviour, Not Implementation

In this example, the container has the ability to store new items as well as retrieve them. The array of stored items is not meant to be directly manipulated, so it’s been made a private member of the class.

class Container
{
public:
    Item* RetrieveItem()
    {
        return ContainedItems.Pop();
    }

    void StoreItem(Item* StoredItem)
    {
        ContainedItems.Add(StoredItem);
    }

private:
    Array ContainedItems;
}

Say you want to test that using the RetrieveItem() function not only returns the item pointer but removes that item from storage. That array is private so you can’t check directly, or set it up to contain a test item so that we have something to retrieve in the test. You might be tempted to make a testable container, and include a function that can get a reference to that array so you can manipulate it directly.

class TestableContainer: public Container
{
public:
    Array& GetContainedItems()
    {
        return ContainedItems;
    }
}

[Test]
GivenStorageContainerWithItem_WhenItemIsRetrieved_ThenItemIsRemovedFromContainer()
{
    TestableContainer TestContainer;
    Item* TestItem = new Item();

    Array& ItemArray() = TestContainer.GetContainedItems();
    ItemArray.Add(TestItem);

    TestableContainer.RetrieveItem();
    TestFalse(TestContainer.ContainedItems.Contains(TestItem));
}

This works fine, but say we want to change the implementation. Maybe we want to store the items in a map, or relegate the storage management to another class. It’s also worth thinking about whether this test tells us what we want to know. We really want to know whether the container is going to work as expected during its actual use cases.

Let’s instead test the examples that it’s going to be used for. When we use the RetrieveItem() function we expect to be returned a pointer to the first item that is stored in the container, and a null pointer if it’s empty. We also only expect there to be an item stored if we store it in there in the first place. We’ll alter the tests to use that flow, and we can make the TestableContainer class obsolete:

[Test]
GivenItemStoredInContainer_WhenRetrieveItemIsCalled_ThenThatItemIsReturned()
{
    Container TestContainer;
    Item* TestItem = new Item();

    TestContainer.StoreItem(TestItem);

    TestTrue(TestContainer.RetrieveItem() == TestItem);
}

[Test]
GivenNoItemsStoredInContainer_WhenRetrieveItemIsCalled_ThenNullptrIsReturned()
{
    Container TestContainer;

    TestTrue(TestContainer.RetrieveItem() == nullptr);
}

Mock Your Dependencies

In this example we’ll only worry about storing items, but this time the items keep track of how long they’ve been stored, for whatever fancy gameplay mechanic you’re going for.

class Container
{
public:
    void StoreItem(Item* ItemToStore)
    {
        ContainedItems.Add(ItemToStore);
        ItemToStore->StartStoredTimer();
    }

private:
    Array ContainedItems;
}

We want to test that the timer has been started when the item is stored but this could be very awkward to do in a test environment, especially if the timer is a private member of the item class. We can avoid worrying about the internals of the item class in a test on the container if we create a mock item which is a shell of the usual item class.

One way we can do this is by putting an interface on the item class and having the container manipulate it only via that interface. Then we can pass in our mock item which also inherits from that interface and only tracks whether or not that function has been called.

class Container
{
public:
    void StoreItem(ItemInterface* ItemToStore)
    {
        ContainedItems.Add(ItemToStore);
        ItemToStore->StartStoredTimer();
    }

private:
    Array ContainedItems;
}

interface ItemInterface
{
    virtual void StartStoredTimer() = 0;
}

class MockItem : public ItemInterface
{
    virtual void StartStoredTimer( )
    {
        TimerStarted = true;
    }

    bool TimerStarted = false;
}

[Test]
GivenContainer_WhenItemIsStored_ThenStorageTimerIsStarted()
{
    Container TestContainer;
    MockItem* TestItem = new MockItem();
    
    TestContainer.StoreItem(Cast(TestItem));

    TestTrue(TestItem.TimerStarted);
}

There are a few other ways to do this, the interface may be more appropriate for a heavyweight class that requires a lot of dependencies on construction. Otherwise you may want to inherit from the original class directly and override the function you’re interested in. Instead of an Item interface, you may want to make it a storeable item interface, including the functions that the container wants to access specifically.

Inject Your Dependencies

This time I’ll omit the store and retrieve functions for brevity, but they’ll still function the same way as in the other examples. Here each container in the game is set up on construction with a default set of items in it. These items are specified in a data asset that can be acquired from a global function.

class Container
{
public:
    Container()
         : ContainedItems()
    {
        ContainedItems = GetGlobalData().DefaultContainerContents;
    }

    //store and retrieve functions

private:
    Array ContainedItems;
}

In a test environment, you’d have to set up the whole application for that global call to be available. Data is very awkward in testing as well, to test that the container is working as expected you rely on production data that could be changed at any time. The alternative is to inject this dependency in the constructor, so that you can instead sling any data you want in there.

In this case you could also just pass in the DefaultContainerContents unless you needed anything else from the data asset. This is an example where optimising for testability can improve your production code too.

class Container
{
public:
    Container(DataAsset& GlobalData)
        : ContainedItems()
    {
        ContainedItems = GlobalData.DefaultContainerContents;
    }

    //store and retrieve functions

private:
    Array ContainedItems;
}

[Test]
GivenDefaultItemStoredInContainer_WhenRetrieveItemIsCalled_ThenThatItemIsReturned()
{
    DataAsset MockData;
    Item* TestItem = new Item;

    MockData.DefaultContainerContents.Add(TestItem);

    Container TestContainer(MockData);

    TestTrue(TestContainer.RetrieveItem() == TestItem);
}

Happy Testing!

Acquiring these techniques for my toolkit has made tests a much more pleasant and reliable experience for me. It’s certainly becoming less of a hoop to jump through, and more of a chance to reap the rewards of how neatly I’ve written my code. Automated testing itself can be unpopular, but I recommend learning to live with it – when it’s 3 weeks from “Wouldn’t it be cool if…” to having a feature in players’ hands, it’s hard not to appreciate it.

Thanks to Eelke Schipper , Chantelle Porritt and Topher Winward for proofreading.

Jessica Baker

Software Engineer