testable rust

One thing that I have been struggling a bit with is making Rust code more testable. I am running into issues because of the combination of my lack of experience writing well tested code (getting better by the day, but have to start somewhere) and the lack of examples in Rust since it is a newer language. I am writing this to document a pattern I have found to be useful.

I will explain how to implement a Repository(-ish) pattern in Rust.I found this especially useful when writing web servers in Rust (although there isn’t any reason it can’t be used elsewhere). This is something I saw in a coworker’s project and it ended up helping me. I want to share a small example of it so that others can experiment with it and see if it is useful for them.

Tests

Let’s start with the tests that show what our intentions are.

Let’s set up our test module:

#[cfg(test)]
mod should {
    use super::*;
    use mockall::predicate::*;
    use anyhow::Result;

    // ... tests will go here ...
}

Which brings into scope all our code needed for testing. Naming the test module should (or anything besides ‘tests’…) will make our tests read nicely:

running 3 tests
test user_cache::should::return_default_nickname_when_no_user_found ... ok
test user_cache::should::return_user_nickname_when_user_found ... ok
test user_cache::should::store_new_user_with_consistent_hash ... ok

test result: ok. 3 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

Let’s deal with the first two tests: getting data from the cache.

#[test]
fn return_default_nickname_when_no_user_found() -> Result<()> {
    // Setup
    let test_user_id = 42;

    let user_cache_mock = {
        // automocked traits will be named `MockNameOfTrait`
        let mut mock = MockCache::new();
        // automocked trait methods are `expect_method_name` on the mock object
        mock.expect_retrieve_user()
            // we want to ensure the correct id is passed in.
            // `eq` comes from `mockall::predicate`
            .with(eq(test_user_id))
            // Return our empty result (signifies no errors, but nothing found)
            .returning(|_user_id| Ok(None));
        // Wrap the mock in an Arc to make it easy to pass around
        // (this will be explained in the implementation)
        Arc::new(mock)
    };

    // Call the function under test
    let user_nickname = get_user_nickname(user_cache_mock, test_user_id)?;

    // Assert that the name matches the default name
    // (which is defined as a constant elsewhere)
    assert_eq!(DEFAULT_NICKNAME, user_nickname);

    Ok(())
}

#[test]
fn return_user_nickname_when_user_found() -> Result<()> {
    let test_user_id = 9;
    let expected_user_name = "User Name".to_string();

    let user_cache_mock = {
        let mut mock = MockCache::new();
        mock.expect_retrieve_user()
            .with(eq(test_user_id))
            // This time returning actual data
            .returning(|_user_id| {
                Ok(Some(UserData {
                    nickname: expected_user_name.clone(),
                    some_important_number: 100,
                }))
            });
        Arc::new(mock)
    };

    let user_name = get_user_nickname(user_cache_mock, test_user_id)?;

    assert_eq!(expected_user_name, user_name);

    Ok(())
}

This sums up some really basic requirements, but also leaves some big unknowns (these tests are using things that we have not defined yet).

Before we dive into how Cache and MockCache work, I’ll show a basic implementation that satisfies these tests.

pub fn get_user_nickname(cache: SharedCache, user_id: u64) -> anyhow::Result<String> {
    match cache.retrieve_user(user_id) {
        Ok(Some(data)) => Ok(data.nickname),
        Ok(None) => Ok(DEFAULT_NICKNAME.to_string()),
        Err(error) => Err(error),
    }
}

This is a simple match statement that returns a name if found, otherwise it returns a default (or propagates an error). The actual business logic here is easy to see, but we still don’t know exactly what cache.retrieve_user() is all about.

So what are Cache and SharedCache?

Cache Trait

First the definition for Cache and UserData (minus boilerplate derives):

pub trait Cache {
    fn retrieve_user(&self, user_id: u64) -> anyhow::Result<Option<UserData>>;
    fn store_user(&self, user_id: u64, user_data: UserData) -> anyhow::Result<()>;
}

pub struct UserData {
    nickname: String,
    some_important_number: i32,
}

(Note: if the trait will be abstracting over different data then an associated or generic type can be used, but I wanted to keep this simple so here we are using a concrete type and anyhow::Result instead of custom error types)

This Cache trait is the seam between our app and an external dependency.

In this case it is the get/set for our cache (and it can be called the Repository pattern when related to data stores). This allows the actual cache implementation to be hidden while we work on the business logic around it. This can also code show clearer intentions (getting something that impl Cache says a lot more than just passing a redis connection around, which could be used to get/set any arbitrary data) at the cost of one more layer of abstraction.

We have a trait that is our contract with the cache, but we still need some way to use it.

To satisfy our mocking use case we will use mockall. If the attribute macro automock is added to a trait declaration (not at the implementation) then we will automatically get a MockCache type that has all the mocking helper methods. That is all that is needed for simple cases like this. See mockall’s documentation for more options if the trait is not as simple (Note: mockall also works with the async_trait crate). To keep the mock from existing in production code we can use conditional compilation to only compile the mock when compiling tests.

// Only need this imported when compiling for tests
#[cfg(test)]
use mockall::automock;

// The `cfg_attr` will apply the `automock` attribute when compiling for `test`
#[cfg_attr(test, automock)]
pub trait Cache {...}

SharedCache Type Alias

We now have a trait to define our cache, and a way to mock that… but we still can’t use it with our get_user_nickname function since that is expecting a SharedCache type. This is simply a type alias, but it will need some explanation of what it means and why it is useful.

pub type SharedCache = Arc<dyn Cache + Send + Sync>;

This type alias might not be needed for lots of cases. There are times when a generic or impl Cache will work as a function argument, but I am showing this because it is a fairly clean way of having an object that can be passed around freely, making it easy to use in many different contexts (for example in multithreaded or async code).

Let’s go through each part individually:

dyn Cache

We want something that implements Cache and this will be a dynamic dispatch (looked up at runtime as opposed to filled in at compile time like generics are).

... + Send + Sync

This object should also be safe to send/share with other threads. (can be removed if that isn’t actually a requirement)

Arc<...>

Wrap that entire object in an Arc so it is easily/cheaply cloneable and moved around. (Especially useful when a single object is constructed at server startup and a clone is passed to all worker threads to use)

pub type SharedCache = ...

Now assign it a name that we can use elsewhere instead of the verbose type annotation. Even after understanding what the type does, an alias reduces noise in function signatures so we can see intentions rather than just details.

Trait Implementation

Now that we have SharedCache and Cache defined we can now use them in function signatures or in a struct/enum definition, depending on our use case. Maybe in some cases an impl Cache is enough, or maybe we need to be able to clone it (say to pass to an async function that will be run in a background thread) so in that case we can use SharedCache. One thing to note is that by default SharedCache itself does not implement Cache and if you find yourself needing that then Cache can be implemented on Arc<T> where T is Cache (and just calls .as_ref() or similar to get to the inner types implementation).

One last thing is to implement the trait on a struct. In this case I will use a very bad implementation just to show the structure and one helpful method when dealing with the type alias SharedCache.

pub struct TotallyRealCache {
    data: Arc<Mutex<HashMap<u64, UserData>>>,
}

impl TotallyRealCache {
    pub fn new() -> Self {
        Self {
            data: Arc::new(Mutex::new(HashMap::new())),
        }
    }

    /// This method will upcast this type automatically on construction.
    /// This is useful if the type is just going act as a trait object
    /// If it ends up being used for other things as well then
    /// upcasting at the call site is a better way go about it
    pub fn new_shared() -> SharedCache {
        Arc::new(Self::new())
    }
}

/// Just enough to make this compile. I do not recommend actually using this!
/// Real implementations would use a redis crate or similar. This layer should
/// be as thin as possible and can be covered in a full integration/system test
impl Cache for TotallyRealCache {
    fn retrieve_user(&self, user_id: u64) -> Result<Option<UserData>> {
        let data = self.data.lock().unwrap();
        Ok(data.get(&user_id).cloned())
    }

    fn store_user(&self, user_id: u64, user_data: UserData) -> Result<()> {
        let mut data = self.data.lock().unwrap();
        data.insert(user_id, user_data.clone());
        Ok(())
    }
}

And that is all there is to it. Depending on the structure of the app different aspects of this pattern can be used and it can be added to for more complex things (say if this would be packaged as a crate to be used in other projects). Either way I hope this explains things enough that it is helpful.

Conclusion

I think the main benefit of this pattern is to allow for keeping a clear seam right where there are external dependencies. I see this being most useful when interacting with things like redis, a db, kafka, another api, etc. The key is to create a trait right at that boundary. The trait creates the interface that you want to work with within your app. The implementation can be very short (and tested in a full integration/system test) but all the business logic around the external dependency is now easily testable in unit tests. This is not to get rid of the integration tests as there is still code within the trait implementation that isn’t very easy to unit test; it needs to test against the real thing.

This still needs some kinks worked out because I have a feeling there are better ways to do this. But as it is I think this can be useful.