Monday, July 1, 2013

Unit testing Azure Table Storage Queries

I was thinking about how to Unit Test queries to Azure Table Storage. My first thought was to create a shim on the TableServiceContext object that will intercept queries and return what I want instead. In some cases this is fine, but there are some cases where I'd really like to test whether the LINQ query I wrote is correct or not. For this, what I really wanted was the ability to create an IEnumerable object that contained the "contents" of my "table" for testing, then write a query that would filter my IEnumerable the same way the Azure Table Storage API would filter my real entities. Most of the solutions available online operate on the assumption that you will create the return value from the query before calling into the shim. This is fine for testing the run up to the query, and testing the code that happens after, but it does little to make sure the query itself is correct.

I set out to create a shim that would take an IEnumerable and filter it as though it were the results in Azure Tables. This is not a perfect reproduction of the Azure environment, but it works for many purposes. The main problem with this is that the CreateQuery<T> method in TableServiceContext returns a DataServiceQuery<T> object, which is not easily shimmed. However, if you are using a CloudTableQuery<T> object in your queries you can Shim the Execute method to get the outcome you want.

The really tricky part was how to run the query on your IEnumerable instead of the actual table. It turns out that one of the properties exposed by the IQueryable interface is Expression, which returns the expression being used to filter the query. In a CloudTableQuery and DataServiceQuery object, the Expression is of the data type MethodCallExpression. A little digging in the tree (by checking the Arguments property of the Expression) and you will find that somewhere in there is an Expression whose specific type is UnaryExpression. This is the actual expression that will be used to filter the results. In Azure, that means it will be converted to the filter string that's included in the REST query, but there's no reason we can't apply it to our own IEnumerable instead.

How to do this? Easy. First convert the IEnumerable to an IQueryable. Then convert the UnaryExpression to a LambdaExpression, then call the Where method on the IQueryable object and you're done.

Additionally, if you don't call AsTableServiceQuery() on your queries you're not entirely out of luck. You can pull a similar trick by putting a Shim into DataServiceQuery<T>.GetEnumerator.

So after figuring out these two things, with a little bit of extra magic for how to actually set which objects are being used, here is the code I came up with.



[TestMethod]
public void here_is_my_test()
{
    IEnumerable<MyEntityType> fakeTableEntries = GenerateFakeTableEntries();

    using (ShimsContext.Create())
    {
        TableContextSpy<MyEntityType> spy = new TableContextSpy<MyEntityType>();
        spy.AddRange(fakeTableEntries);

        DoQuery();

        AssertStuff();
    }
}

public class TableContextSpy<T> where T : TableServiceEntity
{
    SortedSet<T> FakeTable = null;

    public TableContextSpy()
        : base()
    {
        IComparer<T> comparer = new EntityComparer<T>();
        FakeTable = new SortedSet<T>(comparer);

        ShimCloudTableQuery<T>.AllInstances.Execute = (instance) =>
        {
            // Get the expression evaluator.
            MethodCallExpression ex = (MethodCallExpression)instance.Expression;

            // Depending on how I called CreateQuery, sometimes the objects
            // I need are nested one level deep.
            if (ex.Arguments[0] is MethodCallExpression)
            {
                ex = (MethodCallExpression)ex.Arguments[0];
            }

            UnaryExpression ue = ex.Arguments[1] as UnaryExpression;

            // Get the lambda expression
            Expression<Func<T, bool>> le = ue.Operand as Expression<Func<T, bool>>;

            var query = FakeTable.AsQueryable();
            query = query.Where(le);
            return query;
        };

        ShimDataServiceQuery<T>.AllInstances.GetEnumerator = (instance) =>
        {
            // Get the expression evaluator.
            MethodCallExpression ex = (MethodCallExpression)instance.Expression;

            // Depending on how I called CreateQuery, sometimes the objects
            // I need are nested one level deep.
            if (ex.Arguments[0] is MethodCallExpression)
            {
                ex = (MethodCallExpression)ex.Arguments[0];
            }

            UnaryExpression ue = ex.Arguments[1] as UnaryExpression;

            // Get the lambda expression
            Expression<Func<T, bool>> le = ue.Operand as Expression<Func<T, bool>>;

            var query = FakeTable.AsQueryable();
            query = query.Where(le);
            return query.GetEnumerator();
        };
    }

    public void Add(T entity)
    {
        FakeTable.Add(entity);
    }

    public void AddRange(IEnumerable<T> items)
    {
        FakeTable.UnionWith(items);
    }
}

There are a couple issues with this. First, it only handles queries, it does not handle adding, updating or deleting. There are fairly simple ways to do that, however, by putting Shims onto AddObject, UpdateObject, DeleteObject and SaveChanges.

Second, I have not tested this with queries that are not built using Linq. For example, if I were to do something like:

var query = from obj in CreateQuery<MyEntityType>(tableName)
            where obj.RowKey.CompareTo("foo") > 0
            select obj;
query = query.Where(obj => obj.PartitionKey == "pk");
query = query.Where(obj => obj.SomeOtherProperty == "someProp");

This might work with the code I posted above, but it also may fail terribly. In any case, this at least works for some simple queries.