Windows Azure Table Storage Service can be queried using various approaches. I use the Windows Azure Storage NuGet package and reusable queries.
A Query on Windows Azure Table Storage Service usually completes in 200 milliseconds. On busy systems this is an eternity! To help my services perform better, and to reduce costs of operation, I built the TableStorageReader to executes queries with cache, in turn reducing the number of transactions and reducing the latencies created by communications over the network.
The reader executes reusable queries. Reusable queries greatly simplify the application’s design by encapsulating query logic in named concepts. They are like Stored Procedures in SQL Server, where team members can discover functionality by reading file names from the query folders. More details about reusable queries in the reference section at the bottom of this post.
The TableStorageReader is a fluent api that allows you to create tables, query tables and apply cache over queries.
The code from this Post is part of the Brisebois.WindowsAzure NuGet Package
To install Brisebois.WindowsAzure, run the following command in the Package Manager Console
PM> Install-Package Brisebois.WindowsAzure
Querying Table Storage For The Top 4 Entities From a Partition
var q = new GetTopEntriesForPartition<DynamicTableEntity>("partitionKey", 4); var entities = await TableStorageReader .Table("birthdays") .Execute(q);
Querying Table Storage Service With Query Caching
Caching can be extremely beneficial in situation where stale data is acceptable. If your application or system is able to work with eventual consistency, meaning that it is able to work with data that is more than a second old, then caching is for you. It will speedup your application’s response time by reducing the number of storage transactions. In busy systems using a 2 seconds cache can yield surprising results!
Query With a Cache Miss Executed in 1.7452408 seconds
Query With a Cache Hit Executed in 0.0004883 seconds
Using a 1 minute cache
var q = new GetTopEntriesForPartition<DynamicTableEntity>("partitionKey", 4); var entities = await TableStorageReader .Table("birthdays") .WithCache() .Execute(q);
Using a 30 minute cache
var absoluteExpiration = new DateTimeOffset(DateTime.UtcNow, TimeSpan.FromMinutes(30)); var policy = new CacheItemPolicy { AbsoluteExpiration = absoluteExpiration }; var q = new GetTopEntriesForPartition<DynamicTableEntity>("partitionKey", 4); var entities = await TableStorageReader .Table("birthdays") .WithCache(policy) .Execute(q);
Using a 30 minute cache with an additional cache hint
var absoluteExpiration = new DateTimeOffset(DateTime.UtcNow, TimeSpan.FromMinutes(30)); var policy = new CacheItemPolicy { AbsoluteExpiration = absoluteExpiration }; var q = new GetTopEntriesForPartition<DynamicTableEntity>("partitionKey", 4); var entities = await TableStorageReader .Table("birthdays") .WithCache(policy,"special cache") .Execute(q);
TableStorageReader
public class TableStorageReader { private readonly CloudStorageAccount storageAccount; private readonly CloudTableClient tableClient; private readonly CloudTable tableReference; private CacheItemPolicy cachePolicy; private string cacheHint = string.Empty; private TableStorageReader(string tableName) { var cs = CloudConfigurationManager.GetSetting("StorageConnectionString"); storageAccount = CloudStorageAccount.Parse(cs); tableClient = storageAccount.CreateCloudTableClient(); tableReference = tableClient.GetTableReference(tableName); var tableServicePoint = ServicePointManager .FindServicePoint(storageAccount.TableEndpoint); tableServicePoint.UseNagleAlgorithm = false; } public TableStorageReader CreateIfNotExist() { tableReference.CreateIfNotExists(); return this; } public async Task<ICollection<TEntity>> Execute<TEntity>(CloudTableQuery<TEntity> query) where TEntity : ITableEntity { if (query == null) throw new ArgumentNullException("query"); return await Task.Run(() => { if (cachePolicy == null) return query.Execute(tableReference); var cached = new CloudTableQueryCache<TEntity>(query, cachePolicy, cacheHint); return cached.Execute(tableReference); }); } public static TableStorageReader Table(string tableName) { return new TableStorageReader(tableName); } //Cache for 1 minute public TableStorageReader WithCache() { var policy = new CacheItemPolicy { AbsoluteExpiration = DateTime.UtcNow.AddMinutes(1d) }; return WithCache(policy); } public TableStorageReader WithCache(CacheItemPolicy policy) { return WithCache(policy, string.Empty); } public TableStorageReader WithCache(CacheItemPolicy policy, string cacheKey) { if (policy == null) throw new ArgumentNullException("policy"); if (cacheKey == null) throw new ArgumentNullException("cacheKey"); cachePolicy = policy; cacheHint = cacheKey; return this; } }
Reusable Query Example
The reusable query implements a CloudTableQuery, which contains logic for computing the hash for the cache key. Implement the Execute method and the CacheKey property. The CacheKey property should return a unique value for each permutation of the query object. The Execute method will be executed by the TableStorageReader, if WithCache is used, the query result will be cached. Furthermore, queries are immutable and parameters are passed through the queries’ constructor. This means that you can execute the same query on many TableStorageReader instances in parallel. Once all the parallel queries have executed, you can aggregate the results for the consumer.
public class GetTopEntriesForPartition<TEntity> : CloudTableQuery<TEntity> where TEntity : ITableEntity, new() { private readonly string tablePartition; private readonly int take; private readonly string cacheKey; public GetTopEntriesForPartition(string partition) : this(partition, 100) { } public GetTopEntriesForPartition(string partition, int take) { if (partition == null) throw new ArgumentNullException("partition"); tablePartition = partition; this.take = take; var queryCacheHint = "GetTopEntriesForPartition" + tablePartition + take; cacheKey = queryCacheHint; } public override Task<ICollection<TEntity>> Execute(CloudTable model) { if (model == null) throw new ArgumentNullException("model"); return Task.Run(() => { var condition = MakePartitionKeyCondition(); var tableQuery = new TableQuery<TEntity>(); tableQuery = tableQuery.Where(condition).Take(take); return (ICollection<TEntity>)model.ExecuteQuery(tableQuery) .ToList(); }); } public override string GenerateCacheKey(CloudTable model) { return cacheKey; } private string MakePartitionKeyCondition() { var value = tablePartition.ToUpperInvariant(); return TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, value); } }
CloudTableQuery
public abstract class CloudTableQuery<TEntity> : IModelQuery<Task<ICollection<TEntity>>, CloudTable> { public abstract Task<ICollection<TEntity>> Execute(CloudTable model); public abstract string GenerateCacheKey(CloudTable model); protected string MakeCacheKeyHash(string queryCacheHint) { if(string.IsNullOrWhiteSpace(queryCacheHint)) throw new ArgumentNullException("queryCacheHint"); using (var unmanaged = new SHA1CryptoServiceProvider()) { var bytes = Encoding.UTF8.GetBytes(queryCacheHint); var computeHash = unmanaged.ComputeHash(bytes); if (computeHash.Length == 0) return string.Empty; return Convert.ToBase64String(computeHash); } } }
References
- Build Reusable, Testable Queries Part 1
- Build Reusable, Testable Queries Part 2
- How to use the Table Storage Service
- Microsoft.WindowsAzure.Storage.Table Namespace
- Querying Tables and Entities
- TableQuery Methods
Filed under: Microsoft Azure, Table Storage Tagged: Best Practices, C#, Caching, Microsoft Azure, Queries, Table Storage, TableQuery
