CosmosDB Query with Partition

ComsosDB is the first database system that promises <10ms latency on reads at the 99 percentile, and backs it up by an SLA. The way to achieve that performance though is though smart partitioning of your data. Here are a couple of notes that you can use when designing your queries that will definitely help improve performance - especially in an unlimited collection (one with multiple physical partitions)

If you know your Partition, send it in the query

Really, the best way to squeeze as much performance as you can from your query is to know your document id and PartitionKey. Most of the time you don’t know the id, but knowing the PartitionKey is super important.

If you send a partition key either as a ‘WHERE’ for SQL or ‘has(‘PartitonKey’, xyz) for graph the SDK will automatically forward that request to the specified partition.
For identical results, you can also specify the partition key as a FeedOption regardless of API used:
new FeedOptions() { PartitionKey = new PartitionKey(xyz) }

Depending on codebase and how you’ve managed your query creation, you might find it beneficial to use one way or the other - but in the end they are executed in the exact same way and yield the same results and same performance and usage metrics.

If you need to retrieve data from a small number of partitions, break the query into separate single partition calls

When you run a query that does not resolve to a single PartitionKey, the query engine has no options but to actually try and search for your data in every physical partition. By default, when you create an unlimited collection, CosmosDB creates 10 such physical partitions, but the number can increase as you add more data.
This is accomplished by the DocumentClient SDK, which actually “fans-out” your query to all partitions, joins the results and returns them back to you. This is obviously not very efficient, and it gets worse as your data size increases. So, in the rare occations you do need a cross partition query, evaluate and test the option of actually sending them as individual calls with a specified partition that will hit only one physical partition and are executed in one call.

To be able to actually run a query that goes cross-partition, you need to set the EnableCrossPartitionQuery feed option to true, otherwise your query will throw an exception.

Share Comments