The fundamental pricing unit in cosmos is called Request Unit. The RequestUnit is a measure of throughput and it is an abstraction over compute, memory and IOPS required to serve a request. The idea with the model is that number of request units required for an operation is deterministic so a query will always cost the same. The cost of every operation is returned back to the client via the SDK or response headers so you can always track and manage your cost.
So, what happens if you’re trying to use more than you have provisioned? Every time you issue a request that would consume more resources than available, Cosmos will reject the call and return details about how long you should wait before issuing the query again for best chance of execution.
If you’re using the Cosmos SDK you are going to benefit from a built-in automatic retry policy that will try and seamlessly rerun your query according to the timeouts recommended in Cosmos. If Cosmos still can’t fulfill the request, the SDK will eventually give up (default policy is to retry 5 times) and you will get an “Request rate too large” exception that you will need to handle manually.
With this functionality, often times you don’t even have to worry about exceeding the provisioned resources, assuming you are not getting long-running or very high spikes in load.
Here are some valuable insights to help you manage your cost on CosmosDB
- You pay for what you provision, not what you use
This means that the RUs that you provision in the portal is what you pay for, even if you don’t use any of them.
- RU scale with regions
When you enable georeplication on your collection, you are essentially provisioning the same Rus in every region. So to accurately calculate your provisioned RUs, you need to multiply that number by the number of regions you have enabled. For example, if you provision 10.000 RUs for a collection that is replicated across 3 regions, you will pay 30.000RUs for that collection.
- RU divide by number of physical partitions
In an unlimited collection, the RUs are divided equally among the physical partitions. By default, you start with 10 partitions, but the number can grow with your data. For example, if you provision 10.000 RUs, each partition will get 10.00RUs . This is why it’s important to make sure you use a proper paritionKey distribution, to ensure even usage of you resources
- RUS are charged by the hour
CosmosDB does not support minute billing at the time. This means that every hour, you will pay for the highest provisioned RUs that you set during that hour. For example, if your regular provisioned number is 10.000 and for even the briefest moment during that hour, you scaled your collection to 100.000 RUs, you will incur the cost of 100.000 RUs for that entire hour.