CosmosDB Graph using Gremlin.NET

Up until recently, the only way to communicate with the CosmosDB Graph API was though the DocDB Graph Client - Microsoft.Azure.Graph . Like I mentioned in some of my older posts that was not the most optimal implementation.
Luckily we can now use a native Gremlin driver to communicate with the CosmosDB graph database. Gremlin.NET is an open source generic Gremlin driver that can be used to communicate to a CosmosDB Graph account.

The what

First thing to notice is that there is a new endpoint exposed for our CosmosDB account in the portal:

The new endpoint is what you need for your connection string when using the native driver. The access keys are the same.

The why

If you’ve been on the Cosmos Graph journey from the beginning, like I was, you are probably now wondering, if it’s worth changing/porting your code over. The answer is definitely yes for two reasons: the native driver is usually faster -because the queries are compiled on the server- , and the code change is not that big - see the next section for exactly how it’s done.

Here are some initial performance results I was able to measure comparing the two APIs against the same collection:

g.V().count()

Cost Time Size
DocDB Client 461RU 1.4s 1.7MB
Native Driver 113RU 0.3s 1 KB

g.V().limit(10)

Cost Time Size
DocDB Client 7.88RU 0.5s 21KB
Native Driver 10RU 0.4s 19KB

Simple Traversal

Cost Time Size
DocDB Client 127RU 0.9s 2KB
Native Driver 127RU 0.4s 1KB

The how

Finally, some code. It’s not a lot because it’s very simple.
First you need to install the Gremlin.NET nuget package nuget install Gremlin.Net then use a couple of snippets to connect to your Cosmos Instance and start querying.

Connecting:

1
2
3
4
5
6
var server = new GremlinServer(
connectionString,
443,
enableSsl: true,
username: $"/dbs/{databaseName}/colls/{collectionName}",
password: key);

Querying:

1
2
3
4
using (var client = new GremlinClient(_server))
{
var result = await client.SubmitAsync<T>(queryString);
}

The bad

Lastly, some bad sides of the story. Because this is an opensource project, that is meant to work with any Gremlin Server, the driver is missing some Cosmos specific features.

  • The library is not returning the RU usage from Cosmos although the result is sent back from Cosmos
  • The library does not auto-handle [Request too large] exceptions like the Graph library does. One can easily implement that themselves, since all the info regarding retry timeout is included in the exception being thrown.
Share Comments