CosmosDB Graph and PowerBI - Connecting Datasets

In a previous article I showed how you can extract a clean dataset from a CosmosDB Graph database based on one node type (label). It’s likely however that your database contains multiple nodes that are connected with edges to represent relationships. In our reporting we might need the ability to navigate some of those relationships to allow for proper filtering or more complex visualizations.
In this article we’ll explore two techniques to extract those relationships and make them available to PowerBI.

The data

For this post, I’ll be working with a sample database representing an overly simplified transaction database stored in a graph format as seen below.

Graph schema

The graph enables the main application to perform more complex traversals and recommendations when needed, but in this case we need to construct a more simple relational connection.
In some cases, data is de-normalized in the database, and one such example is the relationship between Order and OrderItem which is represented not only via the containsItem edge, but at the same time we store an OrderId property on the OrderItem node. In other cases, we need to rely on the edge to extract the relationship, as is the case with Client and Order.

Below I’ll describe ways to be able to expose both these relationships in PowerBI.

Relationship with internal property

First we’ll work with the OrderItem nodes that have a property connecting them to the Order node. We’ll create a query and expand the data just like we did in the previous article.

Order items

Once we’ve expanded the OrderItem dataset we can Close and Apply the changes and get back to the main workbook where we’ll select the relationships tab.

Order - OrderItem relationship pending

There’s no connection between the two so we need to create one by just dragging the OrderId field from OrderItems onto the Orders table. Double click the line that gets created to verify that the relationship is created properly ( in this case it was because the system detected which fields match together).

Order - OrderItem relationship

Note the source and destination fields are properly selected as well as the relationship type detected correctly as 1 -> * from Order to OrderItem. And with this the relationship is created and fully usable in reporting.

Relationship with edge

For the second case, the operation is a bit trickier. This time we’ll work with the Client nodes that don’t have a direct connection with the ‘Order’ node other than a connecting edge.

First, retrieve the Clients dataset

Clients dataset

Next, we need to grab a dataset representing the relationships between clients and orders. In a graph database that is represented by edges, and in CosmosDB, edges are also stored as documents with a few special properties that store the details for relationships that the edge describes. Of particular interest in our case are _sink, representing the id of the node the edge goes into and _vertexId representing the id of the node the edge goes out of. So we are going to create a SQL query to retrieve those edges and the properties.
Create a new query named Order-Client, and use the Advance Editor to update the query to the one below:

Source = DocumentDB.Contents("", "dev", "test", [Query = "select c._vertexId as ClientId, c._sink as OrderId from c where c._isEdge = true and c.label = 'hasOrder'"])

Once we expand the result we can Close and apply and go back to the workspace’s relationship tab. Once there, use the same technique as above to create the 1->* relationship between Clients and Order-Client and the 1-1 relationship between Orders and Order-Client. When done, your screen should look like below:

Orders - Clients relationship

Share Comments