Introduction
Azure Cosmos DB is a globally distributed database service provided by Microsoft, that supports multiple data models with supported APIs. It accommodates various data models, enabling developers to create scalable and high-performance applications. When using Cosmos DB with C#, interaction typically occurs through its supported APIs.
A data model refers to the way data is structured and stored within the database. This concept is essential because Cosmos DB is a multi-model database, meaning it supports multiple data models and APIs to address diverse application requirements. These APIs enable Cosmos DB to operate as a versatile, multi-model database while ensuring efficiency, consistency, and scalability.
Here’s an overview of the data models and a breakdown of the APIs available in Cosmos DB.
Data Models with Usage and Supported APIs
1. Document Mode
This model is built to store and manage data in a flexible, schema-agnostic manner, often utilizing JSON-like documents.
1.1. Usage of Data Model
It is particularly well-suited for applications that deal with hierarchical or semi-structured data. Each document is independent, supporting nested structures and varying fields, which removes the necessity for fixed schemas. Common use cases include storing data with flexible schemas, such as user profiles or product catalogs.
1.2. Supported API
This model support, particularly through its SQL API, allows developers to perform rich queries, indexing, and transactions on JSON documents, ensuring efficient data retrieval and manipulation.
The MongoDB API is also compatible with the MongoDB wire protocol, so you can use MongoDB drivers, tools, and query syntax. You need to be familiar with MongoDB queries and operators.
MongoDB API is preferred if you have an existing MongoDB application and want to migrate to Cosmos DB with minimal changes.
Below is a C# example to illustrate the ‘Document Model’ with its SQL API
/*
Program to creates CosmosClient, Database and container
Product creation, insertion and query on the basis on category
*/
using Microsoft.Azure.Cosmos;
using System;
using System.Threading.Tasks;
public class Program
{
private static readonly string EndpointUri = "<your-cosmosdb-endpoint>";
private static readonly string PrimaryKey = "<your-cosmosdb-key>";
private static readonly string DatabaseId = "ProductCatalog";
private static readonly string ContainerId = "Products";
public static async Task Main(string[] args)
{
CosmosClient cosmosClient = new CosmosClient(EndpointUri, PrimaryKey);
Database database = await cosmosClient.CreateDatabaseIfNotExistsAsync(DatabaseId);
Container container = await database.CreateContainerIfNotExistsAsync(ContainerId, "/category");
Console.WriteLine("Database and container created successfully.");
var product = new
{
id = Guid.NewGuid().ToString(),
name = "Wireless Mouse",
category = "Electronics",
price = 29.99,
attributes = new
{
brand = "Logitech",
color = "Black",
wireless = true
}
};
ItemResponse<dynamic> response = await container.CreateItemAsync(product, new PartitionKey(product.category));
Console.WriteLine($"Inserted product with ID: {response.Resource.id}");
var sqlQueryText = "SELECT * FROM c WHERE c.category = 'Electronics'";
QueryDefinition queryDefinition = new QueryDefinition(sqlQueryText);
FeedIterator<dynamic> queryResultSetIterator = container.GetItemQueryIterator<dynamic>(queryDefinition);
while (queryResultSetIterator.HasMoreResults)
{
FeedResponse<dynamic> currentResultSet = await queryResultSetIterator.ReadNextAsync();
foreach (var item in currentResultSet)
{
Console.WriteLine($"Product ID: {item.id}");
Console.WriteLine($"Name: {item.name}");
Console.WriteLine($"Category: {item.category}");
Console.WriteLine($"Price: {item.price}");
Console.WriteLine("Attributes:");
Console.WriteLine($" Brand: {item.attributes.brand}");
Console.WriteLine($" Color: {item.attributes.color}");
Console.WriteLine($" Wireless: {item.attributes.wireless}");
Console.WriteLine();
}
}
}
}
2. Key-Value Model
The model is one of the simplest and most efficient data models, where data is stored as a collection of key-value pairs. Each key is distinct and corresponds directly to a specific value.
2.1. Usage of Key-Value Model
This model is ideal for scenarios requiring fast lookups, caching, or storing simple data structures.
2.2. Supported API
This model is supported through its Table API, which is compatible with Azure Table Storage.
C# Example. Key-Value Model with Cosmos DB Table API
/*
This program demonstarte, how to create a CloudStorageAccount, CloudTableClient to interact with the Table API and a reference to the table (if it doesn't exist)
How to Insert a key-value pair and retrieve a value by key
*/
using Microsoft.Azure.Cosmos.Table;
using System;
using System.Threading.Tasks;
public class Program
{
private static readonly string ConnectionString = "<your-cosmosdb-table-api-connection-string>";
private static readonly string TableName = "MyKeyValueTable";
public static async Task Main(string[] args)
{
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(ConnectionString);
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();
CloudTable table = tableClient.GetTableReference(TableName);
await table.CreateIfNotExistsAsync();
var entity = new DynamicTableEntity("partitionKey", "rowKey");
entity.Properties.Add("Name", new EntityProperty("M Imran Ansari"));
entity.Properties.Add("Age", new EntityProperty(35));
TableOperation insertOperation = TableOperation.InsertOrReplace(entity);
await table.ExecuteAsync(insertOperation);
Console.WriteLine("Key-value pair inserted successfully.");
TableOperation retrieveOperation = TableOperation.Retrieve<DynamicTableEntity>("partitionKey", "rowKey");
TableResult result = await table.ExecuteAsync(retrieveOperation);
if (result.Result != null)
{
var retrievedEntity = (DynamicTableEntity)result.Result;
Console.WriteLine($"Retrieved Name: {retrievedEntity.Properties["Name"].StringValue}");
Console.WriteLine($"Retrieved Age: {retrievedEntity.Properties["Age"].Int32Value}");
}
else
{
Console.WriteLine("Key-value pair not found.");
}
}
}
3. Column-Family Mode
This data model organizes data into columns and rows, but unlike traditional relational databases, it groups columns into families. Each column family contains a set of related columns, and each row can have a different set of columns within a family.
3.1. Usage of Column-Famil Model
This model is highly efficient for read and write operations, especially when dealing with large-scale, sparse datasets. It is commonly used in applications like time-series data, recommendation systems, and big data analytics.
3.2. Supported API
This model is supported through its Cassandra API, which is compatible with Apache Cassandra. Here is an explanation and a C# example of how to use the Column-Family Model in Cosmos DB.
C# Example. Column-Family Model with Cosmos DB Cassandra API
/*
This program demonstrates, how to create a Cassandra cluster instance, a session to interact with the Cassandra API and a keyspace (if it doesn't exist)
How to create a table with a column family, Insert data into the table and retravel
*/
using Cassandra;
using System;
public class Program
{
private static readonly string ContactPoint = "<your-cosmosdb-cassandra-endpoint>";
private static readonly string Username = "<your-username>";
private static readonly string Password = "<your-password>";
private static readonly string Keyspace = "mykeyspace";
private static readonly string TableName = "mycolumnfamilytable";
public static void Main(string[] args)
{
Cluster cluster = Cluster.Builder()
.WithCredentials(Username, Password)
.WithPort(10350) // Default port for Cosmos DB Cassandra API
.AddContactPoint(ContactPoint)
.Build();
ISession session = cluster.Connect();
session.Execute($"CREATE KEYSPACE IF NOT EXISTS {Keyspace} WITH REPLICATION = {{ 'class' : 'NetworkTopologyStrategy', 'datacenter1' : 1 }};");
session.Execute($"USE {Keyspace};");
session.Execute($@"
CREATE TABLE IF NOT EXISTS {TableName} (
user_id UUID PRIMARY KEY,
profile MAP<TEXT, TEXT>,
preferences MAP<TEXT, TEXT>
);");
Console.WriteLine("Table created successfully.");
var userId = Guid.NewGuid();
var insertQuery = $@"
INSERT INTO {TableName} (user_id, profile, preferences)
VALUES ({userId}, {{ 'name': 'M Imran Ansari', 'age': '35' }}, {{ 'theme': 'dark', 'notifications': 'enabled' }});";
session.Execute(insertQuery);
Console.WriteLine("Data inserted successfully.");
var selectQuery = $"SELECT * FROM {TableName} WHERE user_id = {userId};";
var result = session.Execute(selectQuery);
foreach (var row in result)
{
Console.WriteLine($"User ID: {row["user_id"]}");
Console.WriteLine("Profile:");
foreach (var entry in row["profile"] as System.Collections.Generic.IDictionary<string, string>)
{
Console.WriteLine($"{entry.Key}: {entry.Value}");
}
Console.WriteLine("Preferences:");
foreach (var entry in row["preferences"] as System.Collections.Generic.IDictionary<string, string>)
{
Console.WriteLine($"{entry.Key}: {entry.Value}");
}
}
}
}
4. Graph Model
The Graph Model in Azure Cosmos DB is designed to represent and store data as a collection of nodes (vertices) and edges (relationships).
4.1. Usage of Graph Model
This model is ideal for applications that require managing complex relationships, such as social networks, recommendation engines, fraud detection systems, or knowledge graphs.
4.2. Supported API
This model is supported through its Gremlin API to interact with the Graph Model, allowing you to perform graph traversals and queries using the Gremlin query language.
Here is an explanation and a C# example of how to use the Graph Model in Cosmos DB.
C# Example. Graph Model with Cosmos DB Gremlin API (Social Network)
/*
This program demonstrates, how to create a Gremlin server connection and client
How to add vertices (nodes), edges (relationships) and query the graph
*/
using Gremlin.Net.Driver;
using Gremlin.Net.Structure.IO.GraphSON;
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
public class Program
{
private static readonly string Hostname = "<your-cosmosdb-gremlin-endpoint>";
private static readonly int Port = 443;
private static readonly string AuthKey = "<your-cosmosdb-key>";
private static readonly string Database = "socialnetwork";
private static readonly string Collection = "users";
public static async Task Main(string[] args)
{
var gremlinServer = new GremlinServer(Hostname, Port, enableSsl: true, username: $"/dbs/{Database}/colls/{Collection}", password: AuthKey);
using (var gremlinClient = new GremlinClient(gremlinServer, new GraphSON2Reader(), new GraphSON2Writer(), GremlinClient.GraphSON2MimeType))
{
await AddVertexAsync(gremlinClient, "person", new Dictionary<string, object> { { "id", "1" }, { "name", "M Imran Ansari" }, { "age", 35 } });
await AddVertexAsync(gremlinClient, "person", new Dictionary<string, object> { { "id", "2" }, { "name", "M Saqlain" }, { "age", 25 } });
await AddVertexAsync(gremlinClient, "person", new Dictionary<string, object> { { "id", "3" }, { "name", "Kashif Shahzad" }, { "age", 28 } });
await AddEdgeAsync(gremlinClient, "1", "2", "knows");
await AddEdgeAsync(gremlinClient, "2", "3", "knows");
Console.WriteLine("Vertices and edges added successfully.");
var query = "g.V('1').out('knows').values('name')";
var result = await gremlinClient.SubmitAsync<dynamic>(query);
Console.WriteLine("People Alice knows:");
foreach (var name in result)
{
Console.WriteLine(name);
}
}
}
private static async Task AddVertexAsync(GremlinClient gremlinClient, string label, Dictionary<string, object> properties)
{
var query = $"g.addV('{label}')";
foreach (var property in properties)
{
query += $".property('{property.Key}', '{property.Value}')";
}
await gremlinClient.SubmitAsync<dynamic>(query);
}
private static async Task AddEdgeAsync(GremlinClient gremlinClient, string fromId, string toId, string label)
{
var query = $"g.V('{fromId}').addE('{label}').to(g.V('{toId}'))";
await gremlinClient.SubmitAsync<dynamic>(query);
}
}
5. Relational-Like Model
Cosmos DB is a NoSQL database, but it supports relational-like modeling using its Core (SQL) API and features like embedded documents, denormalization, and partitioning. Since it doesn’t support traditional relational joins across containers like SQL Server, you need to model data differently to optimize queries.
5.1. Methods to Implement a Relational-Like Model in Cosmos DB:
5.1.1. Embedded Documents (Denormalization)
- Store related data within the same document instead of separate tables (like in SQL).
- Reduces the need for costly cross-document queries.
Here is an example of an order object:
{
"id": "order123",
"customerId": "cust001",
"orderDate": "2024-01-30",
"items": [
{
"productId": "prod101",
"quantity": 2,
"price": 25.99
},
{
"productId": "prod102",
"quantity": 1,
"price": 12.5
}
]
}
5.1.2. Reference Model (Normalization)
- Similar to foreign keys in SQL but managed at the application level.
- Useful when data is highly reusable (e.g., a product catalog referenced by multiple orders).
Here is an example of orders and items collection:
// Orders Collection:
{
"id": "order123",
"customerId": "cust001",
"orderDate": "2024-01-30",
"itemIds": ["item001", "item002"]
}
// Items Collection:
{
"id": "item001",
"productId": "prod101",
"quantity": 2,
"price": 25.99
}
5.1.3. Using Cosmos DB's JOINs in Queries
- Cosmos DB supports intra-document joins (joins within a single document) but not across collections.
Example. Querying embedded items using a JOIN
SELECT o.id, i.productId, i.quantity, i.price
FROM Orders o
JOIN i IN o.items
WHERE o.id = "order123"
5.1.4. Partitioning for Scalability
- Unlike SQL, Cosmos DB partitions data for better scalability.
- Choose a partition key wisely to distribute data evenly.
Comparison of APIs
API |
Primary Use Case |
Query Language |
Compatible With |
SQL API |
Document-based workloads |
SQL-like queries |
Native Cosmos DB SDKs (e.g., .NET) |
MongoDB API |
MongoDB-compatible applications |
MongoDB query syntax |
MongoDB drivers and tools |
Gremlin API |
Graph-based workloads |
Gremlin traversal language |
Gremlin.NET |
Table API |
Key-value workloads |
Table Storage API |
Azure Table SDKs |
Cassandra API |
Column-family workloads |
Cassandra Query Language |
Cassandra drivers and tools |
Mapping Data Models to APIs
Data Model |
Supported API |
Primary Use Case |
Document Model |
SQL API, MongoDB API |
Web apps, e-commerce, content management. |
Key-Value Model |
Table API |
Caching, session state, user settings. |
Column-Family Model |
Cassandra API |
IoT telemetry, time-series data. |
Graph Model |
Gremlin API |
Social networks, recommendations. |
Relational-Like Model |
SQL API |
Relational database migration. |
Conclusion
Azure Cosmos DB stands out as a powerful, globally distributed database service that supports multiple data models and APIs, making it a versatile solution for modern application development. Its ability to handle diverse data structures—such as documents, key-value pairs, graphs, and columnar data—while maintaining high performance, scalability, and consistency makes it an ideal choice for developers building complex, data-driven applications. By leveraging the appropriate API and data model in C#, developers can efficiently manage and interact with data, ensuring their applications meet both current and future demands. Whether you're working with the SQL API for document storage, the Gremlin API for graph data, or other supported APIs, Cosmos DB provides the tools and flexibility needed to deliver robust, scalable solutions.