Introduction To Azure Cosmos DB Table API

Bassam Alugili
Jun 20, 2019

39.2k
0
5
- facebook
- twitter
- linkedIn
- Reddit
- WhatsApp
- Email
- Print
- Other Artcile

Azure Cosmos DB

Introduction

Azure Cosmos DB is a new globally distributed database. It can easily be scaled out (horizontally partitioning), and it is widely available around the world.

In this article series, I'll introduce Cosmos DB with code samples.

Table API
SQL API
API for MongoDB
Cassandra API
Gremlin API

Azure Cosmos DB is inspired by Dr. Leslie Lamport's theory. I want to thank Microsoft for their excellent choice and for the new fantastic Distributed Database System.

I highly recommend you watch Dr. Leslie Lamport's videos on YouTube and read about TLA+ and his contributions in Distributed Systems.

Before we begin, I want to make one thing very clear - Azure Cosmos DB is not to replace MS SQL Server. MS SQL Server is a Relational DBMS, and Cosmos DB provides native support for NoSQL, including Cassandra, MongoDB Gremlin, Spark, and SQL, etc. So, they are complementary to each other and not replacing each other.

As you can see in Figure 1, the relational databases are dominating from 1990 to 2000. The main problem was the old-style software architectures (Layered Architecture, Service-Oriented Architecture) with a relational database as a back-end is a vertical scalability/scaling up (fixed schema problem was less important). It was very expensive and sometimes hard to scale the application up. That was the main reason which helped ignite the NoSQL and Microservices revolution. Meanwhile, the image becomes more transparent, and as Martin Fowler has said, the polyglot persistence is the future for the database design.

Figure -1- Evaluation of Database Systems

Note

NewSQL is so important, like NoSQL, I recommend you read about it or to search for Michael Stonebraker on Google and YouTube.

What does it mean that Azure Cosmos DB primarily is NoSQL?

To answer the question, first, we have to know what is NoSQL.

NoSQL is a class of database management systems (DBMS), that does not follow all of the rules of a relational DBMS.

Not using the relational model
Running well on clusters
Mostly open-source
Schema-less
Different Data Models

The text mentioned above "Running well on clusters" is easy to say, but it is hard to apply. To understand the clusters problem, we have to understand the distributed computing and distributed data store. So, let us take a look into Eric Brewer CAP theorem.

CAP theorem

CAP stands for Consistency (C), Availability (A), and Partition Tolerance (P). When you design an application with a distributed database, then you must choose between those three guarantees and the theory said you could have a maximum of two choices from the three.

The CAP theorem demonstrates that any distributed system cannot guarantee C, A, and P at the same time; instead, there are always trade-offs between C, A, and P.

In the next post, I will write about the BASE Concept.

Cosmos DB

If you are comfortable with the CAP theorem, you will know that there are always trade-offs. Azure Cosmos DB has five consistency models so that you can decide for yourself what you deem most important and what you are willing to sacrifice.

The currently available consistencies are,

(I have quoted some text from www.sqlshack.com/8-things-know-azure-cosmos-db-formerly-documentdb/)

Strong

With strong consistency, you are always guaranteed to read the latest version of an item similar to reading committed isolation in SQL Server. You can only ever see data which is durably committed. Strong consistency is scoped to a single region.

Bounded-staleness

In bounded-staleness consistency, read will lag behind writes, and guarantees global order and is not scoped to a single region.

Session

Is the most popular consistency level, since it provides consistency guarantees, but also has better throughput.

Consistent Prefix

The global order is preserved, and the prefix order is guaranteed. A user will never see writes in a different order than that in which it was written.

Eventual

Is like asynchronous synchronization. It guarantees that all changes will be replicated eventually, and as such, it also has the lowest latency because it does not need to wait on any commits.

Cosmos Table API

Table API belongs to the key-value database with a schema-less design for rapid development and auto-scaling. Table API is based on structured NoSQL data stored in the cloud and is fitted for global distribution scenarios.

Scenarios to use Cosmos Table API

Users data, Devices, IoT, Structured Data.

Figure -2- Key Value Database, in this case, Key is an integer, and the value is a sequence of bytes.

Table Structure

Account
Allows you to access Azure Cosmos DB and the Table API.

Table
Is a collection of entities. You can compare it just like a table in the relational database.

Entities

An entity is a set of properties, similar to the row of the relational database.

Properties

A property is a name-value pair. It is like a dictionary; the property name is the dictionary key. Each entity has three system properties that specify a partition key, a row key, and a timestamp.

Features

No limits on numbers of tables, rows or a table size
Dynamic load balancing
NoSQL- Schema-less entities with strong Consistency
Best for key/Value lookups on partition key and row key
Entity group transaction for atomic batching.
Guaranteed high availability.
Automatic secondary indexing.

Entity Group Transaction

Groups the entity changes in a batch operation, then commits the changes together. Either all changes will be committed successfully, or all will fail. This operation can be executed under one condition: The entities must belong to the same partition; I have demonstrated below with a code sample.

Concurrencies

Pessimistic Concurrency

Locking the Entity so one call can write and blocking the other calls until the writing process is finished.

Optimistic Concurrency

The caller receives a notification about the concurrency changes in the entity, and he can decide which behavior is correct. Optimistic Concurrency is the default Azure one. I have simulated the Optimistic Concurrency problem below in the code sample.

Last write wins

However, write the data last that goes in the most current row.

Cosmos API Code Sample

Pre-Installation

Download and install Azure Cosmos Db Emulator

As I said before, each entity has a partition key and row key. Partition Index and the row Index are used to create the clustered index, so please choose them carefully. Those keys are the glue for good design and excellent performance, and I highly recommend you to follow Microsoft design guidelines.

Entities with the same partition are put in a single tablet server, and the row key is used to identify the entity itself in the same partition.

I have defined a domain entity object “User” which need to be persisted.

public class User : TableEntity
{
public string EMail { get; set; }
public DateTimeOffset LastLogin { get; set; }
public User()
{
}
public User(string locationId, string type)
{
PartitionKey = locationId;
RowKey = type;
}
}

LocationId and the user type -- I have used them for the partition key and row key.

The database context.

I have added a database context class which used to create a cloud account from the connection string and creating/retrieving the users' cloud table.

public class CosmosTableApiDbContext
{
public CloudStorageAccount CreateCloudStorageAccount(string connectionString)
{
return CloudStorageAccount.Parse(connectionString);
}
public CloudTable GetTableClient(string tableName, CloudStorageAccount storageAccount)
{
var tableClient = storageAccount.CreateCloudTableClient();
var table = tableClient.GetTableReference(tableName);
// Create the cloud table client to interacting with the table service
if (table.CreateIfNotExists())
{
Console.WriteLine("Created Table named: {0}", tableName);
}
else
{
Console.WriteLine("Table {0} already exists", tableName);
}
return table;
}
}

The connection string.

<appSettings>
<add key="cloud:StorageConnectionString" value="..." />
<add key="cloud:Tablename" value="…" />
<add key="cloud:TableThroughput" value="400" />
<add key="emulator:StorageConnectionString" value="DefaultEndpointsProtocol=https;AccountName=MyFirstCosmosDB;AccountKey=C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==;TableEndpoint=https://localhost:8081/;" />
<add key="emulator:Tablename" value="Users" />
<add key="emulator:TableThroughput" value="400" />
</appSettings>

I have modified the App.config, as shown above.

I have copied the key from the emulator, as shown below.

The CRUD Operations

I have added a demo class for the very basic CRUD operations,

public void CrudOperations(CloudTable cloudTable)
{
// Create a user.
var user = new User("Karlsbad", "Admin")
{
EMail = "[email protected]",
LastLogin = DateTimeOffset.UtcNow
};
Console.WriteLine("******************** Create a user ********************");
// Create the insert replace operation.
var insertOrReplaceOperation = TableOperation.InsertOrReplace(user);
// Execute the operation. I have ignored the result just for demo.
_ = cloudTable.Execute(insertOrReplaceOperation);
Console.WriteLine($"user is created {user.ETag}");
// In the Production Code! You can evaluate the result to check that the operation has successfully finished!
Console.WriteLine("******************** Update a user ********************");
// Create the insert merge operation.
var insertOrMergeOperation = TableOperation.InsertOrMerge(user);
user.EMail = "[email protected]";
_ = cloudTable.Execute(insertOrMergeOperation);
Console.WriteLine($"user is updated {user.ETag}");
Console.WriteLine("******************** Find a user ********************");
// Create the retrieve operation.
var retrieveOperation = TableOperation.Retrieve<User>("Karlsbad", "Admin");
// Find the entity with PartitionKey ="Karlsbad" and RowKey="Admin"
var retrieveResult = cloudTable.Execute(retrieveOperation).Result;
var retrievedUser = retrieveResult as User;
Console.WriteLine($"user is found {retrievedUser?.ETag}");
Debug.Assert(retrievedUser?.EMail == "[email protected]");
Console.WriteLine("******************** Delete a user ********************");
// Delete the retrieve user.
var deleteOperation = TableOperation.Delete(retrievedUser);
_ = cloudTable.Execute(deleteOperation);
Console.WriteLine($"user is deleted {retrievedUser.ETag}");
}

I have started the demos as below,

public class Program
{
public static void Main(string[] args)
{
//const string enviromentName = "cloud";
const string enviromentName = "emulator";
var connectionString = ConfigurationManager.AppSettings[$"{enviromentName}:StorageConnectionString"];
var tableName = ConfigurationManager.AppSettings[$"{enviromentName}:TableName"];
var azureTableContext = new CosmosTableApiDbContext();
var sa = azureTableContext.CreateCloudStorageAccount(connectionString);
var cloudTable = azureTableContext.GetTableClient(tableName, sa);
Console.WriteLine("Starting Demos!");
// Demo for the basic CRUD operations.
var crudOperationsSample = new CrudOperationsSample();
crudOperationsSample.CrudOperations(cloudTable);
// Demo for the batch operation.
var batchOperationSample = new BatchOperationSample();
batchOperationSample.BatchOperation(cloudTable);
// Demo for the default Pessimistic Concurrency.
var pessimisticConcurrency = new ConcurrencyDemo();
pessimisticConcurrency.ConcurrencyDemoDefaultPessimistic(cloudTable);
Console.WriteLine("Done!");
Console.ReadKey();
}
}

Here is the result for the CRUD Operations demo,

So, let us do it step by step and see the results,

// Create the insert replace operation.
var insertOrReplaceOperation = TableOperation.InsertOrReplace(user);
// Execute the operation. I have ignored the result just for demo.
_ = cloudTable.Execute(insertOrReplaceOperation);
Console.WriteLine($"user is created {user.ETag}");

I have opened the Azure Cosmos DB Emulator Explorer Window to see the data as shown below,

As you see, the user is created and added.

Then the second code part,

Console.WriteLine("******************** Update a user ********************");
// Create the insert merge operation.
var insertOrMergeOperation = TableOperation.InsertOrMerge(user);
user.EMail = "[email protected]";
_ = cloudTable.Execute(insertOrMergeOperation);

When I refresh the query on the emulator then, EMail is changed.

Finally, the code below will remove the created user.

// Delete the retrieved user.
var deleteOperation = TableOperation.Delete(retrievedUser);
_ = cloudTable.Execute(deleteOperation);

Batch Operation Example

public class BatchOperationSample
{
public void BatchOperation(CloudTable cloudTable)
{
Console.WriteLine("******************** Start Batch Operation ********************");
var batchOperation = new TableBatchOperation();
for (var i = 2; i < 52; i++)
{
// I will create and add users and send them as one batch to the table.
var batchUser = new User("Karlsbad", "Admin_" + i)
{
EMail = "[email protected]",
LastLogin = DateTimeOffset.UtcNow
};
batchOperation.Add(TableOperation.InsertOrMerge(batchUser));
}
// Executing the operations or adding the users.
cloudTable.ExecuteBatch(batchOperation);
Console.WriteLine("******************** End Batch Operation ********************");
}
}

Emulator

Pessimistic Concurrency Demo

In the code below,

I have created the firstUser and checked it in.
Simulating that the third party has changed the created first user “updatedFirstUser”
Finally, I have tried to change the firstUser and checking in the changes
BOOM! Concurrency Exception!

public void ConcurrencyDemoDefaultPessimistic(CloudTable cloudTable)
{
Console.WriteLine("**************************** Start Demonstrate pessimistic concurrency ****************************");
// Add new user to table.
var firstUser = new User("Karlsruhe", "Operator")
{
EMail = "[email protected]",
LastLogin = DateTimeOffset.UtcNow
};
var insertOrReplaceOperation = TableOperation.InsertOrReplace(firstUser);
cloudTable.Execute(insertOrReplaceOperation);
Console.WriteLine("Entity added. Original ETag = {0}", firstUser.ETag);
// Someone else has changed the first user!
var updatedFirstUser = new User("Karlsruhe", "Operator")
{
EMail = "[email protected]",
LastLogin = DateTimeOffset.UtcNow
};
insertOrReplaceOperation = TableOperation.InsertOrReplace(updatedFirstUser);
cloudTable.Execute(insertOrReplaceOperation);
Console.WriteLine("Entity updated. Updated ETag = {0}", updatedFirstUser.ETag);
// Try updating first user. Etag is cached within firstUser and passed by default
firstUser.LastLogin = DateTimeOffset.UtcNow;
insertOrReplaceOperation = TableOperation.Merge(firstUser);
try
{
Console.WriteLine("Trying to update Original entity");
cloudTable.Execute(insertOrReplaceOperation);
}
catch (StorageException ex)
{
if (ex.RequestInformation.HttpStatusCode == (int)HttpStatusCode.PreconditionFailed)
{
Console.WriteLine("Error: Entity Tag is changed!");
}
else
{
throw;
}
}
Console.WriteLine("**************************** End Demonstrate pessimistic concurrency ****************************");
}

Exception

Summary

You have read a small introduction to database systems and Azure Cosmos DB. I hope you have understood what the CAP theory is and why we need it. Also, you have read about Azure Cosmos DB Table API with code Samples for CRUD, Batch, and Concurrency. Now, you can create an Azure Cosmos DB Table API database with Cosmos DB emulator. In the next article, I will write about the SQL API. Thank you for reading my article, and I hope you have enjoyed the reading as much as I have enjoyed the writing. You can follow me on C# Corner/Twitter or my webpage.

Source Code

Recommended Free Ebook

Printing in C# Made Easy

Download Now!