Road To AZ-204 - Developing Solutions That Use Cosmos DB Storage

Introduction

This article's intention is to explain the main skills measured in this sub-topic of the AZ-204 Certification. Cosmos DB is the main component that will have its fundamentals explained here alongside a practical example.

This certification is very extensive and this article approaches only the main topics, make sure you know those components in depth before taking the exam. Another great tip is to do exam simulators before the official exam in order to validate your knowledge.

What is the Certification AZ-204 - Developing Solutions for Microsoft Azure?

The AZ-204 - Developing Solutions for Microsoft Azure certification measures designing, building, testing, and maintaining skills of an application and/or service in the Microsoft Azure Cloud environment. It approaches, among others, those components,

  • Azure Virtual Machines
  • Docker
  • Azure Containers
  • Service Web App
  • Azure Functions
  • Cosmos DB
  • Azure Storage
  • Azure AD
  • Azure Key Vault
  • Azure Managed Identities
  • Azure Redis Cache
  • Azure Logic App
  • Azure Event Grid
  • Azure Event Hub
  • Azure Notification Hub
  • Azure Service Bus
  • Azure Queue Storage

Check more information on the AZ - 204 Developing Solutions for Microsoft Azure Official Website.

Target Audience

Any IT professional willing to improve his or her knowledge in Microsoft Azure is encouraged to take this certification, it is a great way to measure your skills within trending technologies. But, some groups of professionals are keener to take maximum advantage of it.

  • Azure Developers, with at least 1 year of experience with Microsoft Azure;
  • Experienced Software Developers, looking for an Architect position in a hybrid environment;
  • Software Developers, working to move applications to the cloud environment.

Skills Measured

According to today's date, the skills that are measured in the exam are split as follows.

  • Develop Azure compute solutions (25-30%)
  • Develop for Azure storage (10-15%)
    • Develop solutions that use Cosmos DB storage
    • Develop solutions that use blob storage
  • Implement Azure security (15-20%)
    • Implement user authentication and authorization
    • Implement secure cloud solutions
  • Monitor, troubleshoot, and optimize Azure solutions (10-15%)
    • Integrate caching and content delivery within solutions
    • Instrument solutions to support monitoring and logging
  • Connect to and consume Azure services and third-party services (25- 30%)
    • Develop an App Service Logic App
    • Implement API Management
    • Develop event-based solutions
    • Develop message-based solutions

Updated skills can be found on the AZ - 204 Official Measured Skills Website.

Benefits of Getting Certified

The main benefit here is having a worldwide recognized certification that proves that you have knowledge of this topic. Among intrinsic and extrinsic benefits, we have,

  • Higher growth potential, as certifications are a big plus.
  • Discounts and deals in Microsoft products and partners, like PluralSight and UpWork.
  • MCP Newsletters, with trending technologies.
  • Higher exposure on LinkedIn, as recruiters usually search for specific certifications.
  • With a higher salary, you will be more valuable to your company.
  • Unique happiness when getting the result and you were approved, knowing that all your efforts were worth it.

Official Microsoft Certification Program Benefits Website.

Main Skills Measured by this Topic
 

What is Cosmos DB?

Cosmos DB is a NoSQL Database with an incredibly fast response time and string support for scalability. Azure Cosmos DB offers Cosmos DB a fully managed service, where you do not need to worry about its administration because Azure handles the Cosmos DB automatic management, updates, and patches.

Azure Cosmos DB also offers serverless cost-effective capacity management and automatic scalability options and its main benefits are the ones as follows,

  • Integrates with many Azure Services, like Azure Functions, Azure Kubernetes Services, and Azure App Services.
  • Integrates with many database APIs, like the native Core SQL, MongoDB, Cassandra, and Gremlin.
  • Integrates with many development SDKs, like .Net, Java, Python, and Node.Js.
  • Has a schema-less service that automatically applies indexes to your data, resulting in fast queries.
  • Guaranteed uptime SLA of 99,999% availability.
  • Automatic data replication, among Azure Regions.
  • Data is protected with encryption-at-rest and role-based access.
  • Fully-managed database, with updates, maintenance, and patches applied automatically.
  • Autoscale is provided in order to attend to different sizes of workloads.

Read more about Cosmos-DB.

Cosmos DB APIs

Azure Cosmos DB is very flexible, being offered through different types of APIs in order to support a wider range of applications to be covered. All those different APIs are supported due to the multi-model approach, being able to deliver data through documents, key-value pairs, wide columns, or graph data.

It is strongly recommended to use the Core SQL APIs for new projects, whereas it is recommended to use the specific database API for existing databases. Those APIs are the ones as follows.

  • Core SQL API, the default API for using Azure Cosmos DB enables querying your data with a language very close to SQL.
  • MongoDB API is used to communicate with MongoDB databases and store data as documents.
  • Cassandra API is used to communicate with Cassandra using Cassandra Query Language and storing data as a partitioned row store.
  • Azure Table API, is used to communicate with Azure Table Storages and allow indexes in the partition and row keys. To query data you can use OData, LinQ in code, and the Rest APIs for GET operations.
  • Gremlin API, used to provide a Graph-based data view, being queried from graph traversal language.

Partitioning Schemas in Cosmos DB

In Azure Cosmos DB indexes the data among partitions, in order to have a better performance they are grouped by the partition keys. To understand better partitioning schemas in Azure Cosmos DB some basic concepts have to be explained as follows.

  • Partition Keys are the keys used to group items. Works like primary keys.
  • Logical Partitions consist of a set of items sharing the same partition key value.
  • Physical Partitions consist of a set of logical partitions. May have from 1 to any logical partitions and are managed by Azure Cosmos DB.
  • Replica-Sets consist of a group of physical partitions that are materialized as a self-managed and dynamically load-balanced group of replicas spread across multiple fault domains.

The concepts explained above can be better visualized in the image below.

Containers

Consistency levels in Cosmos DB

Azure Cosmos DB offers 5 types of consistency levels in order to maintain your data availability and querying performance depending on your needs, those consistency levels are the ones as follows.

Strong Consistency

  • Reads operation guarantees the return of the most recent data.
  • Reads operation costs as much as the Bounded Staleness and more than session and eventual consistencies.
  • Write operations are only available to be read after the data is replicated by the majority of its replicas.

Bounded Staleness

  • Reads operations lagged behind writes operations, by time or versions.
  • Reads operation costs as much as the Strong Consistency and more than session and eventual consistencies.
  • Has the strongest consistency than a session consistency, consistent prefix, or eventual consistency.
  • Recommended for globally distributed applications, with high availability and low latency.

Session

  • Reads operation guarantees consistency of written data in the same session.
  • Consistency is scoped to a user session, while other users may face dirty data if it has just been written by another session.
  • Default consistency level used for newly created databases.
  • Reads operation costs are smaller than Bounded Staleness and Strong Consistency but bigger than Eventual Consistency.

Consistent Prefix

  • Read operation guarantees the return of the most recent data replicated among the replicas, but it does not guarantee to have the most recent data.
  • Dirty data happens when one replica changes the data state but this data has not been replicated yet.
  • Has a stronger consistency level than the Eventual Consistency but less than any others.

Eventual

  • Read operation does not guarantee any consistency level.
  • Weakest consistency level.
  • Lowest latency and best performance among the consistency levels.
  • Reads operation costs less than any other consistency levels.

Cosmos DB Containers

Azure Cosmos Containers are useful for Azure Cosmos DB scalability both for storage scalability and throughput scalability. But, Azure Cosmos Containers are also great when you need a different set of configurations among your Azure Cosmos DBs because it offers the capability to configure each container individually.

Azure Cosmos Container has some container-specific properties, and those properties, which can be system-generated or user-configurable, vary according to the used API.

Those properties list ranges from unique identifiers for containers to configurations of the purging policies. You may find the entire properties list here.

At the creation moment, you may configure your throughput strategy between those two modes.

  1. Dedicated mode, whereas the provisioned throughput configured in this container is exclusively for this container and is backed by SLAs.
  2. Shared mode, whereas the provisioned throughput configured in this container is shared among all containers with the shared mode.

Cosmos DB Containers are available, at the present date, for all Cosmos DB APIs, except Gremlin API and Table API.

Scaling Cosmos DB

Azure Cosmos DB offers manual and automatic scaling, without any interruption to your services nor impact on Azure Cosmos DB SLA.

With automatic scaling, Azure Cosmos DB automatically adjusts, up or down, your throughput capacity according to its usage without needing to create any logic nor code.

You only need to configure your maximum throughput capacity and Azure will adjust your Azure Cosmos DB throughput from 10% of the maximum capacity to 100% of the maximum capacity.

With manual scaling, you can permanently change your throughput capacity.

Keep in mind that it is vital to have chosen your partition keys wisely before scaling your Azure Cosmos DB. Otherwise, your requests are not going to be balanced as you are going to experience a hot partition, which elevates the costs and reduces the performance.

There are some important topics that you must configure when setting autoscale, as follows.

  • Time to Live defines the TTL for your container. The default is off but you can configure it to be on with the TTL time being item-specific or for all items in the container.
  • Geospatial Configuration, is used to query items based on location.
    • Geography represents data in a round-earth coordinate system.
    • Geometry represents data in a flat coordinate system.
  • Partition Key, is the partition key used to scale your partition.
  • Indexing Policy sets how the container applies the indexes to its items. You may include or exclude properties, set the consistency mode, automatically apply the indexes, etc.

Triggers, Stored Procedures, and user-defined functions with Cosmos DB

Azure Cosmos DB provides a transactional way to execute code in order to define Triggers, Stored Procedures, and Functions. You can define those Triggers, Stored Procedures, and Functions through the Azure Portal, Javascript Query API for Cosmos DB, or Cosmos DB SQL API client SDKs.

Azure Cosmos DB has two types of triggers.

  • Pre-trigger which is executed before the data has changed.
  • Post-trigger which is executed after the data has changed.

Change Feed Notifications with Cosmos DB

The Azure Cosmos DB Change Feed Notifications is a service that monitors the changes occurring among all containers and distributes events, triggered by those changes, across multiple consumers.

The Change Feed Notifications can also be scaled up or scaled down alongside the Cosmos Db Containers and its main components are the ones as follows,

  • The monitored container, which is the container that when any insert or update is executed the operations are reflected in the change feed.
  • The lease container, whereas it stores the states and coordinates the change feed processor.
  • The host, hosting the change feed processor.
  • The delegate is the code executed when triggered by any event in the change feed notifications.

The change feed processor may be hosted among Azure services that support long-running tasks, like Azure WebJobs, Azure Virtual Machines, Azure Kubernetes Services, and Azure .Net hosted services.

Practical Sample

Azure Cosmos DB Emulator can be used here: you can download it from the Microsoft Official Website.

Study of the case: we are going to create a Database schema to represent those classes below, do not forget that your ID field must be specified and be a unique string

public class Person
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public DateTime BirthDate { get; set; }
    public string Name { get; set; }
    public string LastName { get; set; }
    public Address Address { get; set; }
    public Vehicle Vehicle { get; set; }
}

public class Address
{
    public int Id { get; set; }
    public string City { get; set; }
    public string StreetAndNumber { get; set; }
}

public class Vehicle
{
    public int Id { get; set; }
    public int Year { get; set; }
    public string Model { get; set; }
    public string Make { get; set; }
}

Creating Azure Cosmos DB using Azure Portal

In your Azure Portal, search for the Azure Cosmos DB product and then click on Add. Fill out the Basics, Networking, Backup policy, Encryption, and Tags forms then create it.

Here we will be using the Core SQL API and naming the Cosmos DB as a sample Azure cosmos db.

SQL API

After Successful deployment, access your Cosmos DB resource in order to get your endpoint URI and the primary key for further usage.

Cosmos DB resource

Validate your empty Data Explorer.

Data explorer

Creating an Azure Cosmos DB Database using C#

Requirements

  1. Create a new Console Application
  2. Install Nuget Microsoft.Azure.Cosmos

Call the CreateDatabaseIfNotExistsAsync method from your Cosmos Client to create your database.

class Program
{
    private static readonly string endpointUri = "https://sampleazurecosmosdb.documents.azure.com:443/";
    private static readonly string primaryKey = "BD43cPOWtjdSsSeBTpy2rbJLIW4lMzhGoNkiVKX6y32cTQ2E2f139J0r8xxS3YR8Sy1bQywls9ByISabRjuaUQ==";

    public static async Task Main(string[] args)
    {
        using (CosmosClient client = new CosmosClient(endpointUri, primaryKey))
        {
            DatabaseResponse databaseResponse = await client.CreateDatabaseIfNotExistsAsync("SampleCosmosDB");
            Database sampleDatabase = databaseResponse.Database;

            await Console.Out.WriteLineAsync($"Database Id:\t{sampleDatabase.Id}");
        }
    }
}

Database

Validate your Data Explorer through Azure Portal.

Azure portal

Creating an Azure Cosmos DB Partitioned Container using C#

Set your indexing policy in your container properties, and call the CreateContainerIfNotExistsAsync method from your database object, here we also pass the desired throughput alongside the container properties.

IndexingPolicy indexingPolicy = new IndexingPolicy
{
    IndexingMode = IndexingMode.Consistent,
    Automatic = true,
    IncludedPaths = 
    {
        new IncludedPath
        {
            Path = "/*"
        }
    }
};

var containerProperties = new ContainerProperties("Person", "/Name")
{
    IndexingPolicy = indexingPolicy
};

var sampleResponse = await sampleDatabase.CreateContainerIfNotExistsAsync(containerProperties, 10000);
var customContainer = sampleResponse.Container;
await Console.Out.WriteLineAsync($"Sample Container Id:\t{customContainer.Id}");

Azure Cosmos DB

Validate your Data Explorer through Azure Portal

Validate your Data Explorer

Adding data to your Azure Cosmos DB container using C#

We will be adding a new Person as follows.

private static Person GetPerson()
{
    return new Person
    {
        BirthDate = DateTime.Now.AddYears(30),
        Id = "10.Thiago",
        Name = "Thiago",
        LastName = "Araujo",
        Vehicle = new Vehicle
        {
            Id = 2,
            Make = "Audi",
            Model = "TT",
            Year = 2020
        },
        Address = new Address
        {
            Id = 12,
            City = "Lisbon",
            StreetAndNumber = "Rua 25 de Abril, 4"
        }
    };
}

From your Container, call the CreateItemAsync method and pass the person object alongside its partition key.

var createPersonResponse = await customContainer.CreateItemAsync<Person>(GetPerson(), new PartitionKey(GetPerson().Name));
await Console.Out.WriteLineAsync($"Created person with Id:\t{createPersonResponse.Resource.Id}. Consuming total of \t{createPersonResponse.RequestCharge} RUs");

CreateItemAsync

Validate your Data Explorer through Azure Portal.

Github

Creating an Azure Cosmos DB Database using Azure CLI

Creating an Azure Cosmos DB Partitioned Container using Azure CLI.

Setting variables

$resourceGroup = "your resource group"
$cosmosDBAccount = "samplecosmosaccount"
$databaseName = "sampleclidatabase"
$containerName = "samplecontainername"
$partitionKey = "/Name"

Creating Cosmos DB Account

 Cosmos DB Account

Creating Cosmos DB Database

 Cosmos DB Database

Creating Cosmos DB Container

Cosmos DB Container

Checking Azure Portal

Checking Azure Portal

Scaling Containers

Inside your Azure Cosmos resource, go to Containers and then Scale. Configure your settings and click on Save.

Scaling Containers

Creating a Change feed notification

Here I used Cosmos DB Emulator to have the change feed notification working.

Create databases and containers

CosmosClient cosmosClient = new CosmosClientBuilder(endpointUri, primaryKey).Build();

Database database = await cosmosClient.CreateDatabaseIfNotExistsAsync(databaseName);

await database.CreateContainerIfNotExistsAsync(new ContainerProperties(sourceContainerName, "/id"));

await database.CreateContainerIfNotExistsAsync(new ContainerProperties(leaseContainerName, "/id"));

Start Change Feed Processor

Container leaseContainer = cosmosClient.GetContainer(databaseName, leaseContainerName);
ChangeFeedProcessor changeFeedProcessor = cosmosClient.GetContainer(databaseName, sourceContainerName)
    .GetChangeFeedProcessorBuilder<Person>(processorName: "changeFeedSample", HandleChangesAsync)
        .WithInstanceName("consoleHost")
        .WithLeaseContainer(leaseContainer)
        .Build();

await changeFeedProcessor.StartAsync();

Track Changes from Source Container

static async Task HandleChangesAsync(IReadOnlyCollection<Person> changes, CancellationToken cancellationToken)
{
    Console.WriteLine("Started handling changes...");
    foreach (Person item in changes)
    {
        Console.WriteLine($"Detected operation for person with id {item.Id}, created at {item.CreationDate}.");
        // Simulate some asynchronous operation
        await Task.Delay(10);
    }
    
    Console.WriteLine("Finished handling changes.");
}

Create Items in the Source Container

private static async Task GenerateItemsAsync(CosmosClient cosmosClient)
{
    Container sourceContainer = cosmosClient.GetContainer(databaseName, sourceContainerName);
    while (true)
    {
        Console.WriteLine("Enter a number of people to insert in the container or 'exit' to stop:");
        string command = Console.ReadLine();
        if ("exit".Equals(command, StringComparison.InvariantCultureIgnoreCase))
        {
            Console.WriteLine();
            break;
        }

        if (int.TryParse(command, out int itemsToInsert))
        {
            Console.WriteLine($"Generating {itemsToInsert} people...");
            for (int i = 0; i < itemsToInsert; i++)
            {
                var person = GetPerson();
                await sourceContainer.CreateItemAsync<Person>(person, new PartitionKey(person.Id));
            }
        }
    }
}

Complete Code

class Program
{
    private static readonly string endpointUri = "https://localhost:8081/";
    private static readonly string primaryKey = "C2y6yDjf5/R+ob0N8A7Cgv30VRDJIWEHLM+4QDU5DE2nQ9nDuVTqobD4b8mGGyPMbIZnqyMsEcaGQy67XIw/Jw==";
    private static readonly string databaseName = "sampleDatabase";
    private static readonly string sourceContainerName = "sampleSourceContainer";
    private static readonly string leaseContainerName = "sampleLeaseContainer";

    static async Task Main(string[] args)
    {
        CosmosClient cosmosClient = new CosmosClientBuilder(endpointUri, primaryKey).Build();

        Database database = await cosmosClient.CreateDatabaseIfNotExistsAsync(databaseName);

        await database.CreateContainerIfNotExistsAsync(new ContainerProperties(sourceContainerName, "/id"));

        await database.CreateContainerIfNotExistsAsync(new ContainerProperties(leaseContainerName, "/id"));

        ChangeFeedProcessor processor = await StartChangeFeedProcessorAsync(cosmosClient);

        await GenerateItemsAsync(cosmosClient);
    }

    private static async Task<ChangeFeedProcessor> StartChangeFeedProcessorAsync(
        CosmosClient cosmosClient)
    {
        Container leaseContainer = cosmosClient.GetContainer(databaseName, leaseContainerName);
        ChangeFeedProcessor changeFeedProcessor = cosmosClient.GetContainer(databaseName, sourceContainerName)
            .GetChangeFeedProcessorBuilder<Person>(processorName: "changeFeedSample", HandleChangesAsync)
                .WithInstanceName("consoleHost")
                .WithLeaseContainer(leaseContainer)
                .Build();

        Console.WriteLine("Starting Change Feed Processor...");
        await changeFeedProcessor.StartAsync();
        Console.WriteLine("Change Feed Processor started.");
        return changeFeedProcessor;
    }

    static async Task HandleChangesAsync(IReadOnlyCollection<Person> changes, CancellationToken cancellationToken)
    {
        Console.WriteLine("Started handling changes...");
        foreach (Person item in changes)
        {
            Console.WriteLine($"Detected operation for person with id {item.Id}, created at {item.CreationDate}.");
            // Simulate some asynchronous operation
            await Task.Delay(10);
        }

        Console.WriteLine("Finished handling changes.");
    }

    private static async Task GenerateItemsAsync(CosmosClient cosmosClient)
    {
        Container sourceContainer = cosmosClient.GetContainer(databaseName, sourceContainerName);
        while (true)
        {
            Console.WriteLine("Enter a number of people to insert in the container or 'exit' to stop:");
            string command = Console.ReadLine();
            if ("exit".Equals(command, StringComparison.InvariantCultureIgnoreCase))
            {
                Console.WriteLine();
                break;
            }

            if (int.TryParse(command, out int itemsToInsert))
            {
                Console.WriteLine($"Generating {itemsToInsert} people...");
                for (int i = 0; i < itemsToInsert; i++)
                {
                    var person = GetPerson();
                    await sourceContainer.CreateItemAsync<Person>(person,
                        new PartitionKey(person.Id));
                }
            }
        }
    }

    private static Person GetPerson()
    {
        Random random = new Random();
        return new Person
        {
            BirthDate = DateTime.Now.AddYears(30),
            Id = random.Next() + "Thiago",
            Name = "Thiago",
            LastName = "Araujo",
            CreationDate = DateTime.Now,
            Vehicle = new Vehicle
            {
                Id = random.Next(),
                Make = "Audi",
                Model = "TT",
                Year = random.Next()
            },
            Address = new Address
            {
                Id = random.Next(),
                City = "Lisbon",
                StreetAndNumber = "Rua 25 de Abril, 4"
            }
        };
    }
}

public class Person
{
    [JsonProperty(PropertyName = "id")]
    public string Id { get; set; }
    public DateTime BirthDate { get; set; }
    public string Name { get; set; }
    public string LastName { get; set; }
    public Address Address { get; set; }
    public Vehicle Vehicle { get; set; }
    public DateTime CreationDate { get; set; }
}

public class Address
{
    public int Id { get; set; }
    public string City { get; set; }
    public string StreetAndNumber { get; set; }
}

public class Vehicle
{
    public int Id { get; set; }
    public int Year { get; set; }
    public string Model { get; set; }
    public string Make { get; set; }
}

Result

Result

Emulator

DB emulator

Creating Stored Procedures

// SAMPLE STORED PROCEDURE
function sample(prefix) {
    var collection = getContext().getCollection();

    // Query documents and take 1st item.
    var isAccepted = collection.queryDocuments(
        collection.getSelfLink(),
        'SELECT * FROM root r',
        function(err, feed, options) {
            if (err) throw err;

            // Check the feed and if empty, set the body to 'no docs found',
            // else take 1st element from feed
            if (!feed || !feed.length) {
                var response = getContext().getResponse();
                response.setBody('no docs found');
            } else {
                var response = getContext().getResponse();
                var body = { prefix: prefix, feed: feed[0] };
                response.setBody(JSON.stringify(body));
            }
        });

    if (!isAccepted) throw new Error('The query was not accepted by the server.');
}

Executing

Executing

Creating Triggers
 

Pre-Trigger

function validateItemTimestamp() {
    var context = getContext();
    var request = context.getRequest();

    // item to be created in the current operation
    var itemToCreate = request.getBody();

    // validate properties
    if (!("triggerTime" in itemToCreate)) {
        var ts = new Date();
        itemToCreate["triggerTime"] = ts.getTime();
    }

    // update the item that will be created
    request.setBody(itemToCreate);
}

Post-Trigger

function updateMetadata() {
    var context = getContext();
    var container = context.getCollection();
    var response = context.getResponse();

    // item that was created
    var createdItem = response.getBody();

    // query for metadata document
    var filterQuery = 'SELECT * FROM root r WHERE r.id = "_metadata"';
    var accept = container.queryDocuments(container.getSelfLink(), filterQuery, updateMetadataCallback);
    if (!accept) throw "Unable to update metadata, abort";
}

function updateMetadataCallback(err, items, responseOptions) {
    if (err) throw new Error("Error" + err.message);
    if (items.length != 1) throw 'Unable to find metadata document';

    var metadataItem = items[0];

    // update metadata
    metadataItem.createdItems += 1;
    metadataItem.createdNames += " Post trigger";
    var accept = container.replaceDocument(metadataItem._self, metadataItem, function(err, itemReplaced) {
        if (err) throw "Unable to update metadata, abort";
    });
    if (!accept) throw "Unable to update metadata, abort";
    return;
}

External References