Utilize Multi Master Support In Azure Cosmos DB

Introduction

Cosmos DB is a buzzword in the distributed system world because of its multi-model, multi-master, low latency and high throughput. You can create an Azure Cosmos DB account using the Azure portal, PowerShell, CLI, or Azure Resource Manager template.

In this article, I am going to explain about multi-master and how Cosmos DB resolves the problem in data replication scenarios with multi-master. Before reading this article, please read my previous article to understand the basics of Cosmos DB.

Description

In a distributed system model, data availability is a big challenge with low latency. Cosmos DB already provides a solution to latency in data read with help of multi-region replication and choosing opted consistency model. So, there is no latency in the Read operation because the data is available in your near region data server.

But there is a quite bit latency in write operation because only one datacentre is there to perform a Write operation. Please read this blog to understand how global distribution is working. Now, Cosmos DB releases a new feature to improve the Write latency with the help of multi-master.

This article will explain the nice extent of Cosmos DB’s Multi-master support which is a new release in Azure with its advantages and how it handles data replication conflict.

What is Multi-master?

Till now, all the distributed system architecture works like one Write data server and multiple Read data servers. Data are replicated to all data servers from Write data server.

With the multi-master feature, we can choose all data servers to act as Write server data will be replicated with other data servers. The big challenge of multi-master, avoid the data duplication in data replication.

How to enable Multi-master features using CLI

Below is the command used to enable multi-master feature using CLI.

az cosmosdb create \
–-name"myCosmosDbAccount" \
--resource-group "myResourceGroup" \
--default-consistency-level "Session" \
--enable-automatic-failover "true" \
--locations "EastUS=0""WestUS=1" \
--enable-multiple-write-locations true \

Let’s see in detail about each keyword in the above command

az cosmosdb create \
az: It denotes Azure
cosmosdb: What resource want to use in Azure
create: Here, we mention want to create the resource
–-name"myCosmosDbAccount" \
name: It is a property to create cosmos DB account, should be unique.
--resource-group"myResourceGroup" \ 
 resource-group: Here mention already created resource group.
--default-consistency-level"Session" \
default-consistency-level: Cosmos DB offer five different type of consistency, we can set any one of them as default.
--enable-automatic-failover"true" \
enable-automatic-failover: Using this property one of the read datacentre act as write datacentre
--locations"EastUS=0""WestUS=1" \
Locations: Here, we have to mention location to replicate the data.

--enable-multiple-write-locationstrue \
Enable-multiple-write-locations
 
This property is the main property to enable multiple write datacenters. If it is false, 0th location datacenter will be the write one while the remaining are read.

How to enable COSMOS DB Multi-master features using the Azure portal

  1. In a new browser window, sign in to the Azure portal.
  2. Click Create a resource > Databases > Azure Cosmos DB.
  3. Fill all the details, refer below image.
  4. COSMOS DB Account name should be unique.
  5. Need to enable Multi-region writes, by default it is disabled.
Utilize Multi Master Support In Azure Cosmos DB

Replication difficulties

Let me explain the difficulties at replication in multi-master architecture.

Let’s assume we have three data server called A, B and C, all are having read and write rights.
 
Utilize Multi Master Support In Azure Cosmos DB

Data a, b, and c wrote on A, data d, e and f wrote on B, data g, h, and j (here do not use i) wrote on C.

First, server B replicated from A by data d, e, and f. After replication B have a, b, c, d, e, and f.

Utilize Multi Master Support In Azure Cosmos DB
 
Second, server C replicated from B by data a, b, c, d, e, and f. After replication data will be a, b, c, d, e, f, g, h, and j.
 
Utilize Multi Master Support In Azure Cosmos DB

Third, server C replicated from A, data are a, b, and c. But these data already available in C server. If replication run, data will be duplicate see the below image.

Utilize Multi Master Support In Azure Cosmos DB

Conflict Scenarios

There are three types of duplication happen while working with multi-master.

  • Insert
    It will occur when we insert two or more documents with the same unique (for example ID) index from two or more regions simultaneously. Documents get removed when we apply conflict resolution only one document with the original ID is committed.

  • Replace
    It will occur when we update a single document simultaneously from two or more regions.

  • Delete
    It will occur when we delete a document from one region and update it from one or more regions.

Cosmos DB offer three different type of conflicts resolutions like Last-Writer-Win (LWW), Custom- user-defined procedure and Custom - Asynchronous

As of now only LWW available for all the model of cosmos DB and it is a default. We can use all the three only in SqlServer.

LWW

It will resolve the conflicts based on the numeric value passed in a property on the document.

Insert and Update conflicts

If one or more document duplicates in the insert and replaces operations, the document that contains the largest values for the conflict resolution path becomes the winner. If multiple documents has the same numeric value for the resolution path, the selected winner is non-deterministic. All region replicated by winner region document.

Delete conflict

If there are delete conflicts involved, the deleted always win over other replace conflicts.

 Use cases

  1. Social media
    Social media applications need low latency on both read and write to improve user’s engagement. If someone writes the status on his account should be written on his region to improve write latency. That should be shared with his friends speared globally.

  2. E-Commerce
    To increase the responsiveness of applications. Availability for both reads and writes in multiple regions provides greater availability with very low latency.

  3. IoT
    For example, the application needs to track the flight data, which is departs from Singapore and arrives at the USA. When flight takes off that home region will be Singapore, in landing time home region reallocated to the USA which is closed. The data write operation will be very fast because data write onthe nearest region.

Conclusion

In this article, we have seen the newly released multi-master feature of Cosmos DB and how it works to resolve the conflicts with case study.