MongoDB Indexing (Day 13)

Before reading this article, I highly recommend reading the previous installments in the series:

Introduction

Indexes support the efficient resolution of queries. Indexes execute queries in an efficient way. Without indexing, MongoDB does a collection scan. In a collection scan, MongoDB scans each document individually to select those documents that match the selection criteria. This approach takes much time and resources. So MongoDB provides indexes that can provide a reliable and efficient way of scanning documents. If an appropriate index exists for a query, then MongoDB can use this index to limit the number of documents it must scan.

Indexes are special data structures that store a small portion of the data set in an easy to traverse form. The index can store the value of a specific field or a set of fields, ordered by the value of the field that is specified in the index.

Types of Index

MongoDB supports several types of indexes. We can create an index on a single field, multiple fields, an index for an embedded field or an index for embedded documents. We should create an index for fields frequently used in queries, such that we can ensure that MongoDB scans the smallest possible number of documents. The createIndex() method creates indexes.

MongoDB has the following types of indexes.
  • Default Index
  • Single Field Index
  • Compound Index
  • MultiKey Index
  • Text Index
  • Geospatial Index
  • Hashed Index

Now to briefly describe each type of index.

Default _ID

Each MongoDB collection contains a builtin index on the _id field of the collection. This is a default index used by MongoDB. The index of the _id field contains the Unique property. In other words, we can’t insert a duplicate value into the _id field. If we don’t specify the value for the _id field then MongoDb inserts a unique value into the _id field automatically. The index of the _id field stores the data in ascending order.

Example

db.post.find({},{_id:1,Likes:1}).pretty()

  1. { "_id" : ObjectId("55c545e1acae58b1a7d7dff1"), "Likes" : 5 }  
  2. { "_id" : ObjectId("55c54617acae58b1a7d7dff2"), "Likes" : 4 }  
  3. { "_id" : ObjectId("55c54665acae58b1a7d7dff3"), "Likes" : 4 }  
  4. { "_id" : ObjectId("55c546a6acae58b1a7d7dff4"), "Likes" : 5 }  
  5. { "_id" : ObjectId("55c546d9acae58b1a7d7dff5"), "Likes" : 3 }  
  6. { "_id" : ObjectId("55c54711acae58b1a7d7dff6"), "Likes" : 6 }  
In the preceding example we retrieve _id, like a field from the “post” collection. The _id in the preceding output is in ascending order.

Single Field Index

In MongoDB, we can create an index for a single field or multiple fields. To create an index on a single field, pass the name of that field with the sorting order option.

Syntax


db.Collection_Name.createIndex({field:sorting_order})

Sorting_Order

1 for sort in ascending order and -1 for sort in descending order.

Example

db.post.createIndex({Likes:1})

Output
  1. {  
  2.     "createdCollectionAutomatically": false,  
  3.     "numIndexesBefore": 1,  
  4.     "numIndexesAfter": 2,  
  5.     "ok": 1  
  6. }  
In the preceding example, we create an index on the “Likes” field and specify 1 for the sorting order. In other words, the index orders the items in ascending order. We can see the createIndex() method return 4 parameters in acknowledge. The first parameter is “createdCollectionAutomatically” which specifies that this index is user-defined or created by the collection automatically. The second parameter is “numIndexesBefore” that specifies the number of indexes before the execution of the current command. The third parameter is “numIndexesAfter” which defines the number of indexes after the execution of the current command. The fourth parameter is “ok” that indicates the creation of an index, 1 for success and 0 for failure.

Index on Embedded Field

In the previous example, we create an index for the top-level field such that we can create an index for the field of the embedded document. Use “dot notation” to create an index on the embedded field.

Example
  1. {  
  2.     "_id": ObjectId("55c545e1acae58b1a7d7dff1"),  
  3.     "PostBY": "Pankaj choudhary",  
  4.     "Time": ISODate("2015-08-07T23:57:21.041Z"),  
  5.     "Title": "Introduction of MongoDB",  
  6.     "Tags": [  
  7.         "NoSQL",  
  8.         "MongoDB",  
  9.         "Database"],  
  10.     "Likes": 5,  
  11.     "Comment": {  
  12.         "CommentBy": "Rahul",  
  13.         "Text": "Nice Article"  
  14.     }  
  15.   
  16. }  
Now we create an index on the “CommentBy” field of the embedded document using the following specification.

db.post.createIndex({"Comment.CommentBy":1})

Index on Embedded Document

In the previous example, we create an index for the embedded field. Now we create an index for the embedded document.

For example, the post-collection contains the “Comment” field that is an embedded document.

Example

db.post.createIndex({"Comment":1})

Compound Indexes

In MongoDB, we can create an index for multiple fields of a collection. MongoDB supports 31 fields in a single compound index.

The following is an example:
  1. {  
  2.     "_id": ObjectId("55c545e1acae58b1a7d7dff1"),  
  3.     "PostBY": "Pankaj choudhary",  
  4.     "Time": ISODate("2015-08-07T23:57:21.041Z"),  
  5.     "Title": "Introduction of MongoDB",  
  6.     "Tags": [  
  7.         "NoSQL",  
  8.         "MongoDB",  
  9.         "Database"],  
  10.     "Likes": 5,  
  11.     "Comment": {  
  12.         "CommentBy": "Rahul",  
  13.         "Text": "Nice Article"  
  14.     }  
  15.   
  16. }  
Now we create a compound index of two fields with the following sorting order.

db.post.createIndex({"Comment":1,Title:-1})

The preceding query indexes the sort, first by the “Comment” field in ascending order, then by the “Title” field in descending order.

Prefixes

An Index prefix refers to the beginning subsets of indexed fields. Let's see an example:

db.post.createIndex({Comment:1,Title:-1,PostBy:1})

The above index has the following index prefixes:
  • { Comment:1}
  • {Comment:1,Title:-1}

MongoDB uses the preceding index for queries containing the following fields.

  • Comment field
  • Comment and Title field
  • Comment and Title and PostBY field

MongoDB never uses the preceding index for queries that contain the following fields.

  • Title field
  • PostBY field
  • Title and PostBY field

MongoDB can use the preceding index for queries that contain “Comment” and “PostBY” fields because the “Comment” field corresponds to a prefix. But this index would not be as efficient as a separate index for the “Comment” and “PostBY” fields.

MultiKey Indexes

If we create an index on a field that holds the array value, then MongoDB creates an index key for each element in the array. Multikey indexes support efficient queries against array fields. Multikey indexes can be created on an array field that contains either scalar values (in other words string or Integer) or nested documents. MongoDB automatically creates a multikey index if an indexed field is an array type. We don’t need to specify the multikey type.

Let's see an example:

  1. {  
  2.     "_id": ObjectId("55c54711acae58b1a7d7dff6"),  
  3.     "PostBY": "Pankaj choudhary",  
  4.     "Time": ISODate("2015-08-08T00:02:25.968Z"),  
  5.     "Title": "MongoDB Day3",  
  6.     "Tags": [  
  7.         "NoSQL",  
  8.         "MongoDB",  
  9.         "Database"],  
  10.     "Likes": 6,  
  11.     "Comment": [{  
  12.         "CommentBy": "Neeraj",  
  13.         "Text": " Nice One"  
  14.     }, {  
  15.         "CommentBy": "Rahul",  
  16.         "Text": "Thanks to Share"  
  17.     }]  
  18. }  
In the preceding document, we have two fields that are array types. The first one is “Tags” that contain scalar values and the second is “Comment” that contains nested documents. Now we create an index for both fields.

Index for “Tags” field.
db.post.createIndex({"Tags":-1})

Index for “Comment” field.
db.post.createIndex({"Comment":1})

Compound Multikey Indexes


In the previous example, we create two indexes, one for the “Tags” field and another for the “Comment” field. But we can’t create a compound index if more than one field is the “Array” type.

Let's try to create a compound index for the “Tags” and “ Comment” fields.

db.post.createIndex({"Comment":-1,"Tags":1 })

Output
  1. {  
  2.     "createdCollectionAutomatically": false,  
  3.     "numIndexesBefore": 10,  
  4.     "errmsg": "exception: cannot index parallel arrays [Tags] [Comment]",  
  5.     "code": 10088,  
  6.     "ok": 0  
  7. }  
MongoDB throws an error if we execute the preceding query because MongoDB doesn’t allow creation of a compound index for multiple “Array” fields.

But we can create a compound index if at most one field is the “Array” type.

db.post.createIndex({"Comment":-1,"Title":1 })

Text Indexes

MongoDB provides the feature “Text Index” to search the string contents in the documents of a collection. The Text Index can contain any field that has a string or array of strings. A collection can have at most one text index. But we can specify multiple fields for the text index.

To create a Text Index field set the value of the “comments” parameter to “text” in the createIndex() method.

Syntax

db.Collection_Name.createIndex({comments:”text”})

Example1

db.post.createIndex({"Title":1 },{comments:"text"})

Example2

db.post.createIndex({"Title":1 ,PostBY:1},{comments:"text"})

Wildcard Text Indexes

We know that a MongoDB database consists of unstructured data, so sometimes it is difficult to predict which fields contain text type data. MongoDB provides the wildcard specifier ($**) to allow for a text search on all fields with string content. Using a wildcard index MongoDB indexes all the fields containing string or text type data.

Let us see an example:

db.post.createIndex( { "$**": "text" } )

The preceding example creates a text index using a wildcard specifier.

We can create a wildcard text index in the compound index. Let us see an example.

db.post.createIndex({"Title":1,"$**":"text"})

In the preceding example, we create a compound index. This compound index contains a “Title” field and a wildcard text index.

Geospatial Indexes

A Geospatial index is mainly used to store the location data or geospatial information. Before storing any data we must decide the type of surface. MongoDB supports two types of surfaces.
  1. Spherical

    Calculates the geometry over a spherical surface (such as Earth) and stores location data on a spherical surface using 2dsphare.

  2. Flat

    Calculates distance over a Euclidean and store location as legacy coordinate pairs using a 2d index.

Hashed Index

Hashed indexed entries are maintained with hashes of the values of the indexed field. The Hashing function collapses embedded documents and computes the hash for the entire value. A Hashed index supports equality queries but doesn’t support range queries and multi-key index. We can’t create a compound index for a Hashed index. However, we can create a non-hashed index (such as a single field or a compound index) and hashed index on the same field.

Syntax

db.Collection_Name.createIndex({Field:”hashed”})

Let us see an example:

db.post.createIndex({"Title":"hashed"})

In the preceding query, we created a hashed index on the “Title” field.

Index Properties

Indexes in MongoDB supports these three properties for indexes.

  • Unique Index
  • TTL Index
  • Sparse Index

Unique Index

Unique index properties are used to restrict all the documents with a duplicate value for the indexed field or that already exists. To create a unique index, set the value of the “Unique” parameter to “true” in the createIndex method. By default the value unique is false.

Syntax

db.Collection_Name.createIndex({fieldname:sorting_order},{unique:true})

Example

Consider the following “Demo” collection. The Id field of the “Demo” collection has the following values.

db.My_Collection.find({},{_id:0})

  1. { "Id" : "1" }  
  2. { "Id" : "2" }  
  3. { "Id" : "3" }  
Now we create a Unique index on the “Id” field .

db.Demo.createIndex({"Id":1},{unique:true})

Now we try to insert a duplicate value into the “Id” field.

db.Demo.insert({Id:"1"})

Output
  1. WriteResult
  2. ({  
  3.          "nInserted" : 0,  
  4.          "writeError" : 
  5.          {  
  6.                  "code" : 11000,  
  7.                  "errmsg" : "E11000 duplicate key error index: Temp.Demo.$Id_1 dup key: { : \"1\  " }"  
  8.          }  
  9.  })  
This output shows that if we try to insert a duplicate value into the “Unique” field then MongoDB will throw an error.

TTL Indexes

TTL indexes are special types of indexes that MongoDB uses to remove documents automatically from a collection after a certain time period. This approach is very useful for machine generating data, logs, session information or storing the information for a temporary time period. To create a TTL index, use the expireAfterSeconds option with the createIndex method. In the expireAfterSeconds option we provide the time period in seconds. The TTL index does not guarantee that expired data will be deleted immediately upon expiration. There may be a delay between the time a document expires and the time that MongoDB removes the document from the database.

Syntax

db.Collection_Name.createIndex({fieldname:sorting_order},{ expireAfterSeconds:time})

Example

The Demo collection contains the following documents.
  1. {  
  2.     "Id": 1,  
  3.     "Name": "Pankaj"  
  4. } {  
  5.     "Id": 2,  
  6.     "Name": "Rahul"  
  7. } {  
  8.     "Id": 3,  
  9.     "Name": "Sandeep"  
  10. }  

Now we create an index on the “Id” field with the expireAfterSeconds option.

db.Demo.createIndex({Name:1},{expireAfterSeconds:120 })

In the preceding example, we create an index and set the value of expireAfterSeconds to 120. In other words, MongoDB removes the document after 120 seconds.

Sparse Index

The Sparse option in an index ensures that the index only contains entries for documents that have the indexed field. It also removes documents containing null values. A non-sparse index contains all documents even if the documents don’t contain the indexed field. The index is “sparse” because it does not include all the documents of the collection.

Syntax

db.Collection_Name.createIndex({fieldname:sorting_order}, { sparse: true })

Example

The Demo collection has the following documents.
  1. {  
  2.     "Id": 1,  
  3.     "Name": "Pankaj"  
  4. } {  
  5.     "Id": 2,  
  6.     "Name": "Rahul"  
  7. } {  
  8.     "Id": 3,  
  9.     "Name": "Sandeep"  
  10. }  
Now we create an index on the “Id” field with the sparse option.

db.Demo.createIndex({Name:1},{sparse:true })

In the preceding example, we create an index and set the value of the sparse option to true.

Basic Index Methods

Now we consider some basic index methods.

getIndexes() Method


The getIndexes() method retrieves all the existing indexes in a collection.

Syntax

db.Collection_Name.getIndexes()

Example

db.Demo.getIndexes()

Output
  1. [{  
  2.     "v": 1,  
  3.     "key": {  
  4.         "_id": 1  
  5.     },  
  6.     "name": "_id_",  
  7.     "ns": "Temp.Demo"  
  8. }, {  
  9.     "v": 1,  
  10.     "key": {  
  11.         "Id": 1  
  12.     },  
  13.     "name": "Id_1",  
  14.     "ns": "Temp.Demo",  
  15.     "expireAfterSeconds": 1200  
  16. }, {  
  17.     "v": 1,  
  18.     "key": {  
  19.         "Name": 1  
  20.     },  
  21.     "name": "Name_1",  
  22.     "ns": "Temp.Demo",  
  23.     "expireAfterSeconds": 12  
  24. }]  
dropIndex() Method

The dropIndex() method removes a specific index from a collection.

Syntax

db.Collection_Name.dropIndex({Field_Name})

Example

db.Demo.dropIndex({"Id":1})

Output

{ "nIndexesWas" : 3, "ok" : 1 }

In the preceding output, “nIndexesWas” reflects the number of indexes before removing this index and “ok” execution status of command.

ensureIndex() Method

The ensureIndex() method creates an index. The ensureIndex() method only creates an index if an index of the same specification does not already exist.

Syntax

db.Collection_Name.ensureIndex({field_name:sorting_order})

Example

db.Demo.ensureIndex({"Id":1})

reIndex() Method

The reIndex() method rebuilds the indexes for a collection. The reIndex() method first drops all the indexes, then rebuilds all the indexes.

Syntax

db.Collection_Name.reIndex()

Example

db.Demo.reIndex()

Output
  1. {  
  2.     "nIndexesWas": 3,  
  3.     "nIndexes": 3,  
  4.     "indexes": [{  
  5.         "key": {  
  6.             "_id": 1  
  7.         },  
  8.         "name": "_id_",  
  9.         "ns": "Temp.Demo"  
  10.     }, {  
  11.         "key": {  
  12.             "Name": 1  
  13.         },  
  14.         "name": "Name_1",  
  15.         "ns": "Temp.Demo",  
  16.         "expireAfterSeconds": 12  
  17.     }, {  
  18.         "key": {  
  19.             "Id": 1  
  20.         },  
  21.         "name": "Id_1",  
  22.         "ns": "Temp.Demo"  
  23.     }],  
  24.     "ok": 1  
  25. }  
Indexes.find() Method

The indexes.find() method returns a list of all indexes on all collections of a database.

Syntax

db.system.indexes.find()

Example

db.system.indexes.find()

Output
  1. {  
  2.     "v": 1,  
  3.     "key": {  
  4.         "_id": 1  
  5.     },  
  6.     "name": "_id_",  
  7.     "ns": "Temp.Employee_Deatils"  
  8. } {  
  9.     "v": 1,  
  10.     "key": {  
  11.         "_id": 1  
  12.     },  
  13.     "name": "_id_",  
  14.     "ns": "Temp.Employee_Details"  
  15. } {  
  16.     "v": 1,  
  17.     "key": {  
  18.         "_id": 1  
  19.     },  
  20.     "name": "_id_",  
  21.     "ns": "Temp.pan"  
  22. } {  
  23.     "v": 1,  
  24.     "key": {  
  25.         "_id": 1  
  26.     },  
  27.     "name": "_id_",  
  28.     "ns": "Temp.post"  
  29. } {  
  30.     "v": 1,  
  31.     "key": {  
  32.         "_id": 1  
  33.     },  
  34.     "name": "_id_",  
  35.     "ns": "Temp.ram"  
  36. } {  
  37.     "v": 1,  
  38.     "key": {  
  39.         "_id": 1  
  40.     },  
  41.     "name": "_id_",  
  42.     "ns": "Temp.Test"  
  43. } {  
  44.     "v": 1,  
  45.     "key": {  
  46.         "_id": 1  
  47.     },  
  48.     "name": "_id_",  
  49.     "ns": "Temp.City_Info"  
  50. } {  
  51.     "v": 1,  
  52.     "key": {  
  53.         "_id": 1  
  54.     },  
  55.     "name": "_id_",  
  56.     "ns": "Temp.Post"  
  57. } {  
  58.     "v": 1,  
  59.     "key": {  
  60.         "_id": 1  
  61.     },  
  62.     "name": "_id_",  
  63.     "ns": "Temp.Employee"  
  64. } {  
  65.     "v": 1,  
  66.     "key": {  
  67.         "Name": 1  
  68.     },  
  69.     "name": "Name_1",  
  70.     "ns": "Temp.Employee"  
  71. } {  
  72.     "v": 1,  
  73.     "key": {  
  74.         "Likes": 1  
  75.     },  
  76.     "name": "Likes_1",  
  77.     "ns": "Temp.post"  
  78. } {  
  79.     "v": 1,  
  80.     "key": {  
  81.         "Likedfs": 1  
  82.     },  
  83.     "name": "Likedfs_1",  
  84.     "ns": "Temp.post"  
  85. } {  
  86.     "v": 1,  
  87.     "key": {  
  88.         "Likedfgdfs": 1  
  89.     },  
  90.     "name": "Likedfgdfs_1",  
  91.     "ns": "Temp.post"  
  92. } {  
  93.     "v": 1,  
  94.     "key": {  
  95.         "Comment.CommentBy": 1  
  96.     },  
  97.     "name": "Comment.CommentBy_1",  
  98.     "ns": "Temp.po  
  99. st"  
  100. }  
Today we learned about indexes, their types and some basic methods for indexes. In the next article we will learn about Aggregation.

Thanks for reading this article!
 
Author
Pankaj Kumar Choudhary
71 26.6k 13.4m
Next » MongoDB Aggregations (Day 14)