In this article, we are going to learn how to create, update, delete and query elastic search documents. Please check the previous part.
Topics
- Creating New document
- Getting Document
- Searching document (querying)
- Updating Document
- Deleting Document
What is Index?
An index is like a “Database” in a relational database. Each Index may contain various Types in it.
What is a type?
A type is like a “tables” in a relational database. Each type has various fields in it.
What is Document?
The document is a JSON document which is stored in elastic search. It is like a row in a table. Each document is stored in the index and has a type and document it.
What is a field?
A document contains a list of fields in it. Fields are similar to a column in the relational database, which stores a single value. Like value types.
Definitions are referred from the link https://www.slideshare.net/ABCTalks/elastic-search-overview
Let’s create a document -- first we need REST CLIENT.
There are various Rest clients available but for this demo, I am going to use Postman.
Comparing Query and JSON object
But if you see in the below query I am getting data from more than one table by joining multiple tables but over the period of time this query will take time to get data but the solution to this is we are going to move this data return by the query to elastic search which will be easy to search.
Creating a New Document
To create a document in elastic search we are going to use restful APIs service provided by elastic search.
The URL of elastic search is divided into segments.
- Server Address is Elastic search address
- The index is Database Name (Database)
- Type is Table Name (type)
For creating the first document we are going to use HTTP post Request.
Steps
Enter URL - http://127.0.0.1:9200/timesheet/projectmaster
After setting Request URL next we are going to set request headers.
Set Header “Content-Type: application/json”
After setting headers next, we are going to enter request body.
Enter Request Body
We are going to insert the below JSON data into elastic search.
A snippet of Request JSON
- {
- "ProjectID": "1",
- "ProjectName": "ABC Bearings Ltd",
- "NatureofIndustry": "Bearings",
- "ProjectCode": "A001"
- }
After entering request URL, request headers, and Request body, next, we are going to Post Data.
Posting data for creating a new document
After posting data to elastic search, we got a response.
In response, you will get index and type name which you have set and a unique “_id” along with result as “created”.
After entering a single document, in a similar way, we can create other new documents in the elastic search.
Note
In a similar way, I have inserted 4 more document into index “timesheet”.
After completing with inserting all documents next we get all documents from Elasticsearch.
The five project which I have created for a demo.
- "ProjectName": "Alok Industries Ltd"
- "ProjectName": "Ansa Pack (Simplex Group)"
- "ProjectName": "ABC Bearings Ltd"
- "ProjectName": "Anil Bioplus Ltd (Anil Group Ahmedabad)
- "ProjectName": "Alok Industries Ltd"
Get Document
The get API allows getting a typed JSON document from the index based on its id.
The below snapshot contains a document id (“_id”) in which we are going to use for search
A snippet of Response JSON
- {
- "_index": "timesheet",
- "_type": "projectmaster",
- "_id": "B_K8nGMBcox2VuUwbivo",
- "_score": 1,
- "_source": {
- "ProjectID": "2",
- "ProjectName": "Alok Industries Ltd",
- "NatureofIndustry": "Textile",
- "ProjectCode": "A002"
- }
- }
URL - http://127.0.0.1:9200/timesheet/projectmaster/B_K8nGMBcox2VuUwbivo
After setting parameter just send a request to get a response.
Below is a response which we have received to a request which we have sent.
But in response, if we only want to view “_source” part is it possible?
Yes, it is possible here is the answer to that.
Just append “_source” keyword to your existing request -- that’s it.
URL - http://127.0.0.1:9200/timesheet/projectmaster/B_K8nGMBcox2VuUwbivo/_source
Next we are going to learn how to search document.
Querying Document (Search)
Get all Data from Elastic search (match_all Query)
Now let’s query data and get all data which we have inserted so far.
- To search data, we need to enter “_search” to URL.
URL - http://127.0.0.1:9200/timesheet/projectmaster/_search
- Setting header
- Setting Query for getting all documents in timesheet index and project master
A snippet of Request JSON
- {
- "query": {
- "match_all": {}
- }
- }
After posting query next we get all responses to a document which we have entered.
In response, you can see we got a “total” of 5 documents, it is correct because we have inserted a total of 5 document into timesheet “index” and project master type.
Below are the complete Response parameter Details
- took – time in milliseconds for Elasticsearch to execute the search
- timed_out – tells us if the search timed out or not
- _shards – tells us how many shards were searched, as well as a count of the successful/failed searched shards
- hits – search results
- hits.total – total number of documents matching our search criteria
- hits.hits – an actual array of search results (defaults to first 10 documents)
- hits.sort - sort key for results (missing if sorting by score)
- hits._score and max_score - ignore these fields for now
Referenced from - https://www.elastic.co/guide/en/elasticsearch/reference/current/_the_search_api.html
What is Shard?
Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. When you create an index, you can simply define the number of shards that you want. Each shard is in itself a fully-functional and independent "index" that can be hosted on any node in the cluster.
Sharding is important for two primary reasons:
It allows you to horizontally split/scale your content volume
It allows you to distribute and parallelize operations across shards (potentially on multiple nodes) thus increasing performance/throughput
Referenced from - https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html
If we expand hits we are going to see data which we have posted.
Now we have written a query to get all data, next let’s write a query to get specific data.
Note
What is Query DSL?
The query DSL is a flexible, expressive search language that Elasticsearch uses to expose most of the power of Lucene through a simple JSON interface. It is what you should be using to write your queries in production. It makes your queries more flexible, more precise, easier to read, and easier to debug.
Referenced from - https://www.elastic.co/guide/en/elasticsearch/guide/current/query-dsl-intro.html
Get specific Data from Elastic search (match Query)
In this part, we are going to search data with project name; for doing that we need to write a different query.
You want to query for a full-text or exact value in almost any field then you can use (Match Query).
- To search data, we need to enter “_search” to URL.
URL - http://127.0.0.1:9200/timesheet/projectmaster/_search
- Setting header
- Query for getting all data that matches project name.
Snippet of JSON
- {
- "query": {
- "match": {
- "ProjectName": "Ambuja Cement Ltd"
- }
- }
- }
After sending a request it returns 4 documents as output because it searches all data for a match.
We have five project names but in that, only four project names have the word “Ltd”.
If I remove “Ltd” then it will only show one project name.
A snippet of Request Json
- {
- "query": {
- "match": {
- "ProjectName": "Ambuja Cement "
- }
- }
- }
Now if we send this above query request then we are going to get only one project name as output as shown below.
There are a lot of ways to query elastic search data; you can learn more here.
After getting an idea of how to query data next let’s have a look at how to update the document.
Updating Document (Modifying Your Data)
To update the document, we need document id -- every document has document id (“_id”) which is auto-generated.
A snippet of Response Json
- {
- "_index": "timesheet",
- "_type": "projectmaster",
- "_id": "CPK8nGMBcox2VuUw5Ss2",
- "_score": 1,
- "_source": {
- "ProjectID": "3",
- "ProjectName": "Ambuja Cement Ltd",
- "NatureofIndustry": "Cement”, ProjectCode": "A003"
- }
- }
This is a parameter which we are going to set for updating the document.
For the demo, we are going to update project name “Ambuja Cement Ltd” to “Ambuja Cement”.
Now we have set all parameter to update data.
After updating data below is a response which we get.
Snippet of Request Json
- {
- "doc": {
- "ProjectName": "Ambuja Cement"
- }
- }
Now that we have updated data we must again call (match all Query).
In the below snapshot, we got updated data,
A snippet of Request Json
- {
- "query": {
- "match_all": {}
- }
- }
Now we are going to work on delete document.
Delete Document (Deleting Your Data)
To delete a document, we need document id. Every document has document id (“_id”) which is auto-generated.
For deleting a document, we are going to set HTTP request to Delete.
After that, we are going to set URL for delete request which contains an index, type, and document id.
URL - http://127.0.0.1:9200/timesheet/projectmaster/CPK8nGMBcox2VuUw5Ss2
After setting parameter just send request to delete document.
Below is the response which we receive after sending delete request.
Finally, we have completed performing CRUD operations with the elastic search.