What is ElasticSearch?
ElasticSearch is an open-source search engine with a REST API for querying written words. Elasticsearch is a standalone database server written in Java. It is built on top of Apache Lucene. It uses JSON over HTTP and is suitable for programming languages other than Java as well. The first version of ElasticSearch was released in February 2010 by Shay Banon. ElasticSearch exposes a REST API to interact with data using HTTP verbs. To retrieve information, we use a GET command. To create or update records, you can use a PUT or POST command.
For the complete Feature List, please visit https://www.elastic.co/products/elasticsearch.
We can download the ElasticSearch from http://www.elasticsearch.org/download/. Please note that the Java runtime should be installed, and the JAVA_HOME environment variable must first be set as a pre-requisite. To run the server, we need to run elasticsearch.bat located in the bin folder of the ElasticSearch folder.
If we don't have a Java runtime installed or not correctly configured, then we'll not see output like the one above but instead a message saying "JAVA_HOME environment variable must be set!”. Then we need to download and install Java. Now open the elasticsearch.in.bat file in Notepad and remove the line highlighted in the following image.
Now change the “%JAVA_HOME%” to your Java JDK path (which is C:\Program Files\Java\jdk1.7.0_67\jre\bin\java in my case) in elasticsearch.bat file. Once ElasticSearch is up and running, open a browser and type “http://localhost:9200/” in the address bar.
Now we can play with ElasticSearch using RESTClient. To do this, we need to first populate some data in an index. An index is a collection of documents of similar data. For example, we can have an index for Employee data. An index is identified by a name that is used when performing index, search, update and delete operations against the documents in it.
Now we would see some CRUD operations. We will populate an index called “employees”. Please note that ES supports index names in small cases only. We can use a PUT or POST method to create an index. If the index does not exist, then it would be created. Similarly, if a document does not exist in the index then it would be inserted, and if it exists, then it would e updated. We need to use http://localhost:9200/<_index>/<_type>/[<id>] to create and update an index and document.[_index] and [_ type] are required, whereas the id part is optional.
Here we have created an index “employees” and passed JSON data for one employee. If you note the response, then we see it shows the version of the document and “created: true”. If we change any data within the JSON and post the data with the same URL, then the version would change, and “created” would be false.
Similarly, I have inserted the following JSON data as well.
{
"EmployeeId": "2",
"FirstName": "Karunesh",
"LastName": "Singh",
"Department": "PMO",
"Salary": "50000"
}
{
"EmployeeId": "3",
"FirstName": "Swati",
"LastName": "Kumari",
"Department": "HR",
"Salary": "15000"
}
{
"EmployeeId": "4",
"FirstName": "Rahul",
"LastName": "Dalal",
"Department": "MST",
"Salary": "30000"
}
{
"EmployeeId": "5",
"FirstName": "Rajib",
"LastName": "Ganguly",
"Department": "MST",
"Salary": "20000"
}
{
"EmployeeId": "6",
"FirstName": "Amit",
"LastName": "Mohod",
"Department": "CTS",
"Salary": "22000"
}
We can read a document from an index with a simple “GET” if we have an “ID” for that, as in the following.
If we need to remove a single document from the index by ID, we must use the same URL with the HTTP method DELETE.
Now we would see how to search the document with ES. For this, ES has an endpoint as “_search”. If we need to get all the documents of a specific type within an index, then we must call: http://localhost:9200//<_index>/<_type>/_search with the HTTP method GET. In our case, it is:
http://localhost:9200/employees/employee/_search. It will give us all the documents.
In order to apply a filter to make a more useful search request, we need to supply a request body with a query with the HTTP method POST. The request body should be a JSON object. Suppose we need to search the document with the word “Amit”; then our JSON object should be:
{
"query":
{
"query_string": { "query": "Amit" }
}
}
This JSON uses ElasticSearch's query DSL.
If we want to search “Amit” in the first name, then we need to use.
{
"query":
{
"query_string": { "query": "FirstName:Amit" }
}
}
We can combine multiple filters or query strings to make some useful search. I will try to explain these options in the next article.