Elasticsearch : Queries

Till now we looked into some basic concept of elasticsearch. Lets now look into various ways we can search in elastic search. For this I am going to delete my prvious movie mapping and start fresh.


DELETE media
POST /media{
"mappings": {
"movies": {
"properties": {
"name": {
"type": "string"
},
"release_date": {
"type": "date",
"format": "YYYY-MM-DD"
},
"box_office": {
"type": "integer"
},
"directors": {
"type": "string"
}
}
}
}
}
POST media/movies/1{
"name": "The Avengers",
"release_date": "2012-04-11",
"box_office": "1519",
"directors": "Joss Whedon"
}
POST media/movies/2{
"name": "The Dark Knight",
"release_date": "2008-07-14",
"box_office": "1005",
"directors": "Christopher Nolan"
}
POST media/movies/3{
"name": "Inception",
"release_date": "2010-07-08",
"box_office": "825",
"directors": "Christopher Nolan"
}

Above, we first deleted our previous index and then created a new one, where box_office is an integer in millions. Then there are 3 inserts. Lets see different search ways:

Basic Queries Using Only the Query String

Basic queries can be done using only query string parameters in the URL. This is sometime also calles ‘search lite‘. Lets see this by an example:


GET media/movies/_search?q=directors:joss

es_basicSearch0

Basic queries use the q query string parameter which supports the Lucene query parser syntax and hence filters on specific fields (e.g. fieldname:value), wildcards (e.g. abc*) and more. There are a variety of other options (e.g. size, from etc) that you can also specify to customize the query and its results. Full details can be found in the ElasticSearch URI request docs.

Query DSL

A DSL is Domain Specific Language, meaning elasticsearch accepts specially composed json snippets as query. We would be using the same url but instead of query strings we would be sending json. Consider the following query


GET media/movies/_search
{
"query": {
"match": {
"directors": "joss"
}
}
}

Running above query will give the same result as previous query string did. If we break the json down, we are telling elastic search that we are performing a ‘match’ type of ‘query’on the field ‘directors’ and then what data to search for. Elasticsearch has many types of queries.  Lets perform another query.

GET media/movies/_search
{
"query": {
"match": {
"name": "dark the"
}
}
}

es_dsl0

Notice the score in the above result. Elasticsearch by default orders the result in score order with higher score at the top. The score for the first result is higher as it contains both our search terms.

If you want to match the search on the exact words maintaing the sequence, the ‘match_phrase’ is a good option. Its great for full text searches. Lets see the following query


GET media/movies/_search
{
"query": {
"match_phrase": {
"name": "the dark"
}
}
}

es_dsl1

As you can see it fetched only one result and not 2 like previous one.

Query Filters

Filters allow you to reduce the results that are returned from elastic search with logical operators. Lets look into the followinng query


GET media/movies/_search
{
"query": {
"filtered": {
"filter": {
"range": {
"release_date": {
"gt": "2010-01-01"
}
}
},
"query": {
"match": {
"directors": "nolan"
}
}
}
}
}

es_dsl2

In the above query we searched on the term ‘nolan’ (which is present in 2 documents) and filtered the result on release_date fields having date greater that 2010-01-01. Hence got only one result. Above we used the ‘range’ filter. There are many other filters also.

Highlighting

This allows to highlight search results on one or more fields. Lets us see this by example.


GET media/movies/_search
{
"query": {
"match": {
"directors": "nolan"
}
},
"highlight": {
"pre_tags" : ["<strong>"],
"post_tags" : ["</strong>"],
"fields": {
"directors": {}
}
}
}

es_dsl_highlight

By default, the highlighting will wrap highlighted text in <em> and </em>. This can be controlled by setting pre_tags and post_tags. This can be highlighted using css in your html.

Aggregation

Elasticsearch comes with a lot of built in analytics function. In SQL terms aggreagation is kind of group by but way more powerful. Lets see this by an example


GET media/movies/_search
{
"query": {
"match": {
"directors": "nolan"
}
},
"aggs": {
"Box Office Earnings in Million $": {
"avg": {
"field": "box_office"
}
}
}
}

es_dsl_aggregation

In the query we used the aggregation to find out the average on the field box_office which contains the term ‘nolan’ in directors and elasticsearch gave us the result.

Thats it for now, but note that these are just some of the queries type that I have mentioned. There are a lot more. Please go through the official elasticsearch docs to know about them.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: