Search as you type
Using match_phrase_prefix
The match_phrase_prefix
query helps to match phrases at the prefix level, which allows partial matching in multi-word phrases. It is similar to match_phrase
but works on the phrase level to handle prefix matching.
Consider an index of films. We can use match_phrase_prefix
to match partial phrases in the title.
Store the Elasticsearch endpoint and credentials in variables:
ELASTIC_ENDPOINT="https://your-elasticsearch-endpoint"
ELASTIC_USER="your-username"
ELASTIC_PW="your-password"
curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XGET "$ELASTIC_ENDPOINT:9200/movies/_search?pretty" -d '
{
"query": {
"match_phrase_prefix": {
"title": {
"query": "the terminator",
"slop": 10
}
}
}
}' | jq
Search-as-you-type
Search-as-you-type enables real-time, incremental searches as users type. It is optimized for autocomplete and prefix search functionality.
- Fast, partial matching during user input.
- Optimized using special field types like
search_as_you_type
in mappings.
Lab: Autocomplete
In this example, we will test the autocomplete search functionality in Elasticsearch.
-
Install
jq
Utility to handle JSON output formatting.sudo apt-get install jq
-
Download the
movies.json
dataset, which contains movie data that we will import into Elasticsearch. -
Create a Mapping for the
movies
Index.curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XPUT $ELASTIC_ENDPOINT:9200/movies -
Import the Data into Elasticsearch.
curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XPUT $ELASTIC_ENDPOINT:9200/_bulk?pretty \
--data-binary @movies.json | jq -
Analyze the Text with a Custom Tokenizer
curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XPOST $ELASTIC_ENDPOINT:9200/movies/_analyze?pretty \
-d '{
"tokenizer" : "standard",
"filter": [{"type":"edge_ngram", "min_gram": 1, "max_gram": 5}],
"text" : "Harry"
}' | jqThis command will analyze the text using a tokenizer and apply an "edge ngram" filter. Edge ngrams are useful for autocomplete functionality because they allow partial token matches, which enables a user to type a few characters and see suggestions as they type.
-
Create the
autocomplete
index with search-as-you-type functionality for thetitle
andgenre
fields.curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XPUT $ELASTIC_ENDPOINT:9200/autocomplete \
-d '{
"mappings": {
"properties": {
"title": {
"type": "search_as_you_type"
},
"genre": {
"type": "search_as_you_type"
}
}
}
}' | jqThe
search_as_you_type
type is optimized for autocomplete and provides an efficient way to index text fields that will be used for real-time search. Using this allows Elasticsearch to return search results as the user types.Output:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "autocomplete"
} -
Reindex Data from
movies
toautocomplete
for faster searches.curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XPOST $ELASTIC_ENDPOINT:9200/_reindex?pretty -d '
{
"source": {
"index": "movies"
},
"dest": {
"index": "autocomplete"
}
}' | grep "total\|created\|failures"Reindexing the data ensures that we are using the
autocomplete
index, which is optimized for the search-as-you-type functionality.The command should return:
"total" : 50,
"created" : 50,
"failures" : [ ] -
Check the Mappings of the
autocomplete
Index.curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XGET "$ELASTIC_ENDPOINT:9200/autocomplete/_mapping?pretty=true" | jqWe can see the mappings confirm the fields
title
andgenre
are set tosearch_as_you_type
, ensuring that Elasticsearch is optimized for autocomplete.{
"autocomplete": {
"mappings": {
"properties": {
"genre": {
"type": "search_as_you_type",
"doc_values": false,
"max_shingle_size": 3
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"title": {
"type": "search_as_you_type",
"doc_values": false,
"max_shingle_size": 3
},
"year": {
"type": "long"
}
}
}
}
} -
Perform a Search Using the
multi_match
Querycurl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XGET $ELASTIC_ENDPOINT:9200/autocomplete/_search?pretty -d'
{
"size": 5,
"query": {
"multi_match": {
"query": "Harry",
"type": "bool_prefix",
"fields": [
"title",
"title._2gram",
"title._3gram",
"title._4gram",
"title._5gram"
]
}
}
}'This will search for titles starting with "Harry" using a
bool_prefix
query. Thebool_prefix
query works well for autocomplete, as it allows partial matches and returns results that begin with the typed string. -
Set Up an Interactive Autocomplete Search. Initialize the
INPUT
variable to hold user input:INPUT=''
-
Next, set up an infinite loop to simulate real-time autocomplete searches as you type:
while true
do
IFS= read -rsn1 char
INPUT=$INPUT$char
echo $INPUT
curl -s -u $ELASTIC_USER:$ELASTIC_PW \
-H 'Content-Type: application/json' \
-XGET $ELASTIC_ENDPOINT:9200/autocomplete/_search \
-d '{
"size": 5,
"query": {
"multi_match": {
"query": "'"$INPUT"'",
"type": "bool_prefix",
"fields": [
"title",
"title._2gram",
"title._3gram"
]
}
}
}' | jq .hits.hits[]._source.title | grep -i "$INPUT"
doneThis loop captures each character typed by the user, appends it to the
INPUT
variable, and sends a search query to Elasticsearch. The results are updated in real-time based on the input. -
Begin typing the film title. Keep in mind that the previous steps are set to display only "Harry Potter" films in the autocomplete. To test for other films, modify steps 5 and 9.