ElasticSearch – Type – Mapping

Background

It is very easy to create new types, tables in traditional tables, and corresponding columns in ElasticSearch.

ElasticSearch

If we do not create mappings before hand, ElasticSearch will create the type, columns and column types for us.

Schema-Less

The ElasticSearch exercises we have tackled thus far relies on the ability of ElasticSeach to create columns upon usage; the first time a record is added.

biblekjv

Here is the schema created for our biblekjv example.

Image

biblekjv_mapping_20180806_1024PM_01

Textual

  1. Columns
      • book
        • type is text
        • And, type is keyword
    • bookID
      • type is long
    • chapter
      • type is long
    • chapterID
      • type is long
    • passage
      • type is text
      • And, type is keyword

Explanation

  1. Numeric entries
    • Are codified as long
    • The largest available datatype for numbers
  2. Strings
    • Represented as both text and keyword
    • Datatype
      • Text
        • Tokenizable
      • Keyword
        • Exact Matches

Schema

Definition

Definition – ElasticSearch

Link

Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. For instance, use mappings to define:

  • which string fields should be treated as full text fields.
  • which fields contain numbers, dates, or geolocations.
  • whether the values of all fields in the document should be indexed into the catch-all _all field.
  • the format of date values.
  • custom rules to control the mapping for dynamically added fields.

Sample – event

Goal

In this post we will create a schema to underine an event.

Sample

Sample – Periscope

Single Event Tables and Common Analysis Queries
Link

event_periscope_20180807_0238PM

Code

Create Type
Outline
  1. We will have two files, the JSON file and the command file
  2. The JSON file
    • Mapping
      • Top – mappings
      • Type :- _doc
      • Properties
        • Property
          • userID :- long
          • loginFirst :- date
          • loginLast :- date
          • firstSpendAt :- date
          • totalSpend :- double
          • dob :- date
          • gender :- keyword ( m/f/etc)  {Gender Inclusive}
          • platform :- keyword ( ios/android)
json

{
    "mappings":
    {
        "_doc":
        {
            "properties":
            {

                "userID":
                {
                    "type": "long"
                }

                , "loginFirst":
                {
                    "type": "date"
                }               

                , "loginLast":
                {
                    "type": "date"
                }                               

                , "firstSpendAt":
                {
                    "type": "date"
                }                                               

                , "totalSpend":
                {
                    "type": "double"
                }                   

                , "dob":
                {
                    "type": "date"
                }   

                , "gender":
                {
                    "type": "keyword"
                }   

                , "platform":
                {
                    "type": "keyword"
                }
            } // "properties"

        } // mapping name

    } //mappings

}  

Batch file

setlocal

set "_binfolder=C:\Downloads\Curl\v7.61.0\Windows\extract\curl-7.61.0-win64-mingw\bin"
set "_binapp=curl.exe"
set "_bin=%_binfolder%\%_binapp%"

set "_url=http://localhost:9200"

set "_esindex=/event"
set "_estype="

set "_espath=%_esindex%%_estype%"

set "_contentType=Content-Type: application/json

Rem Index - Delete

%_bin% -XDELETE -i %_url%%_espath% 

Rem Index / Mapping - Create

set "_jsonfile=jsonfile\event_mapping.json"

set "_jsonfileFQDN=%cd%\%_jsonfile%"

%_bin% -i  -XPUT %_url%%_espath%  -H "%_contentType%" --data-binary @%_jsonfileFQDN%

endlocal

Output
Output – Delete

type_create_delete_20180807_0253PM.png

Output – Create

typeMappingCreate_20180807_0253PM

Display Type
Outline
  1. We will have a lone file, the command file
  2. Command file
  3. Payload
Batch file

setlocal

set "_binfolder=C:\Downloads\Curl\v7.61.0\Windows\extract\curl-7.61.0-win64-mingw\bin"
set "_binapp=curl.exe"
set "_bin=%_binfolder%\%_binapp%"

set "_url=http://localhost:9200"

set "_esindex=/event"
set "_estype="
set "_esoperator=/_mapping"

Rem Index / Mapping - List

set "_espath=%_esindex%%_estype%%_esoperator%?pretty"

%_bin% -i  %_url%%_espath% -H "%_contentType%" 

Output
Output – Browse

typeMappingBrowse_20180807_0320PM

Load Data
Outline
    1. There are two files, the JSON file and the command file
    2. The JSON file
      • Two lines for each record
        • The first record will contain the _id
        • The second record sill contain the actual data
          • Property/Value pair
    3. Command file
      • Syntax
        • curl.exe -i -H “Content-Type: application/x-ndjsonhttp://%5Bserver}:{port}/{index}/{type}/_bulk?pretty –data-binary @[filename]
      • Sample
JSON

{"index":{"_id":1}}
{"userID":137314,"loginFirst":"2015-04-01","loginLast":"2015-04-05","firstSpendAt":"2015-04-03","totalSpend":50,"dob":"1994-01-01","gender":"f","platform":"ios"}
{"index":{"_id":2}}
{"userID":137312,"loginFirst":"2015-04-01","loginLast":"2015-04-02","firstSpendAt":"2015-04-02","totalSpend":2500,"dob":"1987-01-01","gender":"m","platform":"android"}
{"index":{"_id":3}}
{"userID":137310,"loginFirst":"2015-04-01","loginLast":"2015-04-01","dob":"2000-01-01","gender":"m","platform":"ios"}
{"index":{"_id":4}}
{"userID":137311,"loginFirst":"2015-04-01","loginLast":"2015-04-02","dob":"1995-01-01","gender":"f","platform":"ios"}
{"index":{"_id":5}}
{"userID":137313,"loginFirst":"2015-04-01","loginLast":"2015-04-04","dob":"1976-01-01","gender":"m","platform":"andriod"}

Command

setlocal

set "_binfolder=C:\Downloads\Curl\v7.61.0\Windows\extract\curl-7.61.0-win64-mingw\bin"
set "_binapp=curl.exe"
set "_bin=%_binfolder%\%_binapp%"

set "_url=http://localhost:9200"

set "_esindex=event"
set "_estype=_doc"
set "_esoperation=_bulk"

set "_espath=/%_esindex%/%_estype%/%_esoperation%"

set "_jsonfile=jsonldfile\event.json"
set "_jsonfileFQDN=%cd%\%_jsonfile%"

set "_contentType=Content-Type: application/x-ndjson"

%_bin% -i -H "%_contentType%"  %_url%%_espath%?pretty --data-binary @%_jsonfileFQDN%

endlocal
Browse Data
Outline
  1. There are two files the json file and command file
  2. Payload
    • Syntax :-
      • curl.exe -i -H “Content-Type: application/json” http://{server}:{port}/{type}/_search?pretty –data-binary @{filename}
    • Sample :-
JSON

{
   "sort" :
  {
      "_id": {"order": "asc"}

  }
}

Command

setlocal

set "_binfolder=C:\Downloads\Curl\v7.61.0\Windows\extract\curl-7.61.0-win64-mingw\bin"
set "_binapp=curl.exe"
set "_bin=%_binfolder%\%_binapp%"

set "_url=http://localhost:9200"

set "_esindex=event"
set "_estype=_doc"
set "_esoperation=_search"
set "_espath=/%_esindex%/%_estype%/%_esoperation%"

set "_jsonfile=jsonfile\event_data_browse.json"
set "_jsonfileFQDN=%cd%\%_jsonfile%"

set "_contentType=Content-Type: application/json"

%_bin% -i -H "%_contentType%"  %_url%%_espath%?pretty --data-binary @%_jsonfileFQDN%

endlocal
Output

dataBrowse_20180807_0357PM

Source Control

GitHub

DanielAdeniji/elasticSearchMappingEvent
Link

References

  1. Elastic
    • Docs
      • Mapping
        • Definition
        • Field Datatypes
          • Elasticsearch Reference [6.3] » Mapping » Field datatypes
            Link
          • Elasticsearch Reference [6.3] » Mapping » Field datatypes »Data datatypes
            Link
          • Elasticsearch Reference [6.3] » Mapping » Field datatypes » Numeric datatypes
            Link
          • Elasticsearch Reference [6.3] » Mapping » Field datatypes » Boolean datatype
            Link
        • Indices Mapping
          • Get Mapping
            • Indices Get Mapping
              Link
          • Put Mapping
            • Elasticsearch Reference [5.0] » Indices APIs » Put Mapping
              Link
      • Indices
        • Delete Index
          • Elasticsearch Reference [2.3] » Indices APIs » Delete Index
          • Elasticsearch Reference [6.3] » Indices APIs » Delete Index
      • Reference
      • Elastic for Apache Hadoop
        • Elasticsearch for Apache Hadoop [6.3] » Elasticsearch for Apache Hadoop » Mapping and Types
          « Apache Storm support
          Link
    • Github
      • elastic/elasticsearch
        • INDEX DELETE with wildcard doesn’t delete all matching indexes #7295
          Link
        • v5.3.1 Entire Bulk request is rejected if containing a single invalid index request #24319
          Link
    • Discuss
      • Index a new document/35281
        Link
  2. Periscope
    • Single Event Tables and Common Analysis Queries
      Link
  3. GitHubGist
    • kuznero
      • kuznero/elasticsearch-curl.md
        Link
  4. StackOverflow
    • curl command to close all indices at once in elastic search
      Link
    • Reindex ElasticSearch index returns “Incorrect HTTP method for uri [/_reindex] and method [GET], allowed: [POST]”
      Link
    • Elasticsearch : Root mapping definition has unsupported parameters index : not_analyzed
      Link
  5. logz.io
    • Daniel Berman
      • ElasticSearch Mapping
        Link

 

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s