Logstash – Configuration – Field Names – document_id

Background

In Logstash, got bogged down trying to use the document_id Configuration Option.

Reproduction

logstash

Configuration

Image

configuration_file_20180809_0820AM.PNG

ElasticSearch

Query

Image

query_stackoverflowUser_20180809_0825AM.PNG

Data

Image

stackoverflowUser_output_20180809_0830AM.PNG

 

Troubleshooting

ElasticSearch

Data Review

  1. Hits
    • Number of Hits
      • We added 10 records
        • Data Source Query says “select top 10 *
      • But,  hits/total
      • Reads  1
    • Lone Hit
      • Data
        • _id
          • “_id”: “%{[Id]}”
        • field
          • id
            • “id”: 10

Logstash

Output

Image

output_20180809_0851AM.PNG

Textual


{"id":-1,"lastaccessdate":"2008-08-26T07:16:53.810Z","reputation":1,"accountid":-1,"location":"on the server farm","downvotes":890820,"displayname":"Community","@timestamp":"2018-08-09T13:48:09.088Z","age":null,"emailhash":null,"views":649,"@version":"1","creationdate":"2008-07-31T07:00:00.000Z","aboutme":"Hi, I'm not really a person.

\n\nI'm a background process that helps keep this site clean!

\n\nI do things like

\n\n
<ul>\n
	<li>Randomly poke old unanswered questions every hour so they get some attention</li>
\n
	<li>Own community questions and answers so nobody gets unnecessary reputation from them</li>
\n
	<li>Own downvotes on spam/evil posts that get permanently deleted</li>
\n
	<li>Own suggested edits from anonymous users</li>
\n
	<li><a href="http://meta.stackexchange.com/a/92006">Remove abandoned questions</a></li>
\n</ul>
\n","websiteurl":"http://meta.stackexchange.com/","upvotes":225495}

SQL Server

Diagram

dbo.Users

Image

stackOverflow2010.Users.PNG

Explanation
  1. Column name is Id

Resolution

In Logstash, Field names are lower cased

Logstash

Configuration

Image

configuration_file_20180809_0903AM.PNG

ElasticSearch

Query

Query Output

Image

output_20180809_0908AM.PNG

Explanation
  1. hits = 10

References

  1. Logstash
    • Configuring
      • Logstash Reference [6.3] » Configuring Logstash » Accessing Event Data and Fields in the Configuration
        Link
    • Transforming Data
      • Logstash Reference [6.3] » Transforming Data » Extracting Fields and Wrangling Data
        Link
    • Configuring Logstash
      • Logstash Reference [6.3] » Configuring Logstash » Structure of a Config File
        Link
    • Output
      • Logstash Reference [6.3] » Output plugins » Elasticsearch output plugin
        Link
    • Filter
      • Logstash Reference [6.2] » Filter plugins » Mutate filter plugin
        Link

 

 

 

 

Logstash – Error – ” Unrecognized VM option ‘UseParNewGC’ “

Background

During my initial evaluation of Logstash ran into an easy to address error.

Reproduce

Invoke

The invocation is straight forward


set "_binfolder=C:\Downloads\Elastic\Logstash\v6.3.2\extract\bin"
set "_configuration=stackOverflow2010.User.conf"

call %_binfolder%\logstashImpl.bat -f %_configuration%

Output

Image

UseParNewGC_20180802_0334PM

Textual


Unrecognized VM option 'UseParNewGC'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Troubleshooting

Java

Java Version

Outline

We can request the version Number for the java runtime executable, java.exe.

Script


java -version

Output

Image

java_version_20180808_0332PM

Textual


java version "10.0.2" 2018-07-17
Java(TM) SE Runtime Environment 18.3 (build 10.0.2+13)
Java HotSpot(TM) 64-Bit Server VM 18.3 (build 10.0.2+13, mixed mode)

Knowledge Base ( KB)

Googled on the the term and got a good hit

  1. [META] Java 10 Support #9345
    • Opened By :- andrewvc
    • Date Opened :- 2018-April-9th
    • Link
      Link
    • Resolution
      • Resolution #1
        • By :- Armin Braun ( original-brownbear )
        • Web Links
        • Workaround :- Currently there is no temporary fix sorry. Switching to Java 8 is the only stable solution for the time being.
      • Resolution #2
        • By :- Josh Daone ( JoshDaone )
        • Web Links
        • Workaround :- @original-brownbear thanks! Switched back to openjdk-8-jre and working properly.

Resolution

Target Specific Java Version

Objective

Rather than rely on Java that is referenced in our path, we will set the environment variable JAVA_HOME to a pre-v10 Version.

Review Installed Java

Launched Windows explorer and access the “C:\Program Files\Java” folder.

Review Java Folders

java_explorer_20180808_0424PM

Code


set "JAVA_HOME=C:\Program Files\Java\jdk1.8.0_181"

Additional Reading

  1. Andy Luis
    • mpvjava
      • JDK 9 Migration : 5 point checklist for Garbage Collection
        Link
      • About
        Link
  2.  OpenJDK
    • JEP 214: Remove GC Combinations Deprecated in JDK 8
      Link
  3. ORACLE
    • JDK 9 Release Notes – Removed APIs, Features, and Options
      Link

 

Dedicated

Dedicated to Armin Braun ( original-brownbear )
Less Talking … More Coding …

References

  1. Elastic
    • elastic/logstash
      [META] Java 10 Support #9345
      Link

ElasticSearch – Logstash – SQL Server

Background

Elastic’s Logstash is an ETL tool that allows us to “Request, Collect, Parse, and Send” Data.

Logstash

Definition

Elastic

Here is Elastic’s own definition

Link

definition_logstash_20180808_0125PM

Artifact

Logstash is available here.

The current version is 6.3.2 and it is a very recent release ( 2018-July-24th).

artifact_6_3_2

Download

For MS Windows, please choose the ZIP Version.

Extract

Please extract the compressed file.

JDBC Driver

Microsoft SQL Server

Download

Please download Microsoft SQL Server JDBC Driver from here.

Extract

Extract the downloaded file.

Usage

Files

Configuration File

Outline

  1. Input
    • jdbc
      • Connection String
        • Syntax :- jdbc:sqlserver://[host]:[port-number]
        • Sample :- jdbc:sqlserver://localhost:1433
      • jdbc_user :- database user
      • jdbc_password :- database user passsword
      • jdbc_driver_library :- full file name of JDBC Driver
      • jdbc_driver_class
        • com.microsoft.sqlserver.jdbc.SQLServerDriver
      • statement
        • select top 10 * from [StackOverflow2010].[dbo].[Users] tblU order by tblU.[Id] asc
  2. Output
    • elasticSearch
      • hosts
        • Syntax :- Elastic Search host and Port
        • Sample :- localhost:9200
      • Index
        • Sample :- stackoveflow2010user
      • Document Type
        • Sample :- _doc
      • document_id
        • Syntax :- %{[column-name]}
        • Sample :- %{[id]}

Configuration

input {
  jdbc {
    jdbc_connection_string => "jdbc:sqlserver://localhost:1433"
    # The user we wish to execute our statement as
    jdbc_user => "stackoverflow"
    jdbc_password => "hIy8jA2lNl"
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "C:\Downloads\Microsoft\Java\jdbc\v6.0.8112.200\extract\sqljdbc_6.0\enu\jre8\sqljdbc42.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    # our query
    statement => "select * from [StackOverflow2010].[dbo].[Users] tblU order by tblU.[Id] asc"
    }
  }
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "stackoveflow2010user"
  "document_type" => "_doc"
  document_id => "%{[id]}"
  }
}

Command File


setlocal

REM 2018-08-08 11:16 AM Daniel Adeniji ( dadeniji)
REM SET JAVA_HOME to Version 1.8
set "JAVA_HOME=C:\Program Files\Java\jdk1.8.0_181"

set "_binfolder=C:\Downloads\Elastic\Logstash\v6.3.2\extract\bin"
set "_configuration=stackOverflow2010.User.conf"

call %_binfolder%\logstashImpl.bat -f %_configuration%

endlocal

Processing

Script


stackOverflow2010.User.cmd

Output

Output – 01

processing_20180808_0132PM

Output – 02

processing_20180808_0134PM

Output – 03

processing_20180808_0136PM

Output – 04

processing_20180802_011PM

 

Validation

Tools

Postman

Queries

Query – Microsoft

Objective

Find matches for Microsoft

Query


http://localhost:9200/stackoveflow2010user/_doc/_search?q=Microsoft

Design

query_Microsoft_20180808_0117PM

Output

Microsoft_Result_20180808_0118PM

 

References

  1. Elastic
    • Docs
      • Logstash
        • Running Logstash from the Command Line
          Link
    • Blog
      • Suyog Rao
        • Little Logstash Lessons: Handling Duplicates
          Link
  2. StackOverflow
    • Logstash to Keep Two Databases Synced – Cannot Access %{document_id}
      Link
  3. QBox
    • Vineeth Mohan
      • Migrating MySql Data Into Elasticsearch Using Logstash
        Link