ElasticSearch – Logstash – SQL Server

Background

Elastic’s Logstash is an ETL tool that allows us to “Request, Collect, Parse, and Send” Data.

Logstash

Definition

Elastic

Here is Elastic’s own definition

Link

definition_logstash_20180808_0125PM

Artifact

Logstash is available here.

The current version is 6.3.2 and it is a very recent release ( 2018-July-24th).

artifact_6_3_2

Download

For MS Windows, please choose the ZIP Version.

Extract

Please extract the compressed file.

JDBC Driver

Microsoft SQL Server

Download

Please download Microsoft SQL Server JDBC Driver from here.

Extract

Extract the downloaded file.

Usage

Files

Configuration File

Outline

  1. Input
    • jdbc
      • Connection String
        • Syntax :- jdbc:sqlserver://[host]:[port-number]
        • Sample :- jdbc:sqlserver://localhost:1433
      • jdbc_user :- database user
      • jdbc_password :- database user passsword
      • jdbc_driver_library :- full file name of JDBC Driver
      • jdbc_driver_class
        • com.microsoft.sqlserver.jdbc.SQLServerDriver
      • statement
        • select top 10 * from [StackOverflow2010].[dbo].[Users] tblU order by tblU.[Id] asc
  2. Output
    • elasticSearch
      • hosts
        • Syntax :- Elastic Search host and Port
        • Sample :- localhost:9200
      • Index
        • Sample :- stackoveflow2010user
      • Document Type
        • Sample :- _doc
      • document_id
        • Syntax :- %{[column-name]}
        • Sample :- %{[id]}

Configuration

input {
  jdbc {
    jdbc_connection_string => "jdbc:sqlserver://localhost:1433"
    # The user we wish to execute our statement as
    jdbc_user => "stackoverflow"
    jdbc_password => "hIy8jA2lNl"
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "C:\Downloads\Microsoft\Java\jdbc\v6.0.8112.200\extract\sqljdbc_6.0\enu\jre8\sqljdbc42.jar"
    jdbc_driver_class => "com.microsoft.sqlserver.jdbc.SQLServerDriver"
    # our query
    statement => "select * from [StackOverflow2010].[dbo].[Users] tblU order by tblU.[Id] asc"
    }
  }
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "stackoveflow2010user"
  "document_type" => "_doc"
  document_id => "%{[id]}"
  }
}

Command File


setlocal

REM 2018-08-08 11:16 AM Daniel Adeniji ( dadeniji)
REM SET JAVA_HOME to Version 1.8
set "JAVA_HOME=C:\Program Files\Java\jdk1.8.0_181"

set "_binfolder=C:\Downloads\Elastic\Logstash\v6.3.2\extract\bin"
set "_configuration=stackOverflow2010.User.conf"

call %_binfolder%\logstashImpl.bat -f %_configuration%

endlocal

Processing

Script


stackOverflow2010.User.cmd

Output

Output – 01

processing_20180808_0132PM

Output – 02

processing_20180808_0134PM

Output – 03

processing_20180808_0136PM

Output – 04

processing_20180802_011PM

 

Validation

Tools

Postman

Queries

Query – Microsoft

Objective

Find matches for Microsoft

Query


http://localhost:9200/stackoveflow2010user/_doc/_search?q=Microsoft

Design

query_Microsoft_20180808_0117PM

Output

Microsoft_Result_20180808_0118PM

 

References

  1. Elastic
    • Docs
      • Logstash
        • Running Logstash from the Command Line
          Link
    • Blog
      • Suyog Rao
        • Little Logstash Lessons: Handling Duplicates
          Link
  2. StackOverflow
    • Logstash to Keep Two Databases Synced – Cannot Access %{document_id}
      Link
  3. QBox
    • Vineeth Mohan
      • Migrating MySql Data Into Elasticsearch Using Logstash
        Link

 

DBeaver – Table Relationships

Background

Just wanting to make sure that I am properly tracking table relationships in MySQL.

 

Code

INFORMATION_SCHEMA

INFORMATION_SCHEMA.KEY_COLUMN_USAGE

Code

Sample


set @schema := 'sakila';
set @table := 'inventory';

select 

         case

            when tblKCU.TABLE_NAME = @table then 'References'
            when REFERENCED_TABLE_NAME = @table then 'Referenced'

         end as 'isReferencingOrReferenced'
       , tblKCU.CONSTRAINT_NAME as 'constraint'
       , tblKCU.TABLE_NAME as 'table'
       , tblKCU.COLUMN_NAME as 'column'

       , tblKCU.REFERENCED_TABLE_NAME as 'refTable'
       , tblKCU.REFERENCED_COLUMN_NAME as 'refColumn'

       , tblKCU.ORDINAL_POSITION as 'colPos'

from   information_schema.KEY_COLUMN_USAGE tblKCU

where  tblKCU.TABLE_SCHEMA = @schema

/* Foreign Key Relationship */
and    (

             ( tblKCU.TABLE_NAME = @table )
          or ( tblKCU.REFERENCED_TABLE_NAME = @table )

       )

/* Skip Primary Key */
and    (

              ( tblKCU.TABLE_NAME is not null )
          and ( tblKCU.REFERENCED_TABLE_NAME is not null )

       )

order by       

         tblKCU.TABLE_NAME
       , tblKCU.COLUMN_NAME

       , tblKCU.REFERENCED_TABLE_NAME
       , tblKCU.REFERENCED_COLUMN_NAME

       , tblKCU.CONSTRAINT_NAME       

       , tblKCU.ORDINAL_POSITION

;

Output

 

Tools

DBeaver

Outline

Steps to follow to track visually through DBeaver

  1. Launch DBeaver
  2. Connect to MySQL Instance
  3. Navigate to Database Tables <Specific Table>
  4. Double Click on the selected table
  5. On the right panel, choose the “ER Diagram” Tab
  6. Review the shown tables and their relationships to each other

Image

MySQL :- Error – “You are using safe update mode and you tried to update a table without a WHERE that uses a KEY column”

Background

As I tested out the code from our last blog, discovered a nice MySQL safety net.

BTW, that blog is here.

 

Error

Error Text

The error reads:


You are using safe update mode and you tried to update a table 
without a WHERE that uses a KEY column

 

Error Image

Troubleshooting

Review code and identify DML operations ( Update, Delete ) where the key column is not being filtered on.

Here are possible scenarios that will raise this error :-

  1. Delete from [table];
  2. Delete from [table] where [id] > 0;

 

Remediation

In our case we replaced


delete

from tblIndexColumn

with


truncate table tblColumn;

Workaround

SQL_SAFE_UPDATES


SET SQL_SAFE_UPDATES=0;

call listIndexColumns();

SET SQL_SAFE_UPDATES=1;

Transact SQL – STRING_AGG

Background

In our post on MySQL – Information_schema.statistics we spoke glowingly of the GROUP_CONCAT Function.

I really could not find a way around using it in MySQL.

I ran into some difficulties using it and as with problems one just googles for workarounds.

BTW, the MySQL post is here.

SQL Server

Introduction

It seems that in version 2017, MSFT played catch up.

And, added a similar function.

String_Agg

Code


select
        [object]
            = quoteName(tblSS.[name])
              + '.'
              + quoteName(tblSO.[name])

        , [indexID]
            = tblSI.[index_id]

        , [index]
            = tblSI.[name]

        , [indexType]
            = tblSI.[type_desc]

        , [columnList]
            = 
                STRING_AGG 
                (
                      concat
                      (
                          tblSC.[name]
                        , ''
                      )	

                    , ', '
                ) 
                WITHIN GROUP 
                (
                    ORDER BY 
                        tblSIC.[key_ordinal]
                )
            

from   sys.schemas tblSS

inner join sys.objects tblSO

        on   tblSS.[schema_id] = tblSO.[schema_id]
 
inner join sys.indexes tblSI

        on   tblSO.[object_id] = tblSI.[object_id]

inner join sys.index_columns tblSIC

        on   tblSI.[object_id] = tblSIC.[object_id]
        and  tblSI.[index_id]  = tblSIC.[index_id]

inner join sys.columns tblSC

        on   tblSIC.[object_id] = tblSC.[object_id]
        and  tblSIC.column_id = tblSC.column_id

/*
    Skip MS Shipped Objects
*/
where tblSO.is_ms_shipped = 0

/*
    Exclude Included Columns
    Only Include actual Key Columns
*/
and   tblSIC.[key_ordinal] > 0

group by

                quoteName(tblSS.[name])
              + '.'
              + quoteName(tblSO.[name])

        , tblSI.[index_id]

        , tblSI.[name]

        , tblSI.[type_desc]

order by

                quoteName(tblSS.[name])
              + '.'
              + quoteName(tblSO.[name])

        , tblSI.[index_id]

        , tblSI.[name]



Output

 

Crediting

Crediting Dan M for asking the question.

And, Martin Smith for ably.

Simulating group_concat MySQL function in Microsoft SQL Server 2005?
Link

 

References

  1. Microsoft
    • String_Agg
    • sys.index_columns
  2. Stack Overflow
    • Simulating group_concat MySQL function in Microsoft SQL Server 2005?
      Link

MySQL – information_schema.statistics

Background

As a quick follow-up to our discussion of metadata on tables, let us discuss indexes.

BTW, the discussion on tables is here.

INFORMATION_SCHEMA.STATISTICS

Code


select

		tblStat.TABLE_NAME as 'table'

	  , tblStat.INDEX_NAME as 'index'

	  , case tblStat.NON_UNIQUE
			   when 0 then 'No'
			   when 1 then 'Yes'
			   else '?'
		end as 'Unique'


	  , GROUP_CONCAT( COLUMN_NAME order by SEQ_IN_INDEX ) as colList


	  , tblStat.CARDINALITY


from  INFORMATION_SCHEMA.STATISTICS tblStat


where  tblStat.TABLE_SCHEMA = 'drupal'


group by

		 tblStat.TABLE_SCHEMA
	   , tblStat.TABLE_NAME
	   , tblStat.INDEX_NAME


order by

		 tblStat.TABLE_SCHEMA
	  ,  tblStat.TABLE_NAME
	  ,  tblStat.INDEX_NAME


LIMIT 25

;


Output

 

Comparison with Other Databases

  1. GROUP_CONCAT
    • The group_concat is very, very powerful and has few equal in other database platforms
    • Function Documentation

 

SHOW INDEX

To get a bit more information, specifically cardinality ( Number of unique values in each column combination ), we can issue “show index” against the table.

Code

Syntax


show index from [table] from [database];

Sample


show index from search_index from drupal;

Output

References

  1. Reference
    • INFORMATION_SCHEMA Tables
      • INFORMATION_SCHEMA STATISTICS Table
    • Show Index
    •  Group_Function
      • Group By Function
        Link
    • Group_Concat

MySQL – information_schema.tables

Background

With SQL being a lingua-franca for quering data it should not be a far stretch to note that the various governing bodies have agreed on standards on how to expose dictionary.

 

Mysql

For instance, if we want to see which tables are in a database we can launch our favorite GUI SQL Editor or just access MySQL’s Client query tool eponymous named mysql.

 

information_schema.tables

SQL

Sample


select 
       TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME
     , TABLE_TYPE, ENGINE
     , TABLE_ROWS, AVG_ROW_LENGTH 

from information_schema.tables 

where TABLE_TYPE not in ("SYSTEM VIEW")  

LIMIT 10


 

Output

Explanation

  1. Database
    • def
  2. Schema
    • audit_repository
  3. Table
  4. Table Type
    • Available Options
      • SYSTEM VIEW
      • BASE TABLE
  5. Engine
    • Available Options
      • Memory – System Tables
      • InnoDB
      • MyISAM
  6. TABLE ROWS
    • Number of records in table
  7. AVG ROW LENGTH
    • Based on column data type and used size allocation

 

Comparison with Other Databases

SQL Server

  1. Able to use double-quotes in query
    • We are able to use double-quotes (“) to delimit column values
  2. Limit N
    • In Transact SQL we use top N.
    • On the other hand, in MySQL we use LIMIT N

 

show tables

Objective

List tables and views that fits specified name format.

SQL

Syntax


show full tables from [database] like [name] where TABLE_TYPE in [BASE_TABLE | VIEW ];

Sample


show full tables from drupal like 'role%';

 

Output

 

describe

Objective

Display table’s structure.

SQL

Syntax


describe [table];

Sample


describe drupal.role;

Output

 

References

  1. MySQL
    • Show Tables
      • Show Tables Syntax
        Link
    • Show Commands
      • Extensions to SHOW Statements
        Link

MySQL – Binaries Logs – Review Settings

Background

A quick documentation on how to start reviewing Binary Log settings for a MySQL Instance.

Review

Review Account mysql is running under

ps aux

Code


$ ps aux | grep -v "grep" | grep "mysql"

Output

Explanation

  1. User
    • mysql
  2. Process ID
    • 4400
  3. Program Name
    • /apps/mysql/sbin/mysqld
  4. Base dir
    • /apps/mysql
  5. Data dir
    • /data/mysql
  6. plugin-dir
    • /usr/lib64/mysql/plugin
  7. user
    • mysql
  8. log-error
    • /apps/mysql/log/mysql-err.log
  9. pid-file
    • /var/run/mysql/mysqld.pid
  10. socket
    • /var/run/mysql/mysql.sock
  11. port
    • 3306

sudo to account mysql is running under

syntax


sudo su – mysql

confirm

Script


whoami

Output


Where is conf file?

/apps/mysql/bin

syntax


ls -la /apps/mysql/bin/*conf*

output

/etc

syntax


ls -la /etc/my*

output

Explanation

Matches found

  1. /etc/my.cnf

 

Configuration file

/etc/my.cnf

Syntax


cat /etc/my.cnf

Image

Explanation

  1. log_bin
    • log_bin=/logs/mysql/mysql-bin.log
  2.  max_binlog_size
    • max_binlog_size=100MB
  3.  expire_logs_days
    • expire_logs_days=1
  4. binlog_format
    • binlog_format=row

/logs/mysql/mysql-bin.log

Objective

Review /logs/mysql folder for mysql-bin.###### files.

Syntax


ls -la --block-size=M /logs/mysql/mysql-bin*

Image

/logs/mysql/mysql-bin.index

Objective

Review /logs/mysql/mysql-bin.index file for a listing of binary log files.

Syntax


cat /logs/mysql/mysql-bin.index

Image