Mehdi El-Filahi – Page 2

Big Data

MongoDB

by Mehdi El-Filahi

In This post i will provide some topics, example and code to play with MongoDB.

Before going deeper in detail, i suggest you to read this post explaining the difference between RDBMS and Nosql Big Data but also the comparison of technical terms to give you a clear view. See the post

RDBMS vs NoSQL

MongoDB Cloud Atlas

MongoDB and Python

Big Data

Splunk

by Mehdi El-Filahi

Big Data

Logstash configuration

by Mehdi El-Filahi

In this post, i will explain you the basics of logstash. This tool is a powerfull gateway that can apply transformation during the process of a message. It can listen to a port and wait for a message or connect to a service to extract the data as an ETL.

I will show you how to create a small logstash port listener and forward the data to elasticsearch.

First to download logstash please go to this page: Download logstash.

Either download the .deb file or rpm file for an easy and quick install or the compressed file for Windows, Linux or MacOS.

The folder contains at the root folder the binary file “logstash” or “logstash.exe” and a folder conf containing the “pipeline.yml” and “logstash.yml” configuration.

Create a logstash config file into the conf folder and name the file example.conf
With the configuration example, logstash will listen for the same service at the same on the HTTP port 5891 and on the beats protocol 5947 and will forward the data to Elastic on the url http://localhost:9200. Logstash will create every day an index with this name convention tomcat-local-yyyy-mm-dd

#Logstash configuration file
#Log messages can be received using http on port 5891
# or
#Log messages can be received using beats on port 5947
input {
   http {
      port => 5891
      codec => json
   }
   beats {
      port => 5947
      codec => json
   }
}
Data is sent to Elasticsearch to port 9200
output {
   elasticsearch { hosts => ["localhost:9200"]
      index => "tomcat-local-%{+yyyy-MM-dd}"
   }
}

Specify to logstash to take the config file example.conf in consideration
- Add the config file into the pipeline.yml file
- Give a unique pipeline id to this listener worker group
- Point to the configuration file example.conf
- Specify how many concurent thread will manage the data inputs that will be processed at the same time. (3 by default if pipeline.workers is not specified)

- pipeline.id: example
   path.config: "C:\logstash-8.1\conf\example.conf"
   pipeline.workers: 3

Start logstash and ensure that the the ports are listening and send a json example to see if logstash will forward it to elasticsearch.

Now i will show you how to connect to a msql DB and sends table rows to Elastic

The input section will use the mysql connector library, connect to mysql, run the SELECT statement every 5 minutes, the filter part will create field id, and remove 3 other fields before sending the data to elastic. The example also stores the latest value processed to be sure that the rows wont be processed twice.

input {
  jdbc {
  jdbc_driver_library => "C:\mysql-connector\mysql-connector-java-8.0.16.jar"
  jdbc_driver_class => "com.mysql.jdbc.Driver"
  jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/databaseexample"
  jdbc_user => USER_MYSQL
  jdbc_password => PASSWORD_MYSQL
  jdbc_paging_enabled => true
  use_column_value => true
  tracking_column_type => "numeric"
  schedule => "*/5 * * * * *"
 statement =>"SELECT * FROM example_table WHERE last_modified_time >:sql_last_value"
 use_column_value =>true 
  tracking_column =>last_modified_time 
  tracking_column_type => "timestamp"
  } 
}

 filter { mutate { copy => { "id" => "[@metadata][_id]"}
  remove_field => ["id", "@version", "unix_ts_in_secs"]
  }
}
output {
  elasticsearch {
   elasticsearch { hosts => ["localhost:9200"]
   index => "db-local-%{+yyyy-MM-dd}"
   document_id => "%{[@metadata][_id]}"
  }
}

Big Data

Filebeat usage

by Mehdi El-Filahi

In this post, i will explain how filebeat works and what are the possibilities to ingest logs and forward to Elastic/Logstash.

Filebeat is a small application that can be downloaded here : Dowload filebeat.

Either download the .deb file or rpm file for an easy and quick install or the compressed file for Windows, Linux or MacOS.

The application has on its root folder the binary file “filebeat” or “filebeat.exe” and the important configuration file “filbeat.yml“.

There are two ways to configure filebeat and two ways to forward data:

Configuration:

Configure by modules:
- A Full list of modules can be enabled and are already compatible to parse logs from many different products.
- Filebeat will parse and generate the json message with many different metadata (host, timestamp, log path, …)
- To enable the module here is the command line: “filebeat modules enable MODULE_NAME“
  1. Example: for IBM MQ : filebeat modules enable ibmmq
- Here is a print screen with an exhaustive list of available modules.

Elastic – Filebeat Modules Download

Configure filebeat manually:
- To be able to configure filebeat, you first need to understand how the log file you want to parse is structured.
- Lets take an example:

[24/Feb/2015:14:06:41 +0530] "GET / HTTP/1.1" 200 11418
[24/Feb/2015:14:06:41 +0530] "GET /tomcat.css HTTP/1.1" 200 5926
[24/Feb/2015:14:06:41 +0530] "GET /favicon.ico HTTP/1.1" 200 21630
[24/Feb/2015:14:06:41 +0530] "GET /tomcat.png HTTP/1.1" 200 5103
[24/Feb/2015:14:06:41 +0530] "GET /bg-nav.png HTTP/1.1" 200 1401...
[24/Feb/2015:14:06:45 +0530] "GET /docs/ HTTP/1.1" 200 19367
[24/Feb/2015:14:06:45 +0530] "GET /docs/images/asf-logo.gif HTTP/1.1" 200 72790:
[24/Feb/2015:14:06:45 +0530] "GET /docs/images/tomcat.gif HTTP/1.1" 200 20660:
[24/Feb/2015:14:06:52 +0530] "GET /docs/logging.html HTTP/1.1" 200 38251
[24/Feb/2015:14:23:58 +0530] "GET /docs/config/valve.html HTTP/1.1" 200 111016
[24/Feb/2015:15:56:41 +0530] "GET /docs/index.html HTTP/1.1" 200 193670:
[24/Feb/2015:15:56:51 +0530] "GET / HTTP/1.1" 200 114180:
[24/Feb/2015:15:57:02 +0530] "GET /manager/html HTTP/1.1" 401 25380:
[24/Feb/2015:15:57:10 +0530] "GET /manager/html HTTP/1.1" 200 158290:
[24/Feb/2015:15:57:10 +0530] "GET /manager/images/tomcat.gif HTTP/1.1" 200 20660:
[24/Feb/2015:15:57:10 +0530] "GET /manager/images/asf-logo.gif HTTP/1.1" 200 7279

Log Analysis:
- We can see it this log that it begins with [TIMESTAMP]
- So the purpose will be to create a regex that will recognize the beginning of each line.
- Here is the regex that will match each beginning of lines: [\d{2}\/[^0-9]{3}\/\d{4}:\d{2}:\d{2}:\d{2} +\d{4}]

Configure filebeat:
- Enable the filebeat input section, point to the file path and add the regex into the filebeat.yml configuration file (!!! you can have multiple input streams meaning you can send data from multiple log or steam):

Now you see how to be able to ingest logs with the main options that filebeat offers.

Forward data:

To forward data, filebeat offfers two options (!!! you can have only ONE output stream meaning you can only have one destination):
- Send to Logstash:
  - You send to logstash by http, https, beats, beats over TLS.
  - With authentication or anonymously.
- Send to Elasticsearch:
  - You send to elastci by http, https.
  - With authentication or anonymously.

Here is a schema example to propose a point of view of filebeat usage:

DevOps

Hashicorp Vault

by Mehdi El-Filahi

Middleware

Apache Tomcat

by Mehdi El-Filahi

Middleware

JBOSS/WILDFLY

by Mehdi El-Filahi

Middleware

Message structure

by Mehdi El-Filahi

In this post i will explain how mq messages are structured.

An mq message contains two main sections. (It can be compared to JMS where a message is slipt as JMS Header and JMS data)

The header and the data.

The Header:
1. The header contains the MQMD (Message Queue Message Descriptor) with all the usefull information to handle the message metadata.
The data:
1. The data contains three different informations.
  1. The MQ RFH 1: RFH contains usefull metadata information sur as:
    1. The reply to Q for dynamic reply response.
    2. The DLQ name in case of technical error.
    3. The persistence flag: Keep the message in case of mq restart or not.
    4. The message type.
    5. The priority number: long representing the value to prioritze read messages.
    6. Group Id: read based by group number for security context.
    7. The sequence number: in case of partitionned data into multiple messages.
  2. The MQ RFH 2: RFH 2 contains usefull metadata information sur as:
    1. Version.
    2. CCSID: Code Char Set ID as long value (for instance 1208 is UTF-8, 1200 is UTF-16, 13488 is UTF-16 V2, 17584 is UTF-16 V3).
    3. Encoding: Encoding as long value represent carriage return representation for OS usage of MQ, 273 is for Unix systems and 546 is for Linux and Windows systems
    4. Format: Format will be MQHRF2.
    5. Sturcture Lenght: Size in bytes of MQRFH 2.
  3. The payload:
    1. The data payload can be:
      1. Binary
      2. XML
      3. JSON
      4. DFDL
      5. MIME
      6. MRM
      7. CUSTOM

It is possible to convert an MQ message to JMS message and inversly. (On this topic there are two nodes on ACE allowing to do it: See the post List of Nodes JMSMQTransform and MQJMSTransform.)

Middleware

RFHUTIL

by Mehdi El-Filahi

In this post, i will explain what is RFHUTIL, where to download it and how to use it.

RFHUTIL is a usefull tool to read, write and browse message located on a queue or subscribe, publish to a topic.

The tool can be downloaded here https://github.com/ibm-messaging/mq-rfhutil

From you command prompt, run the command line git clone https://github.com/ibm-messaging/mq-rfhutil.git

In the folder bin\Release, you will see two exe files rfhutil.exe and rfhutilc.exe

The difference between the two file is the connection method.

rfhutil.exe will connect to a local queue manager using local binding method.

rfhutilc.exe (also called rfhutil client) will connect to any queue manager using server connection channel (with host, port, protocol and channel name)

A usefull option is to be able to read a message from a queue, save it into file and being able later to load it and resend this message. It can be usefull in case of development and/or debug.

DevOps

Jenkins and Nexus.

by Mehdi El-Filahi

This post will help you to manage and work with Jenkins to build, deploy and test your developments. Deploy your artifacts to Nexus from Jenkins.

Today containers are taking an important part in the IT world. The first time i used is Jenkins, docker did not exist and worskspaces to build projects were present inside the Jenkins host. This is why there will be today different version of Jenkins. One that will use Jenkins with a basic linux docker and another topic later with Jenkins wrapped with dind (Docker in docker). Dind alows Jenkins to create a ephemeral container that will build projects and disapear at the end of the run.

Containerizing Jenkins and Nexus.

All posts by Mehdi El-Filahi