Page 3 – Mehdinfo

Big Data, DevOps

ELK

by Mehdi El-Filahi

This post is intended to share a tool a find very usefull to store data but especially logs.

ELK is a product containing 3 tools:
- E : Elasticsearch
  - Big Data based on Apache lucene engine storing only json data.
  - Elasticsearch contains an API very useful to communicate with.
- L: Logstash
  - Powerfull data gateway allowing to forward data to elasticsearch.
  - It is also an ETL that can process changes from received data before forwarding them to Elastic.
- K: Kibana
  - Kibana is the front end part allowing:
    - To visualize data stored into elasticsearch.
    - Create dashboards.
    - Create alerts.
    - Work for developers to post data by apis request to elasticsearch.
    - To visualize metrics.
    - Manage the stack (index, policies, pipelines, roles, users, …).

Containers and Kafka/ELK

Filebeat usage

Logstash configuration

Dashboards in Kibana

Alerting in Kibana

Ingest Pipelines

Big Data, DevOps

Containers and Kafka/ELK

by Mehdi El-Filahi

For people interested to be able to work with the ELK and Kafka into containers, i have created an infra configuration with docker that you can find here:

https://github.com/djmhd/DockerKafkaElk

The readme will help you run the containers and access the services.

To help you visualize and understand, i drawed a little shema:

Middleware

ESQL best practices

by Mehdi El-Filahi

ESQL is a powerfull language allowing to provide fast and flexible development into IIB/ACE/CP4I

The compute node:

When adding an ESQL Node the compute mode is a crucial choice for performance.
Different options are available:

Message: This option is often the most commonly used
LocalEnvironment: This option is usefull when you need to change the behavior of next nodes
LocalEnvironment and Message: This option is usefull when you need to change the behavior of next nodes with the message included
Exception: This option is usefull for error management only
Exception and Message: This option is usefull for error management only and to log the message or part of the message
Exception and LocalEnvironment: This option is usefull for error management only and log the local variable of nodes information
All: Try to avoid this option as much as possible.
Between Java compute node and ESQL compute node, for performance matter, try to prioritize the ESQL usage.

LocalEnvironment and Environment:

Difference between LocalEnvironment and Environment.

LocalEnvironment is an important option to help next node to know what they will have to do.
Examples:

To change the behavior of an HTTP Request node, i will override the variables
- OutputLocalEnvironment.Variables.HTTP_Method
- OutputLocalEnvironment.Destination.HTTP.RequestURL
To change the destination queue of an MQ Output node, i will override the variable
- OutputLocalEnvironment.Destination.MQ.DestinationData[1].queueName
To change the folder and file to read of a File Input node, i will override the variables
- OutputLocalEnvironment.Destination.File.Directory
- OutputLocalEnvironment.Destination.File.Name

Many middleware developers are mistaking the usage of LocalEnvironment and Environment.
Never use the LocalEnvironment to store your own variable, this could lead to a possible bad behavior if the values of a known variable is changed that a node needs to use later. If this happens, the debug becomes quite diffuclt to find and you may waste hours before you find this root cause.

Things to keep in mind when developing:

Never use CARDINALITY on a table to check if the table is not empty for a performance matter. CARDINALTY will browse the whole table and return the size of the table. While the EXISTS function will stop at the first occurence.

Try to avoid browsing sequences elements with integer index and cardinality, try to user reference pointers and browse next elements with MOVE NEXTSIBLING.

When manipulating XML elements, try to use XMLNSC domain instead of XML or XMLNS.
- Difference between XML, XMLNS, XMLNSC:
- XML: XML is deprecated and should not be used anymore. It is still present for compatibility with older flows coming from WMB.
- XMLNS: is a proframmatic parser and considers the XML message as a STRING, validation is not possible either for generation or parse. It is namespace aware. XMLNS will parse the message and will keep the information in memory, so it will use lees cpu and mode memory to handle the message. So XMLNS is more suitable if XPath access is needed.
- XMLNSC: is the new optimzed way to treat XML messages. It is namespace aware. XMLNSC will parse only sequentially portion of message to finally create a compact tree. This is more optimized to use less memory than XMLNS but may consume a little more CPU.

When manipulating json elements, take care about float elements needing scientific representation or not. This must be changed in the Environment variable of the NODE/Integration server.

Shared variables and external variables:

A shared variable is a very usefull type of object where the variables keeps living between all the different threads of the application flow.
- The variable keeps living until the application flow of the integration server is stopped.
An external variables is a final object. After assigning a value, it can not be changed anymore.
- This variable can be usefull for security matter when the values must never be changed.
- This variable can also be usefull to elevate a user defined property from a flow into the ESQL code.

DevOps

Docker Swarm	Kubernetes	Openshift
Easy and fast setup	Open source and modular	Manages kubernetes
Works with other existing docker tools	Runs on any Linux OS	Helps abstract kubernetes limitation such as network features
Lightweight installation	Easy service organisation with pods	Red hat OS (except for dev) is mandatory
Open source	Backed by years of expert experience	Security feature are better managed than in kubernetes
Limited functionalities with the Docker API	Laborious install and configuration	Smaller community than Kubernetes one
Updates for swarm must be schedules	Updates for kubernetes are monitored and controlled progressively	Openshift lets kubernetes handel the updates
Limited fault tolerance	Incompatible with docker cli and docker compose

DevOps

Security

by Mehdi El-Filahi

For security matters, ansible has a feature called ansible vault to store sensite data.

Seen that ansible is an infra as code technology, you need to store the code into a content management sevrice such as CSV, SVN, GIT, TFS, …

So to not allow anyone to read sensite data, use ansible vault

Secure the content of playbooks.

Create and Keep sensitive data encrypted with AES:
- Run the command line ansible-vault create secret-info.yml
  - Enter twice a vault password
  - Enter your sensitive data with the text editor

Edit the vault:
- ansible-vault edit secret-info.yml
- Edit your sensitive data with the text editor

Use the vault:
- Add vars_files into your playbook
  - vars_files:
  - – secret-info.yml
- ansible-playbook playbook.yml –ask-vault-pass
  - It will prompt the vault password
  - If you try to automate the runs, it could be a good idea to request the password from a secured tool such as Hashicorp vault.

DevOps

Ansible galaxy

by Mehdi El-Filahi

Ansible Galaxy is a hub to share your playbooks projects in public repositories.

https://galaxy.ansible.com/

Each share is categorized into sections:

Roles allows to create portable and shareable ansible projects.

To create a new galaxy project run –> ansible-galaxy init PATH

Create a yml file at the root, lets call it test.yml

tasks:

name: use role
include_role:
name: PATH

ansible-playbook PATH/test.yml

DevOps

Service handlers and error handlers

by Mehdi El-Filahi

A task can notify if a change has been made, then a handler can be triggered.

Example:

name: change_port
- lineinfile: path=/etc/httpd/http.conf regexp=‘^port’ line=‘port=8080’
- notify: Restart_Apache
handlers:
- name: Restart_Apache
  - service: name=apache2 status=restarted

Error management of tasks:

To ignore a change status –> changed_when: false
- For instance uname or service status
Force a change status if a word is found –> changed_when: “’SUCCESS’ in cmd_output.stdout”
Force error status if a word is found –>failed_when: “’FAIL’ in cmd_output.stdout”
Ignore an error status –> ignore_errors: yes
- Easy to test with command –> /bin/false

DevOps

Variables

by Mehdi El-Filahi

Ansible variables are wrapped between double curly braces.

See all the available default variables –> ansible -m setup hostname

Variables can be registered statically into the yml file:
- vars:
  - example_var: “This is a variable example”
  - my_deb_file: zabbix-release_4.4-1+bionic_all.deb
Variables can come as parameter from command line
- ansible-playbook xxxxx.yml –extra-vars “variable=value”

Archives

ELK

Containers and Kafka/ELK

ESQL best practices

The compute node:

LocalEnvironment and Environment:

Things to keep in mind when developing:

Shared variables and external variables:

Openshift

Kubernetes

Difference between docker swarm, kubernetes and openshift

Security

Ansible galaxy

Service handlers and error handlers

Variables