Ingest Pipelines

In this post i will explain what are ingest pipelines, what are their use case and how to create them.

https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest.html

  • What is an ingest pipeline: it is a watcher analyzing data entering to an index and beforing being save can be transformed.
  • Possible actions: The transformation options available are: remove field, add field, enrich value of a field, convert field type.
  • The ingest pipeline option is located into the Stack Management section.
  • Use case:
    • If you have logstash between an agent or a software feeding data to elastic, you may use filter and/or grok system to do the same actions than an ingest pipeline.
    • But if you have agents or softwares feeding data directly to elastic and would like to manipulate data before being indexed you can use the ingest pipeline to do transformation.
      • It is also a good use case when you are now allowed to change the agent or the software that feed the data.
  • How to use it:
    • In the image above, you see the home page of the ingest pipeline menu.
      • Clic on the blue button “Create pipeline”. Choose “New pipeline“.
      • Give a relevant name to your new pipeline and a small description.
      • Clic on the button add processor. You can add many processor in the same pipeline.
      • In my example i convert the type of a field from integer to string.
      • I will use the json field response.
      • Next click on the button “Add”.
      • In front of the text “Test Pipeline:” Click on the link “Add documents”.
      • Insert a json sample you would like to test and run the test with the button “Run the pipeline”.
      • See the result if the transformation worked.
      • When your pipeline is complete, it is possible to save its configuration as an HTTP PUT request which will allow you to deploy it on other ELK environment or clusters.

Here is the json sample i used, see the field in red below:

{
  "_index": "kibana_sample_data_logs",
  "_id": "l_zi9oAB8WFQcfknI5oN",
  "_version": 1,
  "_score": 1,
  "_source": {
    "agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24",
    "bytes": 4460,
    "clientip": "123.217.24.241",
    "extension": "",
    "geo": {
      "srcdest": "US:US",
      "src": "US",
      "dest": "US",
      "coordinates": {
        "lat": 42.71720944,
        "lon": -71.12343
      }
    },
    "host": "www.elastic.co",
    "index": "kibana_sample_data_logs",
    "ip": "123.217.24.241",
    "machine": {
      "ram": 11811160064,
      "os": "ios"
    },
    "memory": null,
    "message": "123.217.24.241 - - [2018-08-01T07:02:46.200Z] \"GET /enterprise HTTP/1.1\" 200 4460 \"-\" \"Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24\"",
    "phpmemory": null,
    "referer": "http://nytimes.com/success/konstantin-kozeyev",
    "request": "/enterprise",
    "response": 200,
    "tags": [
      "success",
      "info"
    ],
    "timestamp": "2022-05-25T07:02:46.200Z",
    "url": "https://www.elastic.co/downloads/enterprise",
    "utc_time": "2022-05-25T07:02:46.200Z",
    "event": {
      "dataset": "sample_web_logs"
    }
  },
  "fields": {
    "referer": [
      "http://nytimes.com/success/konstantin-kozeyev"
    ],
    "request": [
      "/enterprise"
    ],
    "agent": [
      "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24"
    ],
    "extension": [
      ""
    ],
    "tags.keyword": [
      "success",
      "info"
    ],
    "geo.coordinates": [
      {
        "coordinates": [
          -71.12343,
          42.71720944
        ],
        "type": "Point"
      }
    ],
    "geo.dest": [
      "US"
    ],
    "response.keyword": [
      "200"
    ],
    "machine.os": [
      "ios"
    ],
    "utc_time": [
      "2022-05-25T07:02:46.200Z"
    ],
    "agent.keyword": [
      "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24"
    ],
    "clientip": [
      "123.217.24.241"
    ],
    "host": [
      "www.elastic.co"
    ],
    "machine.ram": [
      11811160064
    ],
    "extension.keyword": [
      ""
    ],
    "host.keyword": [
      "www.elastic.co"
    ],
    "machine.os.keyword": [
      "ios"
    ],
    "hour_of_day": [
      7
    ],
    "timestamp": [
      "2022-05-25T07:02:46.200Z"
    ],
    "geo.srcdest": [
      "US:US"
    ],
    "ip": [
      "123.217.24.241"
    ],
    "request.keyword": [
      "/enterprise"
    ],
    "index": [
      "kibana_sample_data_logs"
    ],
    "geo.src": [
      "US"
    ],
    "index.keyword": [
      "kibana_sample_data_logs"
    ],
    "message": [
      "123.217.24.241 - - [2018-08-01T07:02:46.200Z] \"GET /enterprise HTTP/1.1\" 200 4460 \"-\" \"Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24\""
    ],
    "url": [
      "https://www.elastic.co/downloads/enterprise"
    ],
    "url.keyword": [
      "https://www.elastic.co/downloads/enterprise"
    ],
    "tags": [
      "success",
      "info"
    ],
    "@timestamp": [
      "2022-05-25T07:02:46.200Z"
    ],
    "bytes": [
      4460
    ],
    "response": [
      "200"
    ],
    "message.keyword": [
      "123.217.24.241 - - [2018-08-01T07:02:46.200Z] \"GET /enterprise HTTP/1.1\" 200 4460 \"-\" \"Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24\""
    ],
    "event.dataset": [
      "sample_web_logs"
    ]
  }
}

And the json result, as you can see the field response in now as a string type, see the field in red below:

{
  "docs": [
    {
      "doc": {
        "_index": "kibana_sample_data_logs",
        "_id": "l_zi9oAB8WFQcfknI5oN",
        "_version": "1",
        "_source": {
          "referer": "http://nytimes.com/success/konstantin-kozeyev",
          "request": "/enterprise",
          "agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24",
          "extension": "",
          "memory": null,
          "ip": "123.217.24.241",
          "index": "kibana_sample_data_logs",
          "message": "123.217.24.241 - - [2018-08-01T07:02:46.200Z] \"GET /enterprise HTTP/1.1\" 200 4460 \"-\" \"Mozilla/5.0 (X11; Linux i686) AppleWebKit/534.24 (KHTML, like Gecko) Chrome/11.0.696.50 Safari/534.24\"",
          "url": "https://www.elastic.co/downloads/enterprise",
          "tags": [
            "success",
            "info"
          ],
          "geo": {
            "coordinates": {
              "lon": -71.12343,
              "lat": 42.71720944
            },
            "srcdest": "US:US",
            "dest": "US",
            "src": "US"
          },
          "utc_time": "2022-05-25T07:02:46.200Z",
          "bytes": 4460,
          "machine": {
            "os": "ios",
            "ram": 11811160064
          },
          "response": "200",
          "clientip": "123.217.24.241",
          "host": "www.elastic.co",
          "event": {
            "dataset": "sample_web_logs"
          },
          "phpmemory": null,
          "timestamp": "2022-05-25T07:02:46.200Z"
        },
        "_ingest": {
          "timestamp": "2022-05-25T07:23:34.685600556Z"
        }
      }
    }
  ]
}