Optimizing Elasticsearch for security log collection – part 1: reducing the number of shards

Nowadays, logs collection for security monitoring is about indexing, searching and datalakes; this is why at NVISO we use Elasticsearch for our threat hunting activities. Collecting, aggregating and searching data at a very high speed is challenging in big environment, especially when the flow is bigger than expected. At NVISO, we are constantly seeking for enhancements and optimizations to increase our performance. We are especially trying to take the maximum benefit of the very latest proven technology available.

In this & the following blog posts, we will show you the performance issues we suffered when using Elasticsearch for log collection. In three subsequent blog posts, I’ll present you how we use the latest features of Elasticsearch to overcome these performance concerns.

Our concerns

Many Elasticsearch environments suffer from the “too many shards” syndrome. Every shard is maintained in memory, and the more shards you have, the more memory you’ll need. Handling too many shards in memory will decrease the performance because the implied overhead of the java garbage collector will be too high.

By default, every created index in Elasticsearch is created with 5 primary shards. It can be good for some environment, but it can be a performance nightmare if that number doesn’t fit your use-cases, which is often the case.

The following Elasticsearch log lines are a sign of a “too many shards” syndrome:

[2018-01-31T18:30:21,491][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6178] overhead, spent [382ms] collecting in the last [1s]
[2018-01-31T18:30:22,507][WARN ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6179] overhead, spent [539ms] collecting in the last [1s]
[2018-01-31T18:35:13,640][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6467] overhead, spent [361ms] collecting in the last [1s]
[2018-01-31T18:35:14,656][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6468] overhead, spent [503ms] collecting in the last [1s]
[2018-01-31T18:35:15,672][INFO ][o.e.m.j.JvmGcMonitorService] [server1] [gc][6469] overhead, spent [427ms] collecting in the last [1s]

Reference: Overhead and heap issues (Elastic.co) –https://discuss.elastic.co/t/overhead-and-heap-issues/117888

As time series data are growing indefinitely, it’s disabling us from using a single index for all the data we want to ingest. At some point, the index will be too large to be queried and we’ll eventually need to create a new one. For years, the best practice was to create a new index every day for new incoming data. With such strategy, we have faced clusters showing very bad performance because there were too many indices with too many shards.

We used to work mostly with a single machine for simplicity and it did work as the amount of data we ingested together with our retention time did not justify the addition of more hardware. To stay in line with best practices dictated by Elasticsearch, we decided in the past to deploy four different instances of Elasticsearch on the same machine in the form of docker containers. We thought that adding more instances until the memory was full will increase the overall performance. However, we were wrong[1].

Not all of our nodes had the same role: one was a coordinating node whereas all the others took the other roles. The coordinating node consumed 10GB whereas the others consumed 32GB of memory where it was possible, making the whole cluster consuming around 106GB of memory. Moreover, in some environment where 106GB of memory wasn’t an option, we reduced the java heap space down to 8GB. This way, we uncontiously forgot to leave any memory available for the file system cache.

Our Logstash configuration managed the rollover of indices by appending the current timestamp at the end of indices’ name. Indices older than 30 days were closed and indices older than 180 days were deleted; we had ten different indices to rollover in average. Our cluster had to manage 30 days x 10 index x 5 shards per index= 1500 shards with sometimes only 8GB of memory per node where most of the shards were sometimes less than 10MB big.

The above concepts resulted at times in severe performance and memory consumption issues; so something had to change. All signals were pointing in the direction of a cluster making use of too many shards.

Rescuing our cluster

To remediate these issues, we decided the following:

  • Use only one Elasticsearch instance per host to give memory to the file system cache.
  • Merge indices from days to months to reduce the number of indices and shards, facilitating short term resolution.
  • Apply a rollover policy to keep indices and shards number under control, to facilitate long term resolution.
  • Integrate this strategy with an Ansible playbook.

This first blog post will only cover the merging of indices and reducing the number of Elasticsearch instances (scale-in). We will talk about shot-term resolutions in order to rescue our suffering cluster.

Reducing the number of shards

It is not an easy task to reduce the number of shards on a cluster; whenever an index is created, you have to re-index your data to reduce the number of shards.

To remediate this issue, we decided to merge all indices within a month to a single index:

POST _reindex
{
  "source": {
    "index": "logstash-eagleeye-suricatafilter-smtp-2019.03.*"
  },
  "dest": {
    "index": "logstash-eagleeye-suricatafilter-smtp-2019.03"
  },
  "script": {
    "inline": "ctx.remove(\"_id\")"
  }
}

We did this operation for every index type (we have 10 in total). We used the default number of shards; however, it can be smart to reduce the number of shard to one on environment with a lot of index type.

When the re-indexing was done, it was time to free up some shards be deleting all the older shards:

DELETE logstash-eagleeye-suricatafilter-smtp-2019.03.*

This operation can be performed safely as we know data from those indices are now indexed in the new index logstash-eagleeye-suricatafilter-smtp-2019.03. Feel free to use the _count API to double check before deleting!

By doing such, we potentially deleted 30 x 10 x 5 = 1500 shards and created 1 x 10 x 5 = 50 shards. Our cluster just came from 1500 shards to 50 shards by deleting 1450 shards. Our new indices were still around 1GB, so we didn’t suffer from any performance issue of an index that was too big. That operation was worth it but implied some temporary performance decrease as re-indexing data is not free in terms of resource consumption.

Reduce the number of Elasticsearch instances

We wasted a lot of memory in our environment by running three different instances of Elasticsearch. There is no performance gain of running three instances instead of one on the same hardware as soon as you keep the same number of shards for your indices. It is better to run a single instance with 32GB of heap space rather than four instances with 8GB each.

In our cluster, we had three data nodes and one coordinating node. The plan was to decommission two data nodes and delete the coordinating node ending with a single instance of Elasticsearch. To perform this, we just excluded two data nodes from the cluster:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.exclude._name": "esnode2,esnode3",
    "cluster.routing.rebalance.enable": "none"
  }
}

This operation moved all the shards from esnode2 and esnode3 to other data nodes available. As we only had three data nodes, all the shards went on the first esnode1 Elasticsearch instance. As for the re-indexing, this operation is quite long, especially when we keep ingesting data in the meantime. When it was done, we could safely delete the two data nodes containing no shards anymore. The coordinating node was also safe to be deleted as it didn’t contain any data.

So, we went from a three nodes cluster with each node of 8GB heap space with a total of 1500 shards each worth 10MB to a healthy single node cluster of 32GB heap space with a total of 50 shards each worth less than 1GB. This is a nice improvement in terms of memory consumption, especially on systems where we had three instances of 32GB and were able to save 42GB of host memory.

We can keep running like this and rollover the indices every month, but it won’t keep things fully under control. What if at some point in time, we ingest more data than expected filling-in indices to more than 200GB? Can we still enhance our indices management strategy and try to reach the best Elasticsearch performance?

In the next blog post, we will show you how to use the latest Elasticsearch features to define and apply a good index lifecycle management policy.

Until next time, and happy clustering!

[1]Elasticsearch strongly relies on the file system cache to reach its performance level. You need some spare memory for the file system cache.

5 thoughts on “Optimizing Elasticsearch for security log collection – part 1: reducing the number of shards

    1. Hi Bob,

      We went from a non-working cluster to a stable one. We didn’t do any benchmarks as it wasn’t even stable to be executed.

      Like

  1. Had good results with going away from time based indexes and use size based rollover. This way you can have two tier storage, hot for rent data on SSD and warm on spinning rust strip raids for older data. Drawback is that you can not guarantee exact number of days/weeks of stored data, but you can use your storage capacity to full.
    Eg. we store 3.36TB in 220 indices with 880 shards on 4 nodes with 6Gb heap each plus one Master and ingest node.

    Like

    1. Hi Peter,

      This is definitely a good idea. We actually use the same strategy in our index lifecycle management presented in a next blog post.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s