Should we run Elasticsearch on OpenShift?

Suny Kim Agil / Agile, Automation, Compliance, Docker, Kubernetes Leave a Comment

If you are wondering about running Elasticsearch in OpenShift, you might be interested in these insights from a project where we this was our goal.

What is OpenShift?

OpenShift manages containers. It’s RedHat’s supported version of Kubernetes, adding some components, like a software defined network. Still everything below applies to Kubernetes as well. Strictly speaking, OpenShift runs pods which are collections of containers. For this article, it’s enough to think of OpenShift as a platform that bundles your physical servers into a pool of resources, and lets you spin up hundreds of pods on it with a couple of commands. In the pods, you run your applications. They get their share of resources as defined in OpenShift. From their perspective, the pods are servers.

What is the Elastic Stack?

The Elastic Stack consists of an event forwarder, Logstash, that brings your data into a data store, Elasticsearch. Then you can search and analyze it in a web frontend, Kibana. One of the nice features is that you don’t have to worry about the size of your data. As the log volume grows, the stack will happily scale horizontally, serving terabytes of data with ease. This solution is great for centralized log management. And many teams asking for it.

Wouldn’t it be great if you could combine OpenShift and the Elastic Stack, spinning up stacks with one command?

Kibana and Logstash are fine with OpenShift, they happily scale and respawn and are stateless (unless you use logstash persistent queues, then you’ll have some of the data trouble below). Let’s focus on Elasticsearch. (Wait. Didn’t we just claim that Elasticsearch was so great with scaling? And now it’s the one component that isn’t? That’s right – there’s scaling the Elasticsearch way, and there’s scaling the OpenShift way. We’ll have go into the details, see below.)

What is cool about OpenShift?

(A) OpenShift abstracts away the underlying servers – you don’t have to worry about the hardware your pods run on.

(B) It’s simple to scale up (and down) pods.

(C) It’s simple to add underlying servers.

(D) OpenShift restarts failed pods for you.

(E) OpenShift assigns pods to services for you. If you define a readiness probe, OpenShift knows when to add it to the service and have it receive traffic, and when to take it out.

(F) Everything is defined in code and configuration.

(G) It’s what cool people do!

What’s not so cool about OpenShift?

(W) It add another layer of complexity

(X) License fees

(Y) Will it still work under heavy load?

(Z) Wait. Does it work at all?

Yes, it works. The question is if it’s worth the trouble. Now that (Z) is answered, let’s go through the other points.

The Devil’s Details are Eating my Coolness

Before we dive into the details, let me note that these are experiences from early 2018, OpenShift version 3.7 (and Elastic Stack 6.2). Both software ships are moving fast.

Let’s have a closer look at

(A) OpenShift abstracts away the underlying servers – you don’t have to worry about the hardware your pods run on.

Really? Don’t you worry, when you have data that you care about?

The Elasticsearch pod needs storage. Your data has to be protected from the failure of an underlying server. This means you have to either use a distributed fie system (glusterfs) – which is still much more involved than local storage. Or you go with local volumes, which is what we did. Then you have to pin the volume to a specific physical server and the pod to its volume.

  • At this point, let’s look at what’s so cool about Elasticsearch, and how its scaling magic works. It’s best when you install it on some medium sized, independent servers, say, with 64 GB RAM and 1 TB of SSD disks. Define a reasonable number of replicas (data copies) – not too many to save space, but enough to ensure avaiability if nodes fail. Elasticsearch will handle data distribution for you. It will place the replicas on different servers so that if one machine fails, the data won’t be lost, and reorganize the data on the remaining nodes. If you need more resources, Elasticsearch will add the new server to the cluster and nicely rebalance the data.

You lose some of this elegance if the underlying servers aren’t completely independent. If some of them are located in the same rack and may fail at the same time, you have to configure shard allocation awareness so that Elasticsearch doesn’t put all replicas into the same rack. In OpenShift, this means that you need this configuration if you have more than Elasticsearch node per physical server.

With OpenShift, you lose all of the Elasticsearch’s elegance. It cannot manage data distribution for you as it used to, because you took the servers away.

And you do the same to OpenShift. It’s beauty lies in the pods that freely float in the pool of resources, going down here and coming up there, just like happy ducks in a pond. But with local storage, you brutally pin them down to the underlying servers, the level that you hoped to forget about.

You may look at the illustration now (I know this is the first thing you did, it’s just so much more fun than reading). You are the queen who asked two systems to make your life easier. OpenShift does it in a way that makes it impossible for Elasticsearch to do her job. So you end up with more work to fix this.

 


Configuring Elasticsearch on OpenShift

There’s a couple of settings to consider when running Elasticsearch on OpenShift:

  • Elasticsearch hates swapping – OpenShift completely agrees, OpenShift Ansible disables it on all hosts, so nothing to do here.
  • File descriptors, max_map_count and number of threads: The Elasticsearch documentation recommends settings under the assumption that one instance runs on one server. How do you set it correctly when you run thousand instances on one machine? I don’t know. I could come up with some formula involving the probability that n instances use the maximum at the same time. But I’d say that it makes more sense to monitor these resources.
  • JVM DNS cache settings – in OpenShift, things shift, you probably don’t want to cache positive hostname resolutions indefinitely.
  • Check if Elasticsearch has a correct idea of the resources it’s allowed to use: disk, memory and cpu (using the _nodes/stats API). If there are any misunderstandings, inform Elasticsearch of the actual values – e.g. by setting the disk watermarks explicitly, or the number of processors. If your Elasticsearch request another core which it isn’t allowed to use by OpenShift, it may hang.
  • As mentioned, think about data loss on every level (except on the RAID level, use RAID 0 there, because Elasticsearch handles the mirroring). Depending on your setup you may have to configure shard allocation awareness.

 

Next, let’s look at:

(B) It’s simple to scale up (and down) pods.

First, forget about Elasticsearch and autoscaling. There’s an excellent answer on stack overflow to the question on „How to setup Elasticsearch cluster with auto-scaling on Amazon EC2?“ (EC2 autoscaling and OpenShift autoscaling are similar: If the load of a service gets too high, additional nodes are spun up and added.):

Auto scaling doesn’t make a lot of sense with ElasticSearch.

Shard moving and re-allocation is not a light process, especially if you have a lot of data. It stresses IO and network, and can degrade the performance of ElasticSearch badly. (If you want to limit the effect you should throttle cluster recovery using settings like cluster.routing.allocation.cluster_concurrent_rebalance, indices.recovery.concurrent_streams, indices.recovery.max_size_per_sec . This will limit the impact but will also slow the re-balancing and recovery).

This is still valid, though Elasticsearch recovery got a lot smarter.

When a web server farm gets high load, it helps to add servers. When Elasticsearch is under heavy load, this worsens the situation.

For controlled scaling, keep in mind the persistent volume hassle outlined in (A). In the project, I only got as far as spinning up fresh, small Elastic Stacks in OpenShift. I suspect that this is the easy part compared to scaling horizontally – which would be such a piece of cake on physical servers.

(C) It’s simple to add underlying servers.

I’m not sure about this one. It should be simple, but it wasn’t in our project (maybe due to our automation, see below in (F)). This may be different in other projects – please share your experience! In any case, consider the topics in the Box „You Must be This Tall to Use OpenShift“: How smooth is server provisioning, hardware and basic operating system setup in your company?

(D) OpenShift restarts failed pods for you.

If it happens once in a while, it’s nice to know that some mechanism takes care of it and you don’t have to get up in the middle of the night. Still, investigate every crash. If this happens a lot, sorry: Not cool for data systems. This may help with some buggy web application (though to me „in case of trouble, restart“ does feel like an old school Windows desktop administration strategy). If your Elasticsearch node keeps crashing, it won’t help to keep restarting it. Data systems want to be shut down gracefully. They hate it when their cache gets invalidated.

(E) OpenShift assigns pods to services for you. If you define a readiness probe, OpenShift knows when to add it to the service and have it receive traffic, and when to take it out. 

This is an interesting point. For Elasticsearch, readiness is not as simple as for a web server. Answering „200“ to a http request is good, but not good enough. It means that the node is ready to form a cluster, but not that the cluster is ready to serve data. The startup behaviour of your Elasticsearch node depends on the situation. When new nodes are added to an existing cluster, they simply find the service and join. If you design your readiness probe for this scenario only, adding nodes will work nicely. But at the next full cluster restart, or at cluster creation, every node will wait for the service, forever. Your readiness probe has to be modified: The first nodes to come up create the service, the others join. How does a node know that it’s one of the first? Because there’s no service yet … this works, but again is not elegant.

And as with the storage handling you solve something twice: Elasticsearch already manages nodes that leave and join the cluster.

When it comes to more complex orchestration task such as: „Kibana and Logstash should wait for Elasticsearch to become green“, I didn’t find a way to implement this in OpenShift.

Another nice feature of OpenShift for web server farms are deployments without downtime, when you take one server out, update it, take it in again and proceed to the next. Again, I didn’t manage to implement this for Elasticsearch in OpenShift. Restarts of single nodes and rolling upgrades need some actions and intelligence: Stop shard allocation, flush to disk, restart one node, restarting shard allocation, and wait for the cluster to recover before you proceed.

Don’t confuse readiness probes with liveness probes. The latter don’t make sense for Elasticsearch. A liveness probe tells OpenShift to kill a container when the application doesn’t work, and we are back at point (D): Killing data systems may produce more issues than it solves. Elasticsearch may have a range of reasons for answering slow or not at all.

(F) Everything is defined in code and configuration

Yes, I agree that „everything as software“ is an advantage of OpenShift. However, I wasn’t particularly happy with OpenShift Ansible, which I found rather messy (Red Hat recently rewrote it). Helm might be better. And of course, if your world is completely automated already anyway, this isn’t an advantage specific to OpenShift.

This and the last point – „(G) It’s what cool people do“ – appear to be the only ones that survives scrutiny unharmed.

 


Should we run Elasticsearch in Docker?

Let’s ask the same question: What’s cool about Docker? For me, it’s that it simplifies installation, because it controls most of the dependencies of your process on its surroundings. Take for example a ruby application. Resolving all the dependencies can be a struggle even for experienced administrators. Once you get everything right, put it into a docker image and rejoice.

Now, about Elasticsearch – it runs in a JVM, so there never were many dependencies in the first place. Install Java 8. OK, this you have to do. Look at the other system requirements:

  • Disable Swapping
  • Increase File Descriptors
  • Ensure sufficient virtual memory
  • Ensure sufficient threads
  • JVM DNS cache settings

Except for the last point, this has to be done on the system hosting the container. So these aren’t the kind of tasks where docker helps. The same is true of disk optimization like setting the correct SSD scheduler or noatime on the data mountpoint.

Docker and Elasticsearch may give you some headache with the cluster networking. On the other hand, it may make your life easier you when you have to set up lots of instances in different environments. My personal conclusion is: Dockerize Elasticsearch, or don’t, you won’t regret it.


 

What about the Downsides?

Now let’s look at the reasons why you might not want to use OpenShift

(W) It adds another layer of complexity

True. Think about updates: Update the OS, update OpenShift – they come with substantial changes; update your automation … Also, if this is your first OpenShift cluster, you’ll have to rethink topics such as patch management, hardening, monitoring and handling of logs. (The OpenShift initiated know that OpenShift comes with their own Elasticsearch – Fluentd – Kibana – Stack. The Elastic Stack initiated know that this stack is kind of basic.)

(X) license fees 

Not if you run OpenShift Origin. True if you want support. If you have support, this is great, use it. Obviously, keeping part of the Elastic Stack outside the OpenShift cluster will save you licenses.

(Y) Will it still work under heavy load?

Unfortunately, I don’t know, I left the project before the usage got serious. I’m very curious about how the software defined network will cope with the intra cluster traffic. And how to recognize issues with software defined networks.

 


You Must be This Tall to Use OpenShift

The title of this box is a hommage to Martin Fowler’s timeless post „You Must be this Tall to Use Microservices„.

In order to really enjoy OpenShift, it helps to be in control of the following topics (Putting this more German and less Californian, read this as: „If you don’t control the following topics but still go with OpenShift, you are in trouble“):

  • Repositories (reliable access to packages and docker images)
  • Getting certificates / PKI (if you plan to use TLS)
  • Time on your servers – if time on the servers differs, you cluster will be badly confused.
  • Provisioning new servers, disks, and other resources
  • Configuration and updates of the underlying operating system (note that it may interfere with OpenShift Ansible)

 

Conclusion

Many of the reasons to use OpenShift get less shiny when you put Elasticsearch into the pods. Simalarly, Elasticsearch loses much of it’s beauty. Things are changing around persistent volumes, but as it is now, I’d prefer to run Elasticsearch on some friendly medium sized servers outside of OpenShift. Be sure to consider your alternatives before you find yourself in a jungle of complexity where you hoped to simplify your world.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.