Anyone who had the pleasure of upgrading an elasticSearch cluster knows what a tedious and sometimes eventful process it can be. Unfortunately, I’ve not yet had the experience of working on large scale clusters (20+ nodes) however my experience with slightly smaller clusters have highlighted the need to automate this task.
For those who don’t know, when you update an elasticSearch cluster, it’s best to perform what’s called a ‘rolling upgrade’. This involves a few steps:
Disabling shard allocation and performing a synced flush
Stopping the elasticSearch service one node
Restarting the elasticSearch service/node
Waiting for the elasticSearch node to join the cluster
Re-enabling shard allocation
Waiting for the cluster to return to green
Repeat for every node!
Don’t get me wrong, I absolutely love elasticSearch! It the technology behind HoneypotDB’s data store and search, but when nodes take a long time to rebalance (step 7) it can take a while.
Ansible to the rescue
Ansible is a really powerful tool. Put simply, it enables remote management of devices and automation of tasks, which sounds perfect for tackling the beast that is elasticSearch upgrades.
I’ve been working more and more with Ansible lately, I’m using it to automate the deployment of infrastructure for HoneypotDB’s and deploy new pots. Furthermore, all my OS and elasticSearch is automated, ran every week by Ansible Tower. It’s great!
I’ve compiled the steps into a single Ansible role, and uploaded it to my GitHub for you to download. Please feel free to give it a try.
Do give the readme a read (lol), as it highlights some important information
The playbook will:
Check the nodes installed elasticSearch version is lower that the target version
if it is lower, disable shard allocation and perform a synced flush
Stop the elasticSearch service
Upgrade to the target version
Reinstall the elasticSearch S3 snapshot plugin (Comment these steps out if you don’t need them)
Restart the elasticSearch service
Wait for the node to re-join the cluster
Reenable shard allocation
Wait for the cluster to return to green
If Kibana is installed on the node, stop and upgrade Kibana then start it up again