Refresh your elasticsearch index with zero downtime

2 minute read

I’m using a basic elasticsearch-rails setup on one of the applications I’m working on. The thing is, we sometimes need to make changes to the ES index, add new fields, or redo the existing ones. And when you have a database that needs at least 10 minutes(it was closer to 1 hour before we reworked the implementation), you can’t really afford that downtime. As ES-rails sets things up for you in a certain way, it was needed to use some of the underlying ES features to get the job done correctly.

So, it was needed to completely remove the downtime while doing the full reindex(deleting and creating the elasticsearch index). If you are already using elasticsearch-rails then you are familiar with the nomenclature, and probably with the methods described. For doing a complete reindex we require a rake task to do the job for us like this:

namespace :elasticsearch do
  task :reindex => :environment do
    index_name = Person.index_name
    Person.__elasticsearch__.create_index! force: true
    Person.all.find_in_batches(batch_size: 1000) do |group|
      group_for_bulk = group.map do |a|
        { index: { _id: a.id, data: a.as_indexed_json } }
      end
      Person.__elasticsearch__.client.bulk(
        index: index_name,
        type: "person",
        body: group_for_bulk
      )
    end
  end
end

This will effectively remove the index if it exists and reload the new data in the index, making it practically unusable until the task is done. Because this wasn’t an option, we looked into ES aliases and found them to be helpful. Basically, what we needed to do was to create an index with a unique name, and assign an alias to it, so that we could create and fill the index while the current one was still operational. So no downtime needed.

#...
index_name = "#{Person.index_name}_#{SecureRandom.hex}"
client = Person.__elasticsearch__.client
Person.__elasticsearch__.create_index! index: index_name, force: true
Person.all.find_in_batches(batch_size: 1000) do |group|
  #...
end
# to be sure there is no index named Person.index_name
client.indices.delete(index: Person.index_name) rescue nil
# collecting old indices
old_indices = client.indices.get_alias(name: Person.index_name).map do |key, val|
  { index: key, name: val{% highlight 'aliases' %}.keys.first }
end
# creating new alias
client.indices.put_alias(index: index_name, name: Person.index_name)
# removing old indices
old_indices.each do |index|
  client.indices.delete_alias(index)
  client.indices.delete(index: index{% highlight :index %})
end

So that is about it, you just call bin/rake elasticsearch:reindex and you have refreshed your elasticsearch index with zero downtime. Of course, you will have to implement some system to track the changed records while you were reindexing(remember we are working on a live system, so data is changing all the time). We used redis for that, to mark when reindexing has started $redis.set('elasticsearch:reindex_running', true) collected all changed record ids in a redis array, processed them via the regular indexing worker, and deleted the key after the alias was linked with $redis.del('elasticsearch:reindex_running').

Comments