Cinder HA Active-Active specs up for review 1


It’s been some time since the last time I talked here about High Availability Active-Active configurations in Openstack’s Cinder service, and now I am quite pleased -and a little bit embarrassed it took so long- to announce that all specs are now up for reviewing.

More difficult and Less difficult

Three months ago I wrote a couple of post proposing different solutions that would allow Cinder Volume Nodes to support Active-Active configurations. One of them played with AMQP’s ACKs and was a little bit complex while the other focused on using the database. Solutions proposed there had some minor mistakes, but overall were fine.

Based on those posts -mostly on the simpler version- we now have a series of specifications up for review that together aim to provide a robust and reliable solution to the HA Active-Active issue in Cinder. These, together with their respective implementations will be grouped together under the same Blueprint.

There are 6 different specifications -although 2 of them could have been merged- that address each of the specific issues -races in the code, local file locks, job distribution, etc.- and 1 specification to provide a general view of the problem and a summary of the solutions provided in each of the individual specifications.

Proposed specifications are:

The most important difference between these specifications and the solutions that were suggested in previous posts is that we no longer need a DLM or new states in the resources (like reading) to assure mutual exclusion between the different nodes of a cluster. We will now be using the same workers table we use for the cleanup process to do the locking. To acquire the lock we’ll be using a conditional insert query (similar to the compare-and-swap we’ll use to remove API races) with retries and to release the lock we’ll use a delete query; as for the lock timeout management needed to release locks from crashed nodes it will be handled by the cleanup mechanism.

With this locking mechanism we’ll be able to avoid using a DLM and ease operators job to create Active-Active cluster configurations with Cinder.

The last minute rush to get these specifications up for review was to facilitate discussions during Tokyo Openstack Summit, since we have 2 specific design sessions on the topic -one on C-vol active/active as a whole and another on the specifics of the Volume Manager Locks– and good knowledge on this complicated matter involving many moving pieces will be essential to be productive.

While I believe this is a viable path to enable High Availability Active-Active configurations in Cinder, these specifications can also be useful to anyone looking for other solutions as it gives a detailed description of the problem as well as a good number of cases that any proper solution needs to be able to handle.

Alternatives to any of the issues -or the whole approach for that matter- are more than welcome, preferably simpler ones, but they need to be well thought, since we have already discussed some approaches that failed to meet expected basic requirements:

  • Work in all cases for the issue that are trying to fix.
  • Play well with the other individual solutions to give a reliable global solution.

The only implementation effort in progress for these specifications right now, is the removal of code races on the API nodes using compare-and-swap. It has an ongoing series of 9 patches -and more to come- that still need reviewing. The first patch of the series may seem unrelated to the topic, but is needed for the atomic conditional updating mechanism implemented on the next patch of the series which is the base of all the other patches since it will build the query based on a simple syntax.

Conditional updating mechanism is quite flexible, we’ve even had to add a new feature to SQLAlchemy to support some of it’s functionality, and it will allow us to do atomic updates using any of the Versioned Objects available in Cinder. Some of the included functionality is:

  • Flexibility in the compare condition:
    • Inclusion: status == ‘available’
    • Exclusion: status != ‘in-use’
    • Multi-inclusion: status in (‘available’, ‘error’)
    • Multi-exclusion: status not in (‘attaching’, ‘in-use’). By default it treat NULL values in the DB like Python does instead of like SQL does, so checking for not(1, 2) on a field with 2 rows in the DB with values 1 and 2 will match both entries instead of just matching 3 like SQL does.
    • SQLAlchemy filters
  • Flexibility in the values to set:
    • Fixed values
    • Conditional values depending on current DB values.
    • Values from another field in the DB
    • Operations on DB fields

One example of using filters would be the delete volume operation that needs to check the existence of snapshots:

 expected = {'attach_status': db.Not('attached'),
             'migration_status': None,
             'consistencygroup_id': None}
 if not force:
     expected['status'] = ('available', 'error', 'error_restoring',
                           'error_extending')

 values = {'status': 'deleting',
           'previous_status': volume.model.status,
           'terminated_at': timeutils.utcnow()}

 filters = [~sql.exists().where(volume.model.id ==
                                objects.Snapshot.model.volume_id)]

 updated = vol_obj.conditional_update(values,
                                      expected,
                                      filters)

An example of setting values based on current DB fields would be:

 expected = {'status': 'available'}

 has_snapshot_filter = sql.exists().where(
     objects.Snapshot.model.volume_id == volume.model.id)
 values = {
     'size': volume.model.size + 1,
     'status': volume.Case([(has_snapshot_filter, 'has-snapshot')],
                           else_='no-snapshot')

 volume.conditional_update(values, expected)

Picture: “Cinder blocks” by Tom Simpson is licensed under CC BY NC ND 2.0