Write a Cinder Backup driver in an evening 3


Not too long ago I was asked about the complexity of writing a Driver in Cinder, and after a brief explanation I closed saying that in contrast backup drivers are a lot simpler. Which got me thinking about how easy it actually is to create a backup driver for Cinder and to check it I wrote a driver for Google’s Cloud. In this post I’ll give pointers and you’ll be able to see for yourself how easy it really is.

Driver found

Like with many things in this life, writing a Cinder Backup Driver is easy to do, as long as you know what you need to do. And that’s the purpose of this post, provide directions for anyone who wants to write a backup driver and doesn’t want to wander around asking for directions every step of the way or doesn’t want to understand all the little details.

When facing the task of writing a Backup driver in Cinder we multiple options, but I’ll explain the easiest one because I believe it’s the best one and only in counted cases we would want to go with the more complex solution.

To get your backup driver merged in Cinder there are 4 steps:

1- Client to access the storage backend
2- Cinder backup driver itself
3- Tests for the backup driver
4- Submit and review process

Storage backend client

This is the hardest part of it all, but nowadays many storage backends provide Python bindings, so if you are lucky enough you will have one for the backend you want to use.

In my case there was the Google’s api python client, but I decided to go with my own GCS client because I didn’t like the way that client handled the retries (there are no exponential backoff retries), there are no retries for some exceptions, and it doesn’t allow writing an object in chunks.

But maybe in your case there is no Python client package for your backend, and you will have to implement it on your own. If that is the case I suggest you consider doing so outside of the Cinder code base as it has some benefits.

In any case, you must make sure that the license of the client is compatible with the Apache v2.0 license OpenStack uses to be able to include it in Cinder requirements.

Cinder backup driver

I think it is worth mentioning before going into the details of the driver, that creating a Cinder backup driver didn’t just happen to be easy by accident, it is the result of having a nice reference implementation and the right use of OOP inheritance.

Our driver will be inheriting from ChunckedBackupDriver which will keep the code base down to a minimum while allowing us to provide all currently existing functionality in our driver – volume backups, volume incremental backups, in-use volume backups, snapshot backups, backup restore – with minimal effort and make it easier to support future functionality.

There are 5 things we need to do in the new Cinder module that will be holding our driver:

  • Configuration options
  • Backup class
  • Reader class
  • Writer class
  • Backup instance

Configuration options

To configure our driver we’ll use Oslo configuration options just like any other project and driver in OpenStack. And sure, we’ll have some storage backend specific options that are unique to our driver, but there are also some that will be related to the ChunkedBackupDriver, as in:

  • Default container name to use: this could be a directory, depending on our implementation
  • Size in bytes of the objects, or files depending on our implementation, that volumes are split into
  • The block size in bytes used to track changes when doing incremental backups
  • Enable sending progress notifications to Ceilometer

Backup class

We need to implement all 8 abstract methods defined in ChunkedBackupDriver. These methods perform very limited and specific tasks and they are as follow:

  • put_container: Given a name this method must ensure that a container/bucket by that name exists, creating it if necessary and not failing if it already does.
  • get_container_entries: Given a container name returns a list with all the names of objects inside it. This is basically an ls command of the container.
  • get_object_reader: Given the names of a container and an object returns a simplified reader object instance.
  • get_object_writer: Given the names of a container and an object returns a simplified writer object instance.
  • delete_object: Given the names of a container and an object deletes the given object from the specified container.
  • _generate_object_name_prefix: Returns the prefix for the container’s entries to which the base class will be adding a slash (-) and a 5 digit object id to identify each of the chunks that make up the backup. This is usually composed using the availability zone, the backup id, the volume id and a timestamp. Cinder will be saving this information in backup’s service_metadata, and is really important for restoring backups in case we change this in future driver versions.
  • update_container_name: This method receives the container’s name that will be used and should return either None if it doesn’t wish to change it or a new name if it does.
  • get_extra_metadata: JSON serializable object that will be passed to get_object_reader and get_object_writer methods and will be stored in the backend itself. Usually just returns None.

Reader class

This is not a normal file/object class that allows you to open, close, seek, read, write, etc.

For backups only the context manager interface – __enter__ and __exit__ methods – and the read method need to be implemented. That doesn’t mean that we need to have a different class for the reader and the writer, but we can have them as 2 classes or as 1, it doesn’t really matter.

The read method does not need to take any argument, since the base class will be reading the whole chunk in one go.

Writer class

Here we have the same situation as with the Reader class, we only need a simplified version of it, so we just need to implement the context manager interface – __enter__ and __exit__ methods – and the write method that writes a whole chunk in one go.

Backup instance

Our driver module needs to provide a function called get_backup_driver outside of the driver class, that returns an instance of our driver.

Tests for the backup driver

As you can imagine, like with any other patch in OpenStack you’ll need to provide tests for the new code.

Here you’ll probably want to use any of the other ChunkedBackupDriver tests as a reference, for example Swift’s tests.

Submit and review process

This process is the same as with any other patch in OpenStack, you just need to follow the process as outlined in the documentation and be prompt to reply to comments and update your patches.

Conclusion

As you can see, implementing a backup driver in Cinder this way is quite easy – here’s my implementation of a GCS backup driver – and you don’t even need to know anything about the workflow of reading volume contents, detecting incremental changes, dealing with the DB, dealing with requests and responses, etc. You just need to implement very few and specific methods and that’s it.

If you do want to know a little bit more about the workflow of incremental backups you can read my post titled “Inside Cinder’s Incremental Backup.

Looking at Cinder’s code you’ll see that we now have a GCS backup driver and that it’s not mine. The reason for that is that I did mine while on vacation to check, as I mentioned earlier, how easy it was; but then I got distracted with my vacations and when I came back to work I was too busy catching up with the backlog to pick it up again and create the tests and submit for review.

Fortunately someone else had the same idea, and they actually submitted a patch upstream, and though it was not exactly like mine, since they used Google’s api python client, the logic was the same and it was really easy for me to review it, as I had already implemented it myself as well.

One last piece of advice, make your code and tests Python 3 compatible, as we are quite advanced in the process of making Cinder Python 2 & 3 compatible.

 


Picture: “Driver found” by Matthew Stevens is licensed under CC BY-NC-ND 2.0


Leave a comment

Your email address will not be published. Required fields are marked *

3 thoughts on “Write a Cinder Backup driver in an evening

  • toandomino

    Greate!!!

    I have a question:

    I’m using Cinder volume which run virtual machine and snapshot (force) them in status ‘in-use’ when virtual machine is running. What risks can occur for a virtual machine and snapshot?

    Thank!

    • geguileo Post author

      To have consistent snapshots, best practice dictates that these should be taken on available volumes (that’s why we need the force flag to snapshot an attached volume), if that’s not possible next best thing would be to take the snapshot on a volume that is not being written to even if it is attached.

      If none of those two is an option we can do the snapshot of an attached volume like you say, but if the volume is under heavy use the snapshot may contain an inconsistent file system, not to be confused with a corrupt file system.

      • toandomino

        Thank bro,

        I’m following your blog. It’s interesting! Please share your knowledge to the community.