In my previous post on OpenStack’s volume backups I gave an overview of Cinder’s Backup service current status and I mentioned that some of the limitations that currently exist could be easily overcome scripting a helper tool. In this post I’m going to explain different options to create such a script and provide one as a sample/reference.
Among current Cinder Backup service limitations these are some things that can be automated:
- Backup in-use volumes keeping reference to original volume’s ID
- Scheduled backups
- Multiple backups/restores at a time
- Export/Import multiple backup metadata at a time
- Persist name and description of original volume in restoration
But there is no reason to limit the script to just a fixing tool. In the spirit of the principle of least effort – in the smart way – you should include all features that would automate your frequent tasks and make your daily life easier, like:
- Backup, as an administrator, your tenants volumes
- Hide backed up volumes from the tenants
- Implement backup rotation to keep only last N backups
Once we have decided what we want to automate we have to look into how we are actually going to do it.
One of the beauties of IT is the broad range of possibilities, we can choose from, to solve any given problem. Usually we’ll pick based on our requisites: time, available resources, knowledge, acceptable results, the environment the solution will be executed in, an so on.
In this particular case there are at least 3 different paths.
1. REST APIs
Every OpenStack service and extension exposes a REST API to access information and operations on your cloud. So you can use this API to create your script in your preferred language.
Unfortunately Backup API is not documented in the Block Storage current API, so you would need to look at the actual code to figure out how to perform backup, restore, import and export operation.
Using this options means that you will have to do everything yourself, from the request, to error handling and parsing. And don’t forget you would need to manage Keystone’s tokens for your request as well.
If you are thinking about going this path, please don’t! This is a waste of resources, as it requires more time and effort to get results than any of the other options, and there’s so much better things that you could be doing…
2. Cinder client’s CLI
Like the smart people we are, we quickly dismissed reinventing the wheel using REST APIs directly. So our mind will go next to familiar grounds and think of Cinder client’s CLI, after all we use it everyday to manage our cloud and we know it very well, so why not use it to automate our backups?
While this is a valid option to use in your automation script as you are at least one level of abstraction above our first option and Cinder client has good online documentation and in the CLI itself. Remember you’ll still have to capture stdout/stderr from your Cinder client CLI calls and parse them for results; which is not a terrible thing but is indeed a bother.
3. Cinder client’s Python API
You may only be familiar with Cinder client’s command line interface, but the client also exposes a robust and feature rich Python API (the
cinderclient module) you can use in your Python scripts.
This is actually how Cinder CLI is actually implemented, on top of Python’s module API.
The biggest inconvenience of this option is that since this API is meant for OpenStack’s programmers it is not well documented and you are expected to look at the code, the module’s section or the CLI‘s code, to figure out how to use it.
To be frank, saying that it is not well documented is an understatement, because there’s basically no documentation. But even so, this is the option that makes more sense to me, because you don’t have to bother parsing any stdin/stderr, or call external programs and all the code in the end is Python code instead of a mix of a script language plus calls to a CLI. And that’s why I chose this option to create the sample script.
In Vancouver’s Openstack Summit there was a session called Dude, where’s my volume? A guide to storage backup, migration, and replication with OpenStack Cinder that presented an overview of workflows for storage disaster recovery in OpenStack. During this presentation I showcased a script to facilitate backup/restore workflow, and that is the sample script I’ll be talking about.
Currently this script provides following capabilities:
- Backup/Restore all volumes:
- From the tenant
- From all tenants accessible by provided credentials
- Backups can be hidden to owner tenants if done by admin
- Admin can easily restore either his backups or tenants and make them visible/invisible to original tenant.
- Backup rotation: Can limit how many automatic backups are kept for each volume
- Can restore by original volume id or by backup id
- Optional preservation of volume’s original volume name and description
- Backup in-use volumes through a temporary snapshot and volume creation/destruction
- Export/Import all automatic backups metadata at once into one file
Friendly reminder: Always export your backups metadata and keep it safely stored. Remember that without this Metadata your backups will be useless if you lose your DB and want to recover the backups into a new/different deployment, because it doesn’t know your backups even exist.
The script has only 2 requirements to run, Python 2.7 and cinderclient v1.1.1 or newer, if an older version is used some of the features will not be available, since they depend on cinderclient support of all-tenants option.
Cinderclient is available from python-cinderclient package or PyPi’s python-cinderclient.
Please keep in mind that this is just a sample script, it’s not fail proof and therefore is not really production ready.
Script is available on GitHub as cinderback, so feel free to give it a try.
By the way, aforementioned presentation included a quick video showing the script running to illustrate its use. Given the speed of the video, annotations were included to help attendees follow what the script was doing; unfortunately due to last minute technical difficulties the video included in the presentation didn’t have these annotations, which made it a lot harder to follow. But here’s the video as it was meant to be presented.
If you want to automate your backups all you need to do is use the script in a cron job according to your needs.
With current code you basically have 5 available actions:
- export: Export all automatic backups metadata to a single file
- import: Import multiple backups metadata information from a single file
- list: List backups information including original Volume ID for backed up in-use volumes
- backup: Backup all volumes with optional metadata export to file
- restore: Restore all volumes or single volume with optional metadata import from file
Like any OpenStack client you have available optional arguments os-username, os-password, os-tenant-name and os-auth-url, and if they are not provided environment variables will be used.
Create backups for all our volumes using credentials from environment and keep only the previous backup deleting older ones (backup rotation):
Restore latests backups for all our volumes using credentials from environment:
List existing automatic backups:
As administrator create backups for all tenants’ volumes, export metadata and hide backups from tenants:
As administrator import metadata and restore all backups created by us (if we created volumes for other tenants they will also be restored) to their original IDs (volumes with those IDs must exist):
As administrator import newest backups for every volume (created by us or by tenants):
Restore only 1 specific automatic backup using the backup id (used for non last backups):
Restore only latest automatic backup for specific volume:
List existing backups from all tenants:
Where to go from here
I encourage you to try the script and provide feedback on the bugs you find. And if you feel like it, add more features to the script.
Here are a list of features that would be really useful:
- Allow using filters to select which volumes to backup/restore
- Use Consistency Groups to allow multi-volume consistent backups
- Add incremental backup support once all backends support it in the same manner
- Allow parallel backups when volumes are hosted in different Cinder nodes
- Restore volumes by date