Why Back Up Cloud Volumes?
When working in the Cloud the idea of doing backups like in the old days may seen counter intuitive. After all, the main reasons for having backups are recovering data after it’s lost, by deletion or corruption, and recovering it from an earlier time, and you have those covered with fast volume snapshots and the use of fault tolerant back-ends like Ceph, that replicate data, for your volumes. So, why would you still need old school backups?
The answer is simple, disasters. What would happen if a flood or an earthquake hit your data center? How would you recover your precious data then?
That’s why it’s essential to keep your data backed up off-site, as in a different place from where your volumes are, in case a disaster comes knocking at your door.
If you have a proper Disaster Recovery Plan in place I’m sure you didn’t even have to ask yourself about the purpose of these Backups.
Usually a DRP will make use of a combination of Replication – for critical data that needs to be up to date in real or near real time and can justify the costs – and standard Backups – periodically updated depending on business objectives.
I won’t get into what a proper DRP should look like, or how backups and replication are only a small piece of the puzzle, but they are after all one of the pillars of this puzzle, so one must always pay careful attention to them.
Now that we’ve settled that you probably want backups in your organization let’s see how OpenStack enables backing up your volumes.
In Grizzly a backup service was introduced in Cinder, OpenStack’s Block Storage service, that allowed doing full backups to a Swift backend.
By activating Cinder Backup service OpenStack’s users will be able to create backups from their volumes using Horizon and command line tools.
Since back up functionality was first introduced new back-end drivers and features have continuously been added to the service and more will continue to be added.
As of Kilo release Cinder Backup service has following functionality:
- Back up available Cinder volumes
- Multiple drivers available: RGW/Swift, Ceph, TSM, NFS
- Back up encrypted volumes
- Incremental backups
- Metadata support in Cinder Backups
- Ability to import/export backups in to Cinder
Cinder Backup in latest Kilo release has some limitations and bugs; and it’s important to know what they are so we can keep them in mind when using the service.
But don’t let the number of items scare you, because they are not really that big of a problem and most of them have workarounds, have already been fixed upstream, are being actively worked on or there are already plans to fix them.
Most notorious limitations and bugs are:
- Tightly coupled to Volume service
- Single back-end
- Can’t backup in-use volumes
- Manual process
- Individual backups
- Can’t tell if backup has dependents
- Limitations on incremental backup
- Backup import into Cinder doesn’t work correctly
- Restore to same Volume Id only if volume is available
- Name and description are lost
1. Tightly coupled to Volume service
Right now Cinder Backup is tightly coupled to Cinder Volume service and it can only back up volumes that are managed by the same node’s Cinder Volume service.
This prevents you from backing up volumes from any node, as you can only back up from your node. But with good deployment planning impact can be limited.
The big issue here is that this prevents backup service from scaling out. And since backup is a CPU bound process, especially with SSL and compression, this presents a serious bottle neck.
2. Single backend
Cinder Backup, unlike Cinder Volume, only allows 1 back-end to be used for backups.
This affects backups in a similar way to the tightly coupled limitation, as you have to backup all your volumes to the same back-end, you cannot chose where to backup your volumes.
3. Cannot backup in-use volumes
Currently only available volumes can be backed up, so if you have an in-use volume you want to back up you have 2 options:
- Detach the volume, do the back up and reattach the volume.
- Use a snapshot:
- Create a temporary snapshot
- Create a temporary volume from this snapshot (we cannot backup a snapshot)
- Back up temporary volume
- Delete temporary volume
- Delete temporary snapshot
First solution is easier and faster, but in most cases will not be possible, as you may not have the possibility of detaching the volume because it is in use.
Second solution has the advantage of allowing an in-use volume backup, but it has multiple disadvantages as well, besides being a tedious manual process, you will lose original Volume ID since backup will be done for the temporary volume instead of the original volume. While you are in the process of creating the backup temporary snapshot and volume contribute to tenant’s quota.
4. Manual process
All backups are manually triggered, using Horizon or command line commands, as there is no scheduled back up functionality available that would allow you to do daily or weekly backups.
There is an easy fix for creating scheduled jobs, as you can easily create scheduled backups using command line tools and cron jobs.
5. Individual backups
Backups must be done one by one, there is no way to request multiple backups using filters or even backing up all volumes for a tenant.
This can be solved creating a script that automates backups.
6. Can’t tell if backup has dependents
Looking at detailed information for a backup you cannot tell if a backup has incremental backups that depend from it.
You will find out the hard way when you try to delete it and you will receive an error, which is good, because otherwise you would render useless all incremental backups that depend from this one.
7. Limitations on incremental backup
Incremental backup as a generic feature of Cinder Backup was introduced in latest Kilo release, and most drivers didn’t have this functionality before, so they inherited from the new base class that facilitated incremental backups. But there was at least one driver, Ceph that had this functionality implemented from the start, in Havana release (yes you read that right, Ceph had it 3 releases earlier) and enabled by default when using Ceph as the Volume Service.
Unfortunately now Ceph is not in sync with this new incremental backup interface and you will not have the possibility of choosing whether you want full or incremental backup and you will not have incremental backups if your volume back-end is not Ceph.
So if you are using Ceph for Cinder Backup it’s like this:
- If volumes are stored in Ceph: All your backups will be incremental (it works between different pools in the same cluster or different clusters)
- If volumes are store in a non Ceph back-end: All your backups will be full backups.
8. Backup import into Cinder doesn’t work correctly
Usually a volume backup can only be restored on the same Block Storage service. This is because restoring a volume from a backup requires metadata available on the database used by the Block Storage service.
That’s why in Icehouse release it was introduced the possibility to import/export backup metadata from/to Cinder. This way you can create a backup and export its metadata, knowing that whenever you need to import the backup you will be able to even if the destination is a completely new deployment.
Unfortunately there is a bug in the import routine that breaks some of the backups, for example for Ceph were all links to the backup source will be lost, and all incremental backups will be lost, since parent information is not imported.
9. Restore to same Volume Id only if volume is available
Currently you can only restore a volume with the same volume Id as it originally had if the original volume’s status is available. Any other volume state, including not existing, will result in an error.
This is inconvenient if you are trying to recover your volume to an empty DB and you would like to preserve the original volume Id.
10. Name and description are lost
When you backup a volume and then you restore it to a new volume, instead of restoring it to the original volume, you will no longer have volume name or volume description.
What’s being done
Quite some more work has been done already since Kilo freeze in Cinder Backup and more will be done during this Liberty cycle.
Some examples are:
- Backup in use volumes: Blueprint, Specs
- Fix backup metadata import: patch available
- Restore to volume Id on empty DB: patch available
- Show when backup has dependents: patch available
- Allow force delete backups: patch 1, patch 2
- GlusterFS driver: patch available
- Specifying volume display name on restore: patch available
- Decoupling: Etherpad from Vancouver Design Session
- And so much more…
As we all know backups should never be neglected and you should definitely have a look at Cinder Backup if you are not using it already.
Don’t let the limitations I mentioned in this post discourage you from giving it a try, after all none of those were really serious and if you know they are there you can work around them without too much effort.
Cinder Backup is a good service and a lot of work is currently being done to it, so we can expect a lot of good new things from it in the future.