M202 July 2014 Third week

August 6, 2014 21:43
m202,
mongodb

The third week's topic was disaster recovery and backup.

Disaster recovery

The first videos covered what disaster recovery is. There was an emphasis on how to configure the MongoDB replica sets and clusters, those being the most common production systems.

There are two ways in dealing with network or hardware failure, the first being manual and the second automatic. From the videos I learned that at the moment automatic failover requires a minimum of three distinct network groups. For convenience the videos used the term data centre, but any configuration that can keep at least two network groups up should suffice. The reason being that elections require a majority vote or none of the remaining systems will automatically get promoted to primary.

So the key to a successful disaster recovery is: location, location, location. When dealing with the smallest recommended number of nodes, being three nodes, one location holds the primary, the other two either each have a secondary node or one has a light-weight arbiter. Additional mongod processes should be appropriately scattered across these three locations.

When it comes to config servers the minimum number that needs to be up is one, even if the system is truly healthy when three are up. Three is the maximum number of config servers, which means that in a scenario of more than three locations there will be locations without config servers.

Backup

The videos explained several ways to do backups. You can either do live backups (with slightly performance degradation in certain scenarios) or take down instances and use more conventional backup processes.

The recommended scenario is of course to do a backup while the system is live. This so-called point-in-time backup is supported by many systems, for instance the LVM layer in Linux and the EBS volumes in Amazon.

The other scenario is to stop writing to disk using the [db.fsyncLock()](http://docs.mongodb.org/manual/reference/method/db.fsyncLock/ "MongoDB docs: method db.fsyncLock()") and then copy data to a backup location using conventional tools such as cp, scp or rsync. Although some file systems provide a way to become read-only, in case you want to backup more than just mongod data, it's necessary to explicitly tell the mongod process to stop attempting writes to disk. Otherwise lots of pointless errors will follow.

When you make a backup of data, you will usually make a backup of the oplog as well. The oplog will help get a restored system up to speed. Without a backup you're crippling the restore process, increasing the chance of failure.

Backing up with RAID

A special scenario was the backup of data stored on a RAID configuration. Depending on the exact timing of the backup data across the RAID might be in an inconsistent state. More specifically, part of the stripe in a striped volume might not be written yet. This chance increases with the number of disks. Solutions to this are in more abstract layers such as LVM that, according to the videos, can guarantee point-in-time. There's a special case for RAID 10 on EBS volumes in the MongoDB documentation, since this seems is a difficult task.

Test your restore process

What often happens is that people make backups and they feel safe. They are right, it is safe to have a backup, but pointless if you can't restore from it (fast enough).

The most important part of backups is that you can always restore from them to get your data back to a known state. This goes for the latest backup to the oldest one. One of the videos mentioned a situation where people did test restores from the latest backup, but weren't able to restore the backup from last week because the compression software contained a bug that crippled the data.

In case of MongoDB you want to make sure your restore is fast enough. For this you need to take in consideration not only how fast the backed up data is moved to a (new) location, but also how fast it will be available and caught up with the rest of a replica set.

The time available for the restore process is mainly dictated by the amount of time between the first and last operation in the replica set's oplog, also referred to as the oplog window. The course recommends a decent buffer so that there's time for error. Useful techniques to at least warm up the node, and speed up the restore process, were covered in week 2 during the pre-heating videos.

Mongodump and it's use

In the second week I posted a question on the discussion forum. My question was why mongodump/mongorestore was not mentioned as a way to shrink disk usage. The teaching assistant said the discussion could be picked up after the completion of the second week, but that didn't happen. As I expected there would be some sort answer in the third week's videos, although not as complete as you might hope.

So mongodump is an excellent tool to dump the contents of collections either online on a live system or offline when data files are accessible. It can even record some of the oplog to replay later. There are however a few disadvantages to a dump with this tool:

indexes are only stored as descriptions. They need to be rebuild after restoring, because their data isn't dumped
on a live system running mongodump will page all data into memory, which will likely force other data out
on large data sets creating a dump will take a lot of time, the same goes for restoring large datasets

In my mind collection compaction and [repairDatabase()](http://docs.mongodb.org/manual/reference/command/repairDatabase/#dbcmd.repairDatabase "MongoDB docs: repairDatabase() command") also take a lot of time and memory, with the difference they at least seem to keep indexes intact.

Backup and MMS

Mongo Management Service (MMS) has some functionality to do backups as well. This SaaS solution creates several backups with the option to request point-in-time restores. These are created from the last backup patched with replication oplog data until the requested time. Currently data is not encrypted at rest, because that obstructs the point-in-time restore in some way. The only layer of security being that there's two-factor authentication needed to restore data. There is a possibility to get the same functionality on-premise with the enterprise edition of MongoDB.

Homework

The homework of week 3 looked daunting at first and there were some tough questions. With some logical thinking and making notes on paper I made it through.

This week gave me new insights in not just how to manage MongoDB, but also on how to improve my skills in general. The fourth week will cover fault tolerance and availability. I am looking forward to what I might learn then.

← Previous
M202 July 2014 Second week
Next →
M202 July 2014 Fourth week