How do we back up a million databases every day?
As part of our packages, OVH offers one or more shared SQL databases, along with optional private SQL servers. Both types of databases are backed up daily by OVH - that’s over a million databases. This huge volume presents an interesting challenge: how can we distribute the backups in time to avoid creating a peak load on the infrastructure? What methods do we use to complete these backups? The database team, which are responsible for this at OVH, explains more.
When do we make backups?
Making backups naturally causes an increase in the load on the infrastructure, whether it's a database backup or any other type of service. Of course, in terms of Input/Output (I/O), the data must be copied at the network level, because the data needs to be transferred from one machine to another.
This problem is well-known to sysadmins, and is the reason why backups are often made during the service’s off-peak hours - usually at night time.
We have historically adopted this intuitive approach. For the vast majority of our OVH hosting users, the off-peak time is between 01:00am and 07:00 am (French time), so database backups were logically planned to take place during these times.
Why do we no longer do this? This approach quickly showed its limitations. During the 01:00 - 07:00 slot, our system indicators often turned bright red. We were noticing bottlenecks, dumps (SQL database exports) were taking longer to complete, and the service quality was at risk because the time slot allotted to backups would sometimes run out. This meant that the platform had to try and finish the backups while managing the increasing inflows of early-morning visitors to hosted sites. Basically, it wasn’t working any more.
We then adopted a different approach and spread the backup tasks throughout the day, to maintain optimal performance on hosted websites at any time of the day or night.
The distribution algorithm
To achieve an even distribution of backups throughout the day, we needed to link a time of day with each database.
We couldn’t perform this partition (or “sharding”) using only the name of the database, since a huge number of databases in the infrastructure are simply called "wordpress", "prestashop", or even "test" or "demo". That meant that all of the “wordpress” databases would have backed up at the same time, which would not have been very effective.
Although this is unusual, we also couldn’t shard according to the name of the instance that was hosting the database. With fewer instances than databases, the law of large numbers meant that distributing backups throughout the day based on the ID of the instance would have resulted in a less even distribution than sharding them according to the names of the databases themselves. We therefore combined these two pieces of information, and we now perform our sharding by using a concatenation of the database name and the instance name.
Next, we just had to find a formula that enabled us to link this string of characters with a time of day, that is, a number between 1 and 1,440 (the number of minutes in a day). Since 1,440 is divisible by 16 (thanks to the Babylonians, who adopted an easily-divisible base 60 system in order to calculate minutes and seconds), we used a simple hexadecimal cryptographic hash function to generate this algorithm:
int (hashlib.sha512(instance_name_+_"."_+_db_name).hexdigest () [: 90], 16) % 1440
As the following graph shows, the backups are distributed evenly throughout the day.
How do we make backups?
Database backups can be made using one of two methods: either by exporting the data in SQL format, or by saving the databases in a format that’s readable by the database management system (DBMS or MySQL, for example). Both methods have their advantages, disadvantages, and cases of atypical usage. We decided to use a combination of both methods to make use of the advantages of each one.
The idea behind the dump method is to generate a text file of SQL commands which, when fed back to the server, recreates the database in the same state as it was at the time of the dump. This has the advantage of allowing data to migrate to other versions of the DBMS, or to other engines. It’s also possible to modify the file by hand before you import it back (which can be useful if, for example, you want to clone a production environment for test purposes, but some of the large tables are unnecessary).
Because of these advantages, we use this method and we also give you access to the dumps, which are visible and downloadable directly from your customer control panel. As mentioned above, dumps are performed at the same time every day, and are retained for a one-month rolling period.
For a dump to be consistent (that is, it reflects the status of the database at a given point in time), we can use two methods:
• Lock the tables
• We keep the database in a state and place connections on hold, which start to queue. Once the dump has been completed, we unlock the table and release the queue.
• The advantage of this method is that it is compatible with all storage engines (including leading engines such as MyISAM and InnoDB), and the dump is always consistent.
• On the downside, the tables are locked, and you have to wait until the end of the dump to make the database (and therefore the website) accessible again. This is not a problem in a large majority of cases, but it can be when the dump is long and has many connections, meaning that the limit for the maximum number of simultaneous connections to the database may be reached in that time. Increasing the limit would only delay this problem, rather than solve it. Also, limiting the number of connections protects your server: a connection requires RAM, and once you have hit over 200 simultaneous connections (the maximum allowed for SQL Private instances), the RAM allocated to the current connections could cause an Out Of Memory error.
• Use a transaction
• Pros: the dump is completely transparent. No locking. The dump is also always consistent.
• Disadvantages: transactions are only possible with engines that manage transactions (such as InnoDB). This means that if there is a single table in MyISAM, the dump would be inconsistent. Also, if using InnoDB, if an ALTER, CREATE, DROP, RENAME or TRUNCATE TABLE is performed during the dump, it will be inconsistent.
In view of all this, we have combined the two methods: If your database is 100% InnoDB, then we use a transaction. If not, we use table locking.
That’s why we strongly recommend that you convert all of your tables to InnoDB, which is included and active in all products and services provided by MySQL AB since version 4, and has been the default engine since MySQL 5.5.5 (click here to see the guide)!
Backups in DBMS format
As we have seen, performing a database dump has many advantages in comparison to a “traditional” backup, but it does have a drawback: the time it takes to restore, which depends on the size of the dump.
The dump volume that we are creating on a daily basis allows us to establish accurate statistics on how long it takes to export and restore databases. Below is the result of the linear regressions that we calculated on the response time of over 40,000 exports of 100% InnoDB databases (a choice that means we don’t need to take into account the waiting time prior to locking the database).
$. / lr.py list
y = 0.142865163057 * x 1 + 4.65205077853
R2 = 0.792955
Dump duration according to the database size, in MB
Here, R2 is the coefficient of correlation. A value of 0.8 means that the correlation is strong, i.e. determining the dump duration according to the database size makes sense, as the two are closely linked.
We wondered if other parameters should be taken into account in calculating the dump and restore times, so we repeated the exercise with several variables (number of rows, average size of records, etc.). In the end, the only calculation that was almost as relevant was considering the number of records (rows) as well as the size of the database:
$. / lr.py list
y = 2.62412392084 * x 1 + 0.130888461045 * x 2 + 4.60953501036
R2 = 0.800904
Dump time based on the number of records (in million) and the size of the database (in MB)
Encouraged by these findings, we have adapted our strategy to minimize the recovery time of databases in the event of an incident in our systems, because we know that most of the time, an unavailable database means that the web site is inaccessible, which is critical for the user. So, when the size of a database-hosting instance is greater than 4 GB (which means an average restore time of 10 minutes), we systematically double the DBMS backup dumps. This has the advantage of a much faster restore in the event of a storage cluster problem. These backups, which are performed for all shared database hosting instances (and on a case by case basis for SQL Private instances exceeding 4 GB), are not available to users; they are used in the event of a storage cluster problem. The user only has access to their dumps.
To perform these backups in DBMS format for MYSQL and MariaDB, we use XtraBackup in full backup mode (not an incremental backup) to facilitate the restoration. XtraBackup is an open-source tool from Percona, which allows you to make consistent backups without table locking, whatever engine you’re using (MyISAM, InnoDB, etc.). One thing to note is that XtraBackup performs a full instance backup, not just a single database. For PostgreSQL, we use pg_basebackup.
Checking tables and RAM requirements
Before carrying out a dump (either daily or on request), we check the state of your tables and repair them as needed. With MySQL and MariaDB instances, we use the mysqlcheck command. One particular concern is gradually disappearing: the latest versions of MySQL and MariaDB are increasingly better at managing write operations that are interrupted by crashes, and this problem no longer exists at all with PostgreSQL.
Checking a table, and especially repairing one, may take up much more RAM than is available in your instance. That’s why throughout the mysqlcheck and the dump, we temporarily increase the RAM in your instance by adding 4 GB. If your databases are large enough, you can see this extra RAM in your customer control panel:
If the dumps take less than a minute, this extra RAM may go unnoticed between the two actions, so you won't see it in the graph on your customer control panel.
It’s worth noting that this peak in available memory will overwrite your graph, making the memory usage chart almost unreadable - so feel free to click on 'Maximum RAM limit' at the bottom of the graph to show the memory usage.
Where do we store the backups?
A backup is only useful if it is stored on a third party platform. Also, we keep your dumps on the Public Cloud Storage platform, which is fully separate from our web hosting database management platform. Your backups are therefore distributed over three synchronous replications, located in three separate sites: one in Gravelines, one in Roubaix, and the last in Strasbourg, where they will be kept for a month.
Private SQL backups using XtraBackup and pg_basebackup from the Paris datacenter (web hosting created before July 2016) are stored on a Ceph cluster other than the one used to store production data, while backups from the Gravelines datacenter (web hosting created after July 2016) are stored on local hard disks (and the production data is stored on a Ceph cluster).