One of my client's apps has a large MySQL database that I regularly have to get a dump of and reload a local database from the dump for dev and testing. To expedite this process, I recently started using the excellent mydumper/myloader in place of the venerable mysqldump. It does shave considerable time off of the reload process, but I got strange results the first few times I tried to use it.
Basically, there were a few tables that look like this in production:
col1 | col2 ------+---------------- 0 | some value 1 | some other value 2 | some third value ... | ...and so on...
When I'd use mysqldump to dump them, and source the dump to load my local db, it would reliably dump/load the same data, in the same order.
When I'd use mydumper/myloader, I'd get something like:
col1 | col2 ------+---------------- 1 | some other value 2 | some third value ... | ...and so on... 20 | some value
The first row (with col1 = 0) would end up at the end of the table and col1 would have the wrong value. The app really cares too, so I had to figure out what was happening, and how to fix it.
This took a while, but the problem ultimately came down to:
- col1 is an auto-increment column
- mydumper and mysqldump both output instructions to:
- create the table
- reset the next-auto-increment value to whatever it was in the database that is being dumped
- insert rows, using the exact values from the database that is being dumped
- by default, if you insert a 0 into an auto-increment column, mysql substitutes the next auto-increment value for the 0
- mysqldump outputs one big .sql file to create tables and insert rows, and it includes instructions to disable this behavior
- mydumper outputs lots of individual files (for each table - a script to create the table, and a binary full of data to load into it) but none of them contain instructions to disable this behavior
So, basically if you source a script created by mysqldump, if col1 = 0, then you get a 0 for col1, but if you use myloader to load the output of mydumper, if col1 = 0, then you get whatever the next auto-increment value is. For all other values of col1, both work as expected.
Working around the problem was tricky. There's no obvious way to add a SET SESSION sql_mode='...,NO_AUTO_VALUE_ON_ZERO' to the mydumper binary data file. I thought about adding it to each of the table-create scripts, but I wasn't sure if the data-load would necessarily be run in the same session as the create, so the SET SESSION might not even be in effect when the data was being loaded.
I could have run a SET GLOBAL sql_mode='...,NO_AUTO_VALUE_ON_ZERO' prior to the dump, but I reboot the system all the time and I'd certainly forget to re-run it the next time, and even if I remembered, I'd forget exactly what it was that I needed to run. The only sure-fire solution was to alter the sql_mode of the server itself.
This involved adding a line to /etc/mysql/my.cnf, in the [mysqld] section, like this:
[mysqld] ... sql-mode = NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,NO_AUTO_VALUE_ON_ZERO
...and bouncing the server. Actually, there's probably some way to get the server to reload the options, but I could easily bounce mine, so I did.
The NO_AUTO_VALUE_ON_ZERO part was the operative part, but I had to add the other options because they were apparently already in effect, according to:
select @@sql_mode
The sketchy part of this solution is that I'm not 100% sure that somewhere deep in that app (or in another app that uses the same DB), there isn't some query that relies on NO_AUTO_VALUE_ON_ZERO being disabled. So, arguably, this isn't the best solution. I don't know of a better one though, so for now I'm going with this.
Ideally, it would be great if mydumper handled this, per-session, itself. I think I'll submit a feature request...
(Update: looks like there's already an open issue for this: https://github.com/maxbube/mydumper/issues/142)