mysql_upgrade destroys Maria tables?

Bug #944422 reported by nbrnhardt on 2012-03-01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
Medium
Michael Widenius

Bug Description

Upgrading tables from 5.2 to 5.3.5 with mysql_upgrade seems to cause MySQL to zerofill tables...

[Note] Zerofilling moved table: '.\database\technikervermerke'

which then seems to destroy some tables. At least when you run a check, it states:

Found row with transaction id 808 when max transaction id according to maria_control_file is 84

---

This issue might be linked to:
http://askmonty.org/worklog/Maria-BackLog/?tid=126
http://bugs.mysql.com/bug.php?id=42379 (status: "not supported")

Attached is an archive with the 5.2-state as well as 5.3-state table `technikervermerke`.

If the upgrade causes corrupt tables, this bug should be fixed. If it is an erroneus message, it should be disabled.

Related branches

nbrnhardt (nb-k) wrote :
Elena Stepanova (elenst) wrote :

Hi,

In the provided archive, the 5.2.10-before_upgrade/database folder is empty, there are no tables:

Processing archive: bug.7z

Extracting bug/5.3.5-after_upgrade/database/technikervermerke.frm
Extracting bug/5.3.5-after_upgrade/database/technikervermerke.MAD
Extracting bug/5.3.5-after_upgrade/database/technikervermerke.MAI
Extracting bug/5.3.5-after_upgrade/database
Extracting bug/5.3.5-after_upgrade
Extracting bug/5.2.10-before_upgrade/database
Extracting bug/5.2.10-before_upgrade
Extracting bug

Everything is Ok

Folders: 5
Files: 3
Size: 94640
Compressed: 7905

Changed in maria:
status: New → Incomplete
nbrnhardt (nb-k) wrote :

Sorry, was a "bug" in my brain.

Elena Stepanova (elenst) wrote :

Hi,

Could you please describe in more details how exactly you perform the upgrade?
Do you uninstall the previous version, install the new one and point it at the same exact datadir?
Or do you copy your datadir (or parts of it) for the new instance?
How do you start server and run mysql_upgrade?
What system are you running your server on?

I tried the following with your data:

1.
- Install a new clean 5.3.5 instance,
- copy your database schema from 5.2.10-before_upgrade to the datadir,
- start mysqld, no specific parameters (no errors or zerofill in the log),
- run mysql_upgrade without any parameters (it says OK about the table, no errors or zerofill in the log),
- start mysql client, do CHECK TABLE on the table, it says "Corrupt / Found row with transaction id 7424 when max transaction id according to aria_control_file is 2" (still no errors or or zerofill in the log).

To my understanding, the message about 'transaction id' caused by the fact that I don't have the old aria_control_file.

2.
- install a new clean 5.2.10 instance,
- copy your database schema from 5.2.10-before_upgrade to the datadir,
- start mysqld, no specific parameters (no errors or zerofill in the log),
- do something with the table (to have updates in aria logs),
- stop mysqld;
- install 5.3.5 instance, point it at the same datadir;
- start mysqld;
- run mysql_upgrade without any parameters (it says OK about the table, no errors or zerofill in the log),
- start mysql client, do CHECK TABLE on the table, it says OK.

Here I have aria_control_file synchronized with the table, so no error.

As you can see, in both cases I am not getting the zerofill message, so you apparently do something differently, could you please point it out?

nbrnhardt (nb-k) wrote :

I do the following:

1. Backup datadir
2. Update old program folders with new MariaDB version (leaving datadir as is)
3. Delete the log files from datadir
4. Start server
5. Run mysql_upgrade
6. Check table(s)
7. Get a "Found row with transaction id 7424 when max transaction id according to aria_control_file is 2"

It seems that (3) causes the "Found row..." message.
The log file looks like:
120314 14:55:10 [ERROR] mysql: Table '.\dbb\technikervermerke' is marked as crashed and should be repaired
120314 14:55:10 [Warning] Recovering table: '.\dbb\technikervermerke'

When I stop the server, rename the database folder and restart the server, the result of CHECK TABLE is "ok".
But then the log file looks like:
120314 14:57:11 [Note] Zerofilling moved table: '.\database\technikervermerke'

It seems to me that the transaction id (in the table) should be reset when aria_control_file is deleted.

I attach the complete datadir (w/o mysql database).

Elena Stepanova (elenst) on 2012-03-16
Changed in maria:
status: Incomplete → New
Elena Stepanova (elenst) wrote :

Hi,

Thank you, I can now see all the quoted error messages. The remaining question is, you mentioned earlier that this sequence of events destroys some tables. Did you have any evidence of the data being actually corrupt, or do you base your suspicions on all these warnings and records produced in the CHECK output and/or error log?

Thank you.

nbrnhardt (nb-k) wrote :

Besides one table that gets indeed corrupted (and connected clients receive appropriate error messages), the rest at least works.
We have additional problems filed as separate bug reports where the server says "bye" as soon as there is some load on the server. I try to compile a debug version for Windows next mid-week to get more specific details.

Question : is there any specific reason for your upgrade workflow, rather than mysql_upgrade_service (or even a graphical wizard if you like) . Backup is good, but why delete log files?

any reason why procedures described in http://kb.askmonty.org/en/upgrading-mariadb-on-windows do not work for you?

nbrnhardt (nb-k) wrote :

There is really something going wrong when Maria table logs aren't identical to what they are expected to be. So this is not limited to an uprade, I think.

Following scenario to set up / sync new slave servers (5.2.10):
- stop master
- delete the log files (that is, we have a lot of aria_xxx files)
- copy the data dir to the slaves
- restart master, and start the slaves servers.

On the MASTER following happens:
As soon as the first user uses a a table, MariaDB seems to notice some differences (between table and (non-existing) log?), and issues in the error log:
120327 7:04:52 [Note] Zerofilling moved table: '.\dbv\belegliste'
120327 7:04:54 [Warning] Checking table: '.\dbv\belegliste'

Not only limited to this table, but here it does so here hundreds of times during the next hour.
Then the server got restarted and an REPAIR TABLE belegliste was issued:
120327 8:10:23 [Warning] Checking table: '.\dbv\belegliste'
120327 8:10:23 [Warning] Recovering table: '.\dbv\belegliste'
120327 8:10:23 [Note] Retrying repair of: '.\dbv\belegliste' with keycache
120327 8:10:23 [ERROR] Couldn't repair table: dbv.belegliste

After another restart, the error log tells:
120327 8:11:10 [ERROR] MySQL: Table '.\dbv\belegliste' is marked as crashed and last (automatic?) repair failed

On another REPAIR TABLE, the result is:
120327 8:11:37 [Note] Found 6004 of 0 rows when repairing '.\dbv\belegliste'

After issuing a REPAIR TABLE to all Maria tables, the database is working fine again.

On the SLAVE, `belegliste` gives a "Got error: 176 when reading datafile". Couldn't find anything on Google.

It seems that on a read or write of a MariaDB table, the database server compares the table to the log file that isn't there anymore. Is there any sort of counter that tells MariaDB that a table is 'corrupt'? More specificly, is there a counter stored in the table to sync the position of the log files? If the log files change or get deleted, this counter should automatically re-synced with the new logs.

Any ideas?

If you need the table `belegliste`, I would upload it on FTP.

Michael Widenius (monty) wrote :

The reason you get the 'zerofilling table' is that you removed the aria_log_control file. This is something one should not normally do as it contains some critical information for Aria.

If the aria_log_control file is removed, the Aria engine is re-initialized and transaction id's are starting from 1 again.
The effect of this is that all Aria data files are regarded as 'moved' from another server and have to be zerofilled to be usable.

Zerofilling means that all transaction id's are removed from the table and that all unused data is set to 0.
The first part is needed for the table to work, the second is mainly useful if one wants to compress and archive the table.
No real data is removed from the table. I will document this in the knowledgebase shortly.

In no case should one get the error: Found row with transaction id 808; This is a bug and we will fix it as soon as I get a repeatable test case for it (Elena said she can repeat it and will produce one for me).

Elena Stepanova (elenst) wrote :

To reproduce the error message "Found row with transaction id..", do the following:

- start server (no specific server parameters required);
- run
mysql -uroot -e "DROP DATABASE IF EXISTS db; CREATE DATABASE db; USE db; \
CREATE TABLE t ( i INT ) ENGINE=Aria; BEGIN; INSERT INTO t VALUES (1),(2); COMMIT; \
BEGIN; INSERT INTO t VALUES (1),(2); COMMIT;"
(adjust connection parameter such as port or socket, if needed)

- stop server, e.g. using mysqladmin -uroot shutdown
- remove aria_log* files from the datadir;
- start server on the same datadir;
- run
mysql -uroot -e "CHECK TABLE db.t"
see the output:

+-------+-------+----------+---------------------------------------------------------------------------------------------+
| Table | Op | Msg_type | Msg_text |
+-------+-------+----------+---------------------------------------------------------------------------------------------+
| db.t | check | error | Found row with transaction id 5 when max transaction id according to aria_control_file is 1 |
| db.t | check | error | Corrupt |
+-------+-------+----------+---------------------------------------------------------------------------------------------+

(the actual transaction id in the message might be different depending on the pre-conditions of your test).

Please note that it's important either to use MySQL client in the batch mode (with -e, as described), or start it in interactive mode with -A.

Changed in maria:
assignee: nobody → Michael Widenius (monty)
importance: Undecided → Medium
milestone: none → 5.1
Michael Widenius (monty) on 2012-03-28
Changed in maria:
status: New → In Progress
Michael Widenius (monty) wrote :

The bug was that check table (and also optimize and related admin functions) didn't check if table was moved and needed zerofill before starting to work on them.
I have now fixed that check table will detect that a table is moved and give a proper error in this case.
I also added more warning messages when zerofill happens.

This is pushed into 5.1 and I will merge it to 5.2 and 5.3.

Changed in maria:
status: In Progress → Fix Released
Elena Stepanova (elenst) on 2012-03-29
tags: added: aria
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers