Percona XtraBackup moved to https://jira.percona.com/projects/PXB

Unjust "Too many open files"

Bug #1183322 reported by Olaf van Zandwijk on 2013-05-23

This bug affects 10 people

	Status	Importance	Assigned to	Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB	Fix Released	Medium	Hrvoje Matijakovic	Percona XtraBackup moved to https://jira.percona.com/projects/PXB 2.2.0
2.0	Fix Released	Medium	Hrvoje Matijakovic
2.1	Fix Released	Medium	Hrvoje Matijakovic	Percona XtraBackup moved to https://jira.percona.com/projects/PXB future-2.1-releases
2.2	Fix Released	Medium	Hrvoje Matijakovic	Percona XtraBackup moved to https://jira.percona.com/projects/PXB 2.2.0

Bug Description

After upgrading from xtrabackup 2.0.7 to 2.1.3 backups fail.

The 2.1.3 error output is below. However, the mentioned databasename/tablename is not wrong/corrupt. I did a check table, optimize table and several operations and they all work fine. Then I tried the options as mentioned in the error output (ie. setting innodb_force_recovery) but that didn't change anything. The server still thinks there is nothing wrong with the table, but xtrabackup cannot create a backup.

Downgrading to 2.0.7 allows me to make a backup again. I tried to start a new slave of this backup, and that works fine.

xtrabackup: Target instance is assumed as followings.
xtrabackup: innodb_data_home_dir = /srv/mysql/data
xtrabackup: innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 3
xtrabackup: innodb_log_file_size = 268435456
xtrabackup: using O_DIRECT
130523 13:33:54 InnoDB: Warning: allocated tablespace 10, old maximum was 9
130523 13:33:54 InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/operating-system-error-codes.html
InnoDB: Error: could not open single-table tablespace file
InnoDB: ./<databasename>/<tablename>.ibd!
InnoDB: We do not continue the crash recovery, because the table may become
InnoDB: corrupt if we cannot apply the log records in the InnoDB log to it.
InnoDB: To fix the problem and start mysqld:
InnoDB: 1) If there is a permission problem in the file and mysqld cannot
InnoDB: open the file, you should modify the permissions.
InnoDB: 2) If the table is not needed, or you can restore it from a backup,
InnoDB: then you can remove the .ibd file, and InnoDB will do a normal
InnoDB: crash recovery and ignore that table.
InnoDB: 3) If the file system or the disk is broken, and you cannot remove
InnoDB: the .ibd file, you can set innodb_force_recovery > 0 in my.cnf
InnoDB: and force InnoDB to continue crash recovery here.

Tags:

Related branches

lp:~hrvojem/percona-xtrabackup/bug1222818-2.0

Merged into lp:percona-xtrabackup/2.0 at revision 589

Alexey Kopytov (community): Approve on 2013-11-15

lp:~hrvojem/percona-xtrabackup/bug1222818-2.1

Merged into lp:percona-xtrabackup/2.1 at revision 693

Alexey Kopytov (community): Approve on 2013-11-15

lp:~hrvojem/percona-xtrabackup/bug1222818-2.2

Merged into lp:percona-xtrabackup/2.2 at revision 4893

Alexey Kopytov (community): Approve on 2013-11-15

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-05-24:

The reason is that with the fix for bug #1079700 XtraBackup does not reuse file descriptors for multiple tablespaces and keeps them open until the corresponding tablespaces are copied into the backup.

This is the only reliable way to fix bug #1079700 and similar ones. The downside is that the operating system limit on the number of open files for the user under which xtrabackup is running must be high enough to allow opening all tablespaces simultaneously.

We should document this for both 2.0 (when the fix for bug #1079700 is released in 2.0.8) and 2.1. Converting to a doc request.

tags:

added: doc

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-05-24:

I think XB can follow mysqld's course which tries to set the
ulimit itself before dropping privileges.

So, in XB code, if ENFILE is returned, probably it can try to up
the ulimit itself.

Note, there is also the case of soft limit in ulimit (nofile).
So, it is also possible for a non-root user (if innobackupex is
running as mysql user) to up the soft-limit until it hits the
hard-limit.

Revision history for this message

Olaf van Zandwijk (olafz) wrote on 2013-05-24:

I agree with Raghavendra that XB should follow mysqld's course. For MySQL, in my.cnf there is an option to set open-files-limit to some (high) value. Maybe this (or something similar) is possible for XB?

In my case, XB runs as root and intuitively I think XB should be able to complete without having to set the ulimit manually. That is fixed by Raghavendra suggestion.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-05-24:

Yes, XtraBackup can at least try to set the file limit to the maximum allowed value. But the problem still holds: the operating system should be configured appropriately so that XtraBackup could open all its files.

Reported this request separately as bug #1183793. Thanks for the feedback, Olaf and Radhavendra!

Revision history for this message

JonathanLevin (boogybo) wrote on 2013-06-21:

I have this same error. ulimit is set to unlimited and open_files_limit is set to 65535.
How can I resolve this error to use the backup?

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-06-24:

Jonathan,

Are you sure "ulimit -n" for the same user that xtrabackup is using to access data files shows "unlimited"?

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-06-25:

@Jonathan,

"ulimit is set to unlimited and open_files_limit is set to 65535."

This means 'nofile' (which is the file descriptor limit for that user) is not unlimited but limited to 65535.

What is the output of,

ulimit -S -n

and

ulimit -H -n

(which are hard and soft fd limits respectively)

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-06-25:

"This means 'nofile' (which is the file descriptor limit for that user) is not unlimited but limited to 65535."

Since, xtrabackup doesn't check that mysqld variable (from my.cnf, lp:1183793 is for that), so in this case only the shell's ulimits should apply here.

Revision history for this message

Virginia Banh (virginia-banh) wrote on 2013-07-08:

Hello all,

I followed the steps below to modified this file /etc/security/limits.conf, and it works.

http://stackoverflow.com/questions/14068793/how-to-get-etc-security-limits-conf-changes-reflected-for-processes-running-und

Regards,
Virginia

Revision history for this message

Virginia Banh (virginia-banh) wrote on 2013-07-08:

#10

1. vi /etc/security/limits.conf as root

* soft nofile 24000
* hard nofile 32000

2. logout and log back in

3. ulimit -n
24000

4. ulimit -S -n
24000

5. ulimit -H -n
32000

Revision history for this message

Jacob Leatherman (jleatherman) wrote on 2013-07-10:

#11

I actually have this problem as well, but even increasing the ulimit doesn't seem to do the trick. I'm on Ubuntu and don't see a way to set ulimit -n to unlimited.

ulimit -n
1024000

InnoDB: Warning: allocated tablespace 10, old maximum was 9
InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.

I have a lot of databases on this server with a lot of small tables and a few large ones. It's about 1.2 TB on disk

I want to use 2.1 for the compact option, but I can't with this file limit issue.

I'd back up each DB one at a time, but it seems that backing up a single DB pays the entire "crash recovery" process overhead across all DBs.

Any advice?

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-07-11:

#12

Jacob,

How may InnoDB tablespaces do you have approximately?

Also note that changing ulimits in a user session will only raise those limits to that session. To change them globally you should update /etc/security/limits.conf and relogin for new limits to take effect.

Revision history for this message

Jacob Leatherman (jleatherman) wrote on 2013-07-12:

#13

I was running the backup in the same session for that reason

What's the best way to check # of tablespaces? SELECT * from information_schema.TABLESPACES yields nothing. Selecting from TABLES locks up the DB (that wasn't fun).

I have ~21,000 DBs, each with ~150 tables

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-07-15:

#14

Jacob,

The following should be the least intrusive way to answer that question:

find /usr/local/mysql/data/ -name '*.ibd' | wc -l

Revision history for this message

Jacob Leatherman (jleatherman) wrote on 2013-07-15:

#15

Took 9 min to count the files:

4750821

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-07-16:

#16

Thanks, Jacob. That's the highest number of tablespaces I've ever seen. I'll check if we can reconsider the fix for bug #1079700 or at least provide a workaround for cases like this.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-08-08:

#17

Download full text (4.4 KiB)

I don't see a way to fix this (i.e. remove the one-descriptor-per-tablespace requirement for XtraBackup) without a risk of losing data on table renames.

Suppose that the fix for bug #1079700 is reverted, i.e. instead of opening all tablespaces on backup start, XtraBackup just opens and closes tablespaces when it has to copy them. In this case, it has to detect tablespaces renames and deal with them somehow to avoid losing data:

1. it can detect a renamed tablespace if at the time it has to open it for copying the space ID from first page (or inode from the filesystem, it doesn't matter) doesn't match space ID or inode this tablespace had at the time when the list of files was created. So detecting such conditions is the easy part.

2. What can be done to still copy the corresponding tablespace even if its name got changed while backup was in progress? XtraBackup has to discover its new name in the filesystem. In order to do so, it has to scan all tablespaces in all directories under datadir, and find the one having the same space ID or inode values, or consider the tablespace removed if it cannot be found.

3. The above would work fine if either the tablespace can never be renamed again while the directory scan is in progress, or we could scan directories atomically (i.e. we could get a snapshot of directories contents and iterate it without interfering with concurrent file renames/removals/creation). There's nothing that would guarantee either of those conditions:

a) there's always a possibility that the tablespace we are looking for is renamed again while the directory scan is in progress (and that in itself may take minutes on servers with huge numbers of tablespaces)

b) there's no way to scan a directory atomically, let alone scanning multiple directories atomically. The opendir()/readdir() combo does not provide any atomicity. For example, if a file is renamed between the opendir() call and subsequent readdir() calls, then depending on timing, we could see the old name file name, the new file name, none, or even both! I am not aware of any utilities that are capable of handling this in a reasonable way: "rm -f", rmdir, cp, rsync -- all may provide inconsistent and unexpected results if there are concurrent directory changes while the directory scan is in progress. So if XtraBackup scans the directory and doesn't find a new name for the tablespace it is looking for, does it mean the tablespace got removed so we can omit it from the backup? No, it could also mean that the tablespace has been renamed again, and we are back to the original problem we were trying to solve by the scan.

With the above in mind, I don't see a better way to handle tablespace renames than what I implemented as a fix for bug #1079700. Any attempt to fix it differently would result in lost data.

So I'm converting this bug back to a documentation request. We should document that:

1. The number of file descriptors required by XtraBackup on the backup stage is at most (number_of_tablespaces_to_copy * 2 + 10), where number_of_tablespaces_to_copy is either the total number of InnoDB tablespaces in the server (for full backups), or the number of tablespaces th...

I don't see a way to fix this (i.e. remove the one-descriptor-per-tablespace requirement for XtraBackup) without a risk of losing data on table renames.

b) there's no way to scan a directory atomically, let alone scanning multiple directories atomically. The opendir()/readdir() combo does not provide any atomicity. For example, if a file is renamed between the opendir() call and subsequent readdir() calls,  then depending on timing, we could see the old name file name, the new file name, none, or even both! I am not aware of any utilities that are capable of handling this in a reasonable way: "rm -f", rmdir, cp, rsync -- all may provide inconsistent and unexpected results if there are concurrent directory changes while the directory scan is in progress. So if XtraBackup scans the directory and doesn't find a new name for the tablespace it is looking for,  does it mean the tablespace got removed so we can omit it from the backup? No, it could also mean that the tablespace has been renamed again, and we are back to the original problem we were trying to solve by the scan.

With the above in mind, I don't see a better way to handle tablespace renames than what I implemented as a fix for bug #1079700. Any attempt to fix it differently would result in lost data.

So I'm converting this bug back to a documentation request. We should document that:

2. When XtraBackup fails with the following message, it means that the operating system limit on available file descriptors has been exceeded:

InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.5/en/operating-system-error-codes.html
InnoDB: Error: could not open single-table tablespace file
InnoDB: ./<databasename>/<tablename>.ibd!

3. There are 2 kinds of limits on file descriptors:
  a) per-user limit that can be checked and adjusted for the current session by the "ulimit -n" command (or modifying /etc/security/limits.conf to make changes persistent)
  b) system-wide limit that can checked and adjusted by using either /proc/sys/fs/file-max or the sysctl utility (or modifying /etc/sysctl.conf to make changes persistent).

4. Most Linux distributions have rather strict user-level limits, but fairly high system-wide limits by default. For example, on my CentOS 5 VM I see 1024 file descriptors for a non-root user, but 207006 as the system-wide limit in /proc/sys/fs/file-max. The user limit would have to be adjusted if I had to backup more than ~500 tablespaces, and the system-wide limit would have to be adjusted if I had to backup more than ~100,000 tablespaces.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2013-08-08:

#18

Would http://linux.die.net/man/7/inotify help?

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-08-08: Re: [Bug 1183322] Re: Unjust "Too many open files"

#19

Hi Laurynas,

On Thu, 08 Aug 2013 12:26:27 -0000, Laurynas Biveinis wrote:
> Would http://linux.die.net/man/7/inotify help?
>

No. I see many issues with inotify (basically, all items from the
Limitation and caveats section make it non-applicable for XtraBackup
purposes). It can only be used reliably for XtraBackup to be notified
when _something_ has changed in datadir. But as I wrote, detecting
changes is easy, acting on them is not.

Revision history for this message

Rob M (arcane47) wrote on 2013-08-12:

#20

I just want to add that I am suffering the same issue as Jacob. A simple "find /mnt/mysql/ -name '*.ibd' | wc -l" would leave me with over 166741 file descriptors.

My only option now is to revert my XtraBackup setup.

Regards,
Rob M.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-08-12:

#21

Hi Rob,

On Mon, 12 Aug 2013 14:33:55 -0000, Rob M wrote:
> I just want to add that I am suffering the same issue as Jacob. A
> simple "find /mnt/mysql/ -name '*.ibd' | wc -l" would leave me with
> over 166741 file descriptors.
>
> My only option now is to revert my XtraBackup setup.
>

Would raising the user and system limits on file descriptors be an option?

Revision history for this message

Jacob Leatherman (jleatherman) wrote on 2013-08-16:

#22

I'm confident that my application does not do tablespace renaming during backup times, can we make this "rename aware" functionality dependent upon a switch that is on by default?

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-08-18:

#23

Jacob,

It is possible to make that functionality optional, but that would open another possibility for users to shoot themselves in the foot. Similar to the --no-lock option, it is easy to forget that the option is passed to xtrabackup from some backup script, and then discover that a backup is broken simply because the "no renames" condition does not hold anymore.

What is not clear to me is the reasons why people insist on implementing it instead of accepting the suggested workaround: raise the OS limit on file descriptors accordingly. That looks like a simple and feasible solution to me. Am I missing something?

Revision history for this message

markus_albe (markus-albe) wrote on 2013-08-20:

#24

Alexey: customer suggested that it "would be possible for xtrabackup to do a rough comparison (find over data directory) with "ulimit -n" output before it fires up". If comparison shows ulimit -n is significantly lower than number of tablespaces to be backed up, it could issue a warning. All that said, it might be hard to tell what's a sensible threshold to trigger the warning...

Revision history for this message

Jacob Leatherman (jleatherman) wrote on 2013-09-06:

#25

Alexey,

I can't raise the limit high enough to accommodate the files I have - as I said on July 10, raising it up to the max in Ubuntu doesn't cut it, and there is no "unlimited" option available to me.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-09-07:

#26

Jacob,

You are right. According to http://stackoverflow.com/questions/1212925/on-linux-set-maximum-open-files-to-unlimited-possible it is impossible to set the open files limit to 'unlimited', and the maximum possible value is 1048576 which is a hard-coded constant in the Linux kernel.

Which means an option to disable the "rename aware" functionality is the only possible solution in your case. Reported as bug #1222062. Let's keep this bug as a documentation request to document the current state of things.

Revision history for this message

Swany (greenlion) wrote on 2013-10-21:

#27

Another option is to take a hard link of all the files. You can then back up the links. If the database renames a table, the link in the data directory will go away but the file will still be readable by the hard link made by xtrabackup. Xtrabackup can remove the hardlinks after completing the backup.

One danger is that if innobackupex dies, it won't clean up the hard links, so this option should probably only be used in the situation where the #tables would be > the limit.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2013-10-25:

#28

@Swany,

Yes, I considered the hard links option when looking for a way to fix this problem. I rejected it for the same reasons: a software/hardware failure may leave the filesystem in a rather bad and hard-to-recover state.

Revision history for this message

Danny Gueta (danny-gueta) wrote on 2014-03-09:

#29

Hello all,

I'm experiencing this bug as well, is there any planned fix ? currently i have to revert back to rsync, we have 900 DB's with 194078 .ibd's.

i've tried everything, setting a higher limit in limits.conf and raising innodb open files limit in MySQL, nothing seems to help.

any advice ? I'm using the latest xtrabackup 2.1.8, here is a small piece of the log:

>> log scanned up to (878733638246)
InnoDB: Allocated tablespace 177580, old maximum was 0
>> log scanned up to (878733978964)
2014-03-09 16:11:33 7f4d2692a740 InnoDB: Operating system error number 24 in a file operation.
InnoDB: Error number 24 means 'Too many open files'.
InnoDB: Some operating system error numbers are described at

Regards,
Danny.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2014-03-11:

#30

Danny,

Yes, the solution will be provided as a fix for bug #1222062. It should have made it into 2.1.8, but has not due to various reasons. I hope it will be available in the next 2.1 point release.

Revision history for this message

Ovais Tariq (ovais-tariq) wrote on 2014-03-14:

#31

For users of Percona XtraDB Cluster affected by this bug, they can set the mysqld option open-files-limit to a high enough value which according to Alexey's formula is twice the number of tables + 10. The MySQL manual wrongly mentions that the maximum value of open-files-limit can be 65535. It can certainly be larger than that as shown in the bug report http://bugs.mysql.com/bug.php?id=72039

Revision history for this message

Jan Ingvoldstad (jan-launchpad-xud) wrote on 2014-05-16:

#32

Although opening millions of files is just BAD BAD BAD, Linux doesn't really prevent you.

If you want to shoot that foot and permit 10 million open file descriptors, the following commands run as root may "help" you.

nfiles=10000000
sysctl fs.nr_open=$nfiles
ulimit -n $nfiles

I'm also adding this comment to bug #1222062.

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2014-05-16:

#33

Jan,

Unfortunately, in some Linux kernel versions (probably just older ones) the hard-coded limir for fs.nr_open is 1048576

Revision history for this message

Jan Ingvoldstad (jan-launchpad-xud) wrote on 2014-05-19:

#34

Alexey,

Yes, Linux older than 2.6.27 (IIRC) have this hardcoded limit.

Revision history for this message

Fungo Wang (fungo) wrote on 2018-01-05:

#35

The current precaution way by holding all ibd file handle at beginning of backup is too heavy, especially for cloud environment, as we can not control the number of tables for one mysqld instance, and there can be many mysqld instance in one host. System level file number limited can be easily exhausted.

I think we can avoid keeping ibd file handle opened as long as we can detecting ibd mismatch, and failed backing up.

I filled another bug for this issue: https://bugs.launchpad.net/percona-xtrabackup/+bug/1741397