Comment 14 for bug 666211

Revision history for this message
Jay Freeman (saurik) (saurik) wrote :

Stefan,

These stack traces for pgbouncer are all in sys_write(), btw, which is then backed by ext4. Both from what I know about how pgbouncer operates, and from greping through its source code, the only file-backed operation it performs is writing to its log file, which it normally does only once a minute unless it is encountering some kind of connection failures.

It looked (and still looks) to me like the filesystem is simply locking up. It should be noted that my dmesg log also includes another process that got stuck: run-parts, which got blocked in a call to sys_getdents().

Also, I looked into the AIO completion ordering change you mentioned, and it seems totally unrelated. The author of this patch referred to a reproduction of the bug they were fixing, which was a "you now read a bunch of zeros when you were expecting data" race condition, not a deadlock. In specific, operations involving "unwritten extents" would claim to be "completed" via AIO when they were still pending: the reordering fixed this.

http://www.spinics.net/lists/linux-ext4/msg19590.html
http://thread.gmane.org/gmane.comp.file-systems.ext4/19659

-J