Comment 26 for bug 1505948

Revision history for this message
In , Ian (ian-redhat-bugs) wrote :

Description of problem:

After upgrading a node from F20 to F21, node crashes accessing glusterfs volume.
The remaining F20 nodes have no problem accessing the volume.

Aug 16 20:24:25 bagel kernel: [ 1810.077267] ------------[ cut here ]------------
Aug 16 20:24:25 bagel kernel: [ 1810.081945] kernel BUG at mm/slub.c:3413!
Aug 16 20:24:25 bagel kernel: [ 1810.085998] invalid opcode: 0000 [#1] SMP
Aug 16 20:24:25 bagel kernel: [ 1810.090177] Modules linked in: vhost_net vhost m
acvtap macvlan ebt_arp ebtable_nat fuse nfsv3 nfs_acl nfs lockd grace sunrpc fsca
che ebtable_filter ebtables ip6table_filter ip6_tables softdog scsi_transport_isc
si xt_physdev br_netfilter nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_connt
rack nf_conntrack vfat fat coretemp kvm_intel kvm bcache iTCO_wdt crct10dif_pclmu
l ipmi_devintf crc32_pclmul iTCO_vendor_support gpio_ich igb crc32c_intel ptp pps
_core lpc_ich ghash_clmulni_intel i2c_i801 mfd_core ipmi_si dca ipmi_msghandler i
2c_ismt tpm_tis shpchp tpm acpi_cpufreq ast i2c_algo_bit drm_kms_helper ttm drm 8
021q garp mrp tun bridge stp llc bonding
Aug 16 20:24:25 bagel kernel: [ 1810.149526] CPU: 1 PID: 4794 Comm: qemu-system-x
86 Not tainted 4.1.4-100.fc21.x86_64 #1
Aug 16 20:24:25 bagel kernel: [ 1810.157603] Hardware name: Supermicro A1SRM-2758
F/A1SRM-2758F, BIOS 1.2 02/16/2015
Aug 16 20:24:25 bagel kernel: [ 1810.165246] task: ffff88085a1313c0 ti: ffff8803b
09b4000 task.ti: ffff8803b09b4000
Aug 16 20:24:25 bagel kernel: [ 1810.172800] RIP: 0010:[<ffffffff81208532>] [<ff
ffffff81208532>] kfree+0x152/0x160
Aug 16 20:24:25 bagel kernel: [ 1810.180467] RSP: 0018:ffff8803b09b7c98 EFLAGS:
00010246
Aug 16 20:24:25 bagel kernel: [ 1810.185833] RAX: 005ffff80000002c RBX: ffff88020
08b9960 RCX: dead000000200200
Aug 16 20:24:25 bagel kernel: [ 1810.193032] RDX: 000077ff80000000 RSI: ffff88085
a1313c0 RDI: ffff8802008b9960
Aug 16 20:24:25 bagel kernel: [ 1810.200231] RBP: ffff8803b09b7cb8 R08: ffff8803b
09b7c80 R09: ffffea0008022e40
Aug 16 20:24:25 bagel kernel: [ 1810.207431] R10: 0000000000002fe4 R11: 000000000
0000000 R12: 0000000149928000
Aug 16 20:24:25 bagel kernel: [ 1810.214629] R13: ffffffffa02e5c8c R14: ffff8803b
09b7e50 R15: ffff8801009b5600
Aug 16 20:24:25 bagel kernel: [ 1810.221829] FS: 00007f35609ff700(0000) GS:ffff88087fc40000(0000) knlGS:0000000000000000
Aug 16 20:24:25 bagel kernel: [ 1810.229992] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Aug 16 20:24:25 bagel kernel: [ 1810.235799] CR2: 00007fbf24022a98 CR3: 0000000100a81000 CR4: 00000000001027e0
Aug 16 20:24:25 bagel kernel: [ 1810.243001] Stack:
Aug 16 20:24:25 bagel kernel: [ 1810.245037] ffff8802008b9960 ffff8802008b9960 0000000149928000 ffff8803b09b7da8
Aug 16 20:24:25 bagel kernel: [ 1810.252590] ffff8803b09b7d48 ffffffffa02e5c8c 0000000000004800 ffff8806eea842c0
Aug 16 20:24:25 bagel kernel: [ 1810.260145] 0000000000004800 00000001f4000000 000000014992c800 0000000000000000
Aug 16 20:24:25 bagel kernel: [ 1810.267699] Call Trace:
Aug 16 20:24:25 bagel kernel: [ 1810.270189] [<ffffffffa02e5c8c>] fuse_direct_IO+0x20c/0x340 [fuse]
Aug 16 20:24:25 bagel kernel: [ 1810.276525] [<ffffffff811ac2fa>] generic_file_read_iter+0x4ca/0x600
Aug 16 20:24:25 bagel kernel: [ 1810.282941] [<ffffffffa02e22ac>] fuse_file_read_iter+0x4c/0x70 [fuse]
Aug 16 20:24:25 bagel kernel: [ 1810.289531] [<ffffffff81227e1e>] __vfs_read+0xce/0x100
Aug 16 20:24:25 bagel kernel: [ 1810.294810] [<ffffffff8122849a>] vfs_read+0x8a/0x140
Aug 16 20:24:25 bagel kernel: [ 1810.299910] [<ffffffff812295c2>] SyS_pread64+0x92/0xc0
Aug 16 20:24:25 bagel kernel: [ 1810.305186] [<ffffffff8179a76e>] system_call_fastpath+0x12/0x71
Aug 16 20:24:25 bagel kernel: [ 1810.311253] Code: 00 4d 8b 49 30 e9 35 ff ff ff 0f 1f 80 00 00 00 00 4c 89 d1 48 89 da 4c 89 ce e8 ca f9 ff ff e9 73 ff ff ff 0f 1f 44 00 00 0f 0b <0f> 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 89
Aug 16 20:24:25 bagel kernel: [ 1810.336949] RIP [<ffffffff81208532>] kfree+0x152/0x160
Aug 16 20:24:25 bagel kernel: [ 1810.344889] RSP <ffff8803b09b7c98>
Aug 16 20:24:25 bagel kernel: [ 1810.360802] ---[ end trace 76f7ea1ab5ea1b36 ]---

Version-Release number of selected component (if applicable):

kernel-4.1.4-100.fc21.x86_64
glusterfs-fuse-3.5.5-2.fc21.x86_64

How reproducible:

Every time I would start a VM whose disk lived on the gluster volume, the crash would happen immediately. The node would become mostly unresponsive and require a hard reset.

Steps to Reproduce:
1. glusterfs distributed-replicated volume across 3 F20 nodes.
2. upgrade one node from F20 to F21
3. attempt to run a VM on the new F21 node (accessing a disk image on the gluster volume)

Actual results:
Accessing files on gluster volume causes node crash.

Expected results:
No crash.

Additional info: