xfs_growfs fails on EBS volumes created from snapshots with increased size

Bug #1236041 reported by Shlomo on 2013-10-06
54
This bug affects 11 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Luis Henriques
Precise
Medium
Unassigned
Quantal
Medium
Unassigned

Bug Description

1. Create a new EBS volume with 1 GB size, attach it, format it with xfs.
2. Create a snapshot of the volume.
3. Create a new volume from the snapshot, with 2GB size.
4. Attach the new volume and mount it.
5. sudo xfs_growfs /newvol
meta-data=/dev/xvdy isize=256 agcount=4, agsize=65536 blks
         = sectsz=512 attr=2
data = bsize=4096 blocks=262144, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal bsize=4096 blocks=2560, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
xfs_growfs: XFS_IOC_FSGROWFSDATA xfsctl failed: Cannot allocate memory

/var/log/syslog has this:

Oct 6 17:57:42 playground1 kernel: [3498992.808304] XFS (xvdy): _xfs_buf_find: Block out of range: block 0x380001, EOFS 0x200000
Oct 6 17:57:42 playground1 kernel: [3498992.808322] XFS (xvdy): _xfs_buf_find: Block out of range: block 0x380001, EOFS 0x200000

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: xfsprogs 3.1.7
ProcVersionSignature: Ubuntu 3.2.0-54.82-virtual 3.2.50
Uname: Linux 3.2.0-54-virtual x86_64
ApportVersion: 2.0.1-0ubuntu17.5
Architecture: amd64
Date: Sun Oct 6 18:03:57 2013
Ec2AMI: ami-b6089bdf
Ec2AMIManifest: ubuntu-us-east-1/images/ubuntu-precise-12.04-amd64-server-20130222.manifest.xml
Ec2AvailabilityZone: us-east-1b
Ec2InstanceType: m1.small
Ec2Kernel: aki-88aa75e1
Ec2Ramdisk: unavailable
MarkForUpload: True
ProcEnviron:
 TERM=xterm-color
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: xfsprogs
UpgradeStatus: No upgrade log present (probably fresh install)

Shlomo (shlomo-swidler) wrote :
TGL (tom-gl) wrote :

Same symptom/problem here with 3.2.0-54. Switching back to previous installed kernel (3.2.0-51 in my case) solves the issue.

I suspect the regression comes from this commit:
https://lists.ubuntu.com/archives/kernel-team/2013-September/032287.html
  * xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end
    - LP: #1151527
    - CVE-2013-1819

For what it's worth, the same change produced the same effects in redhat six months ago:
https://bugzilla.redhat.com/show_bug.cgi?id=909602

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xfsprogs (Ubuntu):
status: New → Confirmed
TGL (tom-gl) wrote :

A bit more googling shows that this change was actually not suitable for inclusion in kernels versions prior to 3.8.0:
http://oss.sgi.com/archives/xfs/2013-02/msg00201.html

...which is why it had been reverted in 3.7.9, after it was added by mistake in a previous 3.7.x version:
https://lkml.org/lkml/2013/2/15/638

Hope this helps.

tags: added: regression-update
tags: removed: regression-update
TGL (tom-gl) wrote :

FYI, SUSE is also having issues with this CVE-2013-1819 fix being backported on an old kernel:
https://bugzilla.novell.com/show_bug.cgi?id=807471#c25
Note that there is an alternative fix attached to this bug report.

Luis Henriques (henrix) wrote :

I confirm that I can reproduce this issue using xfs_growfs both on Precise and Quantal kernels. For the moment, I'm going to propose to revert the offending commit in both kernels (upstream eb178619f930fa2ba2348de332a1ff1c66a31424 "xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end") and evaluate the patch identified by TGL in comment #5.

Thank you!

Luis Henriques (henrix) on 2013-10-09
affects: xfsprogs (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: nobody → Luis Henriques (henrix)
Luis Henriques (henrix) wrote :

I've uploaded a few test kernels here:

http://people.canonical.com/~henrix/lp1236041/

There are test kernels both for Precise and Quantal (amd64 and i386). I've revert the original offending commit and applied the alternative patch referred by TGL in comment #5. I was able to reproduce this bug both in Precise and Quantal and verified that with these test kernels I can't reproduce it anymore. Is it possible for someone else to also give them a try? Thanks!

Shlomo (shlomo-swidler) wrote :

Happy to try it out on EC2. What is the kernel-id (aki) in us-east-1 that I should try?

Changed in linux (Ubuntu Precise):
importance: Undecided → Medium
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Quantal):
importance: Undecided → Medium
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Precise):
status: New → Confirmed
Changed in linux (Ubuntu Quantal):
status: New → Confirmed
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
tags: added: verification-needed-quantal
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-quantal' to 'verification-done-quantal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Shlomo (shlomo-swidler) wrote :

I'd love to test this. Please point me to instructions for using the new -proposed kernel in EC2.

Luis Henriques (henrix) wrote :

I can not reproduce the issue using the Precise (3.2.0-56.86) and Quantal (3.5.0-43.66) kernels in the -proposed pocket. Tagging as verified.

tags: added: verification-done-precise verification-done-quantal
removed: verification-needed-precise verification-needed-quantal
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.2.0-56.86

---------------
linux (3.2.0-56.86) precise; urgency=low

  [Steve Conklin]

  * Release Tracking Bug
    - LP: #1242901

  [ Upstream Kernel Changes ]

  * Revert "xfs: fix _xfs_buf_find oops on blocks beyond the filesystem
    end"
    - LP: #1236041
    - CVE-2013-1819 fix backport:
  * cciss: fix info leak in cciss_ioctl32_passthru()
    - LP: #1188355
    - CVE-2013-2147
  * cpqarray: fix info leak in ida_locked_ioctl()
    - LP: #1188355
    - CVE-2013-2147
  * SAUCE: (no-up) Only let characters through when there are active
    readers.
    - LP: #1208740
  * Btrfs: fix hash overflow handling
    - LP: #1091187, #1091188
    - CVE-2012-5375
 -- Steve Conklin <email address hidden> Mon, 21 Oct 2013 15:11:01 -0500

Changed in linux (Ubuntu Precise):
status: Confirmed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (19.1 KiB)

This bug was fixed in the package linux - 3.5.0-43.66

---------------
linux (3.5.0-43.66) quantal; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1242895

  [ Timo Aaltonen ]

  * SAUCE: ubuntu/i915: silence unclaimed register poking debug messages
    - LP: #1138787

  [ Upstream Kernel Changes ]

  * Revert "xfs: fix _xfs_buf_find oops on blocks beyond the filesystem
    end"
    - LP: #1236041
    - CVE-2013-1819 fix backport:
  * Revert "sctp: fix call to SCTP_CMD_PROCESS_SACK in
    sctp_cmd_interpreter()"
    - LP: #1241093
  * get rid of full-hash scan on detaching vfsmounts
    - LP: #1226726
  * Smack: Fix the bug smackcipso can't set CIPSO correctly
    - LP: #1236743
  * SAUCE: (no-up) Only let characters through when there are active
    readers.
    - LP: #1208740
  * usb: xhci: define port register names and use them instead of magic
    numbers
    - LP: #1229576
  * usb: xhci: add USB2 Link power management BESL support
    - LP: #1229576
  * iwl4965: fix rfkill set state regression
    - LP: #1241093
  * ath9k_htc: Restore skb headroom when returning skb to mac80211
    - LP: #1241093
  * ALSA: opti9xx: Fix conflicting driver object name
    - LP: #1241093
  * SUNRPC: Fix memory corruption issue on 32-bit highmem systems
    - LP: #1241093
  * drm/i915: ivb: fix edp voltage swing reg val
    - LP: #1241093
  * drm/vmwgfx: Split GMR2_REMAP commands if they are to large
    - LP: #1241093
  * ALSA: ak4xx-adda: info leak in ak4xxx_capture_source_info()
    - LP: #1241093
  * Bluetooth: Add support for Foxconn/Hon Hai [0489:e04d]
    - LP: #1241093
  * [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a
    signal
    - LP: #1241093
  * xen-gnt: prevent adding duplicate gnt callbacks
    - LP: #1241093
  * usb: config->desc.bLength may not exceed amount of data returned by the
    device
    - LP: #1241093
  * USB: cdc-wdm: fix race between interrupt handler and tasklet
    - LP: #1241093
  * xhci-plat: Don't enable legacy PCI interrupts.
    - LP: #1241093
  * ASoC: wm8960: Fix PLL register writes
    - LP: #1241093
  * rculist: list_first_or_null_rcu() should use list_entry_rcu()
    - LP: #1241093
  * USB: mos7720: use GFP_ATOMIC under spinlock
    - LP: #1241093
  * USB: mos7720: fix big-endian control requests
    - LP: #1241093
  * staging: comedi: dt282x: dt282x_ai_insn_read() always fails
    - LP: #1241093
  * usb: ehci-mxc: check for pdata before dereferencing
    - LP: #1241093
  * usb: xhci: Disable runtime PM suspend for quirky controllers
    - LP: #1241093
  * USB: OHCI: Allow runtime PM without system sleep
    - LP: #1241093
  * ACPI / EC: Add HP Folio 13 to ec_dmi_table in order to skip DSDT scan
    - LP: #1241093
  * ACPI / EC: Add ASUSTEK L4R to quirk list in order to validate ECDT
    - LP: #1241093
  * USB: fix build error when CONFIG_PM_SLEEP isn't enabled
    - LP: #1241093
  * ALSA: hda - hdmi: Fallback to ALSA allocation when selecting CA
    - LP: #1241093
  * regmap: silence GCC warning
    - LP: #1241093
  * target: Fix trailing ASCII space usage in INQUIRY vendor+model
    - LP: #1241093
  * iwlwifi: dvm: don't send BT_CONFIG on devices w/o Bluetooth
    - LP: #1...

Changed in linux (Ubuntu Quantal):
status: Confirmed → Fix Released
Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.