[SRU] ceph 10.2.3

Bug #1628809 reported by James Page on 2016-09-29
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ubuntu Cloud Archive
Undecided
Unassigned
Mitaka
High
James Page
ceph (Ubuntu)
High
James Page
Xenial
High
James Page
Yakkety
High
James Page

Bug Description

This point release fixes several important bugs in RBD mirroring, RGW multi-site, CephFS, and RADOS.

We recommend that all v10.2.x users upgrade.

Notable changes in this release include:
* build/ops: 60-ceph-partuuid-workaround-rules still needed by debian jessie (udev 215-17) (#16351, runsisi, Loic Dachary)
* build/ops: ceph Resource Agent does not work with systemd (#14828, Nathan Cutler)
* build/ops: ceph-base requires parted (#16095, Ken Dreyer)
* build/ops: ceph-osd-prestart.sh contains Upstart-specific code (#15984, Nathan Cutler)
* build/ops: mount.ceph: move from ceph-base to ceph-common and add symlink in /sbin for SUSE (#16598, #16645, Nathan Cutler, Dan Horák, Ricardo Dias, Kefu Chai)
* build/ops: need rocksdb commit 7ca731b12ce for ppc64le build (#17092, Nathan Cutler)
* build/ops: rpm: OBS needs ExclusiveArch (#16936, Michel Normand)
* cli: ceph command line tool chokes on ceph –w (the dash is unicode 'en dash' &ndash, copy-paste to reproduce) (#12287, Oleh Prypin, Kefu Chai)
* common: expose buffer const_iterator symbols (#16899, Noah Watkins)
* common: global-init: fixup chown of the run directory along with log and asok files (#15607, Karol Mroz)
* fs: ceph-fuse: link to libtcmalloc or jemalloc (#16655, Yan, Zheng)
* fs: client: crash in unmount when fuse_use_invalidate_cb is enabled (#16137, Yan, Zheng)
* fs: client: fstat cap release (#15723, Yan, Zheng, Noah Watkins)
* fs: essential backports for OpenStack Manila (#15406, #15614, #15615, John Spray, Ramana Raja, Xiaoxi Chen)
* fs: fix double-unlock on shutdown (#17126, Greg Farnum)
* fs: fix mdsmap print_summary with standby replays (#15705, John Spray)
* fs: fuse mounted file systems fails SAMBA CTDB ping_pong rw test with v9.0.2 (#12653, #15634, Yan, Zheng)
* librados: Add cleanup message with time to rados bench output (#15704, Vikhyat Umrao)
* librados: Missing export for rados_aio_get_version in src/include/rados/librados.h (#15535, Jim Wright)
* librados: osd: bad flags can crash the osd (#16012, Sage Weil)
* librbd: Close journal and object map before flagging exclusive lock as released (#16450, Jason Dillaman)
* librbd: Crash when utilizing advisory locking API functions (#16364, Jason Dillaman)
* librbd: ExclusiveLock object leaked when switching to snapshot (#16446, Jason Dillaman)
* librbd: FAILED assert(object_no < m_object_map.size()) (#16561, Jason Dillaman)
* librbd: Image removal doesn't necessarily clean up all rbd_mirroring entries (#16471, Jason Dillaman)
* librbd: Object map/fast-diff invalidated if journal replays the same snap remove event (#16350, Jason Dillaman)
* librbd: Timeout sending mirroring notification shouldn't result in failure (#16470, Jason Dillaman)
* librbd: Whitelist EBUSY error from snap unprotect for journal replay (#16445, Jason Dillaman)
* librbd: cancel all tasks should wait until finisher is done (#16517, Haomai Wang)
* librbd: delay acquiring lock if image watch has failed (#16923, Jason Dillaman)
* librbd: fix missing return statement if failed to get mirror image state (#16600, runsisi)
* librbd: flag image as updated after proxying maintenance op (#16404, Jason Dillaman)
* librbd: mkfs.xfs slow performance with discards and object map (#16707, #16689, Jason Dillaman)
* librbd: potential use after free on refresh error (#16519, Mykola Golub)
* librbd: rbd-nbd does not properly handle resize notifications (#15715, Mykola Golub)
* librbd: the option 'rbd_cache_writethrough_until_flush=true' dosn't work (#16740, #16386, #16708, #16654, #16478, Mykola Golub, xinxin shu, Xiaowei Chen, Jason Dillaman)
* mds: tell command blocks forever with async messenger (TestVolumeClient.test_evict_client failure) (#16288, Douglas Fuller)
* mds: Confusing MDS log message when shut down with stalled journaler reads (#15689, John Spray)
* mds: Deadlock on shutdown active rank while busy with metadata IO (#16042, Patrick Donnelly)
* mds: Failing file operations on kernel based cephfs mount point leaves unaccessible file behind on hammer 0.94.7 (#16013, Yan, Zheng)
* mds: Fix shutting down mds timed-out due to deadlock (#16396, Zhi Zhang)
* mds: MDSMonitor fixes (#16136, xie xingguo)
* mds: MDSMonitor::check_subs() is very buggy (#16022, Yan, Zheng)
* mds: Session::check_access() is buggy (#16358, Yan, Zheng)
* mds: StrayManager.cc: 520: FAILED assert(dnl->is_primary()) (#15920, Yan, Zheng)
* mds: enforce a dirfrag limit on entries (#16164, Patrick Donnelly)
* mds: fix SnapRealm::have_past_parents_open() (#16299, Yan, Zheng)
* mds: fix getattr starve setattr (#16154, Yan, Zheng)
* mds: wrongly treat symlink inode as normal file/dir when symlink inode is stale on kcephfs (#15702, Zhi Zhang)
* mon: "mon metadata" fails when only one monitor exists (#15866, John Spray, Kefu Chai)
* mon: Monitor: validate prefix on handle_command() (#16297, You Ji)
* mon: OSDMonitor: drop pg temps from not the current primary (#16127, Samuel Just)
* mon: prepare_pgtemp needs to only update up_thru if newer than the existing one (#16185, Samuel Just)
* msgr: AsyncConnection::lockmsg/async lockdep cycle: AsyncMessenger::lock, MDSDaemon::mds_lock, AsyncConnection::lock (#16237, Haomai Wang)
* msgr: async messenger mon crash (#16378, #16418, Haomai Wang)
* msgr: backports of all asyncmsgr fixes to jewel (#15503, #15372, Yan Jun, Haomai Wang, Piotr Dałek)
* msgr: msg/async: connection race hang (#15849, Haomai Wang)
* osd: FileStore: umount hang because sync thread doesn't exit (#15695, Kefu Chai)
* osd: Fixes for list-inconsistent-* (#15766, #16192, #15719, David Zafman)
* osd: New pools have bogus stuck inactive/unclean HEALTH_ERR messages until they are first active and clean (#14952, Sage Weil)
* osd: OSD crash with Hammer to Jewel Upgrade: void FileStore::init_temp_collections() (#16672, David Zafman)
* osd: OSD failed to subscribe skipped osdmaps after ceph osd pause (#17023, Kefu Chai)
* osd: ObjectCacher split BufferHead read fix (#16002, Greg Farnum)
* osd: ReplicatedBackend doesn't increment stats on pull, only push (#16277, Kefu Chai)
* osd: Scrub error: 0/1 pinned (#15952, Samuel Just)
* osd: crash adding snap to purged_snaps in ReplicatedPG::WaitingOnReplicas (#15943, Samuel Just)
* osd: partprobe intermittent issues during ceph-disk prepare (#15176, Marius Vollmer, Loic Dachary)
* osd: saw valgrind issues in ReplicatedPG::new_repop (#16801, Kefu Chai)
* osd: sparse_read on ec pool should return extends with correct offset (#16138, kofiliu)
* osd:sched_time not actually randomized (#15890, xie xingguo)
* rbd: ImageReplayer::is_replaying does not include flush state (#16970, Jason Dillaman)
* rbd: Journal duplicate op detection can cause lockdep error (#16363, Jason Dillaman)
* rbd: Journal needs to handle duplicate maintenance op tids (#16362, Jason Dillaman)
* rbd: Unable to disable journaling feature if in unexpected mirror state (#16348, Jason Dillaman)
* rbd: bashism in src/rbdmap (#16608, Jason Dillaman)
* rbd: doc: format 2 now is the default image format (#17026, Chengwei Yang)
* rbd: hen journaling is enabled, a flush request shouldn't flush the cache (#15761, Yuan Zhou)
* rbd: possible race condition during journal transition from replay to ready (#16198, Jason Dillaman)
* rbd: qa/workunits/rbd: respect RBD_CREATE_ARGS environment variable (#16289, Mykola Golub)
* rbd: rbd-mirror should disable proxied maintenance ops for non-primary image (#16411, Jason Dillaman)
* rbd: rbd-mirror: FAILED assert(m_local_image_ctx->object_map != nullptr) (#16558, Jason Dillaman)
* rbd: rbd-mirror: FAILED assert(m_on_update_status_finish == nullptr) (#16956, Jason Dillaman)
* rbd: rbd-mirror: FAILED assert(m_state == STATE_STOPPING) (#16980, Jason Dillaman)
* rbd: rbd-mirror: ensure replay status formatter has completed before stopping replay (#16352, Jason Dillaman)
* rbd: rbd-mirror: include local pool id in resync throttle unique key (#16536, #15239, #16488, #16491, #16329, #15108, #15670, Ricardo Dias, Jason Dillaman)
* rbd: rbd-mirror: potential race condition accessing local image journal (#16230, Jason Dillaman)
* rbd: rbd-mirror: reduce memory footprint during journal replay (#16321, #16489, #16622, #16539, #16223, #16349, Mykola Golub, Jason Dillaman)
* rgw: A query on a static large object fails with 404 error (#16015, Radoslaw Zarzynski)
* rgw: Add zone rename to radosgw_admin (#16934, Shilpa Jagannath)
* rgw: Bucket index shards orphaned after bucket delete (#16412, Orit Wasserman)
* rgw: Bug when using port 443s in rgw. (#16548, Pritha Srivastava)
* rgw: Fallback to Host header for bucket name. (#15975, Robin H. Johnson)
* rgw: Fix civetweb IPv6 (#16928, Robin H. Johnson)
* rgw: Increase log level for messages occuring while running rgw admin command (#16935, Shilpa Jagannath)
* rgw: No Last-Modified, Content-Size and X-Object-Manifest headers if no segments in DLO manifest (#15812, Radoslaw Zarzynski)
* rgw: RGWPeriodPuller tries to pull from itself (#16939, Casey Bodley)
* rgw: Set Access-Control-Allow-Origin to a Asterisk if allowed in a rule (#15348, Wido den Hollander)
* rgw: Swift API returns double space usage and objects of account metadata (#16188, Albert Tu)
* rgw: account/container metadata not actually present in a request are deleted during POST through Swift API (#15977, #15779, Radoslaw Zarzynski)
* rgw: add socket backlog setting for via ceph.conf (#16406, Feng Guo)
* rgw: add tenant support to multisite sync (#16469, #16121, #16665, Yehuda Sadeh, Josh Durgin, Casey Bodley, Pritha Srivastava)
* rgw: add_zone only clears master_zone if --master=false (#15901, Casey Bodley)
* rgw: aws4 parsing issue (#15940, #15939, Yehuda Sadeh)
* rgw: aws4: add STREAMING-AWS4-HMAC-SHA256-PAYLOAD support (#16146, Radoslaw Zarzynski, Javier M. Mellid)
* rgw: backport merge of static sites fixes (#15555, #15532, #15531, Robin H. Johnson)
* rgw: can set negative max_buckets on RGWUserInfo (#14534, Yehuda Sadeh)
* rgw: cleanup radosgw-admin temp command as it was deprecated (#16023, Vikhyat Umrao)
* rgw: comparing return code to ERR_NOT_MODIFIED in rgw_rest_s3.cc (needs minus sign) (#16327, Nathan Cutler)
* rgw: custom metadata aren't camelcased in Swift's responses (#15902, Radoslaw Zarzynski)
* rgw: data sync stops after getting error in all data log sync shards (#16530, Yehuda Sadeh)
* rgw: default zone and zonegroup cannot be added to a realm (#16839, Casey Bodley)
* rgw: document multi tenancy (#16635, Pete Zaitcev)
* rgw: don't unregister request if request is not connected to manager (#15911, Yehuda Sadeh)
* rgw: failed to create bucket after upgrade from hammer to jewel (#16627, Orit Wasserman)
* rgw: fix ldap bindpw parsing (#16286, Matt Benjamin)
* rgw: fix multi-delete query param parsing. (#16618, Robin H. Johnson)
* rgw: improve support for Swift's object versioning. (#15925, Radoslaw Zarzynski)
* rgw: initial slashes are not properly handled in Swift's BulkDelete (#15948, Radoslaw Zarzynski)
* rgw: master: build failures with boost > 1.58 (#16392, #16391, Abhishek Lekshmanan)
* rgw: multisite segfault on ~RGWRealmWatcher if realm was deleted (#16817, Casey Bodley)
* rgw: multisite sync races with deletes (#16222, #16464, #16220, #16143, Yehuda Sadeh, Casey Bodley)
* rgw: multisite: preserve zone's extra pool (#16712, Abhishek Lekshmanan)
* rgw: object expirer's hints might be trimmed without processing in some circumstances (#16705, #16684, Radoslaw Zarzynski)
* rgw: radosgw-admin failure for user create after upgrade from hammer to jewel (#15937, Orit Wasserman, Abhishek Lekshmanan)
* rgw: radosgw-admin: EEXIST messages for create operations (#15720, Abhishek Lekshmanan)
* rgw: radosgw-admin: inconsistency in uid/email handling (#13598, Matt Benjamin)
* rgw: realm pull fails when using apache frontend (#15846, Orit Wasserman)
* rgw: retry on bucket sync errors (#16108, Yehuda Sadeh)
* rgw: s3website: x-amz-website-redirect-location header returns malformed HTTP response (#15531, Robin H. Johnson)
* rgw: segfault in RGWOp_MDLog_Notify (#16666, Casey Bodley)
* rgw: segmentation fault on error_repo in data sync (#16603, Casey Bodley)
* rgw: selinux denials in RGW (#16126, Boris Ranto)
* rgw: support size suffixes for --max-size in radosgw-admin command (#16004, Vikhyat Umrao)
* rgw: updating CORS/ACLs might not work in some circumstances (#15976, Radoslaw Zarzynski)
* rgw: use zone endpoints instead of zonegroup endpoints (#16834, Casey Bodley)
* tests: improve rbd-mirror test case coverage (#16197, Mykola Golub, Jason Dillaman)
* tests: rados/test.sh workunit timesout on OpenStack (#15403, Loic Dachary)
* tools: ceph-disk: Accept bcache devices as data disks (#13278, Peter Sabaini)
* tools: src/script/subman fails with KeyError: 'nband' (#16961, Loic Dachary, Ali Maredia)

For more detailed information refer to the complete changelog[1] and the
release notes[2]

Getting Ceph
------------

* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-10.2.3.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy

[1]: http://docs.ceph.com/docs/master/_downloads/v10.2.3.txt
[2]: http://docs.ceph.com/docs/master/release-notes/#v10-2-3-jewel

Regards,
Abhishek

James Page (james-page) on 2016-09-29
Changed in ceph (Ubuntu Xenial):
status: New → Triaged
Changed in ceph (Ubuntu Yakkety):
status: New → Triaged
Changed in ceph (Ubuntu Xenial):
importance: Undecided → High
Changed in ceph (Ubuntu Yakkety):
importance: Undecided → High
Changed in cloud-archive:
status: New → Invalid
James Page (james-page) on 2016-09-29
Changed in ceph (Ubuntu Yakkety):
assignee: nobody → James Page (james-page)
Changed in ceph (Ubuntu Xenial):
assignee: nobody → James Page (james-page)
David Medberry (med) wrote :

Does this include the changes/fixes that James Troup reported in bug #1628750? Or do we expect an urgent update to this?

James Page (james-page) wrote :

Plan is to pick that fix at the same time.

James Page (james-page) wrote :

Dumping packages for all targets here:

https://launchpad.net/~openstack-ubuntu-testing/+archive/ubuntu/ceph-sru

Once they build and test OK I'll upload for SRU; note that yakkety is broken in other ways atm (bug 1629102).

James Page (james-page) wrote :

Uploaded to xenial-proposed; SRU testing is covered as part of the OpenStack Team Stable Release Update process:

  https://wiki.ubuntu.com/OpenStack/StableReleaseUpdates

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 10.2.3-0ubuntu1

---------------
ceph (10.2.3-0ubuntu1) yakkety; urgency=medium

  * New upstream point release (LP: #1628809):
    - d/p/rocksdb-flags.patch: Dropped, included upstream.
    - d/p/*: Refreshed.
    - d/p/32bit-ftbfs.patch: Cherry pick fix for 32bit arch compat.
    - d/ceph-{fs-common,fuse}.install: Fix install locations
      for mount{.fuse}.ceph.
  * Limit the amount of data per chunk in omap push operations to 64k,
    ensuring that OSD threads don't hit timeouts during recovery
    operations (LP: #1628750):
    - d/p/osd-limit-omap-data-in-push-op.patch: Cherry pick fix from
      upstream master branch.

 -- James Page <email address hidden> Thu, 29 Sep 2016 21:44:33 +0100

Changed in ceph (Ubuntu Yakkety):
status: Triaged → Fix Released

Hello James, or anyone else affected,

Accepted ceph into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/10.2.3-0ubuntu0.16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in ceph (Ubuntu Xenial):
status: Triaged → Fix Committed
tags: added: verification-needed
James Page (james-page) wrote :

Hello James, or anyone else affected,

Accepted ceph into mitaka-proposed. The package will build now and be available in the Ubuntu Cloud Archive in a few hours, and then in the -proposed repository.

Please help us by testing this new package. To enable the -proposed repository:

  sudo add-apt-repository cloud-archive:mitaka-proposed
  sudo apt-get update

Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-mitaka-needed to verification-mitaka-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-mitaka-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-mitaka-needed
Brian Murray (brian-murray) wrote :

The lack of verification on this bug and bug 1628750 are preventing the acceptance of another ceph upload into -proposed.

James Page (james-page) wrote :

Brian

I'd like to stack the fix for bug 1628750 ontop of this SRU, rather than deliver two sets of updates to end users if possible.

James Page (james-page) wrote :

Oh wait - not bug 1628750 - I'd like to stack bug 1587261

Brian Murray (brian-murray) wrote :

Hello James, or anyone else affected,

Accepted ceph into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/ceph/10.2.3-0ubuntu0.16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

James Page (james-page) on 2016-11-23
tags: added: verification-done
removed: verification-needed

The verification of the Stable Release Update for ceph has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package ceph - 10.2.3-0ubuntu0.16.04.2

---------------
ceph (10.2.3-0ubuntu0.16.04.2) xenial; urgency=medium

  * rgw: Fixes for creation times for buckets (LP: #1587261):
    - d/p/rgw_rados-creation_time.patch: Backport fix from upstream master.
      Fix logic error that leads to creation time being 0 instead of current
      time when creating buckets.

ceph (10.2.3-0ubuntu0.16.04.1) xenial; urgency=medium

  * New upstream stable release (LP: #1628809).
    - d/p/*: Refresh.
    - d/p/rocksdb-flags.patch: Dropped, accepted upstream.
    - d/p/32bit-ftbfs.patch: Cherry pick fix for 32bit arch compat.
    - d/ceph-{fs-common,fuse}.install: Fix install locations
      for mount{.fuse}.ceph.
  * Limit the amount of data per chunk in omap push operations to 64k,
    ensuring that OSD threads don't hit timeouts during recovery
    operations (LP: #1628750):
    - d/p/osd-limit-omap-data-in-push-op.patch: Cherry pick fix from
      upstream master branch.

 -- Frode Nordahl <email address hidden> Fri, 28 Oct 2016 13:50:40 +0200

Changed in ceph (Ubuntu Xenial):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers