mount_check does not prevent writing to root mount

Bug #1470576 reported by Doug Mayer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
Ben Martin

Bug Description

Running Swift 2.2.2 on Ubuntu 14.04 x64.

/etc/swift/{object,container,account}-server.conf includes mount_check:
    [DEFAULT]
    mount_check = true

When the drive-audit service unmounts a failing drive, the object-server service creates a directory in the root mount and syncs files into it, filling up the root partition.

As a workaround (though we'll see a lot of errors in the logs), we are able to chown /srv/node to root, preventing the swift user from writing to it after an unmount.

CVE References

Revision history for this message
clayg (clay-gerrard) wrote :

I don't think mount_check was meant to prevent that behavior (It's for the quick 507 support) - but it *is* very annoying.

... somewhere there's a makedirs call that we need to replace with something custom that won't create directories in the configured devices.

Changed in swift:
status: New → Confirmed
importance: Undecided → Medium
tags: added: low-hanging-fruit
Ben Martin (blmartin)
Changed in swift:
assignee: nobody → Ben Martin (blmartin)
Revision history for this message
Pete Zaitcev (zaitcev) wrote :

1. chown root on all mount points under /srv/root is not a workaround, but the established practice that everyone should have been followed from the start

2. There's no way for whoever creates files to make checks race-free. Swift code has some scattered checks here and there, but they are only in to reduce the number of exceptions.

I say this bug is invalid, everything works like it's supposed to. We might want to work to catch tracebacks and thus reduce the amount of useless logging when swift-drive-audit unmounts.

Revision history for this message
Doug Mayer (doxavore) wrote :

Should there be something added to the docs then to suggest that directory/permission layout? As long as mount_check exists and is documented to the extent that it is today (particularly in the deployment/administration guides, perhaps the developer docs are more detailed), this behavior seems non-obvious. I also didn't see anything in the docs about giving eg /srv/node root ownership, though perhaps some more time to reflect calmly on the situation would have made that obvious. :)

Is there a reason we wouldn't want an unmounted point to not receive files when mount_check is turned on? With replication running, you may fill up logs with errors when running on an unmounted drive for any length of time. It isn't always desirable to remove it from the ring and rebalance if you're talking only a few days before the drive will be repaired.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

Just to add on to Pete Zaitcev's point, when node A is going to replicate to node B, it first issues a REPLICATE request to get a list of stuff to push, then invokes rsync to actually push the stuff. If B's disk gets unmounted between the REPLICATE request and the rsync push, then things will go on the root filesystem if allowed. Rsync can't be told to check mountedness.

Probably some more words in the administration guide about having root ownership on the mount points would be good. I think it's somewhere in the docs already, but repetition will make it more likely that people actually read it.

Revision history for this message
Doug Mayer (doxavore) wrote :

So my [limited] understanding is that, as it stands today, mount_check will cause the REPLICATE request to return a 507. While an in-progress replication cycle may be running, it won't start any NEW replication cycles after that? And therein lies Pete's point that it's as good as it will get.

Perhaps I'm just falling into a cycle where it's trying to replicate over 200GB of data in 1 replication cycle, but that seems unlikely with my data size and partition count.

Furthermore, if the disk fills up entirely, and I start clearing files off of the disk, replication will very quickly start filling it back up again. If the 507 was working, it seems like after a failure during one replication cycle, it would recognize the unmounted device before restarting the replication. (And if it's not, maybe that's the real underlying issue?)

Revision history for this message
clayg (clay-gerrard) wrote :

... ok so rsync will create the device dirs. So sam's right.

FWIW when reverting a handoff/rebalance there is no preflight REPLICATE request - the node holding the partition that doesn't belong on that node will just try to rsync it where it goes - no opportunity for 507. If the device's dir does not exist on the target node the remote rsync module will try to create it - so it's important that operation either fail, or is a broken symlink.

I'm not sure if there are any flags/options that rsync might have that would prevent this.

This is yet another of many reasons we need to get in the datapath for object replication (think SSYNC or hummingbird's SYNC)

Changed in swift:
status: Confirmed → In Progress
Revision history for this message
Ben Martin (blmartin) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/199203
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=89f59062864e5cbfc839a6084c323ce35438aa57
Submitter: Jenkins
Branch: master

commit 89f59062864e5cbfc839a6084c323ce35438aa57
Author: Ben Martin <email address hidden>
Date: Mon Jul 27 14:19:09 2015 -0500

    +Document method to avoid rsync filling root drive

    When rsync pushes to a remote node with an unmounted drive and if
    certain steps are not taken, rsync may attempt to write files to
    the local drive at the location where the drive was mounted.

    There are two suggested solutions for this issue:
      1) Set the permissions for all mount points in /srv/node/
           to root:root 755
      2) Mount the drives elsewhere and symlink the drives to /srv/.../

    The first method ensures that only root and not the swift user
    can write in the /srv/.../ directories.

    The second method will prompt a broken link issue if rsync
    attempts to write to an unmounted drive.

    Change-Id: I60ce4ed9ef8401768d5f78b6806cbb2e2a65303e
    Closes-Bug: #1470576

Changed in swift:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 2.4.0
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/crypto)

Fix proposed to branch: feature/crypto
Review: https://review.openstack.org/219775

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/crypto)
Download full text (43.3 KiB)

Reviewed: https://review.openstack.org/219775
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=257e468e9bfd1088040419ad408106ac3c77b531
Submitter: Jenkins
Branch: feature/crypto

commit e02609c66a804845672413b06830b87395afef31
Author: Samuel Merritt <email address hidden>
Date: Tue Sep 1 15:19:50 2015 -0700

    Preserve traceback in swift-dispersion-report

    Commit c690bcb fixed a bug in the dispersion report, but changed this
    from a bare "raise" to "raise err", which loses the traceback. Not a
    big deal, but worth putting back IMO.

    Change-Id: Id5b72153a4b8df8e3faaf1fa3fb2040e28ba85cc

commit d06d4ad0fd2dfe69da8008e729651264522c6c06
Author: Minwoo Bae <email address hidden>
Date: Tue Sep 1 15:08:44 2015 -0500

    Included reference in swift.obj.diskfile to enumerate the string
    used for data file paths.

    Change-Id: Ie22caa678bc00dfc43fabec7efbbb9f34490f1b5

commit 524c89b7eeff037b8a6b421888771e15f98c2da2
Author: John Dickinson <email address hidden>
Date: Fri Aug 21 13:39:41 2015 -0700

    Updated CHANGELOG, AUTHORS, and .mailmap for 2.4.0 release.

    Change-Id: Ic6301146b839c9921bb85c4f4c1e585c9ab66661

commit 05de1305a903ee4ce9c8c50fde53c552d5b90d51
Author: Clay Gerrard <email address hidden>
Date: Thu Aug 27 18:35:09 2015 -0700

    Make ssync_sender send valid chunked requests

    The connect method of ssync_sender tells the remote connection that it's
    going to send a valid HTTP chunked request, but if the remote end needs
    to respond with an error of any kind sender throws HTTP right out the
    window, picks up his ball, and closes the socket down hard - much to the
    surprise of the eventlet.wsgi server who up to this point had been
    playing along quite nicely with this 'SSYNC' nonsense assuming that
    everyone here is consenting mature adults.

    If you're going to make a "Transfer-Encoding: chunked" request have the
    good decency to finish the job with a proper '0\r\n\r\n'. [1]

    N.B. It might be possible to handle an error status during the
    initialize_request phase with some sort of 100-continue support, but
    honestly it's not entirely clear to me when the server isn't going to
    close the connection if the client is still expected to send the body
    [2] - further if the error comes later during missing_check or updates
    we'll for sure want to send the chunk transfer termination line before
    we close down the socket and this way we cover both.

    1. Really, eventlet.wsgi shouldn't be so blasted brittle about this [3]
    2. https://lists.w3.org/Archives/Public/ietf-http-wg/2005AprJun/0007.html
    3. https://github.com/eventlet/eventlet/commit/c3ce3eef0b4d0dfdbfb1ec0186d4bb204fb8ecd5

    Closes-Bug #1489587
    Change-Id: Ic17c6c3075553f8cf6ef6213e62a00282f0d01cf

commit 993ee4e37af1961adba2047d5aa2eb210e423eb3
Author: nakagawamsa <email address hidden>
Date: Fri Aug 28 11:49:43 2015 +0900

    Remove duplicate X-Backend-Storage-Policy-Index key

    There is duplicate 'X-Backend-Storage-Policy-Index' dictionary key in unit.obj.test_server.py.
    One key has fixed policy index value, and another ha...

tags: added: in-feature-crypto
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/hummingbird)

Fix proposed to branch: feature/hummingbird
Review: https://review.openstack.org/221410

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/hummingbird)
Download full text (70.7 KiB)

Reviewed: https://review.openstack.org/221410
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=eb8f1f83f1cfc63d8452bc30096fd1c145781527
Submitter: Jenkins
Branch: feature/hummingbird

commit cb683d391cb66d0f52830de16760c80fd2afedf9
Author: OpenStack Proposal Bot <email address hidden>
Date: Sat Sep 5 06:17:51 2015 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I2d92b8e34a665fb0bb4c048cfb0c59de295dfce6

commit e4542455c8a07b7981c247df8b737816062c1655
Author: Emett Speer <email address hidden>
Date: Wed Sep 2 17:18:03 2015 -0700

    [Labs] Update links to Cloud Admin Guide

    Update links to the Cloud Admin Guide after the
    RST conversion of that book altered URLs.

    Change-Id: I899f8938498b744e62887968a65e58c00ef27f1b

commit 58fcc07523978306cd3889ada73af5d9e664cf59
Author: Christian Schwede <email address hidden>
Date: Wed Sep 2 10:52:34 2015 +0000

    Test if container_sweep is executed on unmounted devices

    This change ensures that container_sweep is not run if a device is not mounted
    and mount_check is set to True.

    Change-Id: I823083c8431d9e61fd426508033ec9188503957b

commit e02609c66a804845672413b06830b87395afef31
Author: Samuel Merritt <email address hidden>
Date: Tue Sep 1 15:19:50 2015 -0700

    Preserve traceback in swift-dispersion-report

    Commit c690bcb fixed a bug in the dispersion report, but changed this
    from a bare "raise" to "raise err", which loses the traceback. Not a
    big deal, but worth putting back IMO.

    Change-Id: Id5b72153a4b8df8e3faaf1fa3fb2040e28ba85cc

commit d06d4ad0fd2dfe69da8008e729651264522c6c06
Author: Minwoo Bae <email address hidden>
Date: Tue Sep 1 15:08:44 2015 -0500

    Included reference in swift.obj.diskfile to enumerate the string
    used for data file paths.

    Change-Id: Ie22caa678bc00dfc43fabec7efbbb9f34490f1b5

commit 615c7a204b9386e05c5bab658bfe96766ad1e680
Author: Brian Cline <email address hidden>
Date: Tue Sep 1 10:51:20 2015 -0500

    Adds useful dispersion info from changelog

    Change-Id: I1a45088fc32620b02ff9a754b02ec1eb75a59d6e

commit 3b8755098a1786c5447abf158bd686293a82977c
Author: janonymous <email address hidden>
Date: Sun Aug 2 21:29:13 2015 +0530

    Replace a / b with a // b to use integer division where needed

    Change-Id: I72c81faa62786e140b0de00e3a04934bf1b5adbd

commit 524c89b7eeff037b8a6b421888771e15f98c2da2
Author: John Dickinson <email address hidden>
Date: Fri Aug 21 13:39:41 2015 -0700

    Updated CHANGELOG, AUTHORS, and .mailmap for 2.4.0 release.

    Change-Id: Ic6301146b839c9921bb85c4f4c1e585c9ab66661

commit 05de1305a903ee4ce9c8c50fde53c552d5b90d51
Author: Clay Gerrard <email address hidden>
Date: Thu Aug 27 18:35:09 2015 -0700

    Make ssync_sender send valid chunked requests

    The connect method of ssync_sender tells the remote connection that it's
    going to send a valid HTTP chunked request, but if the remote end needs
    to respond with an error of any kind sender th...

tags: added: in-feature-hummingbird
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.