Improve error reporting when running on an unsupported filesystem

Bug #966671 reported by Ben Hartshorne
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Low
Matthew Oliver

Bug Description

When you start Swift (specifically the object-server) with an unsupported filesystem (eg one that doesn't support XATTR) it does not complain. You don't get any errors until you try and PUT an object, at which point it dumps a stack trace into syslog (indicating IOError):

Mar 27 22:10:02 i-000001c9 object-server ERROR __call__ error with PUT /vdb/4548/AUTH_.auth/.account_id/AUTH_4c3531a8-0735-4ec4-a88b-fd
eb357cccde : #012Traceback (most recent call last):#012 File "/usr/lib/pymodules/python2.6/swift/obj/server.py", line 715, in __call__
#012 res = getattr(self, req.method)(req)#012 File "/usr/lib/pymodules/python2.6/swift/obj/server.py", line 520, in PUT#012 file
.put(fd, tmppath, metadata)#012 File "/usr/lib/python2.6/contextlib.py", line 34, in __exit__#012 self.gen.throw(type, value, trace
back)#012 File "/usr/lib/pymodules/python2.6/swift/obj/server.py", line 251, in mkstemp#012 yield fd, tmppath#012 File "/usr/lib/p
ymodules/python2.6/swift/obj/server.py", line 520, in PUT#012 file.put(fd, tmppath, metadata)#012 File "/usr/lib/pymodules/python2.
6/swift/obj/server.py", line 275, in put#012 write_metadata(fd, metadata)#012 File "/usr/lib/pymodules/python2.6/swift/obj/server.p
y", line 89, in write_metadata#012 setxattr(fd, '%s%s' % (METADATA_KEY, key or ''), metastr[:254])#012 File "/usr/lib/pymodules/pyt
hon2.6/xattr/__init__.py", line 188, in setxattr#012 return xattr(f).set(attr, value, options=options)#012 File "/usr/lib/pymodules
/python2.6/xattr/__init__.py", line 81, in set#012 self._set(name, value, 0, options | self.options)#012 File "/usr/lib/pymodules/p
ython2.6/xattr/__init__.py", line 16, in _func#012 return func(first, *args)#012IOError: [Errno 95] Operation not supported

I suggest checking that the filesystem supports the necessary functionality (eg setting or checking an xattr) on process start so that the error appears when you start swift rather than when you try and PUT an object.

(fwiw, the user-visible error is just a 500 server error without any additional information.)

Revision history for this message
Pete Zaitcev (zaitcev) wrote :

Aww man, this is so true. However, I am against checking for a list of supported filesystems (such as XFS). If we decide to fix this, we need to devise a test that actually stores an xattr. In Fedora, people run Swift on ext4, for example. This requires some coding and testing... Honestly, I'm tempted to punt it. I knew that xattrs were necessary when I installed Swift, even back in 2010.

Revision history for this message
Pete Zaitcev (zaitcev) wrote :

(I'm marking this Confirmed, but this may be later rejected)

Changed in swift:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/32269

Changed in swift:
assignee: nobody → Kun Huang (academicgareth)
status: Confirmed → In Progress
Revision history for this message
Kun Huang (academicgareth) wrote :

I'm trying to fix this, but have some problems.

If we need a check, we must know which dir we will check for xattr

That dir is something like "/srv/node/sda/object/xxx"

/srv/node/ is devices_dir in object-server.conf
sda/ is in ring
object/ means object server

In object-server, we could get the 1st and 3rd easily. But sda in the ring is a little hard.
Typically, the ring.devs looks like
[
{....ip:192.168.1.1, device:sda, region:1...},
{....ip:192.168.1.2, device:sdb, region:1...},
{....ip:192.168.1.3, device:sdc, region:1...},
]
We want the specific ip of one object server, and only way we could use to distinguish device is ip. If user set ip as 192.168.1.x, we could easily get its own device. but if user set ip as 0.0.0.0, there're on other ways to get device. This issue block my whole solution.

Is there any other thought to fix this bug?

Revision history for this message
Madhuri Kumari (madhuri-rai07) wrote :

Hi Kun Huang,

This bug is also affecting me.

I tried to write data over filesystem that doesnot support xattr and it failed giving error message:

Object PUT failed: http://127.0.0.1:8080/v1/AUTH_test/new/file.txt 503 Internal Server Error [first 60 chars of response] <html><h1>Service Unavailable</h1><p>The server is currently

Log from storage1.error:
object-server: ERROR __call__ error with PUT /sdb1/667/AUTH_test/container/cirros.img : #012Traceback (most recent call last):#012 File "/root/swift/swift/obj/server.py", line 666, in __call__#012 res = method(req)#012 File "/root/swift/swift/common/utils.py", line 1915, in wrapped#012 return func(*a, **kw)#012 File "/root/swift/swift/common/utils.py", line 687, in _timing_stats#012 resp = func(ctrl, *args, **kwargs)#012 File "/root/swift/swift/obj/server.py", line 438, in PUT#012 writer.put(metadata)#012 File "/root/swift/swift/obj/diskfile.py", line 663, in put#012 self._finalize_put, metadata, target_path)#012 File "/root/swift/swift/common/utils.py", line 2212, in force_run_in_thread#012 return self._run_in_eventlet_tpool(func, *args, **kwargs)#012 File "/root/swift/swift/common/utils.py", line 2195, in _run_in_eventlet_tpool#012 raise result#012IOError: [Errno 95] Operation not supported (txn: tx88bb330458364718adb7f-0052943338)

which shows that data is written first to disk partition and after that metadata is written to disk but it failed due to xattr support. So it is better to check support of xattr when object-server starts rather before putting data.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

This isn't something that can be checked at object-server start. Devices can be mounted and unmounted while the object server is running, the ring can change while the object server is running... and that's just off the top of my head.

One way to fix this might be to change the ordering of data and metadata writes in the object server. Currently, it looks something like this:

  * create tempfile
  * fallocate()
  * write data, ..., write data
  * write xattrs
  * rename tempfile

If we moved the metadata writing up, then we'd be better off:

  * create tempfile
  * fallocate()
  * write xattrs
  * write data, ..., write data
  * rename tempfile

Then we could catch IOError with the right errno and turn that into a 507 response from the object server, and we can also log a human-readable message about xattr support at the same time.

If we did that, then the proxy could see the 507 and move on to a handoff node before sending any request body to the backend; as it is now, the proxy sends all the bytes over and then gets a 500, which is wasteful.

Revision history for this message
Peter Portante (peter-a-portante) wrote :

Sam, that sounds like a good idea. Couple of questions:

Do we know all of the xattrs that need to be written before all the data is given to us? I had thought that we don't always know the content length, or something like that. In that case, we can fall back to the previous behavior.

And don't forget the fsync at the end before the close and the rename. :)

Revision history for this message
Samuel Merritt (torgomatic) wrote :

Oh yeah, if we're getting data with Transfer-Encoding: chunked, then we don't know the content length. Darn.

Still, chunked transfers can fail early for other reasons, like running out of disk, so another late failure mode isn't the end of the world. It makes the code more complex, but it might be worth it.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

And by "fail early", I of course mean "fail late".

Someone send coffee, please. :)

Changed in swift:
assignee: Kun Huang (academicgareth) → nobody
Tom Fifield (fifieldt)
Changed in swift:
status: In Progress → Confirmed
Revision history for this message
Madhuri Kumari (madhuri-rai07) wrote :

Hi Samuel,

I have a question how can the underlying filesystem be changed while object server is running?
According to your comment#6, the check for xattr should not be done at start of server.

But i think the filesystem cannot be changed while object server is running. So I think the check can be done at the time server starts rather than before writing the data chunks to partition.

Could you please validate my understanding?

Revision history for this message
Samuel Merritt (torgomatic) wrote :

> But i think the filesystem cannot be changed while object server is running.

Think again. :)

One can unmount a disk (say, /dev/sdf mounted at /srv/node/d25), format it with a different filesystem, and then remount it at its old mount point, and one can do this while the object server is running.

clayg (clay-gerrard)
Changed in swift:
importance: Undecided → Low
Matthew Oliver (matt-0)
Changed in swift:
assignee: nobody → Matthew Oliver (matt-0)
Revision history for this message
Matthew Oliver (matt-0) wrote :

Unless I'm reading the code wrong, the object server first attempts to read metadata (disk_file.read_metadata()) before writing an object. This check metadata uses xattr.getxattr, which will cause a 'IOError: [Errno 95] Operation not supported' like mentioned at the start of this bug.

At the moment, this isn't handled the same as the setxttr.
My thoughts are we can wrap this code with a better exception handling and should be able to use this to alert back to the user that the xattr isn't supprted by the current filesystem.

Best part is, this would allow us to send an error back before the write, just like Sam has suggested previously.

If I'm wrong please correct me. Either way, I'll do some testing and write up a patch soon.

Matt

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/99883

Changed in swift:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/99883
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=2659888c921d09bd1dd23cda6ee2f158187d80e6
Submitter: Jenkins
Branch: master

commit 2659888c921d09bd1dd23cda6ee2f158187d80e6
Author: Matthew Oliver <email address hidden>
Date: Fri Jun 13 19:12:31 2014 +1000

    When a filesystem does't support xattr return a 507

    Currently when the object server tries to write an object's metadata
    to a filesystem that doesn't support xattr, it errors with a stacktrace
    and returns a 500 error back to the user with no information.

    This patch catches the resulting IOError when attempting to read or write
    the xattr metadata, logs the error nicely and then returns a 507 error
    back to the user.

    Seeing as this change is sending back a 507, it also catches and logs
    the out of disk space errors (ENOSPC and EDQUOT).

    Change-Id: I31932b57582817a0b3b58dd315a996bd0bcbc99b
    Closes-Bug: #966671

Changed in swift:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/ec)

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/138165

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/ec)
Download full text (15.6 KiB)

Reviewed: https://review.openstack.org/138165
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=0d3ebf09b94b41782b2c2a6bbcf255bf1203eca0
Submitter: Jenkins
Branch: feature/ec

commit 977d7c14daa38ab9c9d79bbf8b92371024b93fc8
Author: John Dickinson <email address hidden>
Date: Wed Nov 26 14:19:08 2014 -0800

    Fix tempfile bugs from commit 6978275

    Commit 6978275 changed xprofile middleware's usage of mktemp
    and moved to using tempfile. But it was clearly never tested,
    because the os.close() calls never worked. This patch updates
    that previous patch to use a context to open and close the file.

    Change-Id: I40ee42e8539551fd8e4dfb353f50146ab40a7847

commit dec97fc3ba2c71884f1c098e7d9cd1f709f74958
Author: OpenStack Proposal Bot <email address hidden>
Date: Wed Nov 26 06:13:29 2014 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: Ibf319f7cc1b5036ad8031776cf2c6018fb8a0159

commit 01f6e860066640a2ba1406a23c93a72b34ec495e
Author: Clay Gerrard <email address hidden>
Date: Fri Nov 21 17:28:13 2014 -0800

    Add Expected Failure for ssync with sys-meta

    Sysmeta included with an object PUT persists with the PUT data - if an
    internal operation such as POST-as-copy during partial failure, or ssync
    with fast-POST (not supported), causes that data to be lost then the
    associated sysmeta will also be lost.

    Since object sys-meta persistence in the face of a POST when the
    original .data is unavailable requires fast-POST with .meta files the
    probetest that validates object sys-meta persistence of a POST when the
    most up-to-date copy of the object with sys-meta is unavailable
    configures an InternalClient with object_post_as_copy = false.

    This non-default configuration option is not supported by ssync and
    results in a loss of sys-meta very similar to the object sys-meta
    failure you would see with object_post_as_copy = true when the COPY part
    of the POST is unable to retrieve the most recently written object with
    sys-meta.

    Until we can fix the default POST behavior to make metadata updates
    without stomping on newer data file timestamps we should expect object
    sys-meta to be "very very best possible but not really guaranteed
    effort".

    Until we can fix ssync to replicate metadata updates without stomping on
    newer data file timestamps we should expect this test to fail.

    When ssync replication of fast-POST metadata update is fixed this test
    will fail signaling that the expected failure cruft should be removed,
    but other parts of ssync replication will still work and some other bugs
    can be fixed while we wait.

    Change-Id: Ifc5d49514de79b78f7715408e0fe0908357771d3

commit a8751ae557616cab1cafd98a338cad352526a262
Author: Cedric Dos Santos <email address hidden>
Date: Tue Nov 25 12:37:05 2014 +0100

    Correct misspelled words

    In some files I found misspelling words.

    bin/swift-reconciler-enqueue#l26
       prima...

Thierry Carrez (ttx)
Changed in swift:
milestone: none → 2.2.1
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.