swift.common.memcached client always uses absolute (unix) time expiration values

Bug #1076148 reported by Peter Portante
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Undecided
Peter Portante

Bug Description

The current memcached client implementation always uses absolute (unix) time values when a time is provided. This requires that the client server time and the memcached server time are in sync.

Seems like using delta timeout values will work just fine, and prevent unexpected behaviors in how objects are cached when server times are not in sync. See http://code.google.com/p/memcached/issues/detail?id=296.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

There's a bunch of other things that break when the times on your cluster nodes get out of sync, so some kind of time synchronizer is essential.

I guess you could have a separate set of memcached boxes whose time could drift, though.

Changed in swift:
status: New → Confirmed
Revision history for this message
Samuel Merritt (torgomatic) wrote :

I looked at using deltas instead of absolute timestamps, and there's a problem. Specifically, the problem is that the memcache protocol has a suboptimal* design.

memcached differentiates between deltas and absolute timestamps by comparing them to an arbitrarily-chosen value** that's about 30 days. Anything over that is an absolute timestamp.

What this means for Swift is that you can't use deltas and have long-lived stuff in cache. Now, that's all abstract and hand-wavy, so here's two concrete examples:

TempAuth lets you set auth-token lifetimes to be as long as you want. If we used deltas, that would be limited to 30 days.

The cname middleware caches DNS lookups for the DNS TTL. If a domain owner sets a ridiculously long TTL on their domain, then the cname middleware would no longer cache lookups for it.

So, while it would be nice to use deltas instead of absolute timeout values, the memcached protocol is too limited* to let Swift do that without breaking a bunch of other stuff.

* stupid

** https://github.com/memcached/memcached/blob/master/memcached.c#L142

Changed in swift:
status: Confirmed → Won't Fix
Revision history for this message
Peter Portante (peter-a-portante) wrote :

So that goal of this bug report was to alert folks of a problem that users of OpenStack Swift might run into, namely that servers that don't keep in sync can result in attempts to cache something in memcache to silently fail.

One way to avoid that is to use deltas, which has been rejected, but instead of always using absolute times in the memcached client, would it not be possible to use deltas for values under 30 days, and use absolute times over 30 days?

If that is not acceptable, could we at least add note of warning in the docs for memcache in the configuration tables that the proxy-server host needs to be kept in sync with the memcached hosts?

Thanks for considering.

Revision history for this message
dormando (dormando) wrote :

There's really no place in your connector where you can do: if (TTL > (30 * 86400) { TTL = time() + TTL; } ?

Because that's, you know, pretty close to how we do it internally. Since the protocol is "dumb" we have to do that differentiation on our own, too.

When an expiration value is sufficiently far enough into the future to warrant a date, usually the clocks being off by a few seconds doesn't matter. Most small values are noticable problems however.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

Sure, a note in the documentation sounds like a good idea.

I don't like the idea of using deltas sometimes. That leads to code that only works sometimes.

Changed in swift:
status: Won't Fix → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/16284

Changed in swift:
assignee: nobody → Peter Portante (peter-a-portante)
status: Confirmed → In Progress
Revision history for this message
Peter Portante (peter-a-portante) wrote :

I filed https://review.openstack.org/#/c/16284/ since it really seems the code is not that different and the behavior difference is effectively the same.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/16284
Committed: http://github.com/openstack/swift/commit/1ac7b88a271d8b4fd4557b301fdeb80442dcabbe
Submitter: Jenkins
Branch: master

commit 1ac7b88a271d8b4fd4557b301fdeb80442dcabbe
Author: Peter Portante <email address hidden>
Date: Fri Nov 16 00:09:14 2012 -0500

    Use a delta timeout for memcache where possible

    We use a delta timeout value for timeouts under 30 days (in seconds)
    since that is the limit which the memcached protocols will recognize a
    timeout as a delta. Greater than 30 days and it interprets it as an
    absolute time in seconds since the epoch.

    This helps to address an often difficult-to-debug problem of time
    drift between memcache clients and the memcache servers. Prior to this
    change, if a client's time drifts behind the servers, short timeouts
    run the danger of not being cached at all. If a client's time drifts
    ahead of the servers, short timeouts run the danger of persisting too
    long. Using delta's avoids this affect. For absolute timeouts 30 days
    or more in the future small time drifts between clients and servers
    are inconsequential.

    See also bug 1076148 (https://bugs.launchpad.net/swift/+bug/1076148).

    This also fixes incr and decr to handle timeout values in the same way
    timeouts are handled for set operations.

    Change-Id: Ie36dbcedfe9b4db9f77ed4ea9b70ff86c5773310
    Signed-off-by: Peter Portante <email address hidden>

Changed in swift:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.7.6
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers