Delete operation in swift-bench failed, when proxy affinity is enable

Bug #1198926 reported by Ksenia Svechnikova
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Ksenia Svechnikova

Bug Description

ProblemType: Bug
DistroRelease: Ubuntu 12.04 LTS
Package: swift 1.9.0

On SAIO after enable proxy affinity:

[app:proxy-server]
use = egg:swift#proxy
allow_account_management = true
account_autocreate = true
sorting_method = affinity
read_affinity = r1z1=100
write_affinity = r1

With 2 regions in the ring:

Devices: id region zone ip address port replication ip replication port name weight partitions balance meta
             0 1 1 127.0.0.1 6010 127.0.0.1 6010 sdb1 1.00 131072 0.00
             1 1 2 127.0.0.1 6020 127.0.0.1 6020 sdb2 1.00 131072 0.00
             2 2 3 127.0.0.1 6030 127.0.0.1 6030 sdb3 1.00 131072 0.00
             3 2 4 127.0.0.1 6040 127.0.0.1 6040 sdb4 1.00 131072 0.00

Swift-bench passed with failures, because the deletion work after replication has moved things to their proper homes.

swift@regions:/home/swift/bin$ swift-bench -A http://127.0.0.1:8080/auth/v1.0 -U test:tester -K testing -V 1.0
swift-bench 2013-07-08 13:38:58,262 INFO Auth version: 1.0
swift-bench 2013-07-08 13:38:58,969 INFO Auth version: 1.0
swift-bench 2013-07-08 13:39:01,591 INFO 1 PUTS [0 failures], 0.4/s
swift-bench 2013-07-08 13:39:16,727 INFO 60 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:39:31,960 INFO 112 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:39:47,072 INFO 170 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:40:02,137 INFO 224 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:40:17,230 INFO 274 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:40:32,339 INFO 323 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:40:47,490 INFO 381 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:41:02,624 INFO 439 PUTS [0 failures], 3.6/s
swift-bench 2013-07-08 13:41:17,880 INFO 486 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:41:33,129 INFO 533 PUTS [0 failures], 3.5/s
swift-bench 2013-07-08 13:41:48,215 INFO 579 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:42:03,287 INFO 628 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:42:18,630 INFO 681 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:42:33,833 INFO 740 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:42:49,667 INFO 789 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:43:04,873 INFO 831 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:43:20,091 INFO 879 PUTS [0 failures], 3.4/s
swift-bench 2013-07-08 13:43:35,210 INFO 920 PUTS [0 failures], 3.3/s
swift-bench 2013-07-08 13:43:50,405 INFO 967 PUTS [0 failures], 3.3/s
swift-bench 2013-07-08 13:43:58,731 INFO 1000 PUTS **FINAL** [0 failures], 3.3/s
swift-bench 2013-07-08 13:43:58,731 INFO Auth version: 1.0
swift-bench 2013-07-08 13:44:00,735 INFO 340 GETS [0 failures], 170.0/s
swift-bench 2013-07-08 13:44:15,737 INFO 3199 GETS [0 failures], 188.1/s
swift-bench 2013-07-08 13:44:30,752 INFO 5228 GETS [0 failures], 163.3/s
swift-bench 2013-07-08 13:44:45,753 INFO 7745 GETS [0 failures], 164.7/s
swift-bench 2013-07-08 13:45:00,753 INFO 9805 GETS [0 failures], 158.1/s
swift-bench 2013-07-08 13:45:03,620 INFO 10000 GETS **FINAL** [0 failures], 154.1/s
swift-bench 2013-07-08 13:45:03,620 INFO Auth version: 1.0
swift-bench 2013-07-08 13:45:05,672 INFO 110 DEL [12 failures], 53.8/s
swift-bench 2013-07-08 13:45:20,689 INFO 921 DEL [21 failures], 54.0/s
swift-bench 2013-07-08 13:45:21,976 INFO 1000 DEL **FINAL** [21 failures], 54.5/s
swift-bench 2013-07-08 13:45:21,976 INFO Auth version: 1.0

proxy.error:

Jun 26 13:58:36 regions proxy-server Handoff requested (2) (txn: tx30cf1da0b47842e19b162-0051cabb4c) (client_ip: 127.0.0.1)
Jun 26 13:58:36 regions proxy-server Handoff requested (2) (txn: txcdd5c483faf84df58a2be-0051cabb4c) (client_ip: 127.0.0.1)
Jun 26 13:58:36 regions proxy-server Handoff requested (2) (txn: tx0bb91eeba5d0447fb6f74-0051cabb4c) (client_ip: 127.0.0.1)
Jun 26 13:58:36 regions proxy-server Handoff requested (2) (txn: txf860bc3943a141e8a19dc-0051cabb4c) (client_ip: 127.0.0.1)
Jul 8 13:36:22 regions proxy-server SIGTERM received
Jul 8 13:36:22 regions proxy-server Exited
Jul 8 13:37:15 regions proxy-server Started child 29295
Jul 8 13:38:46 regions proxy-server Handoff requested (1) (txn: txdf236cf5c0cb4b1fb6f5f-0051da88a6) (client_ip: 127.0.0.1)
Jul 8 13:38:46 regions proxy-server Handoff requested (2) (txn: txdf236cf5c0cb4b1fb6f5f-0051da88a6) (client_ip: 127.0.0.1)
Jul 8 13:39:03 regions proxy-server ERROR with Container server 127.0.0.1:6021/sdb2 re: Trying to HEAD /v1/AUTH_test/fb139fa365f5417fa84751456c267efd_2: ConnectionTimeout (0.5s) (txn: tx63151415b39a4c4c9bdcc-0051da88b6)
Jul 8 13:39:03 regions proxy-server ERROR with Container server 127.0.0.1:6041/sdb4 re: Trying to HEAD /v1/AUTH_test/fb139fa365f5417fa84751456c267efd_19: ConnectionTimeout (0.5s) (txn: txd20c8cf804a3494aabe25-0051da88b6)
Jul 8 13:45:03 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: txe82e5239b3654dc692a8b-0051da8a1f) (client_ip: 127.0.0.1)
Jul 8 13:45:04 regions proxy-server Object DELETE returning 503 for (404, 204) (txn: txf8eaa22100f642efa6a73-0051da8a20) (client_ip: 127.0.0.1)
Jul 8 13:45:04 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: tx7c4fbfe09709402d8273a-0051da8a20) (client_ip: 127.0.0.1)
Jul 8 13:45:04 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: txabcadddcbcb248e3988f5-0051da8a20) (client_ip: 127.0.0.1)
Jul 8 13:45:04 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: txb545dcc4e5f54f388c8af-0051da8a20) (client_ip: 127.0.0.1)
Jul 8 13:45:04 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: txeb3872e6a1e84ebf977ad-0051da8a20) (client_ip: 127.0.0.1)
Jul 8 13:45:08 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: txc81de90b4cba431590c95-0051da8a23) (client_ip: 127.0.0.1)
Jul 8 13:45:08 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: tx73dce0eb38484b18a9643-0051da8a24) (client_ip: 127.0.0.1)
Jul 8 13:45:08 regions proxy-server Object DELETE returning 503 for (204, 404) (txn: tx197af5afff2543909e3fc-0051da8a24) (client_ip: 127.0.0.1)
Jul 8 13:45:08 regions proxy-server Object DELETE returning 503 for (404, 204) (txn: tx5d5839e0e5364b9d8f656-0051da8a24) (client_ip: 127.0.0.1)

Revision history for this message
clayg (clay-gerrard) wrote :

It seems like maybe this is a two replica ring, can you confirm?

Seems like there's a link missing from the builder output.

-Clay

Revision history for this message
clayg (clay-gerrard) wrote :

*line missing

Revision history for this message
Ksenia Svechnikova (kdemina) wrote :

Yes, this is a two replica ring:

/etc/swift/object.builder, build version 4
262144 partitions, 2.000000 replicas, 2 regions, 4 zones, 4 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices: id region zone ip address port replication ip replication port name weight partitions balance meta
             0 1 1 127.0.0.1 6010 127.0.0.1 6010 sdb1 1.00 131072 0.00
             1 1 2 127.0.0.1 6020 127.0.0.1 6020 sdb2 1.00 131072 0.00
             2 2 3 127.0.0.1 6030 127.0.0.1 6030 sdb3 1.00 131072 0.00
             3 2 4 127.0.0.1 6040 127.0.0.1 6040 sdb4 1.00 131072 0.00

Revision history for this message
clayg (clay-gerrard) wrote :

Hi Ksenia,

Thank you for confirming. In a three replica geo-distributed cluster you'd get a 404 or 200 - I think both are valid and correct depending on the system state and given eventual consistency.

with one primary - 200 404 404 => 404
with two primary - 200 200 404 => 200

This is a generalize-able pre-existing two replica *bug*:

So we wrote one copy at a primary location, and one at a handoff.

We sent the delete to the two primary nodes; one said 200, one said 404, the proxy said 503 because we refuse to define a quorum with two nodes.

Personally I think we could do better, but we'll have to start accepting and triaging 2 replica bugs like this. As a start I'd want to move quorum from > 50% to >= 50%. I think it makes more sense in both the 2 and 4 replica cases and makes no difference on 3.

This is not the first time I've said this tho, so I'm not sure the fate of this bug. IMHO, we should at a minimum change the description.

Revision history for this message
Kun Huang (academicgareth) wrote :

@clayg

Do you have some advice of replicas number for building a ring in two IDC?

4/6 or 5/7, which is better? I think if 6, the ">50%" meaning finding at least 4 nodes in same results, which is same as 7 replicas. 7 makes "50%" more conceivable, but 5 saves more storage.

Revision history for this message
Ksenia Svechnikova (kdemina) wrote :
Download full text (5.8 KiB)

Hi, Clay,

Thank you for answer.
To test performance with a number of regions, I reconfigured SAIO. Now there are 6 devices, 3 regions, 3 replicas.
There are delete errors and with 3 replica, when affinity is enable:

swift@regions:/home/swift/scenario$ swift-ring-builder /etc/swift/object.builder
/etc/swift/object.builder, build version 6
262144 partitions, 3.000000 replicas, 3 regions, 6 zones, 6 devices, 0.00 balance
The minimum number of hours before a partition can be reassigned is 1
Devices: id region zone ip address port replication ip replication port name weight partitions balance meta
             0 1 1 127.0.0.1 6010 127.0.0.1 6010 sdb1 1.00 131072 0.00
             1 1 2 127.0.0.1 6020 127.0.0.1 6020 sdb2 1.00 131072 0.00
             2 2 3 127.0.0.1 6030 127.0.0.1 6030 sdb3 1.00 131072 0.00
             3 2 4 127.0.0.1 6040 127.0.0.1 6040 sdb4 1.00 131072 0.00
             4 3 5 127.0.0.1 6050 127.0.0.1 6050 sdb5 1.00 131072 0.00
             5 3 6 127.0.0.1 6060 127.0.0.1 6060 sdb6 1.00 131072 0.00

swift-bench 2013-07-17 17:34:57,226 INFO Auth version: 1.0
swift-bench 2013-07-17 17:34:59,249 INFO 44 DEL [0 failures], 21.9/s
swift-bench 2013-07-17 17:35:17,039 INFO 123 DEL [0 failures], 6.2/s
swift-bench 2013-07-17 17:35:32,220 INFO 254 DEL [0 failures], 7.3/s
swift-bench 2013-07-17 17:35:51,434 INFO 287 DEL [1 failures], 5.3/s
swift-bench 2013-07-17 17:36:11,622 INFO 296 DEL [10 failures], 4.0/s
swift-bench 2013-07-17 17:36:32,089 INFO 306 DEL [19 failures], 3.2/s
swift-bench 2013-07-17 17:36:56,990 INFO 316 DEL [29 failures], 2.6/s
swift-bench 2013-07-17 17:37:12,107 INFO 596 DEL [32 failures], 4.4/s
swift-bench 2013-07-17 17:37:24,045 INFO 1000 DEL **FINAL** [32 failures], 6.8/s
swift-bench 2013-07-17 17:37:24,045 INFO Auth version: 1.0

swift@regions:/home/swift/scenario$ sudo tail -n 20 /var/log/swift/proxy.error
Jul 17 17:36:49 regions proxy-server Handoff requested (3) (txn: txad01b791de4846b69fe54-0051e69de7) (client_ip: 127.0.0.1)
Jul 17 17:36:50 regions proxy-server ERROR with Object server 127.0.0.1:6030/sdb3 re: Trying to DELETE /AUTH_test/04fbc0d8b76c46e09546c0bbffcae431_1/0508c01c397a4734b44d7d9bf2c77c5d: Timeout (10s) (txn: tx1134b5bfbf9f40d6bf01b-0051e69de8)
Jul 17 17:36:50 regions proxy-server Handoff requested (1) (txn: tx1134b5bfbf9f40d6bf01b-0051e69de8) (client_ip: 127.0.0.1)
Jul 17 17:36:50 regions proxy-server ERROR with Object server 127.0.0.1:6020/sdb2 re: Trying to DELETE /AUTH_test/04fbc0d8b76c46e09546c0bbffcae431_1/0508c01c397a4734b44d7d9bf2c77c5d: Timeout (10s) (txn: tx1134b5bfbf9f40d6bf01b-0051e69de8)
Jul 17 17:36:50 regions proxy-server Handoff requested (2) (txn: tx1134b5bfbf9f40d6bf01b-0051e69de8) (client_ip: 127.0.0.1)
Jul 17 17:36:50 regions proxy-server ERROR with Object server 127.0.0.1:6060/sdb6 re: Trying to DELETE /AUTH_test/04fbc0d8b76c46e09546c0bbffcae431_1/0508c01c397a4734...

Read more...

Revision history for this message
clayg (clay-gerrard) wrote :

No I don't think changing any timeout setting with swift-bench would have an effect.

It's not really swift-bench that was timing out - it just got a bad response from the proxy. It's actually hard to say exactly what happened from only the .error log

I'm not sure really what you're seeing, timeout normally means a network partition or overloaded server. Maybe one of the object servers was misconfigured and didn't start? Do functests pass against this configuration?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/37865

Changed in swift:
assignee: nobody → Ksenia Demina (kdemina)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/37865
Committed: http://github.com/openstack/swift/commit/600e9c86722301cbe881f8a7e8a9706bb41ffacc
Submitter: Jenkins
Branch: master

commit 600e9c86722301cbe881f8a7e8a9706bb41ffacc
Author: Ksenia Demina <email address hidden>
Date: Fri Jul 19 14:32:55 2013 +0400

    Add delay in swift-bench

    With enable write affinity, it's necessary to wait until
    replication has moved things to their proper homes before
    running delete request. With write affinity turned on, only
    nodes in local region will get the object right after PUT request.

    Fix bug #1198926

    Change-Id: I3aa8933d45c47a010ae05561e12176479e7c9bcc

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/ec)

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/45897

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/ec)
Download full text (22.6 KiB)

Reviewed: https://review.openstack.org/45897
Committed: http://github.com/openstack/swift/commit/2d5210da89380ba5892f56eb4326082ef3e448d2
Submitter: Jenkins
Branch: feature/ec

commit ce12d66cf970aec6dbbf0dc84500665c62638cd5
Author: Clay Gerrard <email address hidden>
Date: Tue Jul 30 11:44:11 2013 -0700

    fix swift i18n

    Change-Id: I53cea28a6d7593a1b308dbcf77dddf7f40d76cb2

commit 54d5f3bde9c8685a2f5c238fc6af4162a9dae01b
Author: Alex Gaynor <email address hidden>
Date: Mon Sep 9 14:49:39 2013 -0700

    Fixed a suite that was over-indented

    Change-Id: I3d05b29e57b77c3751d9f5ff694085bd082e8eb1

commit 537626ac6b79679ea775a84266901545d1a6a864
Author: Alex Gaynor <email address hidden>
Date: Mon Sep 9 09:58:26 2013 -0700

    Don't stat the path in ``unlink_older_than``

    ``listdir`` already handles the ENOENT and returns an empty list in
    that case.

    Change-Id: I597d7ffa9979f668a856519062839505d26129f2

commit 816c73e0151f5d75e591e8bf490974855037ea58
Author: Dirk Mueller <email address hidden>
Date: Sat Sep 7 16:29:15 2013 +0200

    Add Apache 2.0 licensing headers

    Change-Id: I38fae2a78b2369a897b7f298c1aead9b963bf7c9

commit 00f9d718d2d746ca8664290b79c852bb91fca1cc
Author: Dirk Mueller <email address hidden>
Date: Fri Aug 30 23:56:55 2013 +0200

    Move string expansion outside localisation (H702)

    String expansion should be done outside localisation call (_()),
    otherwise there will never be a matching string found in the
    catalogue.

    Also enable gating on this Hacking check (H702).

    Change-Id: Ie7c89fbfb52629e75d5e68e9afda8bcf50bf4cdd

commit 3d36a76156a5080f29aa32f6fce019d7b4f1e18b
Author: Dirk Mueller <email address hidden>
Date: Wed Aug 28 21:16:08 2013 +0200

    Use Python 3.x compatible except construct

    except x,y: was deprected and is removed in Python 3.x.
    Use "except x as y:" instead which works in any Python
    version >= 2.6.

    Change-Id: I7008c74b807340f3457d3a0c8bd0b83f23169d14

commit 3102ad48d56ec2df00da66307191a9dd711e4784
Author: Dirk Mueller <email address hidden>
Date: Sat Sep 7 10:14:00 2013 +0200

    Do not use locals() for string formatting (H501)

    Fixes a warning triggered by Hacking 0.7.x or newer. There
    is no need to use a positional string formatting here, since
    this is not going to be localized.

    Change-Id: Ie38d620aecb0b48cd113af45cc9ca0d61f8f8ff1

commit 698023f477e2b66a1463aeb7082ef4547d0e6dd7
Author: Peter Portante <email address hidden>
Date: Tue Sep 3 11:10:22 2013 -0400

    Provide a method for retrieving on-disk metadata

    We hide the internal dictionary for the metadata providing a method to
    retrieve it to abstract away the implementation details of how
    DiskFile object provides and maintains that metadata.

    This is in anticipation of the DiskFile API refactoring.

    Change-Id: I1c0dc01a4680bd435512405e2d31fba24421720a
    Signed-off-by: Peter Portante <email address hidden>

commit 9d98070f7b2e06ac8cc30d12523798df4418eed0
Author: Peter Portante <email address hidden>
Date: Tue Sep 3 16:34:09 2013 -0400

    Remove reference to 'file' built-in...

Thierry Carrez (ttx)
Changed in swift:
status: Fix Released → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.10.0-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in swift:
milestone: 1.10.0-rc1 → 1.10.0
no longer affects: python-swiftclient
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.