swift-get-nodes results

Bug #867518 reported by Jérémy Guermonprez
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Opinion
Undecided
Xingchao Yu

Bug Description

swift-get-nodes results doesn't seems to be correct :

# swift-get-nodes /etc/swift/object.ring.gz account user file.html

Account account
Container user
Object file.html

Partition 210777
Hash cdd66570cde3189f63879e769a428117

Server:Port Device 192.168.1.2:6000 sdb1
Server:Port Device 192.168.1.3:6000 sdb1
Server:Port Device 192.168.1.4:6000 sdb1
Server:Port Device 192.168.1.5:6000 sdb1 [Handoff]

curl -I -XHEAD "http://192.168.1.2:6000/sdb1/210777/account/user/file.html"
curl -I -XHEAD "http://192.168.1.3:6000/sdb1/210777/account/user/file.html"
curl -I -XHEAD "http://192.168.1.4:6000/sdb1/210777/account/user/file.html"
curl -I -XHEAD "http://192.168.1.5:6000/sdb1/210777/account/user/file.html" # [Handoff]

ssh 192.168.1.2 "ls -lah /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/"
ssh 192.168.1.3 "ls -lah /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/"
ssh 192.168.1.4 "ls -lah /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/"
ssh 192.168.1.5 "ls -lah /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/" # [Handoff]

On the specified hosts, files doesn't exists :

# ls -lah /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/
ls: cannot access /srv/node/sdb1/objects/210777/117/cdd66570cde3189f63879e769a428117/: No such file or directory

The correct path for this file is : /srv/node/sdb1/objects/22005/de6/157d4aa9de9d03c594b14285b43fcde6/

Thanks for your help.

Revision history for this message
Jay Payne (letterj) wrote :

The partitions are completely different. Are you using the same rings everywhere?

Can you do an md5sum on the object.ring.gz files to make sure they are the same?

Thanks
--J

Revision history for this message
Jérémy Guermonprez (jguermonprez) wrote :

Yes, rings are the same everywhere, i've just check with md5sum.
I also reproduced the problem on another installation.

Revision history for this message
Anton Altaparmakov (aia21) wrote :

Just wanted to say that we get the same results. I have verified that /etc/swift/object.ring.gz has same md5 sum both on the proxy server and on one of the storage servers. This has never worked by the way. This happens both on 1.4.2 and 1.4.3. I just updated to 1.4.3-0ubuntu1~lucid1~ppa1 and tried it again and I get the same failure. I always thought that perhaps I am using the tool incorrectly but it sounds like it is a real bug. FWIW, here is an example (all servers running 1.4.3-0ubuntu1~lucid1~ppa1) where "swift" is the proxy server (10.10.10.1) and "swift-store1 (10.10.10.2)" to "swift-store5 (10.10.10.6) are the storage servers:

root@swift:~# swift-get-nodes /etc/swift/object.ring.gz uxsup UxSup-Test-Container mytest.txt

Account uxsup
Container UxSup-Test-Container
Object mytest.txt

Partition 127243
Hash 7c42e15e7feed75108596bfa3e67b446

Server:Port Device 10.10.10.2:6000 sdf1
Server:Port Device 10.10.10.5:6000 sdd1
Server:Port Device 10.10.10.4:6000 sdj1
Server:Port Device 10.10.10.6:6000 sdb3 [Handoff]
Server:Port Device 10.10.10.3:6000 sdl1 [Handoff]

curl -I -XHEAD "http://10.10.10.2:6000/sdf1/127243/uxsup/UxSup-Test-Container/mytest.txt"
curl -I -XHEAD "http://10.10.10.5:6000/sdd1/127243/uxsup/UxSup-Test-Container/mytest.txt"
curl -I -XHEAD "http://10.10.10.4:6000/sdj1/127243/uxsup/UxSup-Test-Container/mytest.txt"
curl -I -XHEAD "http://10.10.10.6:6000/sdb3/127243/uxsup/UxSup-Test-Container/mytest.txt" # [Handoff]
curl -I -XHEAD "http://10.10.10.3:6000/sdl1/127243/uxsup/UxSup-Test-Container/mytest.txt" # [Handoff]

ssh 10.10.10.2 "ls -lah /srv/node/sdf1/objects/127243/446/7c42e15e7feed75108596bfa3e67b446/"
ssh 10.10.10.5 "ls -lah /srv/node/sdd1/objects/127243/446/7c42e15e7feed75108596bfa3e67b446/"
ssh 10.10.10.4 "ls -lah /srv/node/sdj1/objects/127243/446/7c42e15e7feed75108596bfa3e67b446/"
ssh 10.10.10.6 "ls -lah /srv/node/sdb3/objects/127243/446/7c42e15e7feed75108596bfa3e67b446/" # [Handoff]
ssh 10.10.10.3 "ls -lah /srv/node/sdl1/objects/127243/446/7c42e15e7feed75108596bfa3e67b446/" # [Handoff]

None of those directories exist on any of the storage nodes and in fact the file is not actually there at all on 10.10.10.2 (swift-store1) for example where it is claimed to be!

I have located it on 10.10.10.3 (swift-store2) even though that is claimed to be a "Handoff" location. I found it here:

root@swift-store2:/srv/node/sdk1/objects/56481/ed0/37287da13f5a25c4027965bd942b0ed0# ls -l
total 4
-rw------- 1 swift swift 33 2011-10-05 14:41 1317822065.97936.data

I know it is the correct file as I did "cat 1317822065.97936.data" and it contains the string I wrote into the text file before uploading it.

This is only a test installation to play around with / evaluate openstack swift, so I am very happy to try things or if nothing else works I can give you ssh access to the server though I expect you can replicate the bug locally very easily given we are running a completely default openstack swift installation and have followed the online installation documentation to the dot...

Best regards,

Anton

Revision history for this message
Jay Payne (letterj) wrote :

Thanks for your response. We are talking about a multi-server environment and not an SAIO, right?

I'll so some testing today on our test clusters and see what if I can duplicate the problem

Revision history for this message
Jérémy Guermonprez (jguermonprez) wrote :

Yes, I'm in a multi-server environnement.

Revision history for this message
Anton Altaparmakov (aia21) wrote :

Yes, 6 physical servers. One proxy server that just runs swift-proxy-server and five storage servers, (four with 16 storage disks given over to swift and one with six disks given over to swift). A cumulative storage capacity of 25TiB. I know it's not much but we assembled it out of scrap pieces of hardware we found kicking around the machine room just to play around with the openstack swift software...

Best regards,

Anton

Revision history for this message
Anton Altaparmakov (aia21) wrote :

Oh, and I forgot to say in case it matters, each of the five storage nodes is one ring thus we have five rings in our swift installation and replication is set to three copies thus all data should be on three out of the five storage nodes.

Best regards,

Anton

Revision history for this message
crayon_z (crayon-z) wrote :

Is this bug fixed? I'm confronted with this problem as well. Does anyone have any idea to solve this problem?

Revision history for this message
mhy (mhy19) wrote :

I also have met this problem in my two different swift environment.
the fist one: ubuntu10.04 + swift1.2.0
the second one: ubuntu11.10 + swift1.4.3

Revision history for this message
Bob Souza (bob-souza) wrote :

The 'account' argument for swift-get-nodes is the *hash* of the user's account, usually something that starts with 'AUTH_'.

Perhaps you are using the account *name* rather than the account *hash*?

Revision history for this message
Xingchao Yu (yuxcer) wrote :

Hey,it's not a bug.Please add a reseller_prefix before your account,for example:
assume you're using tempauth,and have a account called system,the real account name is AUTH_system

swift-get-nodes /etc/swift/account.ring.gz AUTH_system test etc/swift/proxy-server.conf

Account AUTH_system
Container test
Object etc/swift/proxy-server.conf

Partition 87200
Hash 552810c4f98f18a7dd34e4d1cb424346

If you just use system as a account name ,you will get a wrong partition:

root@sws50:/srv/node/sdc/objects/87200/346/552810c4f98f18a7dd34e4d1cb424346# swift-get-nodes /etc/swift/account.ring.gz system test etc/swift/proxy-server.conf

Account system
Container test
Object etc/swift/proxy-server.conf

Partition 37186
Hash 245090f61e02f1488feadc4b6c30c9c8

The reason it occured is swift use 'Auth_system' as account_name save in the account db.You can find this in table account_stat.And I suggest that it should give a more description about the usage of swift-get-nodes,the currently easlily misleads people

Changed in swift:
status: New → Invalid
status: Invalid → Opinion
Xingchao Yu (yuxcer)
Changed in swift:
assignee: nobody → yuxcer (yuxcer)
Revision history for this message
Jérémy Guermonprez (jguermonprez) wrote :

Hi,

Thank you for your response.
In fact it works with "account_id" not with "reseller_prefix_account_name".

Revision history for this message
mhy (mhy19) wrote :

Hi Bob,
Thanks for the reply.
I have got the right place information of object\container\account with the way you offered. thank you very much!

Revision history for this message
Sergio Rubio (rubiojr) wrote :

Took me a while to figure out that the account name must be AUTH + hash when using keystone.

A more descriptive output from "swift-get-node --help", indicating this would be great.

Revision history for this message
Kun Huang (academicgareth) wrote :

You're right. It need more helpful output. But a related solution is here https://bugs.launchpad.net/swift/+bug/907908

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.