Cannot Download Large Object via CLI (radosgw)

Bug #1482888 reported by Jerome Bell
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Fix Released
High
Radoslaw Zarzynski
5.1.x
Won't Fix
High
MOS Maintenance
6.0.x
Won't Fix
High
MOS Maintenance
6.1.x
Won't Fix
High
Alex Ermolov
7.0.x
Won't Fix
High
Radoslaw Zarzynski
8.0.x
Fix Released
High
Rodion Tikunov

Bug Description

Hi, I attempted to download a large file (7.6 GB) via the command line and I get the following error

Object GET failed: http://192.168.2.3:6780/swift/v1/Disc%20Images/HP%20Pavilion%20dv7-6c95dx%20Recovery%20Disc%20%281%20of%203%29.ISO 500 Internal Server Error [first 60 chars of response] <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><he

I am running MOS 5.1. I'm not sure where to troubleshoot next and any help is greatly appreciated. Thanks!

tags: added: customer-found
Changed in mos:
importance: Undecided → High
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Hi Jerome,

probably, you hit the bug https://bugs.launchpad.net/mos/+bug/1481321, could you please try to apply the workaround for the issue and try to reproduce the issue?

how many free disk space (need to attach 'df -h' output) you have on the controller?

Also, could you please attach Glance logs or Fuel diagnostic snapshot?

Thank you!

Changed in mos:
status: New → Incomplete
milestone: none → 5.1-updates
assignee: nobody → Jerome Bell (jeromebell)
milestone: 5.1-updates → none
Revision history for this message
Jerome Bell (jeromebell) wrote :

I have 42 GB of free disk space. I am strictly using Swift (backed by Ceph) and not Glance at all. I don't think the workaround is relevant since it concerns upload where I am trying to DOWNLOAD a large file. I only have 3 GB RAM on the controller and 2 GB is always in use. Could it be that for DOWNLOAD there is in memory chunking as well and thus it fails?

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Hi MOS Swift team, could you please check the issue and prepare the fix or describe the workaround for the issue?

Thank you!

Looks like this bug reproduced on the customer environment with MOS 5.1.1, it means that we need to track bug fix for update releases too, added series with MOS 5.1.1 updates, MOS 6.0 updates and MOS 6.1 updates.

Changed in mos:
assignee: Jerome Bell (jeromebell) → MOS Swift (mos-swift)
milestone: none → 7.0
status: Incomplete → Won't Fix
status: Won't Fix → Confirmed
Revision history for this message
Alexey Khivin (akhivin) wrote :

@Jerome

Can you please provide more information
> I am strictly using Swift (backed by Ceph)
Do you mean you are using radosgw instead of Swift?
Swift does not use Ceph as backend but you may use swift-client with RadosGw which emulates SwiftAPI

Changed in mos:
status: Confirmed → Incomplete
Revision history for this message
Alexey Khivin (akhivin) wrote :

@Jerome

I would highly appreciated if you provide exact commands and Swift configs

Revision history for this message
Alexey Khivin (akhivin) wrote :

@Timur
Can you provide more information about how you reproduced this bug? Do you use Radosgw or Swift?

I tried to download 10Gb from Swift and succeeded

How file was uploaded? Is it was image uploaded by Glance or it was a file uploaded by Swift-client?

Revision history for this message
Alexey Khivin (akhivin) wrote :

@Jerome @Timur
Do you trying to download file which uploaded a moment ago or it was a file uploaded some time ago?

Could you provide a list of chunks for a file you cannot download?

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Hi,

looks like we have the following issue with default configuration of Swift+Glance: during the image upload swift stores all chunks in /srv directory, which is mounted on the root (/) partition. By default we have small amount of disk space for the root partition and in the result, when user tries to store large file, it leads to the errors because user has no free disk space on /srv disk, and, after the first upload fail, we can't store other images as well, because all chunks after failed upload will be not removed.

More that that, probably, it is another issue with Swift, @Jerome, could you please attach Swift and Glance logs to the bug description, it is hard to reproduce and investigate the issue without any information.

Revision history for this message
Jerome Bell (jeromebell) wrote :

I am using radosgw instead of Swift.

The file was uploaded with swift-client using 2 GB chunks (I don't remember the verbatim command since it was uploaded many months ago). There was never any error uploading the file.

Where can I find Swift logs? (Are Glance logs relevant since I did not use Glance at all to upload the file and I am not using Glance at all to download the file)

Revision history for this message
Alexey Khivin (akhivin) wrote :

So, there is no swift in this issue. Reassigned to Ceph command. It looks like RadosGW issue

Changed in mos:
assignee: MOS Swift (mos-swift) → MOS Ceph (mos-ceph)
Revision history for this message
Jerome Bell (jeromebell) wrote :

The exact command (with password removed) is

 swift -v -V 2.0 -A http://192.168.2.3:5000/v2.0/ -U Jerome:jerome -K ###### download "Disc Images" "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO"

Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

Hello Jerome,

Could you please take a look on radosgw log (in MOS 5.1.1 it should be located in /var/log/radosgw/ceph-client.radosgw.gateway) and send some parts of it? We are especially interested in fragments illustrating the failed operations. They could be recognized through "http_status=500" in closing marker. Example:

2015-08-18 13:24:45.476762 7fa79ffaf700 1 ====== starting new request req=0x7fa758015d20 =====
2015-08-18 13:24:45.476994 7fa79ffaf700 <a lot of useful information between here>
2015-08-18 13:24:45.514256 7fa79ffaf700 1 ====== req done req=0x7fa758015d20 http_status=500 =====

Could you also provide us with list of segments for failed object? You may obtain in through listing container called "Disc Images_segments".

Changed in mos:
assignee: MOS Ceph (mos-ceph) → Radoslaw Zarzynski (rzarzynski)
Revision history for this message
Jerome Bell (jeromebell) wrote :

The "/varlog/radosgw" directory is empty. I actually misplaced the VM containing MOS, is there a way to confirm the MOS version on the controller?

swift -v -V 2.0 -A http://192.168.2.3:5000/v2.0/ -U Jerome:jerome -K ##### list "Disc Images_segments"

yields

HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO/1349485243.680366/8192393216/2147483648/00000000
HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO/1349485243.680366/8192393216/2147483648/00000001
HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO/1349485243.680366/8192393216/2147483648/00000002
HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO/1349485243.680366/8192393216/2147483648/00000003
HP Pavilion dv7-6c95dx Recovery Disc (2 of 3).ISO/1349487161.213882/8400273408/2147483648/00000000
HP Pavilion dv7-6c95dx Recovery Disc (2 of 3).ISO/1349487161.213882/8400273408/2147483648/00000001
HP Pavilion dv7-6c95dx Recovery Disc (2 of 3).ISO/1349487161.213882/8400273408/2147483648/00000002
HP Pavilion dv7-6c95dx Recovery Disc (2 of 3).ISO/1349487161.213882/8400273408/2147483648/00000003

Revision history for this message
Jerome Bell (jeromebell) wrote :

Correction, the "/var/log/radosgw" directory is empty.

Changed in mos:
status: Incomplete → In Progress
Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

It looks that the problem is caused by lack of url decoding of values (container name and object prefix) stored in X-Object-Manifest attribute. The problem has been replicated even on current master branch of Ceph. I've sent bug report [1] and prepared pull request [2] with initial, early proposal of the fix.

[1] http://tracker.ceph.com/issues/12728
[2] https://github.com/ceph/ceph/pull/5617

Revision history for this message
Jerome Bell (jeromebell) wrote :

Is there a workaround?

Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

Hi Jerome,

sorry for late reply. The pull request has been merged and will be
backported to Hammer.

I think that a workaround exists. You might try to delete currently
present DLO (but not its segments) and then reupload it with
non-urlencoded X-Object-Manifest header.

I used commands similar to these:

curl -i "$publicURL/Disc Images/HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" -X DELETE -H "X-Auth-Token: $token"

curl -i "$publicURL/Disc Images/HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" -X PUT -H "X-Object-Manifest: Disc Images_segments/HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO/1349485243.680366/8192393216/2147483648/" -H "X-Auth-Token: $token"

However, you must be aware that HTTP server might affect this solution.
I did a try on nginx, not Apache.

Changed in mos:
status: In Progress → Fix Committed
Revision history for this message
Jerome Bell (jeromebell) wrote :

What value should I use for "$publicURL"?

Revision history for this message
Jerome Bell (jeromebell) wrote :

And what value should I use for "$token"?

Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

http://<controller_ip>:6780/swift/v1

tags: added: on-verification
Revision history for this message
Alexander Petrov (apetrov-n) wrote :

I was able to reproduce this bug on mos-70 build 286.
Env: Controller+Ceph - 1, Compute + Ceph - 1, Neutron with VLAN segmentation

{"build_id": "286", "build_number": "286", "release_versions": {"2015.1.0-7.0": {"VERSION": {"build_id": "286", "build_number": "286", "api": "1.0", "fuel-library_sha": "ff63a0bbc93a3a0fb78215c2fd0c77add8dfe589", "nailgun_sha": "5c33995a2e6d9b1b8cdddfa2630689da5084506f", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "1ce8ecd8beb640f2f62f73435f4e18d1469979ac", "astute_sha": "8283dc2932c24caab852ae9de15f94605cc350c6", "fuel-ostf_sha": "1f08e6e71021179b9881a824d9c999957fcc7045", "release": "7.0", "fuelmain_sha": "9ab01caf960013dc882825dc9b0e11ccf0b81cb0"}}}, "auth_required": true, "api": "1.0", "fuel-library_sha": "ff63a0bbc93a3a0fb78215c2fd0c77add8dfe589", "nailgun_sha": "5c33995a2e6d9b1b8cdddfa2630689da5084506f", "feature_groups": ["mirantis"], "fuel-nailgun-agent_sha": "d7027952870a35db8dc52f185bb1158cdd3d1ebd", "openstack_version": "2015.1.0-7.0", "fuel-agent_sha": "082a47bf014002e515001be05f99040437281a2d", "production": "docker", "python-fuelclient_sha": "1ce8ecd8beb640f2f62f73435f4e18d1469979ac", "astute_sha": "8283dc2932c24caab852ae9de15f94605cc350c6", "fuel-ostf_sha": "1f08e6e71021179b9881a824d9c999957fcc7045", "release": "7.0", "fuelmain_sha": "9ab01caf960013dc882825dc9b0e11ccf0b81cb0"}

Steps to reproduce:
1. create container "Disc Images"

2. create file "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" by the command:

dd if=/dev/zero of="HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" bs=1000M count=8

3. upload it to container "Disc Images"

swift upload "Disc Images" "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" -S 1000000000

4. download it by the command:

swift download "Disc Images" "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO"

I get the result:

Error downloading object 'Disc Images/HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO': Object GET failed: http://192.168.0.2:8080/swift/v1/Disc%20Images/HP%20Pavilion%20dv7-6c95dx%20Recovery%20Disc%20%281%20of%203%29.ISO 500 Internal Server Error [first 60 chars of response] <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><he

Changed in mos:
status: Fix Committed → Confirmed
tags: removed: on-verification
Changed in mos:
assignee: Radoslaw Zarzynski (rzarzynski) → Mike Fedosin (mfedosin)
Changed in mos:
milestone: 7.0 → 7.0-updates
Revision history for this message
Mike Fedosin (mfedosin) wrote :

Seriously - I have no idea why this bug was assigned to me. There were direct swift requests with swift CLI without any glance usage.

I'm going to assign this bug to mos-swift.

Revision history for this message
Alexey Khivin (akhivin) wrote :

please, read https://bugs.launchpad.net/mos/+bug/1482888/comments/15 before assign to the Swift-team
There is no Swift in this ticket but RadosGW

tags: added: radosgw
summary: - Cannot Download Large Object via CLI
+ Cannot Download Large Object via CLI (radosgw)
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Radoslaw, please review the fix and backport to 7.0 if applicable. If not, please update the status accordingly (Won't Fix or Invalid).

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Moved out of 7.0-mu-1 scope per feedback from Radoslaw

Egor Kotko (ykotko)
tags: added: on-verification
Egor Kotko (ykotko)
tags: added: on-verificatin
removed: on-verification
Revision history for this message
Egor Kotko (ykotko) wrote :
Revision history for this message
Egor Kotko (ykotko) wrote :
Egor Kotko (ykotko)
tags: removed: on-verificatin
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Egor, are you sure it's the same issue? Looks like in your paste there is error with *upload*, not *download*

tags: added: area-ceph
Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

The fix for the bug has been merged with Ceph's master in August.
It has been backported to Hammer LTS branch quite recently [1] and
thus will be a part of incoming Ceph Hammer v0.94.6 (not released
yet).

IMO backporting it manually to the Ceph v0.94.5 we will ship in MOS
8.0 isn't a good solution as each new build must be extensively tested.
Selective patching would also introduce a degree of freedom while
diagnosing problems in the field.

We may release v0.94.6 as a part of 8.0-updates.

I'm also setting proper status for the bug ("Fix Committed" wasn't
adequate; I'm sorry for the burden).

[1] #13513 backport task on Ceph's tracker:
    http://tracker.ceph.com/issues/13513

Changed in mos:
milestone: 8.0 → 9.0
status: Incomplete → Confirmed
tags: added: move-to-mu
tags: added: release-notes
tags: added: 8.0 release-notes-done
removed: release-notes
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

As per comment from Radoslaw (#30) closing this as Won't fix for 5.1.1, 6.0, 6.1 and 7.0.

tags: added: wontfix-risky
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Radoslaw Zarzynski (rzarzynski) wrote :

The fix has been backported to Ceph v0.94.6 [1][2] that will be
shipped in MOS9 [3]. I'm changing the status to "Fix Committed".

[1] http://tracker.ceph.com/issues/13513
[2] http://ceph.com/releases/v0-94-6-hammer-released/
[3] https://review.fuel-infra.org/#/c/17407/

Changed in mos:
status: Confirmed → Fix Committed
Changed in mos:
status: Fix Committed → Fix Released
Revision history for this message
Maksym Shalamov (mshalamov) wrote :

Verified on:

MOS 9.0

cat /etc/fuel_build_id:
 200
cat /etc/fuel_build_number:
 200
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6332.noarch
 fuel-misc-9.0.0-1.mos8291.noarch
 python-packetary-9.0.0-1.mos131.noarch
 fuel-openstack-metadata-9.0.0-1.mos8648.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8648.noarch
 python-fuelclient-9.0.0-1.mos306.noarch
 fuel-9.0.0-1.mos6332.noarch
 fuel-nailgun-9.0.0-1.mos8648.noarch
 rubygem-astute-9.0.0-1.mos738.noarch
 fuel-library9.0-9.0.0-1.mos8291.noarch
 fuel-agent-9.0.0-1.mos272.noarch
 fuel-ui-9.0.0-1.mos2659.noarch
 fuel-setup-9.0.0-1.mos6332.noarch
 nailgun-mcagents-9.0.0-1.mos738.noarch
 shotgun-9.0.0-1.mos87.noarch
 network-checker-9.0.0-1.mos72.x86_64
 fuel-bootstrap-cli-9.0.0-1.mos272.noarch
 fuel-migrate-9.0.0-1.mos8291.noarch
 fuelmenu-9.0.0-1.mos268.noarch
 fuel-notify-9.0.0-1.mos8291.noarch
 fuel-ostf-9.0.0-1.mos924.noarch
 fuel-mirror-9.0.0-1.mos131.noarch
 fuel-utils-9.0.0-1.mos8291.noarch

Revision history for this message
Rodion Tikunov (rtikunov) wrote :

I am trying to reproduce this bug in 8.0 lab. After following steps described in https://bugs.launchpad.net/mos/+bug/1482888/comments/21, I have got a result:
root@node-2:~# swift download "Disc Images" "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO"
HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO [auth 0.788s, headers 1.196s, total 1.197s, 0.000 MB/s]
"HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO" have a zero size.

curl -i "http://192.168.0.2:8080/swift/v1/Disc%20Images/HP%20Pavilion%20dv7-6c95dx%20Recovery%20Disc%20%281%20of%203%29.ISO" -X GET -H "X-Auth-Token: $token"
HTTP/1.1 200 OK
Date: Thu, 01 Sep 2016 14:52:57 GMT
Server: Apache
Content-Length: 0
Content-Type: application/x-iso9660-image

No "Internal Server Error" but file also can't be downloaded because of file's Content-Length = 0.

Revision history for this message
Rodion Tikunov (rtikunov) wrote :

Files without special symbols in name have no problems at uploading and downloading.

Revision history for this message
Rodion Tikunov (rtikunov) wrote :

Proposed patch https://review.fuel-infra.org/#/c/25640/ solves this issue.

tags: added: on-verification
Revision history for this message
Ekaterina Shutova (eshutova) wrote :

Verified on MOS 8.0 mu4. Used scenario from comment #21.
root@node-1:~# swift download "Disc Images" "HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO"
HP Pavilion dv7-6c95dx Recovery Disc (1 of 3).ISO [auth 0.915s, headers 1.207s, total 1.253s, 0.000 MB/s]
root@node-1:~# swift list
Disc Images
Disc Images_segments

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.