Sahara + swift + DLO : file not found
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Sahara |
New
|
Undecided
|
Unassigned |
Bug Description
I'm using Liberty. I use the sahara 4.0.0 from the centos7 repo. This is a vanilla installation, except with the hadoop-swift jar file : I downloaded a quite recent version since I had a previous issue fixed with this release (Socket timeout while accessing swift). When I try to process a large file from swift, I have a FileNotFound exception.
Step to reproduce :
-------------------
In swift, upload a big file splitted into severals 1GB segment. I use the following argument with my jar file :
Input source file = swift:/
the saveAsHadoop method should download all segments in 2mass_segments container as specified in the 2mass.csv manifest, but java exits with the following error : Exception in thread "main" java.io.
swift:/
It seems that the path should be :
swift:/
Informations :
--------------
The hadoop-swift MD5 :
ubuntu@
cbe1478523ba794
Content of my swift container :
-------
ubuntu@
2mass
2mass_segments
ubuntu@
0 2016-10-26 08:03:48 None 2mass.csv
0
Header :
-------
ubuntu@
Account: v1
Container: 2mass
Object: 2mass.csv
Content Type: binary/octet-stream
Content Length: 58G
Last Modified: Wed, 26 Oct 2016 08:03:48 GMT
ETag: "c1a3b941c9a1a4
Manifest: 2mass_segments/
Meta Mtime: 1476283376.280434
Accept-Ranges: bytes
X-Timestamp: 1477469028.93125
X-Trans-Id: tx0000000000000
Spark command line (as sahara build it):
-------
ubuntu@
2016-11-07 13:49:13,329 INFO Running /opt/spark/
.openstack.
ml cds.xmatch.
cat stderr :
------------
<---- SNIP --->
16/11/07 13:49:18 INFO scheduler.
Exception in thread "main" java.io.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
at org.apache.
<--- SNIP --->
Thanks everyone
Is it a duplicate of https:/ /bugs.launchpad .net/sahara/ +bug/1593663 ?