simplestreams by-hash storage

Bug #1281767 reported by Scott Moser
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
simplestreams
Confirmed
Undecided
Unassigned
simplestreams (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

[Description]
Currently, simplestreams is is able to mirror a source to a local filesystem by paths listsed.
For example:
 sstream-mirror --keep --max=1 http://cloud-images.ubuntu.com/releases/ out.d

That will mirror content from the download stream into out.d/ with paths like server/releases/saucy/release-20140129/ubuntu-13.10-server-cloudimg-amd64.tar.gz .
http://cloud-images.ubuntu.com/releases/streams/v1/com.ubuntu.cloud:released:download.json

That is safe (per definition of simplestreams format) and external paths are checked for safety.

The issue is that if multiple paths exist to the same sha256 sum, then we'll download those files mutiple times. The files could be listed on completely separate sites or inside of different content on the same site.

This is wasteful if downloads are hundreds of megabytes.

[Changes]

The goal is to just change simplestreams to be able to store "by-hash", and create hardlink or symlink trees from the paths listed in the source.

The format used for the by-hash storage would be just like git's .
 <storage>[A-Z][A-Z]/....

Scott Moser (smoser)
Changed in simplestreams (Ubuntu):
milestone: none → ubuntu-14.04
importance: Undecided → Medium
Scott Moser (smoser)
summary: - FFE: simplestreams by-hash storage
+ simplestreams by-hash storage
Changed in simplestreams:
status: New → Confirmed
Changed in simplestreams (Ubuntu):
status: New → Confirmed
Mathew Hodson (mhodson)
Changed in simplestreams (Ubuntu):
milestone: ubuntu-14.04 → none
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.