def _create_z_content_from_chunks(self, chunks):
compressor = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION)
# Peak in this point is 1 fulltext, 1 compressed text, + zlib overhead
# (measured peak is maybe 30MB over the above...) compressed_chunks = map(compressor.compress, chunks) compressed_chunks.append(compressor.flush())
# Ignore empty chunks self._z_content_chunks = [c for c in compressed_chunks if c] self._z_content_length = sum(map(len, self._z_content_chunks))
The whole GroupCompressClass class is based on a format that has a compressed blocks preceded by the length of compressed data. I don't think we can compatibly remove the per-file length prefix. However, what we could probably do is spill the compressed content to a temporary file, trading off disk for memory pressure. Then we'll know the length when we're done compressing, and we can copy from the temporary file out to the actual pack. We need to then just be careful not to read the whole file.
For local storage we could leave space for the count, write out the compressed content, and then seek back, but that obviously won't work when sending these things across the network, and it might be poor over some transports.
In passing, https:/ /code.launchpad .net/~mbp/ bzr/remove- pylzma/ +merge/ 82097 can clean this up a bit.
def _create_ z_content_ from_chunks( self, chunks): j(zlib. Z_DEFAULT_ COMPRESSION)
compressed_ chunks = map(compressor. compress, chunks)
compressed_ chunks. append( compressor. flush() )
self._ z_content_ chunks = [c for c in compressed_chunks if c]
self._ z_content_ length = sum(map(len, self._z_ content_ chunks) )
compressor = zlib.compressob
# Peak in this point is 1 fulltext, 1 compressed text, + zlib overhead
# (measured peak is maybe 30MB over the above...)
# Ignore empty chunks
The whole GroupCompressClass class is based on a format that has a compressed blocks preceded by the length of compressed data. I don't think we can compatibly remove the per-file length prefix. However, what we could probably do is spill the compressed content to a temporary file, trading off disk for memory pressure. Then we'll know the length when we're done compressing, and we can copy from the temporary file out to the actual pack. We need to then just be careful not to read the whole file.
For local storage we could leave space for the count, write out the compressed content, and then seek back, but that obviously won't work when sending these things across the network, and it might be poor over some transports.