Chunk has: -- raw digest (always) -- file (many) -- offset (many) -- length (many) -- enc digest (many) Chunk description in .bracup file: Chunks: offset;raw_length;stored_length;typed_digest Proposal to change: Chunks: offset;raw_length;stored_length;typed_digest(stored);typed_digest(raw);flags Where flags is comma separate list of \w+. e.g. "gz" for gzip compression ---- PositionedChunk (subclass of RawChunk) - has a: file offset RawChunk - used by: $file->foreach_chunk(sub { my $poschunk = shift; }); restoring stuff? - can: write back to disk? RawChunk - has a: length digest contents - used by: positionedchunk. ChunkHandle - has a: digest of stored chunk - used by: return value from asking target if it has a raw chunk, or after it stores a raw chunk. StoredChunk ------ Document: purpose of chunks named by their final digest is twofold: 1) can verify integrity of storage medium. is it corrupt? 2) hides proof of ownership of contents (when encrypted) side-effect: -- when we do compression, we'll be consistent and store it as its compressed digest, even if not encrypted as well. Maybe we don't need per-chunk meta files: -- can get it all from .brackup (meta)files on the server. (TODO: abstract out parser for multiple users) --- smart chunk-sizing on certain files w/ metadata and data separate: like mp3 files and their id3. have a smart chunker that's the data part vs. the id3 part, so updating id3 later doesn't reupload the entire data part. :-)