Are you thinking about adding stream support? I.e something along the lines of i) build up efficient vocabulary up front for the whole data and then ii) compress by chunks, so it can be decompressed by chunks as well. This is important for seeking in data and stream processing.
Yes, definitely! Chunking support is currently in development. Streaming and seeking and so on are features we will certainly pursue as we mature towards an eventual v1.0.0.
Great! I find apache arrow ipc as the most sensible format I found how to organise stream data. Headers first, so you learn what data you work with, columnar for good simd and compression, deeply nested data structures supported. Might serve as an inspiration.