Though I think it's only strictly true, if the intervals you sample over are the same. Eg they both sample some messages every second, and the all start their second-long intervals on the same nanosecond (or close enough).
I find it easier to reason about reservoir sampling in an alternative formulation: the article talks about flipping a random (biased) coin for each arrival. Instead we can re-interpret reservoir sampling as assigning a random priority to each item, and then keeping the items with the top k priority.
It's fairly easy to see in this reformulation whether specific combinations of algorithms would compose: you only need to think about whether they would still select the top k items by priority.
The second formulation sounds easier to use to adapt to specific use cases too: just bump the priority of a message based on your business rules to make it more likely that interesting events get to your log database.
You could do (category, random priority) and then do lexicographic comparison. That way higher categories always outrank lower categories.
But depending on what you need, you might also just do (random priority + weight * category) or so. Or you just keep separate reservoirs for high importance items and for everything else.
I would expect any way to get a truly fair sample from a truly fair sample would necessarily result in a truly fair sample. I can't imagine how it could possibly not.
In the first instance, every second we get a 'truly fair' random sample from all the messages in that second.
Going from there to eg a 'truly fair' random sample from all the messages in a minute is not trivial. And it's not even possible just from the samples, without auxiliary information.
I find it easier to reason about reservoir sampling in an alternative formulation: the article talks about flipping a random (biased) coin for each arrival. Instead we can re-interpret reservoir sampling as assigning a random priority to each item, and then keeping the items with the top k priority.
It's fairly easy to see in this reformulation whether specific combinations of algorithms would compose: you only need to think about whether they would still select the top k items by priority.