Weighted Random Sampling over Data Streams
Abstract
In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss sampling with and without replacement and show adaptations of the algorithms for several WRS problems and evolving data streams.
Code References
ClickHouse/ClickHouse
1 file
src/Common/WeightedRandomSampling.h
1
/// The implementation uses the A-ES method from the paper https://arxiv.org/pdf/1012.0256
cockroachdb/cockroach
1 file
pkg/kv/kvserver/split/weighted_finder.go
1
// https://arxiv.org/pdf/1012.0256.pdf.
Link copied to clipboard!