8 papers
8 files
10 references

Papers Referenced in This Repository

New cardinality estimation algorithms for HyperLogLog sketches

Otmar Ertl
2017
10 citations
5 references

This paper presents new methods to estimate the cardinalities of data sets recorded by HyperLogLog sketches. A theoretically motivated extension to the original estimator is presented that eliminates the bias for small and large cardinalities. Based on the maximum likelihood principle a second unbia...

Show 2 references in code

The LDBC Social Network Benchmark

Renzo Angles, János Benjamin Antal, Alex Averbuch, Altan Birler, Peter Boncz, Márton Búr, Orri Erlin...
2020
1 reference

The Linked Data Benchmark Council's Social Network Benchmark (LDBC SNB) is an effort intended to test various functionalities of systems used for graph-like data management. For this, LDBC SNB uses the recognizable scenario of operating a social network, characterized by its graph-shaped data. LDBC ...

Show 1 reference in code

A Parallel Space Saving Algorithm For Frequent Items and the Hurwitz zeta distribution

Massimo Cafaro, Marco Pulimeno, Piergiulio Tempesta
2014
4 references

We present a message-passing based parallel version of the Space Saving algorithm designed to solve the $k$--majority problem. The algorithm determines in parallel frequent items, i.e., those whose frequency is greater than a given threshold, and is therefore useful for iceberg queries and many othe...

Show 1 reference in code

Note on a Method for Calculating Corrected Sums of Squares and Products

B. P. Welford
1962
1 reference

"Note on a Method for Calculating Corrected Sums of Squares and Products." Technometrics, 4(3), pp. 419–420

Show 1 reference in code

Cuckoo hashing.

Rasmus Pagh, Flemming Friche Rodler
2004
2 references
Show 1 reference in code

Efficient Evaluation of Arbitrarily-Framed Holistic SQL Aggregates and Window Functions

Adrian Vogelsgesang, Thomas Neumann, Viktor Leis, A. Kemper
2022
1 reference

Window functions became part of the SQL standard in SQL:2003 and are widely used for data analytics: Percentiles, rankings, moving averages, running sums and local maxima are all expressed as window functions in SQL. Yet, the features offered by SQL's window functions lack composability: Framing is ...

Show 1 reference in code

Dynamic programming strikes back

G. Moerkotte, Thomas Neumann
2008
1 reference

Two highly efficient algorithms are known for optimally ordering joins while\navoiding cross products:\nDPccp, which is based on dynamic programming, and Top-Down Partition Search, \nbased on memoization.\nBoth have two severe limitations:\nThey handle only (1) simple (binary) join predicates and (2...

Show 1 reference in code
Link copied to clipboard!