💾

Databases

Database engines, storage systems, and query optimization

Repositories

(11)

Papers

(41)

Showing 20 of 41 papers

A contextual-bandit approach to personalized news article recommendation

Lihong Li, Wei Chu, John Langford, Robert E. Schapire

2010

1 reference

Personalized web services strive to adapt their services (advertisements, news articles, etc.) to individual users by making use of both content and user information. Despite a few recent advances, this problem remains challenging for at least two re...

View Paper PDF DOI

A cool and practical alternative to traditional hash tables

Ú. Erlingsson, M. Manasse, Frank McSherry

2006

4 references

View Paper

Algorithms for geodesics

Charles F. F. Karney

2011

3 references

Algorithms for the computation of geodesics on an ellipsoid of revolution are given. These provide accurate, robust, and fast solutions to the direct and inverse geodesic problems and they allow differential and integral properties of geodesics to be...

View Paper PDF DOI

An Adaptive Hash Join Algorithm for Multiuser Environments.

Hansjörg Zeller, J. Gray

1990

1 reference

View Paper

A Study of Replacement Algorithms for Virtual-Storage Computer

L. A. Belady

1966

1 reference

One of the basic limitations of a digital computer is the size of its available memory. 1 In most cases, it is neither feasible nor economical for a user to insist that every problem program fit into memory. The number of words of information in a pr...

View Paper DOI

A Tutorial on Thompson Sampling

Daniel Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen

2017

1 reference

Thompson sampling is an algorithm for online decision problems where actions are taken sequentially in a manner that must balance between exploiting what is known to maximize immediate performance and investing to accumulate new information that may ...

View Paper PDF DOI

Cuckoo hashing.

Rasmus Pagh, Flemming Friche Rodler

2004

2 references

View Paper DOI

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs.

Yu. A. Malkov, D. A. Yashunin

2020

4 references

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search str...

View Paper DOI

Elle: Inferring Isolation Anomalies from Experimental Observations

Kyle Kingsbury, Peter Alvaro

2020

1 reference

Users who care about their data store it in databases, which (at least in principle) guarantee some form of transactional isolation. However, experience shows [Kleppmann 2019, Kingsbury and Patella 2019a] that many databases do not provide the isolat...

View Paper PDF

Less hashing, same performance: Building a better Bloom filter

Adam Kirsch, Michael Mitzenmacher

2006

2 references

Abstract A standard technique from the hashing literature is to use two hash functions h 1 ( x ) and h 2 ( x ) to simulate additional hash functions of the form g i ( x ) = h 1 ( x ) + i h 2 ( x ). We demonstrate that this technique can be usefully a...

View Paper PDF DOI

Low-Latency Sliding-Window Aggregation in Worst-Case Constant Time

Kanat Tangwongsan, Martin Hirzel, Scott Schneider

2017

1 reference

Sliding-window aggregation is a widely-used approach for extracting insights from the most recent portion of a data stream. The aggregations of interest can usually be cast as binary operators that are associative, but they are not necessarily commut...

View Paper DOI

New cardinality estimation algorithms for HyperLogLog sketches

Otmar Ertl

2017

10 citations

5 references

This paper presents new methods to estimate the cardinalities of data sets recorded by HyperLogLog sketches. A theoretically motivated extension to the original estimator is presented that eliminates the bias for small and large cardinalities. Based ...

View Paper PDF DOI

Novel Table Lookup-Based Algorithms for High-Performance CRC Generation

Michael E. Kounavis, Frank L. Berry

2008

2 references

A framework for designing a family of novel fast cyclic redundancy code (CRC) generation algorithms is presented. Our algorithms can ideally read arbitrarily large amounts of data at a time, while optimizing their memory requirement to meet the const...

View Paper DOI

On the correct and complete enumeration of the core search space

Guido Moerkotte, Pit Fender, Marius Eich

2013

1 reference

Reordering more than traditional joins (e.g. outerjoins, antijoins) requires some care, since not all reorderings are valid. To prevent invalid plans, two approaches have been described in the literature. We show that both approaches still produce in...

View Paper DOI

RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search

Jianyang Gao, Cheng Long

2024

1 reference

Searching for approximate nearest neighbors (ANN) in the high-dimensional Euclidean space is a pivotal problem. Recently, with the help of fast SIMD-based implementations, Product Quantization (PQ) and its variants can often efficiently and accuratel...

View Paper PDF

R-trees: a dynamic index structure for spatial searching

A. Guttman

1984

2 references

In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, tradit...

View Paper DOI

Serializable isolation for snapshot databases

Michael J. Cahill, Uwe Röhm, Alan Fekete

2009

2 references

Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that...

View Paper DOI

The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.

Norbert Beckmann, Hans‐Peter Kriegel, Ralf Schneider, Bernhard Seeger

1990

2 references

The R-tree, one of the most popular access methods for rectangles, is based on the heuristic optimization of the area of the enclosing rectangle in each inner node. By running numerous experiments in a standardized testbed under highly varying data, ...

View Paper PDF DOI

The serializability of concurrent database updates

Christos H. Papadimitriou

1979

1 reference

ABSTRACT A sequence of interleaved user transactions in a database system may not be ser:ahzable, t e, equivalent to some sequential execution of the individual transactions Using a simple transaction model, it ~s shown that recognizing the transacti...

View Paper PDF DOI

Weighted Random Sampling over Data Streams

Pavlos S. Efraimidis

2010

2 references

In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss samplin...

View Paper PDF

Page 1 of 3 Next

Databases

Repositories

apache/cassandra

ClickHouse/ClickHouse

cockroachdb/cockroach

dgraph-io/dgraph

duckdb/duckdb

facebook/rocksdb

mongodb/mongo

postgres/postgres

redis/redis

sqlite/sqlite

yugabyte/yugabyte-db

Papers