Operating Systems
OS kernels, schedulers, memory management, and system software
Repositories
(6)freebsd/freebsd-src
FreeRTOS/FreeRTOS
sel4/sel4
SerenityOS/serenity
torvalds/linux
zephyrproject-rtos/zephyr
Papers
(46)8-bit Numerical Formats for Deep Neural Networks
Given the current trend of increasing size and complexity of machine learning architectures, it has become of critical importance to identify new approaches to improve the computational efficiency of model training. In this context, we address the ad...
An Experimental Study of Dynamic Dominators
Motivated by recent applications of dominator computations, we consider the problem of dynamically maintaining the dominators of flow graphs through a sequence of insertions and deletions of edges. Our main theoretical contribution is a simple increm...
Concurrent Hash Tables
Concurrent hash tables are one of the most important concurrent data structures, which are used in numerous applications. For some applications, it is common that hash table accesses dominate the execution time. To efficiently solve these problems in...
Euclidean Affine Functions and Applications to Calendar Algorithms
We study properties of Euclidean affine functions (EAFs), namely those of the form $f(r) = (α\cdot r + β)/δ$, and their closely related expression $\mathring{f}(r) = (α\cdot r + β)\%δ$, where $r$, $α$, $β$ and $δ$ are integers, and where $/$ and $\%$...
Fast Random Integer Generation in an Interval
In simulations, probabilistic algorithms and statistical tests, we often generate random integers in an interval (e.g., [0,s)). For example, random integers in an interval are essential to the Fisher-Yates random shuffle. Consequently, popular langua...
FP8 Formats for Deep Learning
FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors. In this paper we propose an 8-bit floating point (FP8) binary interchange format consisting of two encodings - E4M3 ...
Number Parsing at a Gigabyte per Second
With disks and networks providing gigabytes per second, parsing decimal numbers from strings becomes a bottleneck. We consider the problem of parsing decimal numbers to the nearest binary floating-point value. The general problem requires variable-pr...
Optimizing Function Layout for Mobile Applications
Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance of large-sc...
R-trees: a dynamic index structure for spatial searching
In order to handle spatial data efficiently, as required in computer aided design and geo-data applications, a database system needs an index mechanism that will help it retrieve data items quickly according to their spatial locations However, tradit...
Ryū: fast float-to-string conversion
We present Ryū, a new routine to convert binary floating point numbers to their decimal representations using only fixed-size integer operations, and prove its correctness. Ryū is simpler and approximately three times faster than the previously faste...
SAX-PAC (Scalable And eXpressive PAcket Classification).
Efficient packet classification is a core concern for network services. Traditional multi-field classification approaches, in both \nsoftware and ternary content-addressable memory (TCAMs), entail tradeoffs between (memory) space and (lookup) time. T...
The R*-Tree: An Efficient and Robust Access Method for Points and Rectangles.
The R-tree, one of the most popular access methods for rectangles, is based on the heuristic optimization of the area of the enclosing rectangle in each inner node. By running numerous experiments in a standardized testbed under highly varying data, ...
Worst-Case TCAM Rule Expansion.
Designers of TCAMs (Ternary CAMs) for packet classification deal with unpredictable sets of rules, resulting in highly variable rule expansions, and rely on heuristic encoding algorithms with no reasonable expansion guarantees. In this paper, given s...
Xorshift RNGs
Description of a class of simple, extremely fast random number generators (RNGs) with periods 2k - 1 for k = 32, 64, 96, 128, 160, 192. These RNGs seem to pass tests of randomness very well.