openjdk/jdk
Papers Referenced in This Repository
Faster Base64 Encoding and Decoding Using AVX2 Instructions
Web developers use base64 formats to include images, fonts, sounds, and other resources directly inside HTML, JavaScript, JSON, and XML files. We estimate that billions of base64 messages are decoded every day. We are motivated to improve the efficiency of base64 encoding and decoding. Compared to s...
Base64 encoding and decoding at almost the speed of a memory copy
Many common document formats on the Internet are text-only such as email (MIME) and the Web (HTML, JavaScript, JSON and XML). To include images or executable code in these documents, we first encode them as text using base64. Standard base64 encoding uses 64~ASCII characters: both lower and upper ca...
A modified ziggurat algorithm for generating exponentially- and normally-distributed pseudorandom numbers
The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from common statistical distributions. The algorithm divides a distribution into rectangular layers that stack on top of each other (resembling a Ziggurat), subsuming the desired distribution. ...
An Efficient Method for Generating Discrete Random Variables with General Distributions
article Free Access Share on An Efficient Method for Generating Discrete Random Variables with General Distributions Author: Alastair J. Walker Department of Electrical Engineering, University of Witwatersrand, 1 Jan Smuts Ave., Johnnesburg 2001, South Africa Department of Electrical Engineering, Un...
ScissorGC: scalable and efficient compaction for Java full garbage collection
Java runtime frees applications from manual memory management through automatic garbage collection (GC). This, however, is usually at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably suffer from a full GC phase scanning and compactin...
Understanding and improving JVM GC work stealing at the data center scale
Garbage collection (GC) is a critical part of performance in managed run-time systems such as the OpenJDK Java Virtual Machine (JVM). With a large number of latency sensitive applications written in Java the performance of the JVM is essential. Java application servers run in data centers on a large...
Fast and Robust Vectorized In-Place Sorting of Primitive Types.
Modern CPUs provide single instruction-multiple data (SIMD) instructions. SIMD instructions process several elements of a primitive data type simultaneously in fixed-size vectors. Classical sorting algorithms are not directly expressible in SIMD instructions. Accelerating sorting algorithms with SIM...
A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
The modern CPU's design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motivated us to search for an efficient sorting algorit...
Computationally easy, spectrally good multipliers for congruential pseudorandom number generators
Congruential pseudorandom number generators rely on good multipliers, that is, integers that have good performance with respect to the spectral test. We provide lists of multipliers with a good lattice structure up to dimension eight and up to lag eight for generators with typical power‐of‐two modul...
LXM: better splittable pseudorandom number generators (and almost as fast)
In 2014, Steele, Lea, and Flood presented SplitMix, an object-oriented pseudorandom number generator (prng) that is quite fast (9 64-bit arithmetic/logical operations per 64 bits generated) and also splittable . A conventional prng object provides a generate method that returns one pseudorandom valu...
The Monty Python method for generating random variables
We suggest an interesting and fast method for generating normal, exponential, t, von Mises, and certain other important random variables used in Monte Carlo studies. The right half of a symmetric density is cut into pieces, then, using simple area-preserving transformations, reassembled into a recta...
Gaussian random number generators
Rapid generation of high quality Gaussian random numbers is a key capability for simulations across a wide range of disciplines. Advances in computing have brought the power to conduct simulations with very large numbers of random numbers and with it, the challenge of meeting increasingly stringent ...
User-level Threading
An important class of computer software, such as network servers, exhibits concurrency through many loosely coupled and potentially long-running communication sessions. For these applications, a long-standing open question is whether thread-per-session programming can deliver comparable performance ...
An Extension of the String-to-String Correction Problem
article Free Access Share on An Extension of the String-to-String Correction Problem Authors: Robert A. Wagner Department of Systems and Information Sciences, Vanderbilt University, Nashville, TN Department of Systems and Information Sciences, Vanderbilt University, Nashville, TNView Profile , Roy L...
When is double rounding innocuous?
Double rounding is the phenomenon that occurs when the result of an operation is rounded to fit some intermediate destination, and then again when delivered to its final destination. This can be a common occurrence when using some floating-point arithmetic engines which lack single precision registe...
SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions
In this paper, we present techniques used to implement our portable\nvectorized library of C standard mathematical functions written entirely in C\nlanguage. In order to make the library portable while maintaining good\nperformance, intrinsic functions of vector extensions are abstracted by inline\n...