Classical Compilers
Production implementations of compiler algorithms and techniques
Repositories
(7)dotnet/runtime
gcc-mirror/gcc
golang/go
llvm/llvm-project
openjdk/jdk
rust-lang/rust
swiftlang/swift
Papers
(100)A Novel Hybrid Quicksort Algorithm Vectorized using AVX-512 on Intel Skylake
The modern CPU's design, which is composed of hierarchical memory and SIMD/vectorization capability, governs the potential for algorithms to be transformed into efficient implementations. The release of the AVX-512 changed things radically, and motiv...
Base64 encoding and decoding at almost the speed of a memory copy
Many common document formats on the Internet are text-only such as email (MIME) and the Web (HTML, JavaScript, JSON and XML). To include images or executable code in these documents, we first encode them as text using base64. Standard base64 encoding...
Consistently faster and smaller compressed bitmaps with Roaring
Compressed bitmap indexes are used in databases and search engines. Many bitmap compression techniques have been proposed, almost all relying primarily on run-length encoding (RLE). However, on unsorted data, we can get superior performance with a hy...
Euclidean Affine Functions and Applications to Calendar Algorithms
We study properties of Euclidean affine functions (EAFs), namely those of the form $f(r) = (α\cdot r + β)/δ$, and their closely related expression $\mathring{f}(r) = (α\cdot r + β)\%δ$, where $r$, $α$, $β$ and $δ$ are integers, and where $/$ and $\%$...
Experience Report: Developing the Servo Web Browser Engine using Rust
All modern web browsers - Internet Explorer, Firefox, Chrome, Opera, and Safari - have a core rendering engine written in C++. This language choice was made because it affords the systems programmer complete control of the underlying hardware feature...
Fast and Robust Vectorized In-Place Sorting of Primitive Types.
Modern CPUs provide single instruction-multiple data (SIMD) instructions. SIMD instructions process several elements of a primitive data type simultaneously in fixed-size vectors. Classical sorting algorithms are not directly expressible in SIMD inst...
Faster Base64 Encoding and Decoding Using AVX2 Instructions
Web developers use base64 formats to include images, fonts, sounds, and other resources directly inside HTML, JavaScript, JSON, and XML files. We estimate that billions of base64 messages are decoded every day. We are motivated to improve the efficie...
MOLPIPx: an end-to-end differentiable package for permutationally invariant polynomials in Python and Rust
In this work, we present MOLPIPx, a versatile library designed to seamlessly integrate Permutationally Invariant Polynomials (PIPs) with modern machine learning frameworks, enabling the efficient development of linear models, neural networks, and Gau...
Oxide: The Essence of Rust
Rust claims to advance industrial programming by bridging the gap between low-level systems programming and high-level application programming. At the heart of the argument that this enables programmers to build more reliable and efficient software i...
Polymorphism, subtyping, and type inference in MLsub
We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a stric...
Reenix: Implementing a Unix-Like Operating System in Rust
This paper describes the experience, problems and successes found in implementing a unix-like operating system kernel in rust. Using the basic design and much of the lowest-level support code from the Weenix operating system written for CS167/9 I was...
Rust as a language for high performance GC implementation
High performance garbage collectors build upon performance-critical low-level code, typically exhibit multiple levels of concurrency, and are prone to subtle bugs. Implementing, debugging and maintaining such collectors can therefore be extremely cha...
Ryū: fast float-to-string conversion
We present Ryū, a new routine to convert binary floating point numbers to their decimal representations using only fixed-size integer operations, and prove its correctness. Ryū is simpler and approximately three times faster than the previously faste...
ScissorGC: scalable and efficient compaction for Java full garbage collection
Java runtime frees applications from manual memory management through automatic garbage collection (GC). This, however, is usually at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably ...
SyRust: automatic testing of Rust libraries with semantic-aware program synthesis
Rust’s type system ensures the safety of Rust programs; however, programmers can side-step some of the strict typing rules by using the unsafe keyword. A common use of unsafe Rust is by libraries. Bugs in these libraries undermine the safety of the e...
Understanding and improving JVM GC work stealing at the data center scale
Garbage collection (GC) is a critical part of performance in managed run-time systems such as the OpenJDK Java Virtual Machine (JVM). With a large number of latency sensitive applications written in Java the performance of the JVM is essential. Java ...
User-level Threading
An important class of computer software, such as network servers, exhibits concurrency through many loosely coupled and potentially long-running communication sessions. For these applications, a long-standing open question is whether thread-per-sessi...
When is double rounding innocuous?
Double rounding is the phenomenon that occurs when the result of an operation is rounded to fit some intermediate destination, and then again when delivered to its final destination. This can be a common occurrence when using some floating-point arit...