Programming Languages & Compilers

Number Parsing at a Gigabyte per Second

Daniel Lemire

2021

7 citations

1 reference

With disks and networks providing gigabytes per second, parsing decimal numbers from strings becomes a bottleneck. We consider the problem of parsing decimal numbers to the nearest binary floating-point value. The general problem requires variable-precision arithmetic. However, we need at most 17 di...

View Paper PDF DOI

ompTest – Unit Testing with OMPT

Jan-Patrick Lehr, Michael Halkenhäuser, Dhruva Chakrabarti, Saiyedul Islam, Dan Palermo, Ron Lieberm...

2024

1 reference

OpenMP® is a widely used API in high-performance computing that enables parallelization on the host as well as offload work to an accelerator, such as a GPU. The OpenMP specification defines an OpenMP Tool Interface (OMPT), which allows a third-party tool be notified about OpenMP runtime events. Ens...

View Paper DOI

Optimistic and Scalable Global Function Merging

Kyungwoo Lee, Manman Ren, Ellis Hoag

2024

1 reference

Function merging is a pivotal technique for reducing code size by combining identical or similar functions into a single function. While prior research has extensively explored this technique, it has not been assessed in conjunction with function outlining and linker’s identical code folding, despit...

View Paper PDF DOI

Left Recursion in Parsing Expression Grammars

Sérgio Medeiros, Fabio Mascarenhas, Roberto Ierusalimschy

2012

1 reference

Parsing Expression Grammars (PEGs) are a formalism that can describe all deterministic context-free languages through a set of rules that specify a top-down parser for some language. PEGs are easy to use, and there are efficient implementations of PEG libraries in several programming languages. A ...

View Paper PDF DOI

Applications of finite automata representing large vocabularies

Cláudio L. Lucchesi, Tomasz Kowaltowski

1993

1 reference

<jats:title>Abstract</jats:title><jats:p>The construction of minimal acyclic deterministic partial finite automata to represent large natural language vocabularies is described. Applications of such automata include spelling checkers and advisers, multilanguage dictionaries, thesauri, minimal perfec...

View Paper DOI

Copy-and-Patch Compilation: A fast compilation algorithm for high-level languages and bytecode

Haoran Xu, Fredrik Kjolstad

2020

1 reference

Fast compilation is important when compilation occurs at runtime, such as query compilers in modern database systems and WebAssembly virtual machines in modern browsers. We present copy-and-patch, an extremely fast compilation technique that also produces good quality code. It is capable of lowering...

View Paper PDF DOI

RL4ReAl: Reinforcement Learning for Register Allocation

S. VenkataKeerthy, Siddhartha Jain, Rohit Aggarwal, Albert Cohen, Ramakrishna Upadrasta

2022

8 citations

3 references

We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely def...

View Paper PDF DOI

RL4ReAl: Reinforcement Learning for Register Allocation

S. VenkataKeerthy, Siddharth Jain, Anilava Kundu, Rohit Aggarwal, Albert Cohen, Ramakrishna Upadrast...

2022

8 citations

1 reference

We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art techniques. We formalize the constraints that precisely def...

View Paper PDF DOI

IR2VEC

S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, M. Desarkar, Ramakrishna Upadrasta, Y. Srikant

2020

65 citations

3 references

We propose IR2VEC, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the inpu...

View Paper PDF DOI

OMPTBench – OpenMP Tool Interface Conformance Testing

Jan-Patrick Lehr, Michael Halkenhäuser, Dhruva Chakrabarti, Saiyedul Islam, Dan Palermo, Ron Lieberm...

2024

1 citation

1 reference

OpenMP® is a highly relevant parallelization standard in high-performance computing and all major compiler vendors support it. The standard defines the OpenMP Tool Interface (OMPT) as a mechanism for third-party tools to obtain information on dedicated runtime events. However, the implementation sta...

View Paper DOI

An abstract interpretation for SPMD divergence on reducible control flow graphs

Julian Rosemann, Simon Moll, Sebastian Hack

2021

8 citations

1 reference

Vectorizing compilers employ divergence analysis to detect at which program point a specific variable is uniform, i.e. has the same value on all SPMD threads that execute this program point. They exploit uniformity to retain branching to counter branch divergence and defer computations to scalar pro...

View Paper PDF DOI

The program dependence graph and its use in optimization

Jeanne Ferrante, Karl J. Ottenstein, Joe D. Warren

1987

1 reference

<jats:p>In this paper we present an intermediate program representation, called the<jats:italic>program dependence graph</jats:italic>(<jats:italic>PDG</jats:italic>), that makes explicit both the data and control dependences for each operation in a program. Data dependences have been used to repres...

View Paper DOI

Parallel Runtime Interface for Fortran (PRIF) Specification, Revision 0.6

Dan Bonachea, Katherine Rasmussen

2025

1 reference

This document specifies an interface to support the multi-image parallelism features of Fortran, named the Parallel Runtime Interface for Fortran (PRIF). PRIF is a solution in which a runtime library is primarily responsible for implementing coarray allocation, deallocation and accesses, image synch...

View Paper PDF DOI

Memory Tagging and how it improves C/C++ memory safety

Kostya Serebryany, Evgenii Stepanov, Aleksey Shlyapnikov, Vlad Tsyrklevich, Dmitry Vyukov

2018

2 references

Memory safety in C and C++ remains largely unresolved. A technique usually called "memory tagging" may dramatically improve the situation if implemented in hardware with reasonable overhead. This paper describes two existing implementations of memory tagging: one is the full hardware implementation ...

View Paper PDF

IR2Vec: LLVM IR based Scalable Program Embeddings

S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta, Y...

2019

1 reference

We propose IR2Vec, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to capture the syntax as well as the semantics of the inpu...

View Paper PDF DOI

Optimizing Function Layout for Mobile Applications

Ellis Hoag, Kyungwoo Lee, Julián Mestre, Sergey Pupyrev

2022

1 reference

Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance of large-scale applications or reduce the compressed size of ...

View Paper PDF

Reliable and fast DWARF-based stack unwinding

T. Bastian, Stephen Kell, Francesco Zappa Nardelli

2019

2 references

Debug information, usually encoded in the DWARF format, is a hidden and obscure component of our computing infrastructure. Debug information is obviously used by debuggers, but it also plays a key role in program analysis tools, and, most surprisingly, it can be relied upon by the runtime of high-le...

View Paper PDF DOI

Ryū revisited: printf floating point conversion

Ulf Adams

2019

4 citations

2 references

Ryū Printf is a new algorithm to convert floating-point numbers to decimal strings according to the printf %f, %e, and %g formats: %f generates ‘full’ output (integer part of the input, dot, configurable number of digits), %e generates scientific output (one leading digit, dot, configurable number o...

View Paper PDF DOI

📝 Programming Languages & Compilers