Machine Learning

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Untert...

2020

2 references

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used ...

View Paper PDF DOI

A System for Massively Parallel Hyperparameter Tuning

Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet ...

2018

2 references

Modern learning models are characterized by large hyperparameter spaces and long training times. These properties, coupled with the rise of parallel computing and the growing demand to productionize machine learning workloads, motivate the need to de...

View Paper PDF

Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

Vage Egiazarian, Roberto L. Castro, Denis Kuznedelev, Andrei Panferov, Eldar Kurtic, Shubhra Pandit,...

2025

1 reference

The recent hardware-accelerated microscaling 4-bit floating-point formats such as MXFP4 and NVFP4, supported on NVIDIA and AMD GPUs, promise to revolutionize large language model (LLM) inference. Yet, their practical benefits remain unproven. We pres...

View Paper PDF

Compact Language Models via Pruning and Knowledge Distillation

Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary...

2024

1 reference

Large language models (LLMs) targeting different deployment scales and sizes are currently produced by training each variant from scratch; this is extremely compute-intensive. In this paper, we investigate if pruning an existing LLM and then re-train...

View Paper PDF

Curiosity-driven Exploration by Self-supervised Prediction

Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell

2017

6 references

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be us...

View Paper PDF

Decoding the Molecular Language of Proteins with Evolla

Xibin Zhou, Chenchen Han, Yingqi Zhang, Jin Su, Kai Zhuang, Shiyu Jiang, Zichen Yuan, Wei Zheng, Fen...

2025

1 reference

Abstract Proteins, nature’s intricate molecular machines, are the products of billions of years of evolution and play fundamental roles in sustaining life. Yet, deciphering their molecular language - that is, understanding how protein sequences and s...

View Paper PDF DOI

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Carlos Riquelme, George Tucker, Jasper Snoek

2018

1 reference

Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, developing practical methods to balance exploration and exploitation in complex domains remains largely u...

View Paper PDF

Evolutionary-scale prediction of atomic level protein structure with a language model

Zeming Lin, Halil Akin, Roshan Rao, Brian Hie, Zhongkai Zhu, Wenting Lu, Nikita Smetanin, Robert Ver...

2 references

AbstractArtificial intelligence has the potential to open insight into the structure of proteins at the scale of evolution. It has only recently been possible to extend protein structure prediction to two hundred million cataloged proteins. Character...

View Paper DOI

High-Dimensional Continuous Control Using Generalized Advantage Estimation

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

2015

6 references

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main challenges ...

View Paper PDF

IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

Michael Luo, Jiahao Yao, Richard Liaw, Eric Liang, Ion Stoica

2019

6 references

The practical usage of reinforcement learning agents is often bottlenecked by the duration of training time. To accelerate training, practitioners often turn to distributed reinforcement learning architectures to parallelize and accelerate the traini...

View Paper PDF

MapReduce: simplified data processing on large clusters

Jay B. Dean, Sanjay Ghemawat

2008

1 reference

MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the und...

View Paper PDF DOI

Mastering Atari with Discrete World Models.

Yujin Tang, Duong Nguyen, David Ha

2020

20 references

Inattentional blindness is the psychological phenomenon that causes one to miss things in plain sight. It is a consequence of the selective attention in perception that lets us remain focused on important parts of our world without distraction from i...

View Paper PDF DOI

Mastering Diverse Domains through World Models

Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap

2023

41 references

Developing a general algorithm that learns to solve tasks across a wide range of applications has been a fundamental challenge in artificial intelligence. Although current reinforcement learning algorithms can be readily applied to tasks similar to w...

View Paper PDF

Population Based Training of Neural Networks

Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, O...

2017

2 references

Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In this work w...

View Paper PDF

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shi...

2018

2 references

In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global ...

View Paper PDF

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract)

Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, J. Veness, Matthew J. Hausknecht, Michael Bowli...

2018

2 references

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving in...

View Paper PDF DOI

Soft Actor-Critic Algorithms and Applications

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar,...

2018

4 references

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity an...

View Paper PDF

Stabilizing Transformers for Reinforcement Learning

Emilio Parisotto, H. Francis Song, Jack W. Rae, Razvan Pascanu, Caglar Gulcehre, Siddhant M. Jayakum...

2019

2 references

Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state...

View Paper PDF

Volcano - An Extensible and Parallel Query Evaluation System

G. Graefe

1994

1 reference

To investigate the interactions of extensibility and parallelism in database query processing, we have developed a new dataflow query execution system called Volcano. The Volcano effort provides a rich environment for research and education in databa...

View Paper DOI

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, M...

2023

1 reference

Large multilingual language models typically rely on a single vocabulary shared across 100+ languages. As these models have increased in parameter count and depth, vocabulary size has remained largely unchanged. This \textit{vocabulary bottleneck} li...

View Paper PDF DOI

Repositories

huggingface/transformers

microsoft/onnxruntime

mlflow/mlflow

pytorch/pytorch

ray-project/ray

scikit-learn/scikit-learn

tensorflow/tensorflow

Papers

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale.

A System for Massively Parallel Hyperparameter Tuning

Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization

Compact Language Models via Pruning and Knowledge Distillation

Curiosity-driven Exploration by Self-supervised Prediction

Decoding the Molecular Language of Proteins with Evolla

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Evolutionary-scale prediction of atomic level protein structure with a language model

High-Dimensional Continuous Control Using Generalized Advantage Estimation

IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

MapReduce: simplified data processing on large clusters

Mastering Atari with Discrete World Models.

Mastering Diverse Domains through World Models

Population Based Training of Neural Networks

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents (Extended Abstract)

Soft Actor-Critic Algorithms and Applications

Stabilizing Transformers for Reinforcement Learning

Volcano - An Extensible and Parallel Query Evaluation System

XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models