alphaXiv

History

Papers Benchmarks

Laboratoire Kastler BrosselENS

622

18 Mar 2020

statistics

Computational Optimal Transport

CNRS ENSAE CREST ENS

Marco Cuturi

This survey provides a comprehensive review of Optimal Transport (OT) theory, with a focus on its computational methods and applications in data sciences. It highlights how entropic regularization, particularly through the Sinkhorn-Knopp algorithm, has made OT computationally feasible for large-scale problems, detailing various formulations and their use across machine learning, computer vision, and statistics.

366

381

10 May 2025

computer-science artificial-intelligence generative-models

Optimal Transport for Machine Learners

CNRS ENS PSL Universit ́e

Gabriel Peyré's course notes introduce Optimal Transport, a mathematical framework for comparing and manipulating probability distributions, balancing theoretical rigor with computational methods. The notes cover the Monge and Kantorovich formulations, the Wasserstein distance, and practical algorithms like Sinkhorn, enabling applications in machine learning such as distribution matching, generative modeling, and image processing.

495

16 Nov 2022

computer-science computation-and-language few-shot-learning

Atlas: Few-shot Learning with Retrieval Augmented Language Models

University College London

Inria ENS Meta AI Research

Atlas, a retrieval-augmented language model, achieves state-of-the-art few-shot learning performance on knowledge-intensive tasks, often surpassing purely parametric LLMs with significantly fewer parameters. It demonstrates that effective few-shot learning can be accomplished by integrating external knowledge sources, reducing reliance on in-parameter memorization.

526

138

07 Jan 2020

computer-science machine-learning deep-reinforcement-learning

On Lazy Training in Differentiable Programming

Inria PSL Research University ENS

This paper demonstrates that 'lazy training,' where deep neural networks behave linearly, is a general property of differentiable models driven by scaling, and critically shows that this regime leads to degraded generalization performance in practical deep convolutional neural networks. The findings suggest that the success of deep learning in real-world tasks likely stems from a non-lazy regime involving substantial non-linear feature learning.

148

18 Oct 2025

computer-science artificial-intelligence machine-learning

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

CNRS

Georgia Institute of Technology

Cornell University

University of California, San Diego Vector Institute PSL University ENS

Addressing real-world optimization problems becomes particularly challenging when analytic objective functions or constraints are unavailable. While numerous studies have addressed the issue of unknown objectives, limited research has focused on scenarios where feasibility constraints are not given explicitly. Overlooking these constraints can lead to spurious solutions that are unrealistic in practice. To deal with such unknown constraints, we propose to perform optimization within the data manifold using diffusion models. To constrain the optimization process to the data manifold, we reformulate the original optimization problem as a sampling problem from the product of the Boltzmann distribution defined by the objective function and the data distribution learned by the diffusion model. Depending on the differentiability of the objective function, we propose two different sampling methods. For differentiable objectives, we propose a two-stage framework that begins with a guided diffusion process for warm-up, followed by a Langevin dynamics stage for further correction. For non-differentiable objectives, we propose an iterative importance sampling strategy using the diffusion model as the proposal distribution. Comprehensive experiments on a synthetic dataset, six real-world black-box optimization datasets, and a multi-objective molecule optimization dataset show that our method achieves better or comparable performance with previous state-of-the-art baselines.

29 Aug 2025

bayesian-deep-learning computer-science computer-vision-and-pattern-recognition

From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling

CNRS

Inria Télécom Paris IP Paris PSL University Univ. Bordeaux Bordeaux INP ENS

We consider the problem of sampling distributions stemming from non-convex potentials with Unadjusted Langevin Algorithm (ULA). We prove the stability of the discrete-time ULA to drift approximations under the assumption that the potential is strongly convex at infinity. In many context, e.g. imaging inverse problems, potentials are non-convex and non-smooth. Proximal Stochastic Gradient Langevin Algorithm (PSGLA) is a popular algorithm to handle such potentials. It combines the forward-backward optimization algorithm with a ULA step. Our main stability result combined with properties of the Moreau envelope allows us to derive the first proof of convergence of the PSGLA for non-convex potentials. We empirically validate our methodology on synthetic data and in the context of imaging inverse problems. In particular, we observe that PSGLA exhibits faster convergence rates than Stochastic Gradient Langevin Algorithm for posterior sampling while preserving its restoration properties.

212

05 Nov 2024

computer-science computer-vision-and-pattern-recognition multi-modal-learning

CoVR-2: Automatic Data Construction for Composed Video Retrieval

CNRS

Inria PSL Research University Univ. Gustave Eiffel ENS École des Ponts

Lucas Ventura

Composed Image Retrieval (CoIR) has recently gained popularity as a task that considers both text and image queries together, to search for relevant images in a database. Most CoIR approaches require manually annotated datasets, comprising image-text-image triplets, where the text describes a modification from the query image to the target image. However, manual curation of CoIR triplets is expensive and prevents scalability. In this work, we instead propose a scalable automatic dataset creation methodology that generates triplets given video-caption pairs, while also expanding the scope of the task to include composed video retrieval (CoVR). To this end, we mine paired videos with a similar caption from a large database, and leverage a large language model to generate the corresponding modification text. Applying this methodology to the extensive WebVid2M collection, we automatically construct our WebVid-CoVR dataset, resulting in 1.6 million triplets. Moreover, we introduce a new benchmark for CoVR with a manually annotated evaluation set, along with baseline results. We further validate that our methodology is equally applicable to image-caption pairs, by generating 3.3 million CoIR training triplets using the Conceptual Captions dataset. Our model builds on BLIP-2 pretraining, adapting it to composed video (or image) retrieval, and incorporates an additional caption retrieval loss to exploit extra supervision beyond the triplet. We provide extensive ablations to analyze the design choices on our new CoVR benchmark. Our experiments also demonstrate that training a CoVR model on our datasets effectively transfers to CoIR, leading to improved state-of-the-art performance in the zero-shot setup on the CIRR, FashionIQ, and CIRCO benchmarks. Our code, datasets, and models are publicly available at this https URL ventural/covr.

16 Jan 2023

computer-science machine-learning mathematics

Unbalanced Optimal Transport, from Theory to Numerics

CNRS PSL University ENS UPEM

Optimal Transport (OT) has recently emerged as a central tool in data sciences to compare in a geometrically faithful way point clouds and more generally probability distributions. The wide adoption of OT into existing data analysis and machine learning pipelines is however plagued by several shortcomings. This includes its lack of robustness to outliers, its high computational costs, the need for a large number of samples in high dimension and the difficulty to handle data in distinct spaces. In this review, we detail several recently proposed approaches to mitigate these issues. We insist in particular on unbalanced OT, which compares arbitrary positive measures, not restricted to probability distributions (i.e. their total mass can vary). This generalization of OT makes it robust to outliers and missing data. The second workhorse of modern computational OT is entropic regularization, which leads to scalable algorithms while lowering the sample complexity in high dimension. The last point presented in this review is the Gromov-Wasserstein (GW) distance, which extends OT to cope with distributions belonging to different metric spaces. The main motivation for this review is to explain how unbalanced OT, entropic regularization and GW can work hand-in-hand to turn OT into efficient geometric loss functions for data sciences.

100

18 Feb 2024

computer-science machine-learning statistics

From Denoising Diffusions to Denoising Markov Models

University of Oxford ENS

Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance. They work by diffusing the data distribution into a Gaussian distribution and then learning to reverse this noising process to obtain synthetic datapoints. The denoising diffusion relies on approximations of the logarithmic derivatives of the noised data densities using score matching. Such models can also be used to perform approximate posterior simulation when one can only sample from the prior and likelihood. We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching. We illustrate the resulting models on various applications.

100

03 Oct 2024

computer-science computation-and-language representation-learning

Transformers are Universal In-context Learners

CNRS ENS Shimane Univ.PSL Univ.Rice Univ.

Transformers are deep architectures that define "in-context mappings" which enable predicting new tokens based on a given set of tokens (such as a prompt in NLP applications or a set of patches for a vision transformer). In this work, we study in particular the ability of these architectures to handle an arbitrarily large number of context tokens. To mathematically, uniformly address their expressivity, we consider the case that the mappings are conditioned on a context represented by a probability distribution of tokens which becomes discrete for a finite number of these. The relevant notion of smoothness then corresponds to continuity in terms of the Wasserstein distance between these contexts. We demonstrate that deep transformers are universal and can approximate continuous in-context mappings to arbitrary precision, uniformly over compact token domains. A key aspect of our results, compared to existing findings, is that for a fixed precision, a single transformer can operate on an arbitrary (even infinite) number of tokens. Additionally, it operates with a fixed embedding dimension of tokens (this dimension does not increase with precision) and a fixed number of heads (proportional to the dimension). The use of MLPs between multi-head attention layers is also explicitly controlled. We consider both unmasked attentions (as used for the vision transformer) and masked causal attentions (as used for NLP and time series applications). We tackle the causal setting leveraging a space-time lifting to analyze causal attention as a mapping over probability distributions of tokens.

153

31 Jan 2022

computer-science machine-learning computation

Fast and accurate optimization on the orthogonal manifold without retraction

CNRS PSL University ENS

We consider the problem of minimizing a function over the manifold of orthogonal matrices. The majority of algorithms for this problem compute a direction in the tangent space, and then use a retraction to move in that direction while staying on the manifold. Unfortunately, the numerical computation of retractions on the orthogonal manifold always involves some expensive linear algebra operation, such as matrix inversion, exponential or square-root. These operations quickly become expensive as the dimension of the matrices grows. To bypass this limitation, we propose the landing algorithm which does not use retractions. The algorithm is not constrained to stay on the manifold but its evolution is driven by a potential energy which progressively attracts it towards the manifold. One iteration of the landing algorithm only involves matrix multiplications, which makes it cheap compared to its retraction counterparts. We provide an analysis of the convergence of the algorithm, and demonstrate its promises on large-scale and deep learning problems, where it is faster and less prone to numerical errors than retraction-based methods.

23 Oct 2024

computer-science machine-learning human-ai-interaction

Optimal Design for Reward Modeling in RLHF

CNRS

UC Berkeley

Université Paris-Saclay

Inria Inria Saclay ENS ENS Lyon `Ecole Polytechnique

Reinforcement Learning from Human Feedback (RLHF) has become a popular approach to align language models (LMs) with human preferences. This method involves collecting a large dataset of human pairwise preferences across various text generations and using it to infer (implicitly or explicitly) a reward model. Numerous methods have been proposed to learn the reward model and align a LM with it. However, the costly process of collecting human preferences has received little attention and could benefit from theoretical insights. This paper addresses this issue and aims to formalize the reward training model in RLHF. We frame the selection of an effective dataset as a simple regret minimization task, using a linear contextual dueling bandit method. Given the potentially large number of arms, this approach is more coherent than the best-arm identification setting. We then propose an offline framework for solving this problem. Under appropriate assumptions - linearity of the reward model in the embedding space, and boundedness of the reward parameter - we derive bounds on the simple regret. Finally, we provide a lower bound that matches our upper bound up to constant and logarithmic terms. To our knowledge, this is the first theoretical contribution in this area to provide an offline approach as well as worst-case guarantees.

20 Oct 2025

attention-mechanisms computer-science artificial-intelligence

Layer Specialization Underlying Compositional Reasoning in Transformers

ENS Universit PSL

Transformers exhibit compositional reasoning on sequences not observed during training, a capability often attributed to in-context learning (ICL) and skill composition. We investigate this phenomenon using the Random Hierarchy Model (RHM), a probabilistic context-free grammar that generates sequences through recursive rule application. Models are trained on subsets of sequences and evaluated across four generalization conditions: memorization, in-distribution generalization, out-of-distribution generalization with the same rules, and cross-layer transfer. Behaviorally, performance improves systematically with task complexity and the number of in-context examples, with out-of-distribution tasks requiring substantially more examples than in-distribution scenarios. Mechanistically, we identify a progressive emergence of layer specialization during training that correlates with generalization performance. Principal component analysis and attention pattern clustering reveal that transformers develop structured, hierarchically organized representations in specialized layers. These results demonstrate that transformers develop modular, interpretable mechanisms supporting compositional reasoning, linking internal algorithmic structure to observed behavioral capabilities.

144

25 May 2023

computer-science sound audio-and-speech-processing

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

CNRS

Google DeepMind

Inria

Duke University PSL University ENS Meta AI Research Universit de Toulouse

Most automatic speech processing systems register degraded performance when applied to noisy or reverberant speech. But how can one tell whether speech is noisy or reverberant? We propose Brouhaha, a neural network jointly trained to extract speech/non-speech segments, speech-to-noise ratios, and C50room acoustics from single-channel recordings. Brouhaha is trained using a data-driven approach in which noisy and reverberant audio segments are synthesized. We first evaluate its performance and demonstrate that the proposed multi-task regime is beneficial. We then present two scenarios illustrating how Brouhaha can be used on naturally noisy and reverberant data: 1) to investigate the errors made by a speaker diarization model (this http URL); and 2) to assess the reliability of an automatic speech recognition model (Whisper from OpenAI). Both our pipeline and a pretrained model are open source and shared with the speech community.

755

15 Jan 2025

computer-science artificial-intelligence mathematics

The Mathematics of Artificial Intelligence

CNRS ENS Universit PSL

Gabriel Peyré, affiliated with CNRS and ENS, provides a comprehensive overview of the mathematical foundations enabling modern artificial intelligence, particularly focusing on analytical and probabilistic tools for neural network architectures and optimization. The article demonstrates how diverse mathematical disciplines underpin AI advancements while simultaneously showcasing how AI problems catalyze new mathematical development.

362

17 Nov 2025

high-energy-astrophysical-phenomena general-relativity-and-quantum-cosmology high-energy-physics-theory

Tests of General Relativity with GWTC-3

University of Washington

CNRS

University of Toronto University of Mississippi University of Cincinnati

California Institute of Technology

University of Cambridge INFN Sezione di Napoli

Monash University National Central University National Astronomical Observatory of Japan Vanderbilt University

University of Notre Dame

Tel Aviv University

University College London Nikhef

Georgia Institute of Technology

University of Science and Technology of China

Tsinghua University

The Chinese University of Hong Kong University of Melbourne

The University of Texas at Austin University of Warsaw

Peking University

Texas A&M University

University of British Columbia

Northwestern University

NASA Goddard Space Flight Center Louisiana State University

University of Florida INFN Sezione di Pisa Rutherford Appleton Laboratory

University of Minnesota

University of Maryland

University of Tokyo Indian Institute of Science National Taiwan Normal University

The Pennsylvania State University Rochester Institute of Technology Gran Sasso Science Institute

Sorbonne Université University of Massachusetts Amherst

Australian National University University of Auckland Cardiff University University of Glasgow Leibniz Universität Hannover University of Portsmouth Universidade Federal do ABC High Energy Accelerator Research Organization (KEK)Indian Institute of Technology Madras University of Strathclyde Università di Genova University of Alabama in Huntsville Syracuse University University of Sannio RMIT University Instituto Nacional de Pesquisas Espaciais Università di Camerino Universitat de les Illes Balears Maastricht University University of Birmingham Università di Trieste National Cheng Kung University Aix Marseille University Kyushu University University of South Carolina Washington State University University of Oregon National Tsing-Hua University Kindai University The University of Western Australia Universidade de Aveiro Eötvös Loránd University Universitat Autònoma de Barcelona Sofia University Nicolaus Copernicus Astronomical Center Instituto de Fisica Teorica UAM/CSIC Shanghai Astronomical Observatory Nicolaus Copernicus University INFN, Laboratori Nazionali di Frascati University of Western Ontario Università di Napoli Federico II

University of California, Santa Cruz Embry-Riddle Aeronautical University University of Hawai’i University of Electro-Communications National Chung Hsing University Montana State University International Centre for Theoretical Sciences INFN Sezione di Perugia Istituto Nazionale di Alta Matematica The University of Sheffield Université de la Côte d’Azur Physikalisch-Technische Bundesanstalt Institut de Física d’Altes Energies (IFAE)INFN - Sezione di Padova University of the Balearic Islands Laboratoire Kastler Brossel Università di Firenze University of Toyama Istituto Nazionale di Ottica INFN-Sezione di Genova Universiteit Antwerpen The University of Mississippi University of Szeged Università di Perugia INFN-Sezione di Bologna Università di Cagliari VU Amsterdam Institute for Cosmic Ray Research, University of Tokyo INFN Sezione di Roma Tor Vergata Université de Paris, CNRS, Astroparticule et Cosmologie,California State University, Los Angeles Università di Siena LIGO Livingston Observatory National Center for High-Performance Computing NCBJ Laboratoire AstroParticule et Cosmologie - CNRS Università di Urbino Carlo Bo Università degli Studi di Sassari Università di Trento, INFN-TIFPA Wigner RCP, RMKI INFN Sezione di Cagliari RESCEU, University of Tokyo Univ Lyon, ENS de Lyon, CNRS, Université Claude Bernard Lyon 1 Universite de Nice, ARTEMIS, CNRS, Observatoire de la Cote d’Azur Istituto de Fısica Teórica, UAM/CSIC Albert-Einstein-Institut, Hanover APC, AstroParticule et Cosmologie, CNRS GSSI, INFN, Laboratori Nazionali del Gran Sasso National Institute of Technology, Akashi College LAPP, Universit´e Savoie Mont Blanc Università di Napoli Università degli Studi di Camerino The University of Sheffield, Department of Physics and Astronomy Universite de Paris * National and Kapodistrian University of Athens Friedrich-Schiller-Universität Jena Universit Grenoble Alpes Universit degli Studi di Genova Universit Libre de Bruxelles Universit di Trento Universit di Salerno Universit degli Studi di Padova Universit de Bordeaux Universit di Roma La Sapienza Universit Paris Cit Universit de Strasbourg Universit de Lyon Universit di Pisa INAF Osservatorio Astronomico di Padova Universit de Montpellier Universit di Roma Tor VergataUniversit Di Bologna INAF ` Osservatorio Astronomico di Trieste INFN Sezione di Firenze

Ish Gupta

The ever-increasing number of detections of gravitational waves (GWs) from compact binaries by the Advanced LIGO and Advanced Virgo detectors allows us to perform ever-more sensitive tests of general relativity (GR) in the dynamical and strong-field regime of gravity. We perform a suite of tests of GR using the compact binary signals observed during the second half of the third observing run of those detectors. We restrict our analysis to the 15 confident signals that have false alarm rates

\leq 10^{-3}\, {\rm yr}^{-1}

. In addition to signals consistent with binary black hole (BH) mergers, the new events include GW200115_042309, a signal consistent with a neutron star--BH merger. We find the residual power, after subtracting the best fit waveform from the data for each event, to be consistent with the detector noise. Additionally, we find all the post-Newtonian deformation coefficients to be consistent with the predictions from GR, with an improvement by a factor of ~2 in the -1PN parameter. We also find that the spin-induced quadrupole moments of the binary BH constituents are consistent with those of Kerr BHs in GR. We find no evidence for dispersion of GWs, non-GR modes of polarization, or post-merger echoes in the events that were analyzed. We update the bound on the mass of the graviton, at 90% credibility, to

m_g \leq 2.42 \times 10^{-23} \mathrm{eV}/c^2

. The final mass and final spin as inferred from the pre-merger and post-merger parts of the waveform are consistent with each other. The studies of the properties of the remnant BHs, including deviations of the quasi-normal mode frequencies and damping times, show consistency with the predictions of GR. In addition to considering signals individually, we also combine results from the catalog of GW signals to calculate more precise population constraints. We find no evidence in support of physics beyond GR.

21 Oct 2020

computer-science machine-learning signal-processing

Reservoir Computing meets Recurrent Kernels and Structured Transforms

CNRS Collège de France

Sorbonne Université LightOn Laboratoire de Physique de l’Ecole normale supérieure Laboratoire Kastler Brossel Universit de Paris

Florent KRZAKALA

A rigorous proof of convergence for Reservoir Computing to a deterministic recurrent kernel in the infinite-width limit, establishing an O(1/√N) rate, is presented. Additionally, Structured Reservoir Computing (SRC) is introduced, reducing the computational complexity of the recurrent step from O(N^2) to O(N log N) and demonstrating comparable performance on chaotic time series prediction tasks at significantly larger reservoir sizes.

27 Oct 2025

clustering-algorithms computer-science machine-learning

Toward Interpretable Evaluation Measures for Time Series Segmentation

CNRS

Inria PSL ENS

Researchers from Inria, ENS, CNRS, and PSL introduce WARI and SMS, two new evaluation measures for time series segmentation, alongside a formal typology of segmentation errors. These measures enhance the interpretability of segmentation quality by accounting for temporal error positions and specific error types, providing diagnostic insights into algorithm performance.

12 Aug 2021

computer-science computation-and-language computer-vision-and-pattern-recognition

Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Google DeepMind INRIA Paris ENS CIIRC CTU

Recent methods for visual question answering rely on large-scale annotated datasets. Manual annotation of questions and answers for videos, however, is tedious, expensive and prevents scalability. In this work, we propose to avoid manual annotation and generate a large-scale training dataset for video question answering making use of automatic cross-modal supervision. We leverage a question generation transformer trained on text data and use it to generate question-answer pairs from transcribed video narrations. Given narrated videos, we then automatically generate the HowToVQA69M dataset with 69M video-question-answer triplets. To handle the open vocabulary of diverse answers in this dataset, we propose a training procedure based on a contrastive loss between a video-question multi-modal transformer and an answer transformer. We introduce the zero-shot VideoQA task and show excellent results, in particular for rare answers. Furthermore, we demonstrate our method to significantly outperform the state of the art on MSRVTT-QA, MSVD-QA, ActivityNet-QA and How2QA. Finally, for a detailed evaluation we introduce iVQA, a new VideoQA dataset with reduced language biases and high-quality redundant manual annotations. Our code, datasets and trained models are available at this https URL.

118

26 May 2025

computer-science machine-learning deep-reinforcement-learning

Accelerating Nash Learning from Human Feedback via Mirror Prox

CNRS

Google DeepMind

Hugging Face

Université Paris-Saclay

Inria HSE University Mohamed bin Zayed University of AI ENS ENS Lyon Duisburg-Essen University cole Polytechnique

Traditional Reinforcement Learning from Human Feedback (RLHF) often relies on reward models, frequently assuming preference structures like the Bradley-Terry model, which may not accurately capture the complexities of real human preferences (e.g., intransitivity). Nash Learning from Human Feedback (NLHF) offers a more direct alternative by framing the problem as finding a Nash equilibrium of a game defined by these preferences. In this work, we introduce Nash Mirror Prox (

\mathtt{Nash-MP}

), an online NLHF algorithm that leverages the Mirror Prox optimization scheme to achieve fast and stable convergence to the Nash equilibrium. Our theoretical analysis establishes that Nash-MP exhibits last-iterate linear convergence towards the

\beta

-regularized Nash equilibrium. Specifically, we prove that the KL-divergence to the optimal policy decreases at a rate of order

(1+2\beta)^{-N/2}

, where

N

is a number of preference queries. We further demonstrate last-iterate linear convergence for the exploitability gap and uniformly for the span semi-norm of log-probabilities, with all these rates being independent of the size of the action space. Furthermore, we propose and analyze an approximate version of Nash-MP where proximal steps are estimated using stochastic policy gradients, making the algorithm closer to applications. Finally, we detail a practical implementation strategy for fine-tuning large language models and present experiments that demonstrate its competitive performance and compatibility with existing methods.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Computational Optimal Transport

Optimal Transport for Machine Learners

Atlas: Few-shot Learning with Retrieval Augmented Language Models

On Lazy Training in Differentiable Programming

Diffusion Models as Constrained Samplers for Optimization with Unknown Constraints

From stability of Langevin diffusion to convergence of proximal MCMC for non-log-concave sampling

CoVR-2: Automatic Data Construction for Composed Video Retrieval

Unbalanced Optimal Transport, from Theory to Numerics

From Denoising Diffusions to Denoising Markov Models

Transformers are Universal In-context Learners

Fast and accurate optimization on the orthogonal manifold without retraction

Optimal Design for Reward Modeling in RLHF

Layer Specialization Underlying Compositional Reasoning in Transformers

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

The Mathematics of Artificial Intelligence

Tests of General Relativity with GWTC-3

Reservoir Computing meets Recurrent Kernels and Structured Transforms

Toward Interpretable Evaluation Measures for Time Series Segmentation

Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Accelerating Nash Learning from Human Feedback via Mirror Prox

Events

AI for Law

Personalize Your Feed