alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Ask or search anything...

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

University of Oxford

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

08 Nov 2025

University of Illinois at Urbana-Champaign

University of California, Santa Barbara

A comprehensive survey formally defines Agentic Reinforcement Learning (RL) for Large Language Models (LLMs) as a Partially Observable Markov Decision Process (POMDP), distinct from conventional LLM-RL, and provides a two-tiered taxonomy of capabilities and task domains. The work consolidates open-source resources and outlines critical open challenges for the field.

#agentic-frameworks #agents #computer-science

Paper thumbnail

VGGT: Visual Geometry Grounded Transformer

14 Mar 2025

University of Oxford Meta logo

VGGT, developed by VGG at the University of Oxford and Meta AI, introduces a 1.2 billion-parameter feed-forward transformer that directly infers camera parameters, depth maps, and 3D point clouds from multiple input images in a single pass. This model achieves state-of-the-art accuracy in 3D reconstruction and camera pose estimation (e.g., 85.3 AUC@30 on RealEstate10K) while significantly reducing inference time to approximately 0.2 seconds per scene.

#computer-science #computer-vision-security #computer-vision-and-pattern-recognition

Resources 9,237

Paper thumbnail

Compute as Teacher: Turning Inference Compute Into Reference-Free Supervision

17 Sep 2025

dulhan-jayalath

Dulhan Jayalath

Max Planck Institute for Intelligent Systems Anthropic logo

A new method, Compute as Teacher (CaT), generates supervision signals for large language models in post-training scenarios lacking traditional ground truth or programmatic verifiers. It leverages inference compute to synthesize reference-free supervision, leading to relative performance increases of up to 33% on MATH-500 and 30% on HealthBench.

#agents #computer-science #machine-learning

Paper thumbnail

Robot Learning: A Tutorial

14 Oct 2025

University of Oxford Hugging Face logo

A tutorial developed by the University of Oxford and Hugging Face guides readers through modern robot learning, detailing the transition from classical methods to data-driven, learning-based paradigms. It provides conceptual understanding and practical tools using the `lerobot` open-source library, covering Reinforcement Learning, Imitation Learning, and generalist Vision-Language-Action policies with end-to-end examples.

#agents #autonomous-vehicles #computer-science

Paper thumbnail

GARNN: An Interpretable Graph Attentive Recurrent Neural Network for Predicting Blood Glucose Levels via Multivariate Time Series

26 Feb 2024

Imperial College London University College London logo

University College London

Researchers from University College London and collaborators developed GARNN, an interpretable graph attentive recurrent neural network for predicting blood glucose levels from multivariate time series data. The model consistently achieved state-of-the-art prediction accuracy across four clinical datasets while providing clinically justifiable temporal and global interpretations of variable importance, particularly excelling at attributing sparse event contributions.

#ai-for-health #attention-mechanisms #computer-science

Paper thumbnail

zkFL: Zero-Knowledge Proof-based Gradient Aggregation for Federated Learning

10 May 2024

Imperial College London University of Oxford logo

University of Oxford

Researchers from Imperial College London, Shanghai AI Lab, FLock.io, and HKUST developed zkFL, a system that integrates Zero-Knowledge Proofs to guarantee the integrity of gradient aggregation in Federated Learning against a malicious central aggregator. An extended blockchain-based variant optimizes client-side verification and achieves significantly reduced on-chain costs compared to prior blockchain FL approaches, while maintaining training performance.

#computer-science #artificial-intelligence #cryptography-and-security

Paper thumbnail

GA-NIFS: The highly overdense system BR1202-0725 at z

\sim

4.7. A double AGN with fast outflows plus eight companion galaxies

03 Dec 2024

University of Oxford

Distant quasars (QSOs) in galaxy overdensities are considered key actors in the evolution of the early Universe. In this work, we studied the kinematic and physical properties of the BR1202-0725 system at z=4.7, one of the most overdense fields known in the early Universe, consisting of a QSO, a submillimeter galaxy (SMG), and three Lyman-

\alpha

emitters. We used data from the JWST/NIRSpec Integral Field Unit (IFU) to analyze the rest-frame optical emission of each source in the system. We estimated a bolometric luminosity of log(

L_{\rm bol}/

[erg/s]) = 47.2

\pm

0.4 and a black hole mass of log(

M_{\rm BH}/M_\odot

) = 10.1

\pm

0.5 for the QSO, which are consistent with previous measurements obtained with ground-based observations. The NIRSpec spectra of the SMG revealed instead unexpected [OIII] and H

\alpha

+[NII] profiles. The overall [OIII] line profile is blue-shifted by more than 700 km/s relative to the systemic velocity of the galaxy. Additionally, both the [OIII] and H

\alpha

+[NII] lines show prominent broad (1300 km/s), blueshifted wings associated with outflowing ionized gas. The analysis of NIRSpec and X-ray observations indicates that the SMG likely hosts an accreting supermassive black hole as supported by the following results: (i) the excitation diagnostic diagram is consistent with ionization from an active galactic nucleus (AGN); (ii) the X-ray luminosity is higher than

10^{44}

erg/s; and (iii) it hosts a fast outflow (

v_{\rm out}

= 5000 km/s), comparable to those observed in luminous QSOs. Therefore, the QSO-SMG pair represents one of the highest-redshift double AGN to date, with a projected separation of 24 kpc. Finally, we investigated the environment of this system and found four new galaxies at the same redshift of the QSO and within a projected distance of 5 kpc from it. This overdense system includes at least ten galaxies in only 980 kpc

^2

.

#astrophysics-of-galaxies #physics

Paper thumbnail

Regarding the extension of metaplectic geometrical optics to modelling evanescent waves in ray-tracing codes

21 Aug 2024

University of Oxford Technical University of Denmark

Metaplectic geometrical optics (MGO) is a recently developed ray-tracing framework to accurately compute the wavefield behavior near a caustic (turning point or focal point), where traditional ray-tracing breaks down. However, MGO has thus far been restricted to having real-valued wavevectors. This is disadvantageous because often upon crossing a caustic from the `illuminated' region to the `shadow' region, two real-valued rays coalesce into one complex-valued ray corresponding to the transition from propagating to evanescent behavior. One can distinguish caustics as having either `illuminated shadows' or `proper shadows' -- the former corresponds to when the shadow still contains real-valued rays (albeit in a fewer quantity than in the illuminated region), while the latter corresponds to when the shadow contains no real-valued rays. Here, by means of examples, we show how MGO can be used to model both types of shadows. First, for illuminated shadows we show that MGO can actually be used `as is', provided a corrected quadrature-angle bias is used compared to that proposed in the original references. This is then implemented and demonstrated in a recently developed MGO ray-tracing code. Second, we show that for proper shadows, the MGO formalism can still be used if the symplectic rotation matrix that removes caustics along rays is allowed to be complex-valued. In both cases, strong agreement is seen between the MGO and the exact solution, demonstrating the potential of MGO for improving the predictive capability of ray-tracing codes and laying the foundations for modeling more complicated evanescent phenomena such as tunnelling with MGO.

#physics #optics #plasma-physics

Paper thumbnail

BEACON: JWST NIRCam Pure-parallel Imaging Survey. I. Survey Design and Initial Results

05 Dec 2024

Tohoku University

California Institute of Technology

We introduce the Bias-free Extragalactic Analysis for Cosmic Origins with NIRCam (BEACON) survey, a JWST Cycle2 program allocated up to 600 pure-parallel hours of observations. BEACON explores high-latitude areas of the sky with JWST/NIRCam over

\sim100

independent sightlines, totaling

\sim0.3

deg

^2

, reaching a median F444W depth of

\approx28.2

AB mag (5

\sigma

). Based on existing JWST observations in legacy fields, we estimate that BEACON will photometrically identify 25--150 galaxies at

z&gt;10

and 500--1000 at

z\sim7

--10 uniquely enabled by an efficient multiple filter configuration spanning

0.9

--5.0

\mu

m. The expected sample size of

z&gt;10

galaxies will allow us to obtain robust number density estimates and to discriminate between different models of early star formation. In this paper, we present an overview of the survey design and initial results using the first 19 fields. We present 129 galaxy candidates at

z&gt;7

identified in those fields, including 11 galaxies at

z&gt;10

and several UV-luminous (

M_{\rm UV}&lt;-21

mag) galaxies at

z\sim8

. The number densities of

z&lt;13

galaxies inferred from the initial fields are overall consistent with those in the literature. Despite reaching a considerably large volume (

\sim10^5

Mpc

^3

), however, we find no galaxy candidates at

z&gt;13

, providing us with a complimentary insight into early galaxy evolution with minimal cosmic variance. We publish imaging and catalog data products for these initial fields. Upon survey completion, all BEACON data will be coherently processed and distributed to the community along with catalogs for redshift and other physical quantities.

#astrophysics-of-galaxies #physics

Paper thumbnail

QUEST-DMC: Background Modelling and Resulting Heat Deposit for a Superfluid Helium-3 Bolometer

19 May 2024

University of Oxford Lancaster University

We report the results of radioactivity assays and heat leak calculations for a range of common cryogenic materials, considered for use in the QUEST-DMC superfluid 3He dark matter detector. The bolometer, instrumented with nanomechanical resonators, will be sensitive to energy deposits from dark matter interactions. Events from radioactive decays and cosmic rays constitute a significant background and must be precisely modelled, using a combination of material screening and Monte Carlo simulations. However, the results presented here are of wider interest for experiments and quantum devices sensitive to minute heat leaks and spurious events, thus we present heat leak per unit mass or surface area for every material studied. This can inform material choices for other experiments, especially if underground operation is considered where the radiogenic backgrounds will dominate even at shallow depths.

#other-condensed-matter #physics #instrumentation-and-detectors

Paper thumbnail

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

08 Oct 2025

ETH Zurich Anthropic logo

Research from institutions including the UK AI Security Institute and Anthropic demonstrates that poisoning attacks on Large Language Models are determined by a near-constant absolute number of malicious samples, rather than a percentage of the total training data. As few as 250 poisoned documents were sufficient to backdoor models ranging from 600 million to 13 billion parameters, though subsequent alignment training significantly reduced attack success.

#adversarial-attacks #adversarial-robustness #computer-science

Paper thumbnail

A Definition of AGI

03 Dec 2025

University of Washington

The lack of a concrete definition for Artificial General Intelligence (AGI) obscures the gap between today's specialized AI and human-level cognition. This paper introduces a quantifiable framework to address this, defining AGI as matching the cognitive versatility and proficiency of a well-educated adult. To operationalize this, we ground our methodology in Cattell-Horn-Carroll theory, the most empirically validated model of human cognition. The framework dissects general intelligence into ten core cognitive domains-including reasoning, memory, and perception-and adapts established human psychometric batteries to evaluate AI systems. Application of this framework reveals a highly "jagged" cognitive profile in contemporary models. While proficient in knowledge-intensive domains, current AI systems have critical deficits in foundational cognitive machinery, particularly long-term memory storage. The resulting AGI scores (e.g., GPT-4 at 27%, GPT-5 at 57%) concretely quantify both rapid progress and the substantial gap remaining before AGI.

#computer-science #artificial-intelligence #machine-learning

Paper thumbnail

Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training

30 Sep 2025

University of Oxford Meta Superintelligence Labs

Researchers at Meta Superintelligence Labs and University of Oxford empirically demonstrate that Large Language Models acquire separable visual perceptual and reasoning abilities from text-only pre-training, identifying optimal data mixtures for cultivating these "visual priors" and showing improved multimodal performance. The work introduces the Multi-Level Existence Bench (MLE-Bench) for fine-grained perceptual evaluation.

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Paper thumbnail

Early science acceleration experiments with GPT-5

20 Nov 2025

University of Cambridge Harvard University logo

Harvard University

OpenAI researchers and collaborators evaluate GPT-5's utility in accelerating scientific research across diverse fields, demonstrating its capacity for contributing to known result rediscovery, literature search, collaborative problem-solving, and the generation of novel scientific findings. The model proved to compress research timelines from months to hours and provided verifiable new insights in mathematics, physics, and biology.

#agents #chain-of-thought #computer-science

Paper thumbnail

Quantum influences and event relativity

31 Jan 2024

University of Oxford

We develop a new interpretation of quantum theory by combining insights from extended Wigner's friend scenarios and quantum causal modelling. In this interpretation, which synthesizes ideas from relational quantum mechanics and consistent histories, events obtain relative to a set of systems, and correspond to projectors that are picked out by causal structure. We articulate these ideas using a precise mathematical formalism. Using this formalism, we show through specific examples and general constructions how quantum phenomena can be modelled and paradoxes avoided; how different scenarios may be classified and the framework of quantum causal models extended; and how one can approach decoherence and emergent classicality without relying on quantum states.

#physics #quantum-physics

Paper thumbnail

Euclid preparation XLVI. The Near-IR Background Dipole Experiment with Euclid

24 Jun 2024

arthurmloureiro

Arthur Loureiro

California Institute of Technology University of Oslo

Verifying the fully kinematic nature of the cosmic microwave background (CMB) dipole is of fundamental importance in cosmology. In the standard cosmological model with the Friedman-Lemaitre-Robertson-Walker (FLRW) metric from the inflationary expansion the CMB dipole should be entirely kinematic. Any non-kinematic CMB dipole component would thus reflect the preinflationary structure of spacetime probing the extent of the FLRW applicability. Cosmic backgrounds from galaxies after the matter-radiation decoupling, should have kinematic dipole component identical in velocity with the CMB kinematic dipole. Comparing the two can lead to isolating the CMB non-kinematic dipole. It was recently proposed that such measurement can be done using the near-IR cosmic infrared background (CIB) measured with the currently operating Euclid telescope, and later with Roman. The proposed method reconstructs the resolved CIB, the Integrated Galaxy Light (IGL), from Euclid's Wide Survey and probes its dipole, with a kinematic component amplified over that of the CMB by the Compton-Getting effect. The amplification coupled with the extensive galaxy samples forming the IGL would determine the CIB dipole with an overwhelming signal/noise, isolating its direction to sub-degree accuracy. We develop details of the method for Euclid's Wide Survey in 4 bands spanning 0.6 to 2 mic. We isolate the systematic and other uncertainties and present methodologies to minimize them, after confining the sample to the magnitude range with negligible IGL/CIB dipole from galaxy clustering. These include the required star-galaxy separation, accounting for the extinction correction dipole using the method newly developed here achieving total separation, accounting for the Earth's orbital motion and other systematic effects. (Abridged)

#cosmology-and-nongalactic-astrophysics #general-relativity-and-quantum-cosmology #physics

Paper thumbnail

The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

01 Sep 2024

conglu1997

Cong Lu

University of Oxford

University of British Columbia

Sakana AI and collaborating institutions developed a comprehensive framework for fully automated scientific discovery, enabling AI agents to conduct an entire research endeavor from hypothesis to peer-reviewed paper. This system generated hundreds of medium-quality papers in machine learning at a cost of under $15 per paper, with its automated reviewer achieving near-human-level performance.

#autonomous-vehicles #computer-science #artificial-intelligence

Resources 8,926

Paper thumbnail

Base Models Know How to Reason, Thinking Models Learn When

22 Oct 2025

University of Oxford University of Buenos Aires

The paper investigates whether "thinking models" acquire entirely new reasoning capabilities or simply learn to better utilize pre-existing ones from their base counterparts. It demonstrates that base models possess latent reasoning abilities, and thinking models primarily learn *when* and *how* to deploy these mechanisms; a hybrid model steered a base LLM with specific reasoning vectors, recovering up to 91% of the performance gap on mathematical benchmarks.

#causal-inference #computer-science #artificial-intelligence

Paper thumbnail

Eliciting Secret Knowledge from Language Models

31 Oct 2025

University of Oxford

We study secret elicitation: discovering knowledge that an AI possesses but does not explicitly verbalize. As a testbed, we train three families of large language models (LLMs) to possess specific knowledge that they apply downstream but deny knowing when asked directly. For example, in one setting, we train an LLM to generate replies that are consistent with knowing the user is female, while denying this knowledge when asked directly. We then design various black-box and white-box secret elicitation techniques and evaluate them based on whether they can help an LLM auditor successfully guess the secret knowledge. Many of our techniques improve on simple baselines. Our most effective techniques (performing best in all settings) are based on prefill attacks, a black-box technique where the LLM reveals secret knowledge when generating a completion from a predefined prefix. Our white-box techniques based on logit lens and sparse autoencoders (SAEs) also consistently increase the success rate of the LLM auditor, but are less effective. We release our models and code, establishing a public benchmark for evaluating secret elicitation methods.

#adversarial-attacks #computer-science #machine-learning

Paper thumbnail

Very Deep Convolutional Networks for Large-Scale Image Recognition

10 Apr 2015

University of Oxford

VGG networks systematically demonstrated that increasing convolutional network depth using homogeneous 3x3 filters consistently improves large-scale image recognition accuracy, achieving top results on ImageNet and establishing powerful, transferable feature extractors.

#computer-science #computer-vision-security #computer-vision-and-pattern-recognition

Paper thumbnail

There are no more papers matching your filters at the moment.