alphaXiv

History

Papers Benchmarks

Stanford University

9,559

06 Oct 2025

agentic-frameworks agents computer-science

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

UC Berkeley

Stanford University SambaNova

The Agentic Context Engineering (ACE) framework dynamically evolves and curates comprehensive 'playbook' contexts for large language models, allowing them to continuously improve performance. This enables smaller, open-source models to match or exceed proprietary LLM agent performance on benchmarks like AppWorld, simultaneously reducing adaptation latency by up to 91.5% and token cost by 83.6%.

3,812

04 Nov 2025

computer-science artificial-intelligence computation-and-language

FlowRL: Matching Reward Distributions for LLM Reasoning

Shanghai AI Laboratory

Shanghai Jiao Tong University

Tsinghua University

Stanford University

Renmin University of China

Peking University

Microsoft

Toyota Technological Institute at Chicago

daixuan cheng

FlowRL presents a policy optimization algorithm for large language models that leverages GFlowNet principles to match reward distributions rather than merely maximizing expected reward. This approach yielded superior performance on math and code reasoning benchmarks and notably increased the diversity of generated solutions.

4,450

10 May 2025

attention-mechanisms computer-science computation-and-language

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Alibaba Group

Tsinghua University

Stanford University University of Edinburgh

MIT

Yu Le

A gating mechanism applied to the output of scaled dot-product attention in large language models improves training stability and performance across benchmarks while mitigating attention sink issues, demonstrated through extensive experiments on 15B parameter MoE models and 1.7B dense models trained on 3.5 trillion tokens.

32,677

30 Jul 2024

computer-science artificial-intelligence computation-and-language

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Stanford University CZ Biohub

Direct Preference Optimization (DPO) introduces a method for fine-tuning large language models to align with human preferences that avoids the complexity of Reinforcement Learning from Human Feedback (RLHF). It reparameterizes the RLHF objective, allowing direct policy optimization and matching or exceeding PPO-based methods in performance and stability across summarization and dialogue tasks.

2,400

25,703

05 Sep 2024

computer-science machine-learning robotics

OpenVLA: An Open-Source Vision-Language-Action Model

Google DeepMind

UC Berkeley

Stanford University

MIT Toyota Research Institute Physical Intelligence

OpenVLA introduces a fully open-source, 7B-parameter Vision-Language-Action model that sets a new state of the art for generalist robot manipulation, outperforming larger closed-source models by 16.5% absolute success rate. The model also demonstrates effective and efficient fine-tuning strategies for adapting to new robot setups and tasks on commodity hardware.

2,361

48,023

30 Jul 2025

computer-science data-structures-and-algorithms

Breaking the Sorting Barrier for Directed Single-Source Shortest Paths

Tsinghua University

Stanford University Max-Planck Institute for Informatics

Researchers from Tsinghua University, Stanford University, and the Max Planck Institute for Informatics developed a deterministic algorithm for the single-source shortest path problem on directed graphs with non-negative real edge weights. The algorithm achieves O(m log^(2/3) n) time complexity, marking the first time the long-standing O(m + n log n) sorting barrier has been surpassed in the comparison-addition model for this problem.

2,702

02 Oct 2025

agents chain-of-thought computer-science

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Carnegie Mellon University

Stanford University

The RLAD framework enables large language models to self-discover and leverage high-level reasoning abstractions, leading to substantial improvements in accuracy and compute efficiency on challenging mathematical reasoning tasks. This approach teaches models to propose and utilize concise procedural and factual knowledge to guide complex problem-solving.

107

2,517

20 Nov 2025

agentic-frameworks agents computer-science

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Stanford University Salesforce Research UNC-Chapel Hill

Agent0, developed by researchers at UNC-Chapel Hill, Salesforce Research, and Stanford University, introduces a fully autonomous framework for evolving high-performing LLM agents from scratch, without human-curated data. The system achieves substantial improvements, such as an 18% increase in mathematical reasoning and a 24% boost in general reasoning for Qwen3-8B-Base models, through a co-evolutionary loop and tool-integrated learning.

2,093

23 Feb 2024

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

Stanford University École Polytechnique Fédérale de Lausanne (EPFL)

The Co-Supervised Learning (CSL) framework enhances weak-to-strong generalization by leveraging a hierarchical mixture of specialized weak teachers and an iterative, student-guided assignment process. This approach, which also incorporates a conservative noise reduction mechanism, achieves over 15% improvement in Performance Gap Recovery (PGR) on ImageNet and 17% PGR on DomainNet compared to vanilla single-teacher methods.

9,313

31 Jan 2024

computer-science computation-and-language machine-learning

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Stanford University

RAPTOR, developed by researchers at Stanford University, introduces a method for creating a hierarchical tree-structured index of documents to improve retrieval for large language models. This approach enables the retrieval of information at various levels of abstraction, leading to superior performance on challenging long-document question-answering tasks.

1,100

2,019

29 Feb 2024

ai-for-genomics computer-science computer-vision-and-pattern-recognition

Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data

Stanford University

The Spectral Manifold Alignment and Inference (SMAI) framework enables principled and interpretable integration of single-cell data by first rigorously assessing dataset alignability and then applying a structure-preserving alignment. This approach reduces data distortion, improves the reliability of downstream analyses such as differential expression and spatial gene prediction, and provides a clear, quantitative interpretation of batch effects as combinations of scaling, translation, and rotation.

1,909

19 Sep 2025

computer-science contrastive-learning artificial-intelligence

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Tsinghua University

Stanford University

NVIDIA

DiffusionNFT presents an online reinforcement learning framework that optimizes diffusion models on the forward process through negative-aware fine-tuning. This method achieves 3x to 25x higher efficiency than previous state-of-the-art methods and significantly improves image generation quality and alignment, notably enabling Classifier-Free Guidance-free operation.

190

1,892

06 Nov 2025

computer-science continual-learning computer-vision-and-pattern-recognition

Cambrian-S: Towards Spatial Supersensing in Video

New York University

Stanford University

Ellis Brown

Researchers from NYU and Stanford introduce a "spatial supersensing" hierarchy for video-based Multimodal Large Language Models (MLLMs) and new VSI-SUPER benchmarks to reveal current models' limitations in genuine spatial and temporal reasoning. They develop Cambrian-S, a specialized MLLM trained on a large spatial dataset achieving state-of-the-art on VSI-Bench, and prototype a "predictive sensing" paradigm that leverages prediction error ("surprise") to robustly improve memory management and event segmentation in arbitrarily long videos.

1,850

07 Oct 2025

agentic-frameworks agents chain-of-thought

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Stanford University

University of California, San Diego

Texas A&M University LAMBDA

Isaac Zhang

AGENTFLOW introduces a trainable agentic framework for large language models that integrates specialized modules with in-the-flow reinforcement learning for effective planning and tool use. Utilizing a 7B-scale backbone, this system surpassed GPT-4o's performance across diverse complex reasoning tasks.

463

11,858

24 Jun 2022

attention-mechanisms computer-science machine-learning

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Stanford University University at Buffalo SUNY

FlashAttention, developed at Stanford's Hazy Research lab, introduces an exact attention algorithm that optimizes memory access patterns to mitigate the IO bottleneck on GPUs. This method achieves substantial speedups and reduces memory footprint for Transformer models, enabling processing of significantly longer sequence lengths.

2,195

27 Oct 2024

computer-science continual-learning machine-learning

Learning Continually by Spectral Regularization

Google DeepMind

Stanford University

University of Alberta

Researchers introduce spectral regularization, a method that maintains neural network plasticity and trainability by explicitly controlling the spectral norms of layer weights. This technique consistently improved performance in diverse continual supervised and reinforcement learning tasks while demonstrating robustness across various non-stationarities and reduced hyperparameter sensitivity.

2,143

23 May 2024

computer-science computer-vision-security artificial-intelligence

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials

Shanghai AI Laboratory

Shanghai Jiao Tong University

Stanford University

The Chinese University of Hong Kong

Nanyang Technological University Fudan Unversity

Researchers from Fudan University and Shanghai AI Laboratory introduce "Make-it-Real," a framework that leverages GPT-4V to automatically paint 3D objects with realistic materials from albedo-only inputs. It generates a full suite of SVBRDF maps, achieving up to 77.8% human user preference and 84.8% GPT evaluation preference for refined objects over unrefined ones, significantly enhancing visual authenticity.

176

2,134

02 Dec 2024

ai-for-health computer-science artificial-intelligence

Best Practices for Large Language Models in Radiology

University of Zurich

Stanford University

Hugging Face Icahn School of Medicine at Mount Sinai UT Health San Antonio University Hospital Zurich

Researchers from the Stanford Center for Artificial Intelligence in Medicine and Imaging present a comprehensive guide outlining best practices for integrating Large Language Models (LLMs) into radiology workflows. The work synthesizes technical foundations, identifies critical applications and challenges, and recommends strategies for responsible development, evaluation, and deployment, emphasizing techniques like Retrieval-Augmented Generation for improved factual accuracy and safety.

2,089

11 Mar 2024

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Mapping High-level Semantic Regions in Indoor Environments without Object Recognition

Georgia Institute of Technology

Stanford University University of Modena and Reggio Emilia

This research develops a method for robots to generate high-level semantic region maps of indoor environments (e.g., "kitchen," "bedroom") without relying on explicit object recognition. The approach adapts Vision-Language Models for embodied perception, outperforming object-based and traditional scene classification baselines in online mapping.

2,040

11 Apr 2024

computer-science robotics imitation-learning

One-Shot Transfer of Long-Horizon Extrinsic Manipulation Through Contact Retargeting

Stanford University

NVIDIA

Ruocheng Wang

Researchers from Stanford University and NVIDIA developed a contact retargeting framework that enables one-shot transfer of long-horizon extrinsic manipulation skills to diverse objects and environments. This method achieved an 80.5% success rate on hardware across various complex tasks, significantly outperforming prior reinforcement learning approaches by effectively generalizing from a single human demonstration.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

FlowRL: Matching Reward Distributions for LLM Reasoning

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

OpenVLA: An Open-Source Vision-Language-Action Model

Breaking the Sorting Barrier for Directed Single-Source Shortest Paths

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

Cambrian-S: Towards Spatial Supersensing in Video

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

Learning Continually by Spectral Regularization

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials

Best Practices for Large Language Models in Radiology

Mapping High-level Semantic Regions in Indoor Environments without Object Recognition

One-Shot Transfer of Long-Horizon Extrinsic Manipulation Through Contact Retargeting

Events

AI for Law

Personalize Your Feed