alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Ask or search anything...

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Stanford University

Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

06 Oct 2025

UC Berkeley Stanford University logo

Stanford University

The Agentic Context Engineering (ACE) framework dynamically evolves and curates comprehensive 'playbook' contexts for large language models, allowing them to continuously improve performance. This enables smaller, open-source models to match or exceed proprietary LLM agent performance on benchmarks like AppWorld, simultaneously reducing adaptation latency by up to 91.5% and token cost by 83.6%.

#agentic-frameworks #agents #computer-science

Paper thumbnail

FlowRL: Matching Reward Distributions for LLM Reasoning

04 Nov 2025

daixuancheng6

daixuan cheng

Shanghai AI Laboratory

Shanghai Jiao Tong University

FlowRL presents a policy optimization algorithm for large language models that leverages GFlowNet principles to match reward distributions rather than merely maximizing expected reward. This approach yielded superior performance on math and code reasoning benchmarks and notably increased the diversity of generated solutions.

#computer-science #artificial-intelligence #computation-and-language

Paper thumbnail

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

10 May 2025

yu-le

Yu Le

Alibaba Group Tsinghua University logo

Tsinghua University

A gating mechanism applied to the output of scaled dot-product attention in large language models improves training stability and performance across benchmarks while mitigating attention sink issues, demonstrated through extensive experiments on 15B parameter MoE models and 1.7B dense models trained on 3.5 trillion tokens.

#attention-mechanisms #computer-science #computation-and-language

Paper thumbnail

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

30 Jul 2024

Stanford University CZ Biohub

Direct Preference Optimization (DPO) introduces a method for fine-tuning large language models to align with human preferences that avoids the complexity of Reinforcement Learning from Human Feedback (RLHF). It reparameterizes the RLHF objective, allowing direct policy optimization and matching or exceeding PPO-based methods in performance and stability across summarization and dialogue tasks.

#computer-science #artificial-intelligence #computation-and-language

Resources 2,400

Paper thumbnail

OpenVLA: An Open-Source Vision-Language-Action Model

05 Sep 2024

Google DeepMind UC Berkeley logo

OpenVLA introduces a fully open-source, 7B-parameter Vision-Language-Action model that sets a new state of the art for generalist robot manipulation, outperforming larger closed-source models by 16.5% absolute success rate. The model also demonstrates effective and efficient fine-tuning strategies for adapting to new robot setups and tasks on commodity hardware.

#computer-science #machine-learning #robotics

Resources 2,361

Paper thumbnail

Breaking the Sorting Barrier for Directed Single-Source Shortest Paths

30 Jul 2025

Tsinghua University Stanford University logo

Stanford University

Researchers from Tsinghua University, Stanford University, and the Max Planck Institute for Informatics developed a deterministic algorithm for the single-source shortest path problem on directed graphs with non-negative real edge weights. The algorithm achieves O(m log^(2/3) n) time complexity, marking the first time the long-standing O(m + n log n) sorting barrier has been surpassed in the comparison-addition model for this problem.

#computer-science #data-structures-and-algorithms

Paper thumbnail

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

02 Oct 2025

Carnegie Mellon University Stanford University logo

Stanford University

The RLAD framework enables large language models to self-discover and leverage high-level reasoning abstractions, leading to substantial improvements in accuracy and compute efficiency on challenging mathematical reasoning tasks. This approach teaches models to propose and utilize concise procedural and factual knowledge to guide complex problem-solving.

#agents #chain-of-thought #computer-science

Paper thumbnail

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

20 Nov 2025

Stanford University Salesforce Research

Agent0, developed by researchers at UNC-Chapel Hill, Salesforce Research, and Stanford University, introduces a fully autonomous framework for evolving high-performing LLM agents from scratch, without human-curated data. The system achieves substantial improvements, such as an 18% increase in mathematical reasoning and a 24% boost in general reasoning for Qwen3-8B-Base models, through a co-evolutionary loop and tool-integrated learning.

#agentic-frameworks #agents #computer-science

Paper thumbnail

Co-Supervised Learning: Improving Weak-to-Strong Generalization with Hierarchical Mixture of Experts

23 Feb 2024

Stanford University École Polytechnique Fédérale de Lausanne (EPFL)

The Co-Supervised Learning (CSL) framework enhances weak-to-strong generalization by leveraging a hierarchical mixture of specialized weak teachers and an iterative, student-guided assignment process. This approach, which also incorporates a conservative noise reduction mechanism, achieves over 15% improvement in Performance Gap Recovery (PGR) on ImageNet and 17% PGR on DomainNet compared to vanilla single-teacher methods.

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Paper thumbnail

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

31 Jan 2024

Stanford University

RAPTOR, developed by researchers at Stanford University, introduces a method for creating a hierarchical tree-structured index of documents to improve retrieval for large language models. This approach enables the retrieval of information at various levels of abstraction, leading to superior performance on challenging long-document question-answering tasks.

#computer-science #computation-and-language #machine-learning

Resources 1,100

Paper thumbnail

Is your data alignable? Principled and interpretable alignability testing and integration of single-cell data

29 Feb 2024

Stanford University

The Spectral Manifold Alignment and Inference (SMAI) framework enables principled and interpretable integration of single-cell data by first rigorously assessing dataset alignability and then applying a structure-preserving alignment. This approach reduces data distortion, improves the reliability of downstream analyses such as differential expression and spatial gene prediction, and provides a clear, quantitative interpretation of batch effects as combinations of scaling, translation, and rotation.

#ai-for-genomics #computer-science #computer-vision-and-pattern-recognition

Paper thumbnail

DiffusionNFT: Online Diffusion Reinforcement with Forward Process

19 Sep 2025

Tsinghua University Stanford University logo

Stanford University

DiffusionNFT presents an online reinforcement learning framework that optimizes diffusion models on the forward process through negative-aware fine-tuning. This method achieves 3x to 25x higher efficiency than previous state-of-the-art methods and significantly improves image generation quality and alignment, notably enabling Classifier-Free Guidance-free operation.

#computer-science #contrastive-learning #artificial-intelligence

Paper thumbnail

Cambrian-S: Towards Spatial Supersensing in Video

06 Nov 2025

Ellis Brown

New York University Stanford University logo

Stanford University

Researchers from NYU and Stanford introduce a "spatial supersensing" hierarchy for video-based Multimodal Large Language Models (MLLMs) and new VSI-SUPER benchmarks to reveal current models' limitations in genuine spatial and temporal reasoning. They develop Cambrian-S, a specialized MLLM trained on a large spatial dataset achieving state-of-the-art on VSI-Bench, and prototype a "predictive sensing" paradigm that leverages prediction error ("surprise") to robustly improve memory management and event segmentation in arbitrarily long videos.

#computer-science #continual-learning #computer-vision-and-pattern-recognition

Paper thumbnail

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

07 Oct 2025

isaac-zhang

Isaac Zhang

Stanford University

University of California, San Diego

AGENTFLOW introduces a trainable agentic framework for large language models that integrates specialized modules with in-the-flow reinforcement learning for effective planning and tool use. Utilizing a 7B-scale backbone, this system surpassed GPT-4o's performance across diverse complex reasoning tasks.

#agentic-frameworks #agents #chain-of-thought

Paper thumbnail

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

24 Jun 2022

Stanford University University at Buffalo SUNY

FlashAttention, developed at Stanford's Hazy Research lab, introduces an exact attention algorithm that optimizes memory access patterns to mitigate the IO bottleneck on GPUs. This method achieves substantial speedups and reduces memory footprint for Transformer models, enabling processing of significantly longer sequence lengths.

#attention-mechanisms #computer-science #machine-learning

Paper thumbnail

Learning Continually by Spectral Regularization

27 Oct 2024

Google DeepMind Stanford University logo

Stanford University

Researchers introduce spectral regularization, a method that maintains neural network plasticity and trainability by explicitly controlling the spectral norms of layer weights. This technique consistently improved performance in diverse continual supervised and reinforcement learning tasks while demonstrating robustness across various non-stationarities and reduced hyperparameter sensitivity.

#computer-science #continual-learning #machine-learning

Paper thumbnail

Make-it-Real: Unleashing Large Multimodal Model for Painting 3D Objects with Realistic Materials

23 May 2024

Shanghai AI Laboratory

Shanghai Jiao Tong University

Researchers from Fudan University and Shanghai AI Laboratory introduce "Make-it-Real," a framework that leverages GPT-4V to automatically paint 3D objects with realistic materials from albedo-only inputs. It generates a full suite of SVBRDF maps, achieving up to 77.8% human user preference and 84.8% GPT evaluation preference for refined objects over unrefined ones, significantly enhancing visual authenticity.

#computer-science #computer-vision-security #artificial-intelligence

Paper thumbnail

Best Practices for Large Language Models in Radiology

02 Dec 2024

University of Zurich Stanford University logo

Stanford University

Researchers from the Stanford Center for Artificial Intelligence in Medicine and Imaging present a comprehensive guide outlining best practices for integrating Large Language Models (LLMs) into radiology workflows. The work synthesizes technical foundations, identifies critical applications and challenges, and recommends strategies for responsible development, evaluation, and deployment, emphasizing techniques like Retrieval-Augmented Generation for improved factual accuracy and safety.

#ai-for-health #computer-science #artificial-intelligence

Paper thumbnail

Mapping High-level Semantic Regions in Indoor Environments without Object Recognition

11 Mar 2024

Georgia Institute of Technology Stanford University logo

Stanford University

This research develops a method for robots to generate high-level semantic region maps of indoor environments (e.g., "kitchen," "bedroom") without relying on explicit object recognition. The approach adapts Vision-Language Models for embodied perception, outperforming object-based and traditional scene classification baselines in online mapping.

#computer-science #artificial-intelligence #computer-vision-and-pattern-recognition

Paper thumbnail

One-Shot Transfer of Long-Horizon Extrinsic Manipulation Through Contact Retargeting

11 Apr 2024

rcwang

Ruocheng Wang

Stanford University NVIDIA logo

Researchers from Stanford University and NVIDIA developed a contact retargeting framework that enables one-shot transfer of long-horizon extrinsic manipulation skills to diverse objects and environments. This method achieved an 80.5% success rate on hardware across various complex tasks, significantly outperforming prior reinforcement learning approaches by effectively generalizing from a single human demonstration.

#computer-science #robotics #imitation-learning

Paper thumbnail

There are no more papers matching your filters at the moment.