alphaXiv

History

Papers Benchmarks

University of California Berkeley

2,007

01 Feb 2024

computer-science computation-and-language cryptography-and-security

Domain-Independent Deception: A New Taxonomy and Linguistic Analysis

Tel Aviv University University of Houston University of California Berkeley Instabase

Xuting Liu

Verma et al. introduce a new computational definition and a comprehensive taxonomy for deception, offering a foundational framework for domain-independent detection. Their work provides empirical evidence through linguistic analysis and deep learning experiments that generalizable linguistic cues for deception exist and can be transferred across diverse domains, challenging previous skepticism in the field.

964

26 Nov 2024

ai-for-health computer-science artificial-intelligence

A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics

University of Toronto

California Institute of Technology

Chinese Academy of Sciences National Research Council Canada University of California Berkeley

Shengchao Liu

In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by utilizing machine learning (ML) methods. Yet, challenges remain, such as accurate modeling of extended-timescale simulations. To address this issue, we propose NeuralMD, the first ML surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics. We propose a principled approach that incorporates a novel physics-informed multi-grained group symmetric framework. Specifically, we propose (1) the BindingNet model that satisfies group symmetry using vector frames and captures the multi-level protein-ligand interactions, and (2) an augmented neural differential equation solver that learns the trajectory under Newtonian mechanics. For the experiment, we design ten single-trajectory and three multi-trajectory binding simulation tasks. We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K

\times

speedup compared to standard numerical MD simulations. NeuralMD also outperforms all other ML approaches, achieving up to 15

\times

reduction in reconstruction error and 70% increase in validity. Additionally, we qualitatively illustrate that the oscillations in the predicted trajectories align more closely with ground-truth dynamics than those of other machine-learning methods. We believe NeuralMD paves the foundation for a new research paradigm in simulating protein-ligand dynamics.

316

29 Sep 2025

agents computer-science artificial-intelligence

AutoCode: LLMs as Problem Setters for Competitive Programming

University of Washington

University of Waterloo

New York University

OpenAI

University of California, San Diego

MIT

Princeton University University of California Berkeley Canyon Crest Academy Sentient Labs

AutoCode introduces a framework for large language models to generate high-quality competitive programming problems and robust test cases, enhancing the rigor of evaluation benchmarks. The system achieves 91.1% consistency with official judgments in test case generation, reducing false positive and false negative rates, and produces novel problems with 3.2% reaching ICPC/IOI quality levels after human review.

514

01 Sep 2025

computer-science computer-vision-and-pattern-recognition generative-models

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

Apple University of California Berkeley

University of California, Santa Cruz

OpenVision 2 introduces a family of generative pretrained visual encoders that simplify multimodal learning by exclusively using a caption-only generative objective, abandoning contrastive learning. This approach maintains or improves performance on multimodal benchmarks while reducing training time by up to 50% and memory consumption by 1.8x, enabling billion-parameter scale vision encoders.

391

504

13 Jun 2025

chain-of-thought computer-science artificial-intelligence

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

California Institute of Technology

Carnegie Mellon University

University of California, San Diego

Microsoft

University of Wisconsin-Madison University of Surrey University of California Berkeley

Adobe

R-KV introduces a training-free and model-agnostic method for compressing the Key-Value (KV) cache in reasoning Large Language Models by identifying and pruning redundant tokens during decoding. This approach achieved up to 90% memory savings and a 6.6x throughput increase, while remarkably sometimes surpassing uncompressed models in reasoning accuracy.

616

25 Sep 2025

agents computer-science artificial-intelligence

LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction

Carnegie Mellon University Simon Fraser University University of California Berkeley Norwegian University of Science and Technology

LeVERB is a framework designed for humanoid robots to execute agile whole-body actions through latent vision-language instructions, enabling zero-shot sim-to-real transfer. The framework achieved an average 58.5% success rate across diverse tasks in simulation, which is a 7.8 times improvement over a naive hierarchical VLA, and successfully generalized to unseen commands on a Unitree G1 robot.

292

19 Aug 2025

computer-science robotics

CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models

Princeton University University of California Berkeley

A new method called CAST enhances vision-language-action (VLA) models' instruction-following capabilities by generating counterfactual language and action labels from existing robot trajectories. This approach helps overcome the posterior collapse problem, improving success rates by 27% over standard hindsight-labeled VLAs and 19% over prior methods in diverse real-world navigation tasks.

7,762

08 Nov 2024

autonomous-vehicles computer-science computation-and-language

End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering

University of Pennsylvania University of California Berkeley

VLMnav enables zero-shot robot navigation by reframing spatial reasoning as a question-answering task for off-the-shelf Vision-Language Models. The approach leverages carefully designed prompting and visual annotations, achieving a 50.7% success rate on the ObjectNav benchmark and outperforming existing prompting methods on ObjectNav and GOAT benchmarks.

102

387

03 Oct 2023

applications statistics methodology

A First Course in Causal Inference

University of California Berkeley

I developed the lecture notes based on my ``Causal Inference'' course at the University of California Berkeley over the past seven years. Since half of the students were undergraduates, my lecture notes only required basic knowledge of probability theory, statistical inference, and linear and logistic regressions.

143

31 May 2024

computer-science artificial-intelligence machine-learning

Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models

Princeton University University of California Berkeley Genentech

AI-driven design problems, such as DNA/protein sequence design, are commonly tackled from two angles: generative modeling, which efficiently captures the feasible design space (e.g., natural images or biological sequences), and model-based optimization, which utilizes reward models for extrapolation. To combine the strengths of both approaches, we adopt a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL. Although prior work has explored similar avenues, they primarily focus on scenarios where accurate reward models are accessible. In contrast, we concentrate on an offline setting where a reward model is unknown, and we must learn from static offline datasets, a common scenario in scientific domains. In offline scenarios, existing approaches tend to suffer from overoptimization, as they may be misled by the reward model in out-of-distribution regions. To address this, we introduce a conservative fine-tuning approach, BRAID, by optimizing a conservative reward model, which includes additional penalization outside of offline data distributions. Through empirical and theoretical analysis, we demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models while avoiding the generation of invalid designs through pre-trained diffusion models.

975

28 Nov 2018

computer-science computer-vision-security artificial-intelligence

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

University of California Berkeley Google Brain X

QT-Opt, a deep reinforcement learning framework by Google Brain and X, enables scalable, vision-based robotic grasping, achieving a 96% success rate on previously unseen objects through closed-loop control and an efficient, distributed data collection and training system leveraging over 580,000 real-world grasp attempts.

468

21 Nov 2025

autonomous-vehicles computer-science computer-vision-and-pattern-recognition

Learning to Drive Anywhere with Model-Based Reannotation

Princeton University University of California Berkeley Toyota Motor North America

Researchers at the University of California, Berkeley, and Toyota Motor North America developed the Model-Based ReAnnotation (MBRA) framework, which transforms noisy crowd-sourced or action-free internet video data into high-quality training signals for robot navigation. This approach enabled LogoNav, a long-horizon navigation policy, to achieve state-of-the-art performance with 0.857 goal success and 0.924 coverage, demonstrating zero-shot generalization across diverse global environments and different robot platforms.

22 Dec 2020

cosmology-and-nongalactic-astrophysics high-energy-astrophysical-phenomena instrumentation-and-methods-for-astrophysics

Results of the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC)

Lluís Galbany

The Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC) successfully engaged a global community to develop machine learning classifiers capable of handling the complex, high-volume data expected from the Vera C. Rubin Observatory's LSST. Top solutions demonstrated superior performance in classifying a broad range of astronomical objects, including managing non-representative training data and identifying novel transient types.

1,475

11 Nov 2021

computer-science artificial-intelligence machine-learning

Characterizing possible failure modes in physics-informed neural networks

University of Utah University of California Berkeley International Computer Science Institute Lawrence Berkeley National Lab

Researchers characterized how Physics-Informed Neural Networks (PINNs) often fail to learn solutions for moderately complex partial differential equations due to optimization difficulties. They introduced curriculum regularization and a sequence-to-sequence learning approach, which reduced prediction errors by up to two orders of magnitude in challenging cases.

128

335

26 Jun 2018

computer-science artificial-intelligence machine-learning

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

University of Washington

OpenAI University of California Berkeley

A deep reinforcement learning framework, Demonstration Augmented Policy Gradient (DAPG), enables robotic hands with many degrees of freedom to learn intricate manipulation skills by integrating a small number of human demonstrations. This approach achieves a 30-fold increase in learning efficiency for tasks like object relocation while yielding robust, human-like behaviors.

1,547

27 Feb 2025

adversarial-attacks adversarial-robustness chain-of-thought

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

University of California Berkeley

University of California, Santa Cruz Cisco

Chengzhi Liu

A multi-faceted safety assessment of large reasoning models (LRMs) like DeepSeek-R1 reveals a substantial safety gap compared to proprietary models and demonstrates that reasoning distillation can inadvertently degrade safety alignment. The study uniquely found that the internal 'thinking process' of LRMs often contains hidden safety risks not present in their final answers, and when unsafe, their outputs are more harmful due to enhanced reasoning capabilities.

159

02 Oct 2020

computer-science artificial-intelligence machine-learning

Learning to Reach Goals via Iterated Supervised Learning

Carnegie Mellon University University of California Berkeley

Researchers at UC Berkeley developed Goal-Conditioned Supervised Learning (GCSL), an algorithm that learns goal-reaching policies by iteratively training a policy via supervised learning on self-generated, hindsight-relabelled experience. GCSL exhibits improved robustness to hyperparameter settings and performs comparably to or better than deep reinforcement learning baselines like TD3-HER and PPO across simulated robotic manipulation and navigation tasks.

331

06 Nov 2020

computer-science computer-vision-and-pattern-recognition machine-learning

Sanity Checks for Saliency Maps

Google

MIT University of California Berkeley Google Brain

The paper introduces randomization-based sanity checks to rigorously evaluate saliency map methods, revealing that some popular techniques like Guided Backpropagation are invariant to model parameter and data label randomization, essentially functioning as input feature detectors rather than true explanations of a model's learned behavior. The work establishes a methodology for assessing explanation faithfulness, showing that methods such as Gradients and GradCAM generally pass these tests while others do not.

171

17 Apr 2025

causal-inference computer-science computation-and-language

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

University of Washington

Stanford University University of California Berkeley

This paper presents a unified framework explaining Large Language Model (LLM) hallucinations as a result of competing statistical associations between input subsequences and potential outputs. It demonstrates how these associations are encoded within transformer architectures and introduces a tracing algorithm that identifies specific input subsequences responsible for triggering hallucinations, showing a strong correlation between these identified associations and patterns in the model's training data.

298

26 Aug 2024

adversarial-robustness computer-science artificial-intelligence

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

Université de Montréal

Mila - Quebec AI Institute Simon Fraser University University of California Berkeley

Researchers at UC Berkeley, Simon Fraser University, and Mila developed a deep reinforcement learning framework for bipedal locomotion, enabling the Cassie robot to perform diverse and dynamic skills like sustained running and various jumps with zero-shot sim-to-real transfer. The approach leverages a novel dual-history policy architecture and a multi-stage training curriculum, resulting in robust performance and adaptability to real-world conditions over extended periods.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Domain-Independent Deception: A New Taxonomy and Linguistic Analysis

A Multi-Grained Symmetric Differential Equation Model for Learning Protein-Ligand Binding Dynamics

AutoCode: LLMs as Problem Setters for Competitive Programming

OpenVision 2: A Family of Generative Pretrained Visual Encoders for Multimodal Learning

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction

CAST: Counterfactual Labels Improve Instruction Following in Vision-Language-Action Models

End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering

A First Course in Causal Inference

Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

Learning to Drive Anywhere with Model-Based Reannotation

Results of the Photometric LSST Astronomical Time-series Classification Challenge (PLAsTiCC)

Characterizing possible failure modes in physics-informed neural networks

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

Learning to Reach Goals via Iterated Supervised Learning

Sanity Checks for Saliency Maps

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control

Events

AI for Law

Personalize Your Feed