University of California Berkeley
Verma et al. introduce a new computational definition and a comprehensive taxonomy for deception, offering a foundational framework for domain-independent detection. Their work provides empirical evidence through linguistic analysis and deep learning experiments that generalizable linguistic cues for deception exist and can be transferred across diverse domains, challenging previous skepticism in the field.
In drug discovery, molecular dynamics (MD) simulation for protein-ligand binding provides a powerful tool for predicting binding affinities, estimating transport properties, and exploring pocket sites. There has been a long history of improving the efficiency of MD simulations through better numerical methods and, more recently, by utilizing machine learning (ML) methods. Yet, challenges remain, such as accurate modeling of extended-timescale simulations. To address this issue, we propose NeuralMD, the first ML surrogate that can facilitate numerical MD and provide accurate simulations in protein-ligand binding dynamics. We propose a principled approach that incorporates a novel physics-informed multi-grained group symmetric framework. Specifically, we propose (1) the BindingNet model that satisfies group symmetry using vector frames and captures the multi-level protein-ligand interactions, and (2) an augmented neural differential equation solver that learns the trajectory under Newtonian mechanics. For the experiment, we design ten single-trajectory and three multi-trajectory binding simulation tasks. We demonstrate the efficiency and effectiveness of NeuralMD, achieving over 1K×\times speedup compared to standard numerical MD simulations. NeuralMD also outperforms all other ML approaches, achieving up to 15×\times reduction in reconstruction error and 70% increase in validity. Additionally, we qualitatively illustrate that the oscillations in the predicted trajectories align more closely with ground-truth dynamics than those of other machine-learning methods. We believe NeuralMD paves the foundation for a new research paradigm in simulating protein-ligand dynamics.
AutoCode introduces a framework for large language models to generate high-quality competitive programming problems and robust test cases, enhancing the rigor of evaluation benchmarks. The system achieves 91.1% consistency with official judgments in test case generation, reducing false positive and false negative rates, and produces novel problems with 3.2% reaching ICPC/IOI quality levels after human review.
OpenVision 2 introduces a family of generative pretrained visual encoders that simplify multimodal learning by exclusively using a caption-only generative objective, abandoning contrastive learning. This approach maintains or improves performance on multimodal benchmarks while reducing training time by up to 50% and memory consumption by 1.8x, enabling billion-parameter scale vision encoders.
391
R-KV introduces a training-free and model-agnostic method for compressing the Key-Value (KV) cache in reasoning Large Language Models by identifying and pruning redundant tokens during decoding. This approach achieved up to 90% memory savings and a 6.6x throughput increase, while remarkably sometimes surpassing uncompressed models in reasoning accuracy.
3
LeVERB is a framework designed for humanoid robots to execute agile whole-body actions through latent vision-language instructions, enabling zero-shot sim-to-real transfer. The framework achieved an average 58.5% success rate across diverse tasks in simulation, which is a 7.8 times improvement over a naive hierarchical VLA, and successfully generalized to unseen commands on a Unitree G1 robot.
A new method called CAST enhances vision-language-action (VLA) models' instruction-following capabilities by generating counterfactual language and action labels from existing robot trajectories. This approach helps overcome the posterior collapse problem, improving success rates by 27% over standard hindsight-labeled VLAs and 19% over prior methods in diverse real-world navigation tasks.
21
VLMnav enables zero-shot robot navigation by reframing spatial reasoning as a question-answering task for off-the-shelf Vision-Language Models. The approach leverages carefully designed prompting and visual annotations, achieving a 50.7% success rate on the ObjectNav benchmark and outperforming existing prompting methods on ObjectNav and GOAT benchmarks.
102
I developed the lecture notes based on my ``Causal Inference'' course at the University of California Berkeley over the past seven years. Since half of the students were undergraduates, my lecture notes only required basic knowledge of probability theory, statistical inference, and linear and logistic regressions.
AI-driven design problems, such as DNA/protein sequence design, are commonly tackled from two angles: generative modeling, which efficiently captures the feasible design space (e.g., natural images or biological sequences), and model-based optimization, which utilizes reward models for extrapolation. To combine the strengths of both approaches, we adopt a hybrid method that fine-tunes cutting-edge diffusion models by optimizing reward models through RL. Although prior work has explored similar avenues, they primarily focus on scenarios where accurate reward models are accessible. In contrast, we concentrate on an offline setting where a reward model is unknown, and we must learn from static offline datasets, a common scenario in scientific domains. In offline scenarios, existing approaches tend to suffer from overoptimization, as they may be misled by the reward model in out-of-distribution regions. To address this, we introduce a conservative fine-tuning approach, BRAID, by optimizing a conservative reward model, which includes additional penalization outside of offline data distributions. Through empirical and theoretical analysis, we demonstrate the capability of our approach to outperform the best designs in offline data, leveraging the extrapolation capabilities of reward models while avoiding the generation of invalid designs through pre-trained diffusion models.
QT-Opt, a deep reinforcement learning framework by Google Brain and X, enables scalable, vision-based robotic grasping, achieving a 96% success rate on previously unseen objects through closed-loop control and an efficient, distributed data collection and training system leveraging over 580,000 real-world grasp attempts.
5
Researchers at the University of California, Berkeley, and Toyota Motor North America developed the Model-Based ReAnnotation (MBRA) framework, which transforms noisy crowd-sourced or action-free internet video data into high-quality training signals for robot navigation. This approach enabled LogoNav, a long-horizon navigation policy, to achieve state-of-the-art performance with 0.857 goal success and 0.924 coverage, demonstrating zero-shot generalization across diverse global environments and different robot platforms.
26
Researchers characterized how Physics-Informed Neural Networks (PINNs) often fail to learn solutions for moderately complex partial differential equations due to optimization difficulties. They introduced curriculum regularization and a sequence-to-sequence learning approach, which reduced prediction errors by up to two orders of magnitude in challenging cases.
128
A deep reinforcement learning framework, Demonstration Augmented Policy Gradient (DAPG), enables robotic hands with many degrees of freedom to learn intricate manipulation skills by integrating a small number of human demonstrations. This approach achieves a 30-fold increase in learning efficiency for tasks like object relocation while yielding robust, human-like behaviors.
A multi-faceted safety assessment of large reasoning models (LRMs) like DeepSeek-R1 reveals a substantial safety gap compared to proprietary models and demonstrates that reasoning distillation can inadvertently degrade safety alignment. The study uniquely found that the internal 'thinking process' of LRMs often contains hidden safety risks not present in their final answers, and when unsafe, their outputs are more harmful due to enhanced reasoning capabilities.
Researchers at UC Berkeley developed Goal-Conditioned Supervised Learning (GCSL), an algorithm that learns goal-reaching policies by iteratively training a policy via supervised learning on self-generated, hindsight-relabelled experience. GCSL exhibits improved robustness to hyperparameter settings and performs comparably to or better than deep reinforcement learning baselines like TD3-HER and PPO across simulated robotic manipulation and navigation tasks.
76
The paper introduces randomization-based sanity checks to rigorously evaluate saliency map methods, revealing that some popular techniques like Guided Backpropagation are invariant to model parameter and data label randomization, essentially functioning as input feature detectors rather than true explanations of a model's learned behavior. The work establishes a methodology for assessing explanation faithfulness, showing that methods such as Gradients and GradCAM generally pass these tests while others do not.
25
This paper presents a unified framework explaining Large Language Model (LLM) hallucinations as a result of competing statistical associations between input subsequences and potential outputs. It demonstrates how these associations are encoded within transformer architectures and introduces a tracing algorithm that identifies specific input subsequences responsible for triggering hallucinations, showing a strong correlation between these identified associations and patterns in the model's training data.
6
Researchers at UC Berkeley, Simon Fraser University, and Mila developed a deep reinforcement learning framework for bipedal locomotion, enabling the Cassie robot to perform diverse and dynamic skills like sustained running and various jumps with zero-shot sim-to-real transfer. The approach leverages a novel dual-history policy architecture and a multi-stage training curriculum, resulting in robust performance and adaptability to real-world conditions over extended periods.
7
There are no more papers matching your filters at the moment.