Ask or search anything...

History

Events

Watch Recordings

AI for Law01/09 · Joel Niklaus · Hugging Face

Papers Benchmarks

Hot

MoE Key Laboratory of Intelligent Perception and Human Machine Collaboration

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

27 May 2025

ke hu

University College London

Shanghai Jiao Tong University

GenPO introduces a framework that integrates generative diffusion policies into on-policy reinforcement learning, enabling tractable likelihood computations. The method consistently achieves higher returns and enhanced stability across diverse IsaacLab robotic control tasks compared to existing model-free and off-policy diffusion RL baselines.

View blog

#computer-science #machine-learning #deep-reinforcement-learning

Resources

714

Diffusion Bridge or Flow Matching? A Unifying Framework and Comparative Analysis

29 Sep 2025

ShanghaiTech University MoE Key Laboratory of Intelligent Perception and Human Machine Collaboration

Diffusion Bridge and Flow Matching have both demonstrated compelling empirical performance in transformation between arbitrary distributions. However, there remains confusion about which approach is generally preferable, and the substantial discrepancies in their modeling assumptions and practical implementations have hindered a unified theoretical account of their relative merits. We have, for the first time, provided a unified theoretical and experimental validation of these two models. We recast their frameworks through the lens of Stochastic Optimal Control and prove that the cost function of the Diffusion Bridge is lower, guiding the system toward more stable and natural trajectories. Simultaneously, from the perspective of Optimal Transport, interpolation coefficients

t

and

1-t

of Flow Matching become increasingly ineffective when the training data size is reduced. To corroborate these theoretical claims, we propose a novel, powerful architecture for Diffusion Bridge built on a latent Transformer, and implement a Flow Matching model with the same structure to enable a fair performance comparison in various experiments. Comprehensive experiments are conducted across Image Inpainting, Super-Resolution, Deblurring, Denoising, Translation, and Style Transfer tasks, systematically varying both the distributional discrepancy (different difficulty) and the training data size. Extensive empirical results align perfectly with our theoretical predictions and allow us to delineate the respective advantages and disadvantages of these two models. Our code is available at this https URL.

View blog

#computer-science #computer-vision-and-pattern-recognition #generative-models

Resources

101

One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

03 Jun 2025

ShanghaiTech University MoE Key Laboratory of Intelligent Perception and Human Machine Collaboration

Researchers at ShanghaiTech University developed DreamPolicy, a unified framework for versatile humanoid locomotion that enables a single policy to generalize zero-shot across diverse and unseen terrains. It achieves this by integrating offline data, a diffusion-based trajectory planner that generates 'Humanoid Motion Imagery', and a physics-constrained reinforcement learning policy, outperforming existing methods by an average of 20% in unseen environments.

View blog

#computer-science #machine-learning #robotics

Resources

200

There are no more papers matching your filters at the moment.

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Browser Extension

Dark mode

Ask or search anything...

Events