alphaXiv

History

Papers Benchmarks

Eyeline Labs

549

06 Oct 2025

chain-of-thought computer-science computer-vision-and-pattern-recognition

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

Nanyang Technological University Eyeline Labs

The VChain framework integrates a large multimodal model's reasoning capabilities into video generation through a three-stage, inference-time process, generating videos with enhanced causal and physical coherence. It employs "chain-of-visual-thought" keyframes to guide pre-trained diffusion models, leading to more logically consistent and plausible dynamic sequences.

16 Oct 2025

computer-science artificial-intelligence computer-vision-and-pattern-recognition

Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Netflix

HKUST Scanline VFX Eyeline Labs

The "Virtually Being" framework introduces a method for customizing camera-controllable video diffusion models using multi-view performance captures. It achieves superior multi-view identity preservation and precise 3D camera control, enabling high-fidelity character generation and interaction tailored for virtual production workflows.

There are no more papers matching your filters at the moment.

Events

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

VChain: Chain-of-Visual-Thought for Reasoning in Video Generation

Virtually Being: Customizing Camera-Controllable Video Diffusion Models with Multi-View Performance Captures

Events

AI for Law

Personalize Your Feed