alphaXiv

History

Papers Benchmarks

TARS Robotics

04 Dec 2025

computer-science computer-vision-and-pattern-recognition efficient-transformers

LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging

Nanjing University of Posts and Telecommunications

Nanjing University

Zhejiang University Horizon Robotics Macau University of Science and Technology China Mobile Zijin Innovation Institute TARS Robotics

LiteVGGT introduces a geometry-aware cached token merging strategy to enhance the Visual Geometry Grounded Transformer (VGGT) for multi-view 3D reconstruction. This approach provides up to a 10x speedup in inference time and enables processing of 1000-image scenes without out-of-memory errors, while largely preserving geometric and pose estimation accuracy.

There are no more papers matching your filters at the moment.

Events

AI for Law
Joel Niklaus· Hugging Face
01/09
Register
Watch recordings

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

LiteVGGT: Boosting Vanilla VGGT via Geometry-aware Cached Token Merging

Events

AI for Law

Personalize Your Feed