LiteVGGT introduces a geometry-aware cached token merging strategy to enhance the Visual Geometry Grounded Transformer (VGGT) for multi-view 3D reconstruction. This approach provides up to a 10x speedup in inference time and enables processing of 1000-image scenes without out-of-memory errors, while largely preserving geometric and pose estimation accuracy.