With the advent of portable 360{\deg} cameras, panorama has gained
significant attention in applications like virtual reality (VR), virtual tours,
robotics, and autonomous driving. As a result, wide-baseline panorama view
synthesis has emerged as a vital task, where high resolution, fast inference,
and memory efficiency are essential. Nevertheless, existing methods are
typically constrained to lower resolutions (512
× 1024) due to demanding
memory and computational requirements. In this paper, we present PanSplat, a
generalizable, feed-forward approach that efficiently supports resolution up to
4K (2048
× 4096). Our approach features a tailored spherical 3D Gaussian
pyramid with a Fibonacci lattice arrangement, enhancing image quality while
reducing information redundancy. To accommodate the demands of high resolution,
we propose a pipeline that integrates a hierarchical spherical cost volume and
Gaussian heads with local operations, enabling two-step deferred
backpropagation for memory-efficient training on a single A100 GPU. Experiments
demonstrate that PanSplat achieves state-of-the-art results with superior
efficiency and image quality across both synthetic and real-world datasets.
Code is available at this https URL