Researchers from UESTC, Megvii Technology, and Dzine AI/SeeKoo introduced a two-stage adapter framework to enable ultra high-resolution (4K+) text-guided image inpainting. Their Patch-Adapter model achieved a FID of 14.594 and an Aesthetic Score of 6.021 on the photo-concept-bucket dataset, significantly improving perceptual quality and semantic alignment over prior methods.
View blogThe paper introduces PerVFI, a perception-oriented video frame interpolation (VFI) method, which employs asymmetric synergistic blending and a conditional normalizing flow-based generator. It significantly reduces blur and ghosting artifacts, achieving superior perceptual quality validated by quantitative metrics like LPIPS and VFIPS, and confirmed by user studies.
View blog