Researchers from CASIA, Meituan, GigaAI, and other institutions developed FullVQ (FVQ), a scalable training method for vector-quantized networks that consistently achieves 100% codebook utilization by introducing a novel VQBridge projector. FVQ sets a new state-of-the-art for discrete tokenizers with an rFID of 0.88 and enables autoregressive models to surpass advanced diffusion models in image generation quality without incurring inference overhead.
Bayesian Prompt Flow Learning (Bayes-PFL) models the text prompt space as a learnable probability distribution using normalizing flows to enhance Zero-Shot Anomaly Detection (ZSAD) with Vision-Language Models. This method achieves state-of-the-art performance across 15 industrial and medical datasets, demonstrating substantial gains, such as a 3.8% improvement in pixel-level AUROC on the ISIC medical dataset.
Addresses flawed ground truth generation and enhances geometric relationship utilization in sparse-point 3D lane detection. This improves F1-scores on state-of-the-art models across challenging datasets like OpenLane and ApolloSim.
There are no more papers matching your filters at the moment.