alphaXiv

History

Papers Benchmarks

Huawei Consumer Business Group

153

08 May 2025

computer-science artificial-intelligence machine-learning

Low-bit Model Quantization for Deep Neural Networks: A Survey

Huawei Noah’s Ark Lab

Beihang University

Shanghai Jiao Tong University Huawei Consumer Business Group ETH Z urich

This survey reviews recent advancements in low-bit model quantization techniques for deep neural networks, categorizing methods to enable efficient deployment on resource-constrained devices while aiming to preserve model accuracy. It details improvements in quantization parameter optimization, training mechanisms, mixed precision, data handling, and specialized formats for complex models like diffusion networks.

296

06 Nov 2025

computer-science computer-vision-and-pattern-recognition fine-tuning

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Shanghai Jiao Tong University

Westlake University Huawei Consumer Business Group

DOVE is an efficient one-step diffusion model designed for real-world video super-resolution, which fine-tunes a pretrained video generation model using a novel latent-pixel training strategy. It achieves state-of-the-art restoration quality while delivering up to a 28x speed-up in inference time compared to previous diffusion-based methods.

117

07 Mar 2020

computer-science computer-vision-and-pattern-recognition machine-learning

Distilling portable Generative Adversarial Networks for Image Translation

Huawei Noah’s Ark Lab Peng Cheng Laboratory

Peking University The University of Sydney Huawei Consumer Business Group

Despite Generative Adversarial Networks (GANs) have been widely used in various image-to-image translation tasks, they can be hardly applied on mobile devices due to their heavy computation and storage cost. Traditional network compression methods focus on visually recognition tasks, but never deal with generation tasks. Inspired by knowledge distillation, a student generator of fewer parameters is trained by inheriting the low-level and high-level information from the original heavy teacher generator. To promote the capability of student generator, we include a student discriminator to measure the distances between real images, and images generated by student and teacher generators. An adversarial learning process is therefore established to optimize student generator and student discriminator. Qualitative and quantitative analysis by conducting experiments on benchmark datasets demonstrate that the proposed method can learn portable generative models with strong performance.

07 Apr 2025

computer-science computation-and-language machine-translation

DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation

Soochow University Huawei Translation Services Center Huawei Consumer Business Group

Document-level context is crucial for handling discourse challenges in text-to-text document-level machine translation (MT). Despite the increased discourse challenges introduced by noise from automatic speech recognition (ASR), the integration of document-level context in speech translation (ST) remains insufficiently explored. In this paper, we develop DoCIA, an online framework that enhances ST performance by incorporating document-level context. DoCIA decomposes the ST pipeline into four stages. Document-level context is integrated into the ASR refinement, MT, and MT refinement stages through auxiliary LLM (large language model)-based modules. Furthermore, DoCIA leverages document-level information in a multi-level manner while minimizing computational overhead. Additionally, a simple yet effective determination mechanism is introduced to prevent hallucinations from excessive refinement, ensuring the reliability of the final results. Experimental results show that DoCIA significantly outperforms traditional ST baselines in both sentence and discourse metrics across four LLMs, demonstrating its effectiveness in improving ST performance.

There are no more papers matching your filters at the moment.