alphaXiv

History

Papers Benchmarks

YanTron Technology Co. Ltd

474

28 May 2025

computer-science computation-and-language machine-learning

Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations

Renmin University of China University of International Business and Economics EBTech Co. Ltd YanTron Technology Co. Ltd

Researchers introduced EASY-EP, a domain-specific pruning method for large Mixture-of-Experts (MoE) models, leveraging few-shot demonstrations to identify and retain critical experts. The approach yielded up to a 4.33x increase in inference throughput and substantial memory reduction on models like DeepSeek-R1, while preserving over 90% of the original model's performance across diverse benchmarks.

There are no more papers matching your filters at the moment.

Events

AI for Law
Joel Niklaus· Hugging Face
01/09
Register
Watch recordings

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Domain-Specific Pruning of Large Mixture-of-Experts Models with Few-shot Demonstrations

Events

AI for Law

Personalize Your Feed