alphaXiv

Jiutian Artificial Intelligence Research Institute

23 Sep 2025

audio-and-speech-processing electrical-engineering

Group Relative Policy Optimization for Text-to-Speech with Large Language Models

University of Science and Technology of China China Mobile iFlytek Research National Engineering Research Center of Speech and Language Information Processing Jiutian Artificial Intelligence Research Institute

This research introduces Group Relative Policy Optimization (GRPO) with a novel off-the-shelf ASR-derived composite reward to fine-tune large language model (LLM)-based Text-to-Speech (TTS) models. The approach enhances speech intelligibility and naturalness, achieving consistent improvements in character/word error rates and Mean Opinion Scores across multiple languages and diverse LLM-TTS architectures.

There are no more papers matching your filters at the moment.

Events

AI for Law
Joel Niklaus· Hugging Face
01/09
Register
Watch recordings

Personalize Your Feed

Install Browser Extension

We're hiring

alphaXiv

Explore

State of the Art

Sign In

Labs

Feedback

Dark mode

Group Relative Policy Optimization for Text-to-Speech with Large Language Models

Events

AI for Law

Personalize Your Feed