cv
Basics
Name | Yuancheng Wang |
Label | Ph.D. Student at CUHK(SZ) |
yuanchengwang@link.cuhk.edu.cn | |
Phone | (+86) 189-5643-5965 |
Url | https://HeCheng0625.github.io/ |
Summary | A second-year Ph.D. student at CUHK(SZ), interested in text-to-speech synthesis, text-to-audio generation, and unified audio representation and generation. |
Internship
-
2024.05 - Present Shenzhen, China
-
2022.12 - 2023.06 Beijing, China
Research Intern
Microsoft Research Asia
Developed on audio generation & editing and larger scale text-to-speech synthesis.
- Audio Generation & Editing
- Speech Synthesis
Volunteer
-
2024.12 - 2024.12 Macau, China
Education
-
2023.09 - Present Shenzhen, China
-
2019.09 - 2023.06 Shenzhen, China
-
2016.09 - 2019.06 Hefei, Anhui, China
Awards
Publications
-
2025 MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
ICLR 2025
A fully non-autoregressive large-scale zero-shot TTS model eliminates the need for phone-level duration prediction.
-
2024 Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
IEEE SLT 2024
We collect a 10w hours in-the-wild speech dataset for speech generation.
-
2024 Amphion: an Open-Source Audio, Music, and Speech Generation Toolkit
IEEE SLT 2024
We develop a unified toolkit for audio, music, and speech generation.
-
2024 SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
NeurIPS 2024
We propose a benchmark dataset to evaluate spoken dialogue understanding and generation.
-
2024 Naturalspeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
ICML 2024 Oral
A large-scale zero-shot TTS model achieves on-par quality with human recordings.
-
2023 AUDIT: Audio Editing by following Instructions with Latent Diffusion Models
NeurIPS 2023
The first audio editing model that can follow natural language instructions.
Skills
Computer Science & AI | |
Python | |
PyTorch | |
Deep Learning | |
Generative Models |
Languages
Chinese | |
Native speaker |
English | |
Interests
Deep Learning | |
Generative Models | |
Speech Synthesis | |
Speech Language Models | |
Reinforcement Learning |