帖文详情
@_akhaliq@x.good.news
VALL-E 2
Neural Codec Language Models are Human Parity Zero-Shot Text to Speech Synthesizers
This paper introduces VALL-E 2, the latest advancement in neural codec language models that marks a milestone in zero-shot text-to-speech synthesis (TTS), achieving human parity