SpeechT5 The SpeechT5 framework consists of a shared seq2seq and six modal-specific (speech/text) pre/post-nets that can address a few audio-related tasks. SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Paper • 2110.07205 • Published Oct 14, 2021 • 1 microsoft/speecht5_tts Text-to-Speech • Updated 26 days ago • 40.3k • 251 Running ont4 181 👩🎤 SpeechT5 Speech Synthesis Demo microsoft/speecht5_vc Audio-to-Audio • Updated Mar 22 • 16.5k • 36
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing Paper • 2110.07205 • Published Oct 14, 2021 • 1