Extensional Experiments on Text-to-Speech Synthesis

For all of the speech audio samples, we uniformly use HiFi-GAN as vocoder.

Comparison with Other Models

This type was introduced into England by Wynkyn de Worde, Caxtons successor

GT GT (Mel) Tacotron 2
Glow-TTS FastSpeech 2 DiffSpeech

Most of Caxtons own types are of an earlier character, though they also much resemble Flemish or Cologne letter.

GT GT (Mel) Tacotron 2
Glow-TTS FastSpeech 2 DiffSpeech

the worst, which perhaps was the English, was a terrible falling off from the work of the earlier presses;

GT GT (Mel) Tacotron 2
Glow-TTS FastSpeech 2 DiffSpeech

Ablation Study

This type was introduced into England by Wynkyn de Worde, Caxtons successor

DiffSpeech DiffSpeech Naive

Most of Caxtons own types are of an earlier character, though they also much resemble Flemish or Cologne letter.

DiffSpeech DiffSpeech Naive

the worst, which perhaps was the English, was a terrible falling off from the work of the earlier presses;

DiffSpeech DiffSpeech Naive