DiffSinger

Extensional Experiments on Text-to-Speech Synthesis

For all of the speech audio samples, we uniformly use HiFi-GAN as vocoder.

Comparison with Other Models

This type was introduced into England by Wynkyn de Worde, Caxtons successor

GT	GT (Mel)	Tacotron 2

Glow-TTS	FastSpeech 2	DiffSpeech

Most of Caxtons own types are of an earlier character, though they also much resemble Flemish or Cologne letter.

GT	GT (Mel)	Tacotron 2

Glow-TTS	FastSpeech 2	DiffSpeech

the worst, which perhaps was the English, was a terrible falling off from the work of the earlier presses;

GT	GT (Mel)	Tacotron 2

Glow-TTS	FastSpeech 2	DiffSpeech

Ablation Study

This type was introduced into England by Wynkyn de Worde, Caxtons successor

DiffSpeech	DiffSpeech Naive

Most of Caxtons own types are of an earlier character, though they also much resemble Flemish or Cologne letter.

DiffSpeech	DiffSpeech Naive

the worst, which perhaps was the English, was a terrible falling off from the work of the earlier presses;

DiffSpeech	DiffSpeech Naive