🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
 
 
 
 
Go to file
Eren Golge 1ee45b5336 Change config to json 3 2018-01-22 08:29:27 -08:00
datasets Change config to json 3 2018-01-22 08:29:27 -08:00
layers New files 2018-01-22 06:59:41 -08:00
models New files 2018-01-22 06:59:41 -08:00
png Beginning 2018-01-22 01:48:59 -08:00
samples Beginning 2018-01-22 01:48:59 -08:00
utils Change config to json 3 2018-01-22 08:29:27 -08:00
.gitignore new files 2018-01-22 06:59:21 -08:00
README.md Beginning 2018-01-22 01:48:59 -08:00
__init__.py Beginning 2018-01-22 01:48:59 -08:00
config.json Change config to json 3 2018-01-22 08:29:27 -08:00
module.py Beginning 2018-01-22 01:48:59 -08:00
requirements.txt Beginning 2018-01-22 01:48:59 -08:00
synthesis.py Beginning 2018-01-22 01:48:59 -08:00
train.py Change config to json 3 2018-01-22 08:29:27 -08:00

README.md

Tacotron-pytorch

A pytorch implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model.

Requirements

  • Install python 3
  • Install pytorch == 0.2.0
  • Install requirements:
    pip install -r requirements.txt
    

Data

I used LJSpeech dataset which consists of pairs of text script and wav files. The complete dataset (13,100 pairs) can be downloaded here. I referred https://github.com/keithito/tacotron for the preprocessing code.

File description

  • hyperparams.py includes all hyper parameters that are needed.
  • data.py loads training data and preprocess text to index and wav files to spectrogram. Preprocessing codes for text is in text/ directory.
  • module.py contains all methods, including CBHG, highway, prenet, and so on.
  • network.py contains networks including encoder, decoder and post-processing network.
  • train.py is for training.
  • synthesis.py is for generating TTS sample.

Training the network

  • STEP 1. Download and extract LJSpeech data at any directory you want.
  • STEP 2. Adjust hyperparameters in hyperparams.py, especially 'data_path' which is a directory that you extract files, and the others if necessary.
  • STEP 3. Run train.py.

Generate TTS wav file

  • STEP 1. Run synthesis.py. Make sure the restore step.

Samples

  • You can check the generated samples in 'samples/' directory. Training step was only 60K, so the performance is not good yet.

Reference

Comments

  • Any comments for the codes are always welcome.