TTS/docs/source/tutorial_for_nervous_beginn...

126 lines
3.0 KiB
Markdown
Raw Normal View History

2021-06-27 18:55:20 +00:00
# Tutorial For Nervous Beginners
## Installation
User friendly installation. Recommended only for synthesizing voice.
```bash
$ pip install TTS
```
Developer friendly installation.
```bash
$ git clone https://github.com/coqui-ai/TTS
$ cd TTS
$ pip install -e .
```
## Training a `tts` Model
A breakdown of a simple script that trains a GlowTTS model on the LJspeech dataset. See the comments for more details.
2021-06-27 18:55:20 +00:00
### Pure Python Way
0. Download your dataset.
In this example, we download and use the LJSpeech dataset. Set the download directory based on your preferences.
```bash
$ python -c 'from TTS.utils.downloaders import download_ljspeech; download_ljspeech("../recipes/ljspeech/");'
```
2021-09-30 14:34:53 +00:00
1. Define `train.py`.
```{literalinclude} ../../recipes/ljspeech/glow_tts/train_glowtts.py
2021-09-30 14:34:53 +00:00
```
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
2. Run the script.
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
```bash
CUDA_VISIBLE_DEVICES=0 python train.py
```
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
- Continue a previous run.
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
```bash
CUDA_VISIBLE_DEVICES=0 python train.py --continue_path path/to/previous/run/folder/
```
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
- Fine-tune a model.
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
```bash
CUDA_VISIBLE_DEVICES=0 python train.py --restore_path path/to/model/checkpoint.pth
2021-09-30 14:34:53 +00:00
```
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
- Run multi-gpu training.
2021-06-27 18:55:20 +00:00
2021-09-30 14:34:53 +00:00
```bash
CUDA_VISIBLE_DEVICES=0,1,2 python -m trainer.distribute --script train.py
2021-09-30 14:34:53 +00:00
```
2021-06-27 18:55:20 +00:00
### CLI Way
2021-09-30 14:34:53 +00:00
We still support running training from CLI like in the old days. The same training run can also be started as follows.
2021-06-27 18:55:20 +00:00
1. Define your `config.json`
```json
{
"run_name": "my_run",
2021-06-27 18:55:20 +00:00
"model": "glow_tts",
"batch_size": 32,
"eval_batch_size": 16,
"num_loader_workers": 4,
"num_eval_loader_workers": 4,
"run_eval": true,
"test_delay_epochs": -1,
"epochs": 1000,
"text_cleaner": "english_cleaners",
"use_phonemes": false,
"phoneme_language": "en-us",
"phoneme_cache_path": "phoneme_cache",
"print_step": 25,
"print_eval": true,
"mixed_precision": false,
"output_path": "recipes/ljspeech/glow_tts/",
"datasets":[{"name": "ljspeech", "meta_file_train":"metadata.csv", "path": "recipes/ljspeech/LJSpeech-1.1/"}]
}
```
2. Start training.
```bash
$ CUDA_VISIBLE_DEVICES="0" python TTS/bin/train_tts.py --config_path config.json
```
## Training a `vocoder` Model
```{literalinclude} ../../recipes/ljspeech/hifigan/train_hifigan.py
2021-06-27 18:55:20 +00:00
```
2021-09-30 14:34:53 +00:00
❗️ Note that you can also use ```train_vocoder.py``` as the ```tts``` models above.
2021-06-27 18:55:20 +00:00
## Synthesizing Speech
You can run `tts` and synthesize speech directly on the terminal.
```bash
$ tts -h # see the help
$ tts --list_models # list the available models.
```
![cli.gif](https://github.com/coqui-ai/TTS/raw/main/images/tts_cli.gif)
You can call `tts-server` to start a local demo server that you can open it on
your favorite web browser and 🗣️.
```bash
$ tts-server -h # see the help
$ tts-server --list_models # list the available models.
```
![server.gif](https://github.com/coqui-ai/TTS/raw/main/images/demo_server.gif)