README.md update

2020-07-15 11:08:35 +02:00 · 2020-07-15 11:08:35 +02:00 · c9e2df1451
parent 664f42df33
commit c9e2df1451
1 changed files with 14 additions and 24 deletions
--- a/README.md
+++ b/README.md
@ -1,7 +1,10 @@
 <p align="center"><img src="https://user-images.githubusercontent.com/1402048/52643646-c2102980-2edd-11e9-8c37-b72f3c89a640.png" data-canonical-src="![TTS banner](https://user-images.githubusercontent.com/1402048/52643646-c2102980-2edd-11e9-8c37-b72f3c89a640.png =250x250)
 " width="320" height="95" /></p>
 <center>
 <img src="https://travis-ci.org/mozilla/TTS.svg?branch=dev"/>
 [![Discourse](https://img.shields.io/badge/discourse-online-green.svg)](https://discourse.mozilla.org/c/tts)
 </center>
 This project is a part of [Mozilla Common Voice](https://voice.mozilla.org/en). TTS aims a deep learning based Text2Speech engine, low in cost and high in quality.
@ -38,25 +41,26 @@ Vocoders:
 You can also help us implement more models. Some TTS related work can be found [here](https://github.com/erogol/TTS-papers).
 ## Features
- High performance Deep Learning models for Text2Speech related tasks.
+- High performance Deep Learning models for Text2Speech tasks.
-    - Text2Speech models (Tacotron, Tacotron2).
+    - Text2Spec models (Tacotron, Tacotron2).
    - Speaker Encoder to compute speaker embeddings efficiently.
-    - Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS)
+    - Vocoder models (MelGAN, Multiband-MelGAN, GAN-TTS, ParallelWaveGAN)
 - Support for multi-speaker TTS training.
 - Support for Multi-GPUs training.
 - Ability to convert Torch models to Tensorflow 2.0 for inference.
 - Released pre-trained models.
 - Fast and efficient model training.
 - Detailed training logs on console and Tensorboard.
 - Support for multi-speaker TTS.
 - Efficient Multi-GPUs training.
 - Ability to convert PyTorch models to Tensorflow 2.0 and TFLite for inference.
 - Released models in PyTorch, Tensorflow and TFLite.
 - Tools to curate Text2Speech datasets under```dataset_analysis```.
 - Demo server for model testing.
 - Notebooks for extensive model benchmarking.
 - Modular (but not too much) code base enabling easy testing for new ideas.
-## Requirements and Installation
+## Main Requirements and Installation
 Highly recommended to use [miniconda](https://conda.io/miniconda.html) for easier installation.
  * python>=3.6
-  * pytorch>=0.4.1
+  * pytorch>=1.4.1
  * tensorflow>=2.2
  * librosa
  * tensorboard
  * tensorboardX
@ -107,21 +111,7 @@ Audio examples: [soundcloud](https://soundcloud.com/user-565970875/pocket-articl
 <img src="images/example_model_output.png?raw=true" alt="example_output" width="400"/>
-## Runtime
+## [Mozilla TTS Tutorials and Notebooks](https://github.com/mozilla/TTS/wiki/TTS-Notebooks-and-Tutorials)
 The most time-consuming part is the vocoder algorithm (Griffin-Lim) which runs on CPU. By setting its number of iterations lower, you might have faster execution with a small loss of quality. Some of the experimental values are below.
 Sentence: "It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent."
 Audio length is approximately 6 secs.
 | Time (secs) | System | # GL iters | Model
 | ---- |:-------|:-----------| ---- |
 |2.00|GTX1080Ti|30|Tacotron|
 |3.01|GTX1080Ti|60|Tacotron|
 |3.57|CPU|60|Tacotron|
 |5.27|GTX1080Ti|60|Tacotron2|
 |6.50|CPU|60|Tacotron2|
 ## Datasets and Data-Loading
 TTS provides a generic dataloader easy to use for new datasets. You need to write an preprocessor function to integrate your own dataset.Check ```datasets/preprocess.py``` to see some examples. After the function, you need to set ```dataset``` field in ```config.json```. Do not forget other data related fields too.