From fe38c26b86efaf1b3a1e0e85afc27344993b436e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eren=20G=C3=B6lge?= Date: Tue, 10 Sep 2019 13:32:37 +0300 Subject: [PATCH 1/6] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 39e507e1..50b62059 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ Below you see Tacotron model state after 16K iterations with batch-size 32 with Audio examples: [https://soundcloud.com/user-565970875](https://soundcloud.com/user-565970875) -![example_model_output](images/example_model_output.png?raw=true) +example_output ## Runtime The most time-consuming part is the vocoder algorithm (Griffin-Lim) which runs on CPU. By setting its number of iterations lower, you might have faster execution with a small loss of quality. Some of the experimental values are below. @@ -176,4 +176,4 @@ Please feel free to offer new changes and pull things off. We are happy to discu ### References - https://github.com/keithito/tacotron (Dataset pre-processing) -- https://github.com/r9y9/tacotron_pytorch (Initial Tacotron architecture) \ No newline at end of file +- https://github.com/r9y9/tacotron_pytorch (Initial Tacotron architecture) From 92b7bd1c85475aba43916e9a1a5cc5febdf75a5b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eren=20G=C3=B6lge?= Date: Fri, 20 Sep 2019 12:41:20 +0200 Subject: [PATCH 2/6] Spanish Dataset added --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 50b62059..0d2bb4ff 100644 --- a/README.md +++ b/README.md @@ -78,6 +78,7 @@ Some of the open-sourced datasets that we successfully applied TTS, are linked b - [TWEB](https://www.kaggle.com/bryanpark/the-world-english-bible-speech-dataset) - [M-AI-Labs](http://www.caito.de/2019/01/the-m-ailabs-speech-dataset/) - [LibriTTS](https://openslr.org/60/) +- [Spanish](https://drive.google.com/file/d/1Sm_zyBo67XHkiFhcRSQ4YaHPYM0slO_e/view?usp=sharing) - thx! @carlfm01 ## Training and Fine-tuning LJ-Speech Here you can find a [CoLab](https://gist.github.com/erogol/97516ad65b44dbddb8cd694953187c5b) notebook for a hands-on example, training LJSpeech. Or you can manually follow the guideline below. From 4ff2b2f6a6fc6c5bd7a3f35007d8f3c58a576fb4 Mon Sep 17 00:00:00 2001 From: Anand <40825655+anand372@users.noreply.github.com> Date: Sun, 22 Sep 2019 19:36:52 +0530 Subject: [PATCH 3/6] Update README.md some typo errors which were identified: integrade->integrate listenning->listening to --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 50b62059..dc29a569 100644 --- a/README.md +++ b/README.md @@ -69,7 +69,7 @@ Audio length is approximately 6 secs. ## Datasets and Data-Loading -TTS provides a generic dataloder easy to use for new datasets. You need to write an preprocessor function to integrade your own dataset.Check ```datasets/preprocess.py``` to see some examples. After the function, you need to set ```dataset``` field in ```config.json```. Do not forget other data related fields too. +TTS provides a generic dataloder easy to use for new datasets. You need to write an preprocessor function to integrate your own dataset.Check ```datasets/preprocess.py``` to see some examples. After the function, you need to set ```dataset``` field in ```config.json```. Do not forget other data related fields too. Some of the open-sourced datasets that we successfully applied TTS, are linked below. @@ -82,7 +82,7 @@ Some of the open-sourced datasets that we successfully applied TTS, are linked b ## Training and Fine-tuning LJ-Speech Here you can find a [CoLab](https://gist.github.com/erogol/97516ad65b44dbddb8cd694953187c5b) notebook for a hands-on example, training LJSpeech. Or you can manually follow the guideline below. -To start with, split ```metadata.csv``` into train and validation subsets respectively ```metadata_train.csv``` and ```metadata_val.csv```. Note that for text-to-speech, validation performance might be misleading since the loss value does not directly measure the voice quality to the human ear and it also does not measure the attention module performance. Therefore, running the model with new sentences and listenning the results is the best way to go. +To start with, split ```metadata.csv``` into train and validation subsets respectively ```metadata_train.csv``` and ```metadata_val.csv```. Note that for text-to-speech, validation performance might be misleading since the loss value does not directly measure the voice quality to the human ear and it also does not measure the attention module performance. Therefore, running the model with new sentences and listening to the results is the best way to go. ``` shuf metadata.csv > metadata_shuf.csv From 7eb291cae64d5510f847af1d8216af4f2c6f4c2f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eren=20G=C3=B6lge?= Date: Tue, 29 Oct 2019 12:05:20 +0100 Subject: [PATCH 4/6] Update CODE_OF_CONDUCT.md --- CODE_OF_CONDUCT.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 498baa3f..3b6d813c 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,3 +1,7 @@ +# Ethical Notice + +Please consider possible consequences and be mindful of any adversarial use cases of this project. In this regard, please contact us if you have any concerns. + # Community Participation Guidelines This repository is governed by Mozilla's code of conduct and etiquette guidelines. From 5ffd4e24ecc2bf07bc980591fb6b2b347e70cb7c Mon Sep 17 00:00:00 2001 From: Neil Stoker Date: Wed, 30 Oct 2019 23:55:44 +0000 Subject: [PATCH 5/6] Remove matplotlib version restriction So far as I can tell the matplotlib version restriction (to 2.0.2) is not necessary and causes difficulties in conda (for me at least it triggers reinstallation of an older version which then fails to compile) --- setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/setup.py b/setup.py index f6916741..fe41f12f 100644 --- a/setup.py +++ b/setup.py @@ -82,7 +82,7 @@ setup( "librosa==0.6.2", "unidecode==0.4.20", "tensorboardX", - "matplotlib==2.0.2", + "matplotlib", "Pillow", "flask", # "lws", From 52e4f9940072db64a503c3056656376e9a2a5726 Mon Sep 17 00:00:00 2001 From: Neil Stoker Date: Wed, 30 Oct 2019 23:57:53 +0000 Subject: [PATCH 6/6] Update requirements.txt --- requirements.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/requirements.txt b/requirements.txt index c9f074f1..e70956e0 100644 --- a/requirements.txt +++ b/requirements.txt @@ -4,10 +4,10 @@ librosa==0.5.1 Unidecode==0.4.20 tensorboard tensorboardX -matplotlib==2.0.2 +matplotlib Pillow flask scipy==0.19.0 tqdm git+git://github.com/bootphon/phonemizer@master -soundfile \ No newline at end of file +soundfile