Commit Graph

4474 Commits (fix_xtts_v1.1_)

Author SHA1 Message Date
manmay nakhashi 624513018d
add energy by default to Fastspeech2 config (#2326)
* add energy by default

* added energy to base tts

* fix energy dataset

* fix styles

* fix test
2023-03-06 10:20:25 +01:00
Florian Quirin 478c8178b8
Basic Mary-TTS API compatibility (#2352)
* added basic Mary-TTS API endpoints to server

- imported `parse_qs` from `urllib.parse` to parse HTTP POST parameters
- imported `render_template_string` from `flask` to return text as endpoint result
- added new routes:
  - `/locales` - returns list of locales (currently locale of active model)
  - `/voices` - returns list of voices (currently locale and name of active model)
  - `/process` - accepts synth. request (GET and POST) with parameter `INPUT_TEXT` (other parameters ignored since we have only one active model)

* better log messages for Mary-TTS API

- smaller tweaks to log output

* use f-string in log print to please linter

* updated server.py to match 'make style' result
2023-03-06 10:08:21 +01:00
thennal10 d39bc74f57
OverFlow with test sentences (#2253)
* Fix typo in function definiton

* Swap hasattr out

hasattr(self, "speaker_manager")  and hasattr(self, "language_manager") seems to be redundant since BaseTTS defines both.
2023-03-01 09:11:30 +01:00
Edresson Casanova 16b9862252
Fix Speaker Consistency Loss (SCL) (#2364) 2023-02-27 09:14:00 +03:00
p0p4k a365a7e888
numpy version for py310 (#2316)
* numpy version for py310

requested in #2315

* Update requirements.txt
2023-02-13 10:34:00 +01:00
Eren G??lge d488b4f1c6 Merge branch 'dev' into main 2023-02-10 17:39:37 +01:00
Eren G??lge 661725b95e Bump up to v0.11.1 2023-02-10 15:59:05 +01:00
Eren G??lge 0196b4dfbf Merge branch 'add_neural_hmm_model' into dev 2023-02-10 15:23:56 +01:00
Eren G??lge ea5bd7dcbc Merge branch 'dev' into main 2023-02-10 10:27:34 +01:00
Eren Gölge 914280a556
Bump up to v0.11.0 (#2329)
* Make style

* Bump up to v0.11.0
2023-02-08 13:58:49 +01:00
Eren G??lge 6cfb590eb2 Merge branch 'dev' into main 2023-02-06 11:47:18 +01:00
Eren G??lge 683b4d432f Fixup 2023-02-06 11:44:56 +01:00
Eren G??lge c7184dcef9 Linter fix 2023-02-06 11:30:36 +01:00
Eren G??lge 910a218652 Merge branch 'dev' into main 2023-02-06 11:25:33 +01:00
Eren G??lge 4e75b6262c Update docs 2023-02-06 11:20:32 +01:00
Eren G??lge 85b3a04b37 Merge branch 'api_model_path' into dev 2023-02-06 11:18:00 +01:00
Eren G??lge c496b1a986 Linter fix 2023-02-06 11:17:28 +01:00
Eren G??lge baed2a2c2b Update README 2023-02-06 11:15:43 +01:00
marius851000 1f4d8bf0f1
Fix tts-server for multi-lingual models (#2257) 2023-02-06 10:54:34 +01:00
Eren G??lge 6ee94f8bad Fixup 2023-01-30 14:02:25 +01:00
Eren G??lge 713e8c8d04 Add pretrained model 2023-01-30 13:55:17 +01:00
Eren G??lge 7fddabc8ac Implement cloning in API 2023-01-30 13:35:48 +01:00
Eren G??lge 335b8ed44e Add vocoder path 2023-01-30 12:59:29 +01:00
Martin Weinelt 994be163e1
Use packaging.version for version comparisons (#2310)
* Use packaging.version for version comparisons

The distutils package is deprecated¹ and relies on PEP 386² version
comparisons, which have been superseded by PEP 440³ which is implemented
through the packaging module.

With more recent distutils versions, provided through setuptools
vendoring, we are seeing the following exception during version
comparisons:

> TypeError: '<' not supported between instances of 'str' and 'int'

This is fixed by this migration.

[1] https://docs.python.org/3/library/distutils.html
[2] https://peps.python.org/pep-0386/
[3] https://peps.python.org/pep-0440/

* Improve espeak version detection robustness

On many modern systems espeak is just a symlink to espeak-ng. In that
case looking for the 3rd word in the version output will break the
version comparison, when it finds `text-to-speech:`, instead of a proper
version.

This will not break during runtime, where espeak-ng would be
prioritized, but the phonemizer and tokenizer tests force the backend
to `espeak`, which exhibits this breakage.

This improves the version detection by simply looking for the version
after the "text-to-speech:" token.

* Replace distuils.copy_tree with shutil.copytree

The distutils module is deprecated and slated for removal in Python
3.12. Its usage should be replaced, in this case by a compatible method
from shutil.
2023-01-29 23:47:00 +01:00
Eren G??lge cf076345e7 Make style 2023-01-23 13:49:51 +01:00
Eren G??lge 13334d507c Load model from path 2023-01-23 13:45:45 +01:00
Gerard Sant Muniesa c59b3f75b8
Add Catalan text cleaners for Catalan support (#2295) 2023-01-23 11:56:30 +01:00
Shivam Mehta d83ee8fe45
Adding neural HMM TTS Model (#2272)
* Adding neural HMM TTS

* Adding tests

* Adding neural hmm on readme

* renaming training recipe

* Removing overflow\s decoder parameters from the config

* Update the Trainer requirement version for a compatible one (#2276)

* Bump up to v0.10.2

* Adding neural HMM TTS

* Adding tests

* Adding neural hmm on readme

* renaming training recipe

* Removing overflow\s decoder parameters from the config

* fixing documentation

Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
2023-01-23 11:53:04 +01:00
Eren Gölge 497f22b20b
Cache speaker encoder model (#2284) 2023-01-23 11:49:51 +01:00
Eren G??lge 6e3f74fc29 Fix #2191 2023-01-15 23:11:57 +01:00
manmay nakhashi bc422f2f3c
Fastspeech2 (#2073)
* added EnergyDataset

* add energy to Dataset

* add comupte_energy

* added energy params

* added energy to forward_tts

* added plot_avg_energy for visualisation

* Update forward_tts.py

* create file

* added fastspeech2 recipe

* add fastspeech2 config

* removed energy from fast pitch

* add energy loss to forward tts

* Update fastspeech2_config.py

* change run_name

* Update numpy_transforms.py

* fix typo

* fix typo

* fix typo

* linting issues

* use_energy default value --> False

* Update numpy_transforms.py

* linting fixes

* fix typo

* liniting_fix

* liniting_fix

* fix

* fixes

* fixes

* lint fix

* lint fixws

* added training test

* wrong import

* wrong import

* trailing whitespace

* style fix

* changed class name because of error

* class name change

* class name change

* change class name

* fixed styles
2023-01-15 22:39:22 +01:00
Eren Gölge 14d45b5347
Bump up to v0.10.2 2023-01-11 01:06:02 +01:00
Edresson Casanova 49dfaa5234
Update the Trainer requirement version for a compatible one (#2276) 2023-01-11 01:01:46 +01:00
Khalid Bashir 42afad5e79
Fixed bug related to yourtts speaker embeddings issue (#2234)
* Fixed bug related to yourtts speaker embeddings issue

* Reverted code for base_tts

* Bug fix on VITS d_vector_file type

* Ignore the test speakers on YourTTS recipe

* Add speaker encoder model and config on YourTTS recipe to easily do zero-shot inference

* Update YourTTS config file

* Update ModelManager._update_path to deal with list attributes

* Fix lint checks

* Remove unused code

* Fix unit tests

* Reset name_to_id to get the right speaker ids on load_embeddings_from_list_of_files

* Set weighted_sampler_multipliers as an empty dict to prevent users' mistakes

Co-authored-by: Edresson Casanova <edresson1@gmail.com>
2023-01-02 14:20:02 +01:00
Eren G??lge da93d768b8 Update docs 2023-01-02 10:07:03 +01:00
Julian Weber a07397733b
Multilingual tokenizer (#2229)
* Implement multilingual tokenizer

* Add multi_phonemizer receipe

* Fix lint

* Add TestMultiPhonemizer

* Fix lint

* make style
2023-01-02 10:03:19 +01:00
Eren Gölge a31af762e8
v0.10.1 (#2242)
* Add Ukrainian LADA (female) voice

* Add ca and fa models

* Add pth files to manager

* Bump up to v0.10.1

Co-authored-by: Yehor Smoliakov <yehors@ukr.net>
2022-12-26 15:46:21 +01:00
Eren G??lge f814d52394 Bump up to v0.10.1 2022-12-26 14:29:46 +01:00
Eren G??lge 8c32a6998a Add pth files to manager 2022-12-26 14:29:25 +01:00
Eren G??lge cf765cb3f2 Add ca and fa models 2022-12-26 14:29:10 +01:00
Eren Gölge 0910cb76bc
Merge pull request #2226 from egorsmkv/patch-1
Add Ukrainian LADA (female) voice
2022-12-16 13:17:16 +01:00
Yehor Smoliakov 046b137946
Add Ukrainian LADA (female) voice 2022-12-16 12:30:44 +02:00
Eren Gölge a04db8d632
Merge pull request #2205 from coqui-ai/dev
🚀 v0.10.0
2022-12-15 12:02:16 +01:00
Eren G??lge 46b0ad37e7 Bump up to v0.10.0 2022-12-15 11:19:23 +01:00
Eren Gölge a9167cf239
Fixup overflow (#2218)
* Update overflow config

* Pulling shuffle and drop_last  from config

* Print training stats for overflow
2022-12-15 00:56:48 +01:00
Eren Gölge ecea43ec81
Adding pre-trained Overflow model (#2211)
* Adding pretrained Overflow model

* Stabilize HMM

* Fixup model manager

* Return `audio_unique_name` by default

* Distribute max split size over datasets

* Fixup eval_split_size

* Make style
2022-12-14 16:55:48 +01:00
Edresson Casanova 061ac43187
Add Original YourTTS vocabulary for full transfer learning (#2206) 2022-12-13 09:02:10 +01:00
Edresson Casanova 3b1a28fa95
Add YourTTS VCTK recipe (#2198)
* Add YourTTS VCTK recipe

* Fix lint

* Add compute_embeddings and resample_files functions to be able to reuse it

* Add automatic download and speaker embedding computation for YourTTS VCTK recipe

* Add parameter for eval metadata file on compute embeddings function
2022-12-12 16:14:25 +01:00
Shivam Mehta 3b8b105b0d
Adding OverFlow (#2183)
* Adding encoder

* currently modifying hmm

* Adding hmm

* Adding overflow

* Adding overflow setting up flat start

* Removing runs

* adding normalization parameters

* Fixing models on same device

* Training overflow and plotting evaluations

* Adding inference

* At the end of epoch the test sentences are coming on cpu instead of gpu

* Adding figures from model during training to monitor

* reverting tacotron2 training recipe

* fixing inference on gpu for test sentences on config

* moving helpers and texts within overflows source code

* renaming to overflow

* moving loss to the model file

* Fixing the rename

* Model training but not plotting the test config sentences's audios

* Formatting logs

* Changing model name to camelcase

* Fixing test log

* Fixing plotting bug

* Adding some tests

* Adding more tests to overflow

* Adding all tests for overflow

* making changes to camel case in config

* Adding information about parameters and docstring

* removing compute_mel_statistics moved statistic computation to the model instead

* Added overflow in readme

* Adding more test cases, now it doesn't saves transition_p like tensor and can be dumped as json
2022-12-12 12:44:15 +01:00
p0p4k 2e153d54a8
Adding missing key to formatter (#2194)
quick fix for #2156.
 added 'root_path' key.
2022-12-12 12:25:37 +01:00