* initial commit
* Tortoise inference
* revert path change
* style fix
* remove accidental remove
* style fixes
* style fixes
* removed unwanted assests and deps
* remove changes
* remove cvvp
* style fix black
* added tortoise config and updated config and args, refactoring the code
* added tortoise to api
* Pull mel_norm from url
* Use TTS cleaners
* Let download model files
* add ability to pass tortoise presets through coqui api
* fix tests
* fix style and tests
* fix tts commandline for tortoise
* Add config.json to tortoise
* Use kwargs
* Use regular model api for loading tortoise
* Add load from dir to synthesizer
* Fix Tortoise floats
* Use model_dir when there are multiple urls
* Use `synthesize` when exists
* lint fixes and resolve preset bug
* resolve a download bug and update model link
* fix json
* do tortoise inference from voice dir
* fix
* fix test
* fix speaker id and remove assests
* update inference_tests.yml
* replace inference_test.yml
* fix extra dir as None
* fix tests
* remove space
* Reformat docstring
* Add docs
* Update docs
* lint fixes
---------
Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
* Warn when lang is not avail
* Make style
* Implement Coqui Studio API
* Test
* Update docs
* Set action
* Make style
* Make lint
* Update README
* Make style
* Fix action
* Run actions
* Use packaging.version for version comparisons
The distutils package is deprecated¹ and relies on PEP 386² version
comparisons, which have been superseded by PEP 440³ which is implemented
through the packaging module.
With more recent distutils versions, provided through setuptools
vendoring, we are seeing the following exception during version
comparisons:
> TypeError: '<' not supported between instances of 'str' and 'int'
This is fixed by this migration.
[1] https://docs.python.org/3/library/distutils.html
[2] https://peps.python.org/pep-0386/
[3] https://peps.python.org/pep-0440/
* Improve espeak version detection robustness
On many modern systems espeak is just a symlink to espeak-ng. In that
case looking for the 3rd word in the version output will break the
version comparison, when it finds `text-to-speech:`, instead of a proper
version.
This will not break during runtime, where espeak-ng would be
prioritized, but the phonemizer and tokenizer tests force the backend
to `espeak`, which exhibits this breakage.
This improves the version detection by simply looking for the version
after the "text-to-speech:" token.
* Replace distuils.copy_tree with shutil.copytree
The distutils module is deprecated and slated for removal in Python
3.12. Its usage should be replaced, in this case by a compatible method
from shutil.
* Adding neural HMM TTS
* Adding tests
* Adding neural hmm on readme
* renaming training recipe
* Removing overflow\s decoder parameters from the config
* Update the Trainer requirement version for a compatible one (#2276)
* Bump up to v0.10.2
* Adding neural HMM TTS
* Adding tests
* Adding neural hmm on readme
* renaming training recipe
* Removing overflow\s decoder parameters from the config
* fixing documentation
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
* Fixed bug related to yourtts speaker embeddings issue
* Reverted code for base_tts
* Bug fix on VITS d_vector_file type
* Ignore the test speakers on YourTTS recipe
* Add speaker encoder model and config on YourTTS recipe to easily do zero-shot inference
* Update YourTTS config file
* Update ModelManager._update_path to deal with list attributes
* Fix lint checks
* Remove unused code
* Fix unit tests
* Reset name_to_id to get the right speaker ids on load_embeddings_from_list_of_files
* Set weighted_sampler_multipliers as an empty dict to prevent users' mistakes
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
* Adding encoder
* currently modifying hmm
* Adding hmm
* Adding overflow
* Adding overflow setting up flat start
* Removing runs
* adding normalization parameters
* Fixing models on same device
* Training overflow and plotting evaluations
* Adding inference
* At the end of epoch the test sentences are coming on cpu instead of gpu
* Adding figures from model during training to monitor
* reverting tacotron2 training recipe
* fixing inference on gpu for test sentences on config
* moving helpers and texts within overflows source code
* renaming to overflow
* moving loss to the model file
* Fixing the rename
* Model training but not plotting the test config sentences's audios
* Formatting logs
* Changing model name to camelcase
* Fixing test log
* Fixing plotting bug
* Adding some tests
* Adding more tests to overflow
* Adding all tests for overflow
* making changes to camel case in config
* Adding information about parameters and docstring
* removing compute_mel_statistics moved statistic computation to the model instead
* Added overflow in readme
* Adding more test cases, now it doesn't saves transition_p like tensor and can be dumped as json
* Cache fsspec downloaded files
* Use diff paths for test
* Make fsspec caching optional
* Decom GPU docker tests
* Make progress bar optional for better CI log
* Check path local
* Update BaseDatasetConfig
- Add dataset_name
- Chane name to formatter_name
* Update compute_embedding
- Allow entering dataset by args
- Use released model by default
- Use the new key format
* Update loading
* Update recipes
* Update other dep code
* Update tests
* Fixup
* Load multiple embedding files
* Fix argument names in dep code
* Update docs
* Fix argument name
* Fix linter
* Update requirements.txt
install jamo for korean
* Update formatters.py
add KSS formatter
KSS is a korean single speech dataset (12hours)
* Add files via upload
add phonemizer for korean
* Add files via upload
add korean phonemizer
* Update requirements.txt
* change code style with `black` and `pylint`
* reflecting pylint's Evaluation
* reflecting pylint's Evaluation
* reflecting pylint's Evaluation-2
* isort
* edit about separator
write test case and add 'nltk' for requirements.txt
* add korean g2p (g2pkk)
* isort
* TTS/tts/utils/text/phonemizers/ko_kr_phonemizer.py:43:24: W0621: Redefining name 'text' from outer scope (line 58) (redefined-outer-name)
TTS/tts/utils/text/korean/korean.py:28:8: R1705: Unnecessary "else" after "return" (no-else-return)
* black
* Set n_jobs to 1 for resample script
* Delete resample test
* Set n_jobs 1 in vad test
* delete vad test
* Revert "Delete resample test"
This reverts commit bb7c8466af.
* Remove tests with resample