* Bug Fix on XTTS load
* Bug fix in MP3 length on TTSDataset
* Update TTS/tts/datasets/dataset.py
Co-authored-by: Aarni Koskela <akx@iki.fi>
* Uses mutagen for all audio formats
* Add dataloader test wit hall supported audio formats
* Use mutagen.File
* Update
* Fix aux unit tests
* Bug fixe on unit tests
---------
Co-authored-by: Aarni Koskela <akx@iki.fi>
* refactor(punctuation): remove orphan code for handling lone punctuation
The case of lone punctuation is already handled at the top of restore(). The
removed if statement would never be called and would in fact raise an
AttributeError because the _punc_index named tuple doesn't have the attribute
`mark`.
* refactor(punctuation): remove unused argument
* fix(punctuation): correctly handle initial punctuation
Stripping and restoring initial punctuation didn't work correctly because the
string-splitting caused an additional empty string to be inserted in the text
list (because `".A".split(".")` => `["", "A"]`). Now, an initial empty string is
skipped and relevant test cases are added.
Fixes#3333
* Implement most similar ref training approach
* Use non-enhanced hifigan for test samples
* Add Perceiver
* Update GPT Trainer for perceiver support
* Update XTTS docs
* Bug fix masking with XTTS perceiver
* Bug fix on gpt forward
* Bug Fix on XTTS v2.0 training
* Add XTTS v2.0 unit tests
* Add XTTS v2.0 inference unit tests
* Bug Fix on diffusion inference
* Add XTTS v2.0 training recipe
* Placeholder model entry
* Add cloning params to config
* Make prompt embedding configurable
* Make cloning configurable
* Cheap fix for a cheaper fix
* Prevent resampling
* Update model entry
* Update docs
* Update requirements
* Code linting
* Add xtts v2 to sep tests
* Bug fix on XTTS get_gpt_cond_latents
* Bug fix on rebase
* Make style
* Bug fix in Japenese tokenizer
* Add num2words to deps
* Remove unused kwarg and added num_beams=1 as default
---------
Co-authored-by: Eren G??lge <egolge@coqui.ai>
* add add cli options for play and speed
--play argument uses simpleaudio to play the tts wav
--speed <float 0.0-2.0> passes speed argument to Coqui Studio models
* remove simpleaudio not referenced in file
* fix simpleaudio dependency version
* add ALSA headers for simpleaudio compilation
* Dockerfile ALSA headers for simpleaudio
* base changes to use stdout instead of play audio
Considering conversion to pipe wav data for audio playback with ohter program
like aplay.
This is incomplete code. Using to get feedback before proceeding with
implementation.
* remove play for pipe_out arg that suppresses stdout
removed play and simpleaudio dependency in place of pipe
fuctionality to allow passing wav file data to a program
dedicated to playing audio.
* scipy.io.wavfile.write fails with /dev/null target
* Streaming inference for XTTS 🚀 (#3035)
* v0.17.7
* Redownload XTTS with the local and remote config do not match
* Remove unused method
* Print a message when it is already donwloaded
* Try-except to present error when the user dont have connection
* Fix style
* 0.17.8
* v0.17.8
---------
Co-authored-by: Julian Weber <julian.weber@hotmail.fr>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
Co-authored-by: Edresson Casanova <edresson1@gmail.com>
Co-authored-by: ggoknar <ggoknar@coqui.ai>
* initial commit
* Tortoise inference
* revert path change
* style fix
* remove accidental remove
* style fixes
* style fixes
* removed unwanted assests and deps
* remove changes
* remove cvvp
* style fix black
* added tortoise config and updated config and args, refactoring the code
* added tortoise to api
* Pull mel_norm from url
* Use TTS cleaners
* Let download model files
* add ability to pass tortoise presets through coqui api
* fix tests
* fix style and tests
* fix tts commandline for tortoise
* Add config.json to tortoise
* Use kwargs
* Use regular model api for loading tortoise
* Add load from dir to synthesizer
* Fix Tortoise floats
* Use model_dir when there are multiple urls
* Use `synthesize` when exists
* lint fixes and resolve preset bug
* resolve a download bug and update model link
* fix json
* do tortoise inference from voice dir
* fix
* fix test
* fix speaker id and remove assests
* update inference_tests.yml
* replace inference_test.yml
* fix extra dir as None
* fix tests
* remove space
* Reformat docstring
* Add docs
* Update docs
* lint fixes
---------
Co-authored-by: Eren Gölge <egolge@coqui.ai>
Co-authored-by: Eren Gölge <erogol@hotmail.com>
* Warn when lang is not avail
* Make style
* Implement Coqui Studio API
* Test
* Update docs
* Set action
* Make style
* Make lint
* Update README
* Make style
* Fix action
* Run actions