Commit Graph

1381 Commits (29915ba85c77db944071ea1e0be2b153c2e74102)

Author SHA1 Message Date
Eren Gölge d023872f0d
Merge pull request #407 from thllwg/dev
Configurable number of workers for DataLoader
2020-05-16 15:45:07 +02:00
Eren Gölge 93cab61d80
Merge pull request #406 from mittimithai/patch-6
small change for multispeaker
2020-05-15 11:11:04 +02:00
thllwg 65b9c7d3d6 number of workers as config parameter 2020-05-15 10:19:52 +02:00
erogol d5d9e6e8ea bug fix 2020-05-13 13:52:17 +02:00
mittimithai 25f466f299
more whitespace problems 2020-05-12 19:02:37 -07:00
mittimithai 42ff83f9b9
trying to fix trailing whitespace 2020-05-12 18:50:58 -07:00
mittimithai a4aca623c3
removed + chars
silly mistake copy pasting
2020-05-12 15:23:45 -07:00
mittimithai 85a822e319
small change for multispeaker
just threads speaker_id through decoder.run_model
2020-05-12 15:02:24 -07:00
erogol 1cd25ccf0d bug fix 2020-05-12 16:23:32 +02:00
erogol 68dbcee746 import condition update for synthesis with TF 2020-05-12 16:23:32 +02:00
erogol 84c5c4a587 config remove empty chars 2020-05-12 16:23:32 +02:00
erogol b3ec50b5c4 tf bacend for synthesis 2020-05-12 16:23:32 +02:00
erogol d99fda8e42 init batch norm explicit initial values 2020-05-12 16:23:32 +02:00
erogol 6f5c8773d6 enable encoder lstm bias 2020-05-12 16:23:32 +02:00
erogol 9504b71f79 fix lstm biases True 2020-05-12 16:23:32 +02:00
erogol de2918c85b bug fixes 2020-05-12 16:23:32 +02:00
erogol 736f169cc9 tf lstm does not match torch lstm wrt bias vectors. So I avoid bias in LSTM as an easy solution. 2020-05-12 16:23:32 +02:00
erogol d282222553 renaming layers to be converted to TF counterpart 2020-05-12 16:23:32 +02:00
erogol bee288fa93 fixing console logging colors 2020-05-12 16:22:59 +02:00
erogol 88bde77061 fix checkpointing 2020-05-12 14:09:28 +02:00
erogol 0ec42fa279 more agressive remove folder 2020-05-12 14:09:16 +02:00
erogol 3b2d726e2d radam pytorch 1.5 update 2020-05-12 13:58:10 +02:00
erogol 574968b249 refactoring utils 2020-05-12 13:57:37 +02:00
erogol 720c4690db update imports 2020-05-12 13:57:26 +02:00
erogol c0c3c6e331 train.py update imports for utils refactoring 2020-05-12 13:57:13 +02:00
erogol 2d9dcd60ba update imports for util refactoring 2020-05-12 13:56:49 +02:00
Eren Gölge 7292d303b9
Merge pull request #404 from Edresson/patch-3
Fix bug in Graves Attn
2020-05-07 17:37:25 +02:00
Eren Gölge 7a9247f91f
Merge pull request #403 from Edresson/patch-2
fix bug in bidirectional decoder train
2020-05-05 00:33:53 +02:00
Edresson Casanova cce13ee245
Fix bug in Graves Attn
On my machine at Graves attention the variable self.J ( self.J = torch.arange(0, inputs.shape[1]+2).to(inputs.device) + 0.5) is a LongTensor, but it must be a float tensor. So I get the following error:

Traceback (most recent call last):
  File "train.py", line 704, in <module>
    main(args)
  File "train.py", line 619, in main
    global_step, epoch)
  File "train.py", line 170, in train
    text_input, text_lengths, mel_input, speaker_embeddings=speaker_embeddings)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/models/tacotron.py", line 121, in forward
    self.speaker_embeddings_projected)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 435, in forward
    output, stop_token, attention = self.decode(inputs, mask)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 367, in decode
    self.attention_rnn_hidden, inputs, self.processed_inputs, mask)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/common_layers.py", line 180, in forward
    phi_t = g_t.unsqueeze(-1) * (1.0 / (1.0 + torch.sigmoid((mu_t.unsqueeze(-1) - j) / sig_t.unsqueeze(-1))))
RuntimeError: expected type torch.cuda.FloatTensor but got torch.cuda.LongTensor


In addition the + 0.5 operation is canceled if it is a LongTensor.
Test: 
>>> torch.arange(0, 10) 
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10) + 0.5
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10.0) + 0.5
tensor([0.5000, 1.5000, 2.5000, 3.5000, 4.5000, 5.5000, 6.5000, 7.5000, 8.5000,
        9.5000])

To resolve this I forced the arrange range to float:
self.J = torch.arange(0, inputs.shape[1]+2.0).to(inputs.device) + 0.5
2020-05-04 17:52:58 -03:00
Edresson Casanova e4e29f716e
fix bug in bidirectional decoder train 2020-05-04 17:39:35 -03:00
erogol 67420eeb86 more console printing formatting 2020-04-29 11:58:51 +02:00
erogol 091711459d remove redundant avg keeper 2020-04-29 11:58:26 +02:00
Eren Gölge 2e2221f146
Merge pull request #399 from fatihkiralioglu/master
Tacotron1 + wavernn configuration fix
2020-04-27 12:00:39 +02:00
fatihkiralioglu cc11be06d7
fixing "No space allowed before..." compile errors
fixing "No space allowed before..." compile errors
2020-04-27 09:53:52 +03:00
fatihkiralioglu 70a8210283
Tacotron1 + wavernn configuration fix
Tacotron1 + wavernn configuration: corrected the input format for wavernn vocoder, converted spectrograms to mels
2020-04-25 15:18:46 +03:00
Eren Gölge f7b1cad9ee
Merge pull request #398 from PNRxA/dev
numpy to use CPU when using CUDA
2020-04-25 12:01:04 +02:00
PNRxA 61a1d59ac5 numpy to use CPU when using CUDA 2020-04-25 16:30:19 +10:00
erogol 373a682c08 logging fix 2020-04-24 18:51:15 +02:00
erogol 95385c8797 adding a dummy normalization file for testint mean-var norm 2020-04-23 16:06:12 +02:00
erogol 53b24625a7 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-04-23 15:48:05 +02:00
erogol f63bce89f6 pylint fix 2020-04-23 15:46:45 +02:00
erogol 3673cc1e30 passing reduced losses to loss dict 2020-04-23 15:46:11 +02:00
erogol 6e2c8c6537 update config.json 2020-04-23 14:37:12 +02:00
erogol d5093bf6fb checkpoint log 2020-04-23 14:24:38 +02:00
erogol 0e7ecca33f fancier and more flexible (self adapting to loss_dict) console logging. Fixing multi-gpu loss reduce 2020-04-23 14:14:09 +02:00
erogol 668a695763 bug fixes and consider the fmin fmax plotting specs 2020-04-09 12:28:52 +02:00
Eren Gölge 99be88c338
Merge pull request #388 from mittimithai/patch-3
Small fix for "Tacotron" use
2020-04-02 13:07:17 +02:00
erogol 3293d4e05f bug fix to use tacotron 2020-04-02 13:06:19 +02:00
mittimithai 6501369b0a
Small fix for "Tacotron" use
"Tacotron" won't work without this fix, since the linear spectrograms end up not getting computed
2020-04-01 11:57:53 -07:00
erogol 391dab45f0 update ExtractTTSSpecs notebook 2020-03-29 23:07:12 +02:00