Commit Graph

1346 Commits (263dc2f7ce067ad5535f779f9ff2d0bad2cb4a2f)

Author SHA1 Message Date
erogol 263dc2f7ce fix weird lint problem but not caring 2020-05-20 16:27:48 +02:00
erogol ddd7de6439 update notebook 2020-05-20 16:12:10 +02:00
erogol 4a6949632b bug fixes and benchmark notebook update 2020-05-20 16:05:57 +02:00
erogol ed67cadf98 requirements for testingwq 2020-05-20 15:06:41 +02:00
erogol a893405739 raise not implemented for multispeaker TTS_tf inference 2020-05-20 14:26:56 +02:00
erogol cb9ac27b65 TTS_tf notebook update 2020-05-20 14:26:47 +02:00
erogol 6ccf32c2b9 update tests 2020-05-20 14:00:31 +02:00
erogol ca359727bc config update and change default debug mode 2020-05-20 12:30:06 +02:00
erogol 1835628335 tf conversion fixes 2020-05-20 12:25:24 +02:00
erogol dc166b42e3 update config.json 2020-05-20 11:55:32 +02:00
erogol 97cd39bf99 console logger fix 2020-05-19 16:44:37 +02:00
erogol f75b0a6439 linter updates 2020-05-18 18:46:13 +02:00
erogol 496ff68dec config update 2020-05-18 18:45:30 +02:00
erogol 327c88b4bb dme update 2020-05-18 13:31:14 +02:00
erogol df8fd3823d Merge branch 'tf-convert2' into dev 2020-05-18 13:13:21 +02:00
Eren Gölge e55e28bb37 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-05-18 13:12:31 +02:00
erogol 342d6303d4 update TF model notebook 2020-05-18 12:20:51 +02:00
erogol 8e6aedccee update readme 2020-05-18 12:00:10 +02:00
erogol 523fa5dfd2 pass sequence mask to the same device as the input 2020-05-18 11:35:19 +02:00
erogol 8805370645 add tf tacotron2 test and edit test utils imports after utils
refactoring
2020-05-18 11:34:13 +02:00
Eren Gölge 67397be1c0 tf folder add 2020-05-18 11:02:36 +02:00
Eren Gölge d023872f0d
Merge pull request #407 from thllwg/dev
Configurable number of workers for DataLoader
2020-05-16 15:45:07 +02:00
Eren Gölge 93cab61d80
Merge pull request #406 from mittimithai/patch-6
small change for multispeaker
2020-05-15 11:11:04 +02:00
thllwg 65b9c7d3d6 number of workers as config parameter 2020-05-15 10:19:52 +02:00
erogol d5d9e6e8ea bug fix 2020-05-13 13:52:17 +02:00
mittimithai 25f466f299
more whitespace problems 2020-05-12 19:02:37 -07:00
mittimithai 42ff83f9b9
trying to fix trailing whitespace 2020-05-12 18:50:58 -07:00
mittimithai a4aca623c3
removed + chars
silly mistake copy pasting
2020-05-12 15:23:45 -07:00
mittimithai 85a822e319
small change for multispeaker
just threads speaker_id through decoder.run_model
2020-05-12 15:02:24 -07:00
erogol 1cd25ccf0d bug fix 2020-05-12 16:23:32 +02:00
erogol 68dbcee746 import condition update for synthesis with TF 2020-05-12 16:23:32 +02:00
erogol 84c5c4a587 config remove empty chars 2020-05-12 16:23:32 +02:00
erogol b3ec50b5c4 tf bacend for synthesis 2020-05-12 16:23:32 +02:00
erogol d99fda8e42 init batch norm explicit initial values 2020-05-12 16:23:32 +02:00
erogol 6f5c8773d6 enable encoder lstm bias 2020-05-12 16:23:32 +02:00
erogol 9504b71f79 fix lstm biases True 2020-05-12 16:23:32 +02:00
erogol de2918c85b bug fixes 2020-05-12 16:23:32 +02:00
erogol 736f169cc9 tf lstm does not match torch lstm wrt bias vectors. So I avoid bias in LSTM as an easy solution. 2020-05-12 16:23:32 +02:00
erogol d282222553 renaming layers to be converted to TF counterpart 2020-05-12 16:23:32 +02:00
erogol bee288fa93 fixing console logging colors 2020-05-12 16:22:59 +02:00
erogol 88bde77061 fix checkpointing 2020-05-12 14:09:28 +02:00
erogol 0ec42fa279 more agressive remove folder 2020-05-12 14:09:16 +02:00
erogol 3b2d726e2d radam pytorch 1.5 update 2020-05-12 13:58:10 +02:00
erogol 574968b249 refactoring utils 2020-05-12 13:57:37 +02:00
erogol 720c4690db update imports 2020-05-12 13:57:26 +02:00
erogol c0c3c6e331 train.py update imports for utils refactoring 2020-05-12 13:57:13 +02:00
erogol 2d9dcd60ba update imports for util refactoring 2020-05-12 13:56:49 +02:00
Eren Gölge 7292d303b9
Merge pull request #404 from Edresson/patch-3
Fix bug in Graves Attn
2020-05-07 17:37:25 +02:00
Eren Gölge 7a9247f91f
Merge pull request #403 from Edresson/patch-2
fix bug in bidirectional decoder train
2020-05-05 00:33:53 +02:00
Edresson Casanova cce13ee245
Fix bug in Graves Attn
On my machine at Graves attention the variable self.J ( self.J = torch.arange(0, inputs.shape[1]+2).to(inputs.device) + 0.5) is a LongTensor, but it must be a float tensor. So I get the following error:

Traceback (most recent call last):
  File "train.py", line 704, in <module>
    main(args)
  File "train.py", line 619, in main
    global_step, epoch)
  File "train.py", line 170, in train
    text_input, text_lengths, mel_input, speaker_embeddings=speaker_embeddings)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/models/tacotron.py", line 121, in forward
    self.speaker_embeddings_projected)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 435, in forward
    output, stop_token, attention = self.decode(inputs, mask)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 367, in decode
    self.attention_rnn_hidden, inputs, self.processed_inputs, mask)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/common_layers.py", line 180, in forward
    phi_t = g_t.unsqueeze(-1) * (1.0 / (1.0 + torch.sigmoid((mu_t.unsqueeze(-1) - j) / sig_t.unsqueeze(-1))))
RuntimeError: expected type torch.cuda.FloatTensor but got torch.cuda.LongTensor


In addition the + 0.5 operation is canceled if it is a LongTensor.
Test: 
>>> torch.arange(0, 10) 
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10) + 0.5
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10.0) + 0.5
tensor([0.5000, 1.5000, 2.5000, 3.5000, 4.5000, 5.5000, 6.5000, 7.5000, 8.5000,
        9.5000])

To resolve this I forced the arrange range to float:
self.J = torch.arange(0, inputs.shape[1]+2.0).to(inputs.device) + 0.5
2020-05-04 17:52:58 -03:00