Commit Graph

1352 Commits (2a071a75055b5b266fe982602585be35d2667ad0)

Author SHA1 Message Date
erogol 720c4690db update imports 2020-05-12 13:57:26 +02:00
erogol c0c3c6e331 train.py update imports for utils refactoring 2020-05-12 13:57:13 +02:00
erogol 2d9dcd60ba update imports for util refactoring 2020-05-12 13:56:49 +02:00
Eren Gölge 7292d303b9
Merge pull request #404 from Edresson/patch-3
Fix bug in Graves Attn
2020-05-07 17:37:25 +02:00
Eren Gölge 7a9247f91f
Merge pull request #403 from Edresson/patch-2
fix bug in bidirectional decoder train
2020-05-05 00:33:53 +02:00
Edresson Casanova cce13ee245
Fix bug in Graves Attn
On my machine at Graves attention the variable self.J ( self.J = torch.arange(0, inputs.shape[1]+2).to(inputs.device) + 0.5) is a LongTensor, but it must be a float tensor. So I get the following error:

Traceback (most recent call last):
  File "train.py", line 704, in <module>
    main(args)
  File "train.py", line 619, in main
    global_step, epoch)
  File "train.py", line 170, in train
    text_input, text_lengths, mel_input, speaker_embeddings=speaker_embeddings)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/models/tacotron.py", line 121, in forward
    self.speaker_embeddings_projected)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 435, in forward
    output, stop_token, attention = self.decode(inputs, mask)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/tacotron.py", line 367, in decode
    self.attention_rnn_hidden, inputs, self.processed_inputs, mask)
  File "/home/edresson/anaconda3/envs/TTS2/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/edresson/DD/TTS/voice-clonning/TTS/tts_namespace/TTS/layers/common_layers.py", line 180, in forward
    phi_t = g_t.unsqueeze(-1) * (1.0 / (1.0 + torch.sigmoid((mu_t.unsqueeze(-1) - j) / sig_t.unsqueeze(-1))))
RuntimeError: expected type torch.cuda.FloatTensor but got torch.cuda.LongTensor


In addition the + 0.5 operation is canceled if it is a LongTensor.
Test: 
>>> torch.arange(0, 10) 
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10) + 0.5
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> torch.arange(0, 10.0) + 0.5
tensor([0.5000, 1.5000, 2.5000, 3.5000, 4.5000, 5.5000, 6.5000, 7.5000, 8.5000,
        9.5000])

To resolve this I forced the arrange range to float:
self.J = torch.arange(0, inputs.shape[1]+2.0).to(inputs.device) + 0.5
2020-05-04 17:52:58 -03:00
Edresson Casanova e4e29f716e
fix bug in bidirectional decoder train 2020-05-04 17:39:35 -03:00
erogol 67420eeb86 more console printing formatting 2020-04-29 11:58:51 +02:00
erogol 091711459d remove redundant avg keeper 2020-04-29 11:58:26 +02:00
Eren Gölge f7b1cad9ee
Merge pull request #398 from PNRxA/dev
numpy to use CPU when using CUDA
2020-04-25 12:01:04 +02:00
PNRxA 61a1d59ac5 numpy to use CPU when using CUDA 2020-04-25 16:30:19 +10:00
erogol 373a682c08 logging fix 2020-04-24 18:51:15 +02:00
erogol 95385c8797 adding a dummy normalization file for testint mean-var norm 2020-04-23 16:06:12 +02:00
erogol 53b24625a7 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-04-23 15:48:05 +02:00
erogol f63bce89f6 pylint fix 2020-04-23 15:46:45 +02:00
erogol 3673cc1e30 passing reduced losses to loss dict 2020-04-23 15:46:11 +02:00
erogol 6e2c8c6537 update config.json 2020-04-23 14:37:12 +02:00
erogol d5093bf6fb checkpoint log 2020-04-23 14:24:38 +02:00
erogol 0e7ecca33f fancier and more flexible (self adapting to loss_dict) console logging. Fixing multi-gpu loss reduce 2020-04-23 14:14:09 +02:00
erogol 668a695763 bug fixes and consider the fmin fmax plotting specs 2020-04-09 12:28:52 +02:00
Eren Gölge 99be88c338
Merge pull request #388 from mittimithai/patch-3
Small fix for "Tacotron" use
2020-04-02 13:07:17 +02:00
erogol 3293d4e05f bug fix to use tacotron 2020-04-02 13:06:19 +02:00
mittimithai 6501369b0a
Small fix for "Tacotron" use
"Tacotron" won't work without this fix, since the linear spectrograms end up not getting computed
2020-04-01 11:57:53 -07:00
erogol 391dab45f0 update ExtractTTSSpecs notebook 2020-03-29 23:07:12 +02:00
erogol a678d684a2 bug fix 2020-03-27 14:17:03 +01:00
erogol d5efe040f7 compute stft paddings to correct wav and spec alignment aespecially for vocoder training 2020-03-26 21:10:37 +01:00
erogol 52c0b4e3e1 bug fix addinf gmissing output for synthesis 2020-03-25 01:56:29 +01:00
erogol 9915d79173 return inputs with synthesis 2020-03-24 01:30:58 +01:00
erogol 745cc4e20a audio.py updates 2020-03-24 01:30:46 +01:00
erogol 20dd509430 change the way how ref_level_db is handled in audio.py 2020-03-17 18:24:05 +01:00
erogol 5223678a4b update audio tests more verbose 2020-03-17 18:22:55 +01:00
erogol b9df54adcd bug fix and check cleaner config field by comparing with the list of
avail functions
2020-03-17 15:05:13 +01:00
erogol fa795347a9 turkish cleaner and data preprocessor 2020-03-17 14:47:59 +01:00
erogol fd4e6d0245 Merge branch 'mean-var2' of https://github.com/erogol/TTS_experiments into mean-var2 2020-03-17 13:39:27 +01:00
erogol fd9f469ddc visualization updates wrt mean-var scaling 2020-03-17 13:38:57 +01:00
erogol 77f36b65b8 StandardScaler added 2020-03-17 13:38:57 +01:00
erogol cef9f06887 changes of audio.py for mean-vat scaling 2020-03-17 13:38:57 +01:00
erogol 52b0dc39a6 testing mean-var scalingand updating test config 2020-03-17 13:38:57 +01:00
erogol 40cb4a53a6 update test attention notebooks 2020-03-17 13:38:57 +01:00
erogol 25bcbe2887 config update for mean-var scaling 2020-03-17 13:38:57 +01:00
erogol 568c743632 update compute_statistics.py 2020-03-17 13:38:57 +01:00
erogol 2a903888c1 Merge branch 'dev' of https://github.com/mozilla/TTS into dev 2020-03-17 13:38:38 +01:00
erogol 3bbeb43f57 visualization updates wrt mean-var scaling 2020-03-17 13:28:15 +01:00
erogol d7cf34ca34 StandardScaler added 2020-03-17 13:27:53 +01:00
erogol 92ebec01b1 changes of audio.py for mean-vat scaling 2020-03-17 13:27:25 +01:00
erogol d1e9f8dff1 testing mean-var scalingand updating test config 2020-03-17 13:26:46 +01:00
erogol acccac72f5 update test attention notebooks 2020-03-17 13:24:30 +01:00
erogol 141797b6ae write model description to tensorboard 2020-03-17 13:23:25 +01:00
erogol 0ee1dd54a3 config update for mean-var scaling 2020-03-17 12:44:18 +01:00
erogol 069c8e4315 update compute_statistics.py 2020-03-17 12:43:38 +01:00