From 9b041f958bf72160f6c3a9496ea9821223f6b70e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Eren=20G=C3=B6lge?= Date: Sun, 2 Jul 2023 13:09:40 +0200 Subject: [PATCH] Update docs and credits --- README.md | 5 +++-- docs/source/models/bark.md | 1 + 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 9cefb002..b55edba7 100644 --- a/README.md +++ b/README.md @@ -106,8 +106,9 @@ Underlined "TTS*" and "Judy*" are 🐸TTS models ### End-to-End Models - VITS: [paper](https://arxiv.org/pdf/2106.06103) -- YourTTS: [paper](https://arxiv.org/abs/2112.02418) -- Tortoise: [orig. repo](https://github.com/neonbjb/tortoise-tts) +- 🐸 YourTTS: [paper](https://arxiv.org/abs/2112.02418) +- 🐢 Tortoise: [orig. repo](https://github.com/neonbjb/tortoise-tts) +- 🐶 Bark: [orig. repo](https://github.com/suno-ai/bark) ### Attention Methods - Guided Attention: [paper](https://arxiv.org/abs/1710.08969) diff --git a/docs/source/models/bark.md b/docs/source/models/bark.md index d07cca3f..978d793a 100644 --- a/docs/source/models/bark.md +++ b/docs/source/models/bark.md @@ -6,6 +6,7 @@ It is architecturally very similar to Google's [AudioLM](https://arxiv.org/abs/2 ## Acknowledgements - 👑[Suno-AI](https://www.suno.ai/) for training and open-sourcing this model. +- 👑[gitmylo](https://github.com/gitmylo) for finding [the solution](https://github.com/gitmylo/bark-voice-cloning-HuBERT-quantizer/) to the semantic token generation for voice clones and finetunes. - 👑[serp-ai](https://github.com/serp-ai/bark-with-voice-clone) for controlled voice cloning.