mirror of https://gitlab.com/christosangel/sapo
update README.md
parent
6f7df5492c
commit
24f3dc7a65
30
README.md
30
README.md
|
@ -22,37 +22,44 @@ https://dirk.net/2021/10/31/tts-fix-max-decoder-steps/
|
|||
* Likewise, instead of _celluloid_ audio player, you can use any other player you prefer, like _xplayer, mplayer, smplayer, vlc, mpv etc._ Just make sure to substitute celluloid with your preffered player in line 211 of __Sapo.sh__.
|
||||
* The same applies for _Audacity_ and any other preffred wave editor in line 222 of Sapo.sh. While _audacity_ is not considered an absolute dependency for the functionality of the script, having a wave editor installed might as well be of use in cases, so, such a choice exists in fixing potential errors.
|
||||
|
||||
---
|
||||
|
||||
### DETECTING ERRORS
|
||||
|
||||
### I. CLATTER IN AUDIO OUTPUT
|
||||
|
||||
Sometimes the output wav file of a text file line is longer than necessary, containing hissing sounds, inrecognisable utterrances and clatter at the end of it. In order to detect which wave files are generated having that problem, the ratio of _character count of line / duration of audio file_
|
||||
is calculated. The lines that _possibly_ present this problem are written down in the error.tsv that is generated. After the end of all the lines,
|
||||
Sometimes the output wav file of a text file line is longer than necessary, containing hissing sounds, inrecognisable utterrances and clatter at the end of it.
|
||||
|
||||
In order to detect which wave files are generated having that problem, the ratio of _character count of line / duration of audio file_ is calculated. This ratio helps us roughly to estimate which lines were rendered with errors.
|
||||
|
||||
The lines that _possibly_ present this problem are written down in the error.tsv that is generated. After the end of all the lines,
|
||||
|
||||
* the lines written down in the tsv file get re-rendered. Many times this alone is enough.
|
||||
* After that each line one by one can be examined. The user can
|
||||
|
||||
> 1.__Play__ the respective audio file
|
||||
1.__Play__ the respective audio file
|
||||
|
||||
>2.__Re-render__ the line, making minor changes(like e.g. putting a fullstop at the end of the line)
|
||||
2.__Re-render__ the line, making minor changes(like e.g. putting a fullstop at the end of the line)
|
||||
|
||||
|
||||
>3.__Trim the clutter__ that exists at the end of audio file, anything that exists after half a second of detected silence.
|
||||
3.__Trim the clutter__ that exists at the end of audio file, anything that exists after half a second of detected silence.
|
||||
|
||||
>4.__Split__ render the line text in two batches, that will be concatenated after(useful in long sentences)
|
||||
4.__Split__ render the line text in two batches, that will be concatenated after(useful in long sentences)
|
||||
|
||||
>5.__Edit__ the respective audio file with a wave editor(e.g._Audacity)
|
||||
5.__Edit__ the respective audio file with a wave editor(e.g._Audacity)
|
||||
|
||||
>6.__Remove__ the respective audio file directly.
|
||||
6.__Remove__ the respective audio file directly.
|
||||
|
||||
>7.By hiting __OK__ the user vcan accept the audio file as is, or after correcting it, and proceed to the next.
|
||||
7.By hiting __OK__ the user can accept the audio file as is, or after correcting it, and proceed to the next.
|
||||
|
||||
---
|
||||
|
||||
![5.png](screenshots/5.png)
|
||||
|
||||
|
||||
After that, the audio files from all the lines will be concatenated into one.
|
||||
|
||||
---
|
||||
|
||||
### II. SED SCRIPT
|
||||
|
||||
|
@ -61,8 +68,11 @@ sapofonetix.sed is a script that substitutes words that get mispelled with othe
|
|||
|
||||
will substitute the word _biscuit_ (or _Biscuit_ in plural) with the word _biskeet_ (_Biskeet_), that its pronunciation sounds more proper.
|
||||
|
||||
The list of words is growing as the script gets used more, ___<u>feel free to chime in!</u>___
|
||||
The list of words is growing as the script gets used more, this will be an on going task:
|
||||
|
||||
___<u>feel free to chime in!</u>___
|
||||
|
||||
___
|
||||
|
||||
### SCREENSHOTS
|
||||
|
||||
|
|
Loading…
Reference in New Issue