"Five hours seven and a half minutes" was parsing as 5.5. This is
resolved. Multiple fractions/decimals still cause problems, e.g.
convert_words_to_numbers("seven and a half and nine and a half")
Out[5]: '7 and a 0.5 and 9 and a 0.5'
This is in support of issues-1959.
A small bug caused things like "two hundred twenty" to return only the
"hundred tenty" for the text. This has been fixed.
extract_numbers_with_text was updated to deal with the new return types
of the functions it depends on. Specifically, it accounts for the start
and end index values.
This is in support of issues-1959.
Previously it was assumed that the orgiginal text would be enough to
determine where in a string a number should go, however, in some
scenarios, that does not work, and results in the wrong values being
parsed.
A different, and smarter approach is being taken now, in which the
original string is initially split into a list of tuples of
(index, word) where index is the index of the word within the string.
All subsequent processing is done on these tuples, meaning we always
know exactly where the words were in the orginal string. This should
make text replacement perfect, as we can always sub out the exact,
correct words, based on their indicies.
extract_number_with_text_en now returns the number parsed, the text that
represents the number, the start index, and the end index.
Things are not yet working perfectly. Here is roughly the current state
of the world:
from mycroft.util.lang.parse_en import *
extract_number_with_text_en("this is some two hundred thousand twenty
two hours")
Out[3]: (200022, 'hundred thousand twenty two', 4, 7)
extract_number_with_text_en("this is some twenty two hours")
Out[4]: (22, 'twenty two', 3, 4)
extract_number_with_text_en("this is some twenty hours")
Out[5]: (20, 'twenty', 3, 3)
extract_number_with_text_en("this is some two and a half hours")
Out[6]: (2, 'two', 3, 3)
extract_number_with_text_en("this is some two point five hours")
Out[7]: (2, 'two', 3, 3)
The list of tuples is a bit of a hassle to deal with. In a future
commite the will be replaced with dictionaries, or even better, Token
objects, that contain the word and it's index. This would make the
code easier to reason about (removing lots things like words[0][1]
which has no meaning without deep understanding of the code).
This is in support of issues-1959.
Articles (a, an, the) that appeared immediately before the number were
included in the returned text. This fixes those issues.
The solution isn't super clean - it craetes a new function to wrap the
old one (unavoidable, since the articles can be needed for fractions),
and splits the returned string, the strips leading artivles.
Since strings are immutable, this is probably not super efficient. Might
be better to eventually use lists as much as possible, and only create a
string at the end (though the lists will come with their own problems,
so that could turn out to be a wash). In any case, this works for now.
Issues fixed:
Lists, e.g. "some words one two three" would return (3, "one two three")
Negaitve words were not included in output, e.g. "negative five" would
return (-5, "five").
This is in support of issues-1959.
Any articles (a, an, the) appearing before the number would be included
in the text, e.g. "set a timer for eight minutes" returned:
(8, 'a eight')
This fix clears the number words list if no number words have been
found. Note that articles can't simple be filtered, as they are
often a part of the numebr (e.g. three and a half).
This is in support of issues-1959.
Methods implemented include:
extract_number_with_text
extract_numbers_with_text
convert_words_to_numbers
extract_duration
This is in support of issues-1959. This continues the work of
returning the relevant text that corresponds to a number
parsed from a string.
This is in support of issues-1959. This begins the work of returning the
relevant text that corresponds to a number parsed from a string. Here's
a sample showing the basic functionality/state of the world (showing
some of the remaining cases to handle)
>>> from mycroft.util.lang.parse_en import *
>>> extractnumber_en_with_text("three hours twenty minutes")
(3, 'three')
>>> extractnumber_en_with_text("twenty minutes")
(20, 'twenty')
>>> extractnumber_en_with_text("twenty five minutes")
(25, 'twenty five')
>>> extractnumber_en_with_text("two hundred twenty five minutes")
(225, 'two hundred twenty five')
>>> extractnumber_en_with_text("three and a half minutes")
(3.5, 'three and a half minutes')
>>> extractnumber_en_with_text("three point five minutes")
(3.5, 'three point five minutes')
==== Tech Notes ====
Checks if the word being parsed is relevant to number parsing. If it is
not, and we have already found number words, we return what we have. If
it is, we add it to a list of words representing the current number
being parsed.
==== Documentation Notes ====
The old implementation of extractnumber_en seems to generally return the
last number in the text. This change will cause the first number to be
returned.
This is part of a refactor of extractnumber_en, with the ultimate
goal of making it easier to maintain and extend (should also
improve perf). This is in support of issues-1959.
This is part of a refactor of extractnumber_en, with the ultimate
goal of making it easier to maintain and extend (should also
improve perf). This is in support of issues-1959.
All tests (minus extract_duration, which has not yet been implemented)
are passing at this stage.
This is part of a refactor of extractnumber_en, with the ultimate
goal of making it easier to maintain and extend (should also
improve perf). This is in support of issues-1959.
All tests (minus extract_duration, which has not yet been implemented)
are passing at this stage.
Audiotest now prints the device and samplerate used as well as the commandline used to playback the audio for easier debugging if things doesn't work as intended.
This is part of a refactor of extractnumber_en, with the ultimate goal
of making it easier to maintain and extend (should also improve perf).
This is in support of issues-1959.
Begins a refactor of extractnumber_en, with the ultimate goal of making
it easier to maintain and extend (should also improve perf). This is
in support of issues-1959.
* Fix mimic2 negative numbers
Make the regex extracting numbers also match negative numbers when preparsing phrases sent to the mimic2 service
* Update pronounce_number to use "minus" for negatives
After discussion in the chat it was suggested to use "minus" for negatives as default.
When scientific notation is used the term "negative " is still used.
add support for decades, centuries, millemniums
add support for "within the hour", "in a second/minute",
add support for "a couple time_unit" and "a couple of time_unit"
Fix trillion being saved with wrong number (10e10 instead of 10e12)
==== Fixed Issues ====
1718
==== Tech Notes ====
NONE - explain new algorithms in detail, tool changes, etc.
==== Documentation Notes ====
NONE - description of a new feature or notes on behavior changes
==== Localization Notes ====
NONE - point to new strings, language specific functions, etc.
==== Environment Notes ====
NONE - new package requirements, new files being written to disk, etc.
==== Protocol Notes ====
NONE - message types added or changed, new signals, APIs, etc.
==== Fixed Issues ====
==== Tech Notes ====
NONE - explain new algorithms in detail, tool changes, etc.
==== Documentation Notes ====
NONE - description of a new feature or notes on behavior changes
==== Localization Notes ====
NONE - point to new strings, language specific functions, etc.
==== Environment Notes ====
NONE - new package requirements, new files being written to disk, etc.
==== Protocol Notes ====
NONE - message types added or changed, new signals, APIs, etc.
* Make the combo lock handle multiple users
On the Mark-1 the locking would fail since the first time the lock is
accessed the process is run under root (script checking the enclosure
version). This caused the locking to fail since the normal mycroft
processes got an access error (lock-file owned by root)
This change creates lock-files and sets the permissions to 0o777
if the file doesn't exist so it's accessible for all users
Also added an Apache license header
The test for today/tomorrow/yesterday was only looking at the day,
not the month and year. So on July 14, 2018 it would claim that
the date June 14 2018 and July 14 2020 are all "today".
Now the full date is used in the comparison
Adds the mycroft.util.combo_lock ComboLock class for interprocess/Thread
lock.
Loading updated to be more reliable:
- Flush and sync file
- wait 1.2 seconds before load
Split the logic from the locking so the lock can be avoided when calling
update from save or load from get.
The mycroft.util.time.now_utc() was returning a naive datetime object, potentially causing
issues in skills running on instances that aren't using UTC system-wide. This didn't impact
Picroft or Mark 1, but Github installs would experience issues with skills such as the Alarm.
Added functions for the spanish parser
* Added spanish function calls in parser.py for extractnumber_es
and extract_datetime_es
* Added spanish functions in util/paser_es.py for extractnumber_es
extract_datetime_es and added some missing numbers
Merged in changes from #1804
* Fix ambiguous time handling
In certain cases the ambiguous time handling skipped a day forward. This updates the logic to handle a bit better.
* Adds a test for the ambiguous time
* Remove references to timeStr
timeStr was never set and the logic that used it would never activate.
* Remove rename of currentDate
* Add extract_datetime parameter default_time
If a time is not found in the input string the time will be set from the
datetime/time object passed in as the default_time argument. If None the
time will be Midnight as previously.
* Add MycroftSkill.find_resource() that locates a localization resource file
from either under the old vocab/regex/dialog folder, or under any folder in
the new 'locale' folder. The locale folder unifies the three different
folders and allows arbitrary subfolders underneath it.
* MycroftSkill.speak_dialog() will now speak the entry name if no dialog file
is found. Periods are replaced with spaces, so
```self.speak_dialog("this.is.a.test.")``` would return "this is a test" if
no file named this.is.a.test.dialog is found.
* Remove MycroftSkills.vocab_dir value
* Minor edits to several docstrings
This is reimplementation of #1649 which became divergent.
## Description
Adds a Ogg123Service and a play_ogg exactly like Mpg123Service and play_mp3
## How to test
I have a skill for a podcast which does not have an mp3 feed:
https://github.com/joshuacox/skill-GNUworldOrder
## Contributor license agreement signed?
signed by @joshuacox
Response formatting for German language ordinals depending on cases/prepositions for dates
"am 1. März" -> "am ersten März" (on the first of March)
"der 1. März" -> "der erste März" (the first of March)
"1. März" -> "erster März" (first of March)
Response formatting for mathematical results
"10 ^ 2" -> "10 hoch 2" (ten to the power of two)
Can be tested via the corresponding test_format_de or by using wolfram alpha skill:
"Was ist die Fläche von Canada"
"Wann ist George Washington geboren"
Switched from "AM" to "a.m.", and "PM" to "p.m". These are pronounced better
in Mimic, and are also the more normally accepted ways of writing the
abbreviations for ante meridiem and post meridiem (at least according to the
AP and Chicago Style guides).
Many cases that were missed in the unittests for extract_datetime()
from the original source. Restored those tests and made code
adjustments to support them all.
Also adding the mycroft.util.time module. This supports:
* mycroft.util.time.default_timezone()
Returns the user-configured timezone based on location
* mycroft.util.time.now_utc()
Returns the time in UTC
* mycroft.util.time.now_local()
Returns the time in the user's timezone
* mycroft.util.time.to_utc()
Converts to UTC
* mycroft.util.time.to_local()
Converts to user's timezone
NOTE: Several skills should be updated to use these now.
==== Fixed Issues ====
Several issues for skills regarding parsing of "today"
==== Documentation Notes ====
Note the new module: mycroft.util.time
==== Localization Notes ====
Localized versions of extract_datetime() likely need to be
updated, as most were based on the original English implementation
* Restore extractdatetime to return False
- Restore extractdatetime to return False if no time was found
- Add tests to make sure this is true
- Add a extract_datetime function to keep coherency with the rest of the functions. (the old extractdatetime still exists for compatibility)
- Update documentation to match
* Minor corrections to docstrings.
The default socket timeout was changed when checking the connectivity through connecting to a remote server (8.8.8.8). This has now been updated to only change the timeout for the socket used for the connection.
Lots of minor fixes including, sorting dicts, making ints of strings,
MagicMock file spec and some other things
A couple of issues in the mycroft-core code base were identified and
fixed. Most notably the incorrect version check for python three when
adding basestring.
Update .travis.yml
This sends a ctrl+c signal to each process which will allow code to exit properly by handling KeyboardInterrupt
Other notable changes:
- create_daemon method used to clean up create daemon threads
- create_echo_function used to reduce code duplication with messagebus
echo functions
- wait_for_exit_signal used to wait for ctrl+c (SIGINT)
- reset_sigint_handler used to ensure SIGINT will raise KeyboardInterrupt
Add support for:
* mycroft.util.format.nice_time()
* mycroft.util.format.prounce_number()
* implemented unittests for above
Also renamed the helper method convert_number() to
_convert_to_mixed_fraction()
Move the language specific functions and constants into separate files.
This will avoid many unnecessary conflicts due to involuntary encoding
changes.
Add Python 2/3 compatibility
==== Tech Notes ====
This allows the main bus, skills and cli to be run in both python 2.7 and
3.5+.
Mainly trivial changes
- syntax for exceptions
- logic for importing correct Queue module
- .iteritems -> future.utils.iteritems when accessing dicts key value
pairs
* Allow audio service to be run in python 3
* Make speech client work with python 3
* Importing of Queue version dependent
* Exception syntax corrected
* Creating sound buffer is version dependant
- Adapt context use range from builtins
- Use compatible next() instead of .next() when walking the skill
directory
* Make CLI Python 3 Compatible
- Use compatible BytesIO instead of StringsIO
- Open files as text instead of binary
- Make sure integer divisions are used
* Make messagebus send compatible
* Fix failing travis
Re-add future 0.16.0
* Make string checks compatible
* basestring doesn't exist in python 3 so it's imported from the "past"
* Fix latest compatibility issues in speech client
- handle urllib
- handle encoding before calling md5
* Make Api.build_json() python 2/3 compatible
- "tonight" is re-interpreted as PM
- check is performed to check if previous word exist before accessing it
to handle sentences containing only a simple date
In code like this:
self.speak_dialog("something")
mycroft.audio.wait_while_speaking()
It was possible that the speaking of "something" would take longer to
start than the 0.1 seconds that was built into the wait_while_speaking().
The definition of this behavior is slightly fuzzy, but this is definitely
a case where the expectation is that previous request for speech would
start and complete. For now, I have just bumped the minimum wait to
0.3 seconds.
In the long run we might consider tracking specific speak requests and
generating a notification when that request has been serviced. Then the
skill could automatically hold off until that request has been serviced.
But the basic skill code won't have to change to make this happen, so
this additional sleep is adequate for today.
Also snuck in a minor change to a comment.
PR 1049 introduced several cosmetic PEP8 errors that were easily fixed.
Additionally there are unittests that include non-ASCII characters which are
failing. As Pt-PT support is a work-in-progress, I just commented them out
with TODOs next to them.
==== Fixed Issues ====
#1141
==== Tech Notes ====
Curate cache now only removes cache files if the diskspace is below the
set percentage AND if below a set amount of free disk space
==== Tech Notes ====
Since the voice is a quite large download stalling download is a real
possibility. Using wget allows for resume and retry of download in a
simple way.
Fix log autocomplete
==== Tech Notes ====
Due to the use of setattr(), IDEs like PyCharm couldn't use autocomplete on the LOG class. This unrolls the loop so that the logging attributes appear in the autocomplete menu.
This commit officially switches the mycroft-core repository from
GPLv3.0 licensing to Apache 2.0. All dependencies on GPL'ed code
have been removed and we have contacted all previous contributors
with still-existing code in the repository to agree to this change.
Going forward, all contributors will sign a Contributor License
Agreement (CLA) by visiting https://mycroft.ai/cla, then they will
be included in the Mycroft Project's overall Contributor list,
found at: https://github.com/MycroftAI/contributors. This cleanly
protects the project, the contributor and all who use the technology
to build upon.
Futher discussion can be found at this blog post:
https://mycroft.ai/blog/right-license/
This commit also removes all __author__="" from the code. These
lines are painful to maintain and the etiquette surrounding their
maintainence is unclear. Do you remove a name from the list if the
last line of code the wrote gets replaced? Etc. Now all
contributors are publicly acknowledged in the aforementioned repo,
and actual authorship is maintained by Github in a much more
effective and elegant way!
Finally, a few references to "Mycroft AI" were changed to the correct
legal entity name "Mycroft AI Inc."
==== Fixed Issues ====
#403 Update License.md and file headers to Apache 2.0
#400 Update LICENSE.md
==== Documentation Notes ====
Deprecated the ScheduledSkill and ScheduledCRUDSkill classes.
These capabilities have been superceded by the more flexible MycroftSkill
class methods schedule_event(), schedule_repeating_event(), update_event(),
and cancel_event().
==== Tech Notes ====
Downloader() is a Thread subclass starting and handling a download. The
property done and status is set when the download is completed or
failed.
download manager ensuring continuation of started downloads (per process)
complete_action, a function to be called after download is complete can be
supplied to the downloader
Some exception code was attempting to use the LOGGER, but one had
not been created for this file. So when the exception occurred
it caused a crash.
NOTE: We need to standardize on log/LOG/LOGGER!
Adds the ExtractDateTime parse function from Christopher. When imported from mycroft/util/parse.py, it'll take a sentence like "What's the weather like 5 weeks from next Wednesday?" and will extract a python datetime object for that date.
* Added requirements.txt change for importing dateutil
* BUGFIX: The big bug was calling is_paired() during wake_word_in_audio(). When not paired, that call hit the server, taking about a second. Since it happened multiple times a second, the audio buffers got backed up hugely. This resulted in weird behavior later as the buffers get cleared out.
* Added mycroft.api.has_been_paired(), which just looks for the pairing key (it does not validate it is still active with the server, like is_paired())
* The enclosure now checks for internet connectivity and kicks off the wifisetup process, not the wifisetup client itself.
* During the "onboarding" process, the microphone is muted using the new "mycroft.mic.mute" message. After pairing completes, the "mycroft.mic.unmute" is expected to be sent from the pairing skill. Unmuting again after a re-pairing is harmless.
* mute_and_speak() is smart enough to not unmute itself when complete if muted before
* util.check_for_signal() now accepts -1 as the lifetime. This means it never times out.
* util.stop_speaking() is more intelligent about shutting down the spoken text (including text that has been split at periods) and visemes
* The wifi setup no longer stops after 5 minutes unless already paired (i.e. still onboarding)
* The mouth resets on wifisetup stop (clearing the scrolling home.mycroft.ai)
* Changed several ConfigurationManager.get() calls to ConfigurationManager.instance(). Exactly the same, but .instance() is clearer/preferred.
* Added a mycroft.util.stop_speaking() method. Not perfect, but works for now and can be replaced later when AudioManager is in place.
A forthcoming change to the pairing skill will utilize the stop_speaking method.
* Added a mycroft.api.is_paired() method
* Added mycroft.util.is_speaking and mycroft.util.wait_while_speaking() methods
* RESET now waits for the spoken notice to complete
* Stopped the "Checking for updates" and "Skills updated" prompts (commented out for now, probably will eliminate)
* Wifi setup filters out hidden ("x00") networks
* Visemes should keep up better if they get behind (will skip)
* Mimic is now searched for on the users path
* Onboarding process:
- wifi setup starts automatically
- User is walked through the process
- wake word and button pressing are ignored
- At end, a short tutorial is given
This addresses this in several ways:
* Created mechanism to load 'commented' JSON (using '//' or '#' comments on a single line)
* Embedded comments into the mycroft.conf, indicating use, legal values, and where they get overridden
* Create ConfigurationManager.instance() static method to replace ConfigurationManager.get(). This produces more readable code like:
ConfigurationManager.instance().get("value") instead of ConfigurationManager.get().get("value")
* Made _ConfigurationListener 'private'
* docstring'ed things