* Added GLOB capability to entityfilter and every place that uses it. All existing tests are passing
* added tests for components affected by glob change
* fixed flake8 error
* mocking the correct listener
* mocking correct bus method in azure test
* tests passing in 3.7 and 3.8
* fixed formatting issue from rebase/conflict
* Checking against glob patterns in more performant way
* perf improvments and reverted unnecessarily adjusted tests
* added new benchmark test around filters
* no longer using get with default in entityfilter
* changed filter name and removed logbook from filter benchmark
* simplified benchmark tests from feedback
* fixed apache tests and returned include exclude schemas to normal
* fixed azure event hub tests to properly go through component logic
* fixed azure test and clean up for other tests
* renaming test files to match standard
* merged mqtt statestream test changes with base
* removed dependency on recorder filter schema from history
* fixed recorder tests after merge and a bunch of lint errors
Cleanup indexes as >50% of the db size was indexes,
many of them unused in any current query
Logbook search was having to filter event_types without
an index:
Created ix_events_event_type_time_fired
Dropped ix_events_event_type
States had a redundant keys on composite index:
Dropped ix_states_entity_id
Its unused since we have ix_states_entity_id_last_updated
De-duplicate storage of context in states as
its always stored in events and can be found
by joining the state on the event_id.
Dropped ix_states_context_id
Dropped ix_states_context_parent_id
Dropped ix_states_context_user_id
After schema v9:
STATES............................................ 10186 40.9%
EVENTS............................................ 5502 22.1%
IX_STATES_ENTITY_ID_LAST_UPDATED.................. 2177 8.7%
IX_EVENTS_EVENT_TYPE_TIME_FIRED................... 1910 7.7%
IX_EVENTS_CONTEXT_ID.............................. 1592 6.4%
IX_EVENTS_TIME_FIRED.............................. 1383 5.6%
IX_STATES_LAST_UPDATED............................ 1079 4.3%
IX_STATES_EVENT_ID................................ 375 1.5%
IX_EVENTS_CONTEXT_PARENT_ID....................... 347 1.4%
IX_EVENTS_CONTEXT_USER_ID......................... 346 1.4%
IX_RECORDER_RUNS_START_END........................ 1 0.004%
RECORDER_RUNS..................................... 1 0.004%
SCHEMA_CHANGES.................................... 1 0.004%
SQLITE_MASTER..................................... 1 0.004%
* adj
* time_fired_isoformat
* remove unused code
* tests for processing timestamps
* restore missing import lost in merge conflict
* test for None case
* Add old_state_id to states, remove old/new state data from events since it can now be found by a join
* remove state lookup on restart
* Ensure old_state is set for exisitng states
* Improve history api performance
A new option "minimal_response" reduces the amount of data
sent between the first and last history states to only the
"last_changed" and "state" fields.
Calling to_native is now avoided where possible and only
done at the end for rows that will be returned in the response.
When sending the `minimal_response` option, the history
api now returns a json response similar to the following
for an entity
Testing:
History API Response time for 1 day
Average of 10 runs with minimal_response
Before: 19.89s. (content length : 3427428)
After: 8.44s (content length: 592199)
```
[{
"attributes": {--TRUNCATED--},
"context": {--TRUNCATED--},
"entity_id": "binary_sensor.powerwall_status",
"last_changed": "2020-05-18T23:20:03.213000+00:00",
"last_updated": "2020-05-18T23:20:03.213000+00:00",
"state": "on"
},
...
{
"last_changed": "2020-05-19T00:41:08Z",
"state": "unavailable"
},
...
{
"attributes": {--TRUNCATED--},
"context": {--TRUNCATED--},
"entity_id": "binary_sensor.powerwall_status",
"last_changed": "2020-05-19T00:42:08.069698+00:00",
"last_updated": "2020-05-19T00:42:08.069698+00:00",
"state": "on"
}]
```
* Remove impossible state check
* Remove another impossible state check
* Update homeassistant/components/history/__init__.py
Co-authored-by: Paulus Schoutsen <paulus@home-assistant.io>
* Reorder to save some indent per review
* Make query response make sense with to_native=False
* Update test for 00:00 to Z change
* Update homeassistant/components/recorder/models.py
Co-authored-by: Paulus Schoutsen <paulus@home-assistant.io>
Co-authored-by: Paulus Schoutsen <paulus@home-assistant.io>
The database fields are timezoned via DateTime(timezone=True), so the
default value should be timezoned too. When using cockroachdb this is
fatal and results in the recorder crashing.
* Avoid a context switch in the history api
The history api was creating a job to fetch the
states and another job to convert the states to
json. This can be done in a single job which
decreases the overhead of the operation.
* Ensure there is only one sqlalchemy session created per history
query.
Most queries created three sqlalchemy sessions which was
especially slow with sqlite since it opens and closes the
database.
In testing the UI is noticeably faster at generating history
graphs for entites.
* Add additional coverage
* pass hass first to _states_to_json and _get_significant_states
Some providers have set their wait_timeout to 60s
in order to pack as many users as they can on a machine.
The mysql default is 28800 seconds (8 hours)
Since mysql connection build and tear down is relativity
expensive, we want to avoid being disconnected.
We now accommodate this scenario with the following:
1. Raise the mysql session wait_timeout 28800 when we connect
2. The event session now does a 30 second keep alive to
ensure the connection stays open
If the database server disconnects there were exceptions
that were not trapped which would cause the recorder event
loop to collapse. As we never want the loop to end
we trap exceptions broadly.
Fix a bug in the new commit interval setting which caused
it to always commit after 1s
* Add a commit interval setting to recorder
* Make the default every 1s instead of immediate
* See attached py-spy flamegraphs for why 1s
* This avoids disk thrashing during event storms
* Make Home Assistant significantly more responsive on busy systems
* remove debug
* Add commit forces for tests that expect commits to be immediate
* Add commit forces for tests that expect commits to be immediate
* make sure _trigger_db_commit is in the right place (all effective "wait_recording_done" calls)
* De-duplicate wait_recording_done code
* [recorder] Use orjson to parse json faster
* Remove from http manifest
* Bump to orjson 2.5.1
* Empty commit to trigger CI
Co-authored-by: Paulus Schoutsen <paulus@home-assistant.io>
* added recorder vars db_max_retries and db_retry_wait
* fixed test_recorder_setup_failure
I failed because it was missing the two new variables. I simply added these with default values.
* fixed syntax error in test_recorder_setup_failure
* fixed formatting error in test_init_py for recorder component
* fixed typo in test case
* Updated the way the default keys for db_,max_wait and db_retry_wait is set
Implemented based on suggestions from @springstan
* Updated config_schema call to adhere to Black
* changed conf.get to conf[dict] for var retrieval
* removed 2 blank lines
* move imports to top-level in recorder init
* move imports to top-level in recorder migration
* move imports to top-level in recorder models
* move imports to top-level in recorder purge
* move imports to top-level in recorder util
* fix pylint
* Restore states through a JSON store
* Accept entity_id directly in restore state helper
* Keep states stored between runs for a limited time
* Remove warning
* Don't treat typing as an "in-between" module for import order
That was a < 3.5 era thing.
* Tighten scope of some pylint unused-import disables
To avoid isort moving a top level one around, undesirably broadening its
scope.
* Upgrade pylint to 1.8.1
* Fix no-else-return
* Fix bad-whitespace
* Fix too-many-nested-blocks
* Fix raising-format-tuple
See https://github.com/PyCQA/pylint/blob/master/doc/whatsnew/1.8.rst
* Fix len-as-condition
* Fix logging-not-lazy
Not sure about that TEMP_CELSIUS though, but internally it's probably just like if you concatenated any other (variable) string
* Fix stop-iteration-return
* Fix useless-super-delegation
* Fix trailing-comma-tuple
Both of these seem to simply be bugs:
* Nest: The value of self._humidity never seems to be used anywhere
* Dovado: The called API method seems to expect a "normal" number
* Fix redefined-argument-from-local
* Fix consider-using-enumerate
* Fix wrong-import-order
* Fix arguments-differ
* Fix missed no-else-return
* Fix no-member and related
* Fix signatures-differ
* Revert "Upgrade pylint to 1.8.1"
This reverts commit af78aa00f125a7d34add97b9d50c14db48412211.
* Fix arguments-differ
* except for device_tracker
* Cleanup
* Fix test using positional argument
* Fix line too long
I forgot to run flake8 - shame on me... 🙃
* Fix bad-option-value for 1.6.5
* Fix arguments-differ for device_tracker
* Upgrade pylint to 1.8.2
* 👕 Fix missed no-member
* Lazy loading of service descriptions
* Fix tests
* Load YAML in executor
* Return a copy of available services to allow mutations
* Remove lint
* Add zha/services.yaml
* Only cache descriptions for known services
* Remove lint
* Remove description loading during service registration
* Remove description parameter from async_register
* Test async_get_all_descriptions
* Remove lint
* Fix typos from multi-edit
* Remove unused arguments
* Remove unused import os
* Remove unused import os, part 2
* Remove unneeded coroutine decorator
* Only use executor for loading files
* Cleanups suggested in review
* Increase test coverage
* Fix races in existing tests
* Add EntityFilter helper
* Changes in entityfilter after code review
* Convert recorder to use EntityFilter
* Fix flake/lint errors in recorder
* Update entity filter helper to return function
* Update recorder to use updated entity filter
* Better docstrings in entityfilter
* Update entityfilter.py
* Extra check to incoming connections
The incoming connection could be other than self.db_url, because
some 'custom_component' could be making these, and then, if they're not
sqlite3 connections, an error will raise because those haven't the
`dbapi_connection.isolation_level` attrib.
* lint fix
* simplify check: isinstance test only
* Add recorder purge service
* Recorder test to match purge config
* Removed purge timer, move service handler to setup, add service description file
* Tests for recorder purge service
* Recorder purge timer rework, add purge service parameter, tests
* Purge service schema change
* Service description change value range
* First cleanup
* Fix name of config
* Add DEBUG-level log for db row to native object conversion
This is now the bottleneck (by a large margin) for big history queries, so I'm leaving this log feature in to help diagnose users with a slow history page
* Rewrite of the "first synthetic datapoint" query for multiple entities
The old method was written in a manner that prevented an index from being used in the inner-most GROUP BY statement, causing massive performance issues especially when querying for a large time period.
The new query does have one material change that will cause it to return different results than before: instead of using max(state_id) to get the latest entry, we now get the max(last_updated). This is more appropriate (primary key should not be assumed to be in order of event firing) and allows an index to be used on the inner-most query. I added another JOIN layer to account for cases where there are two entries on the exact same `last_created` for a given entity. In this case we do use `state_id` as a tiebreaker.
For performance reasons the domain filters were also moved to the outermost query, as it's way more efficient to do it there than on the innermost query as before (due to indexing with GROUP BY problems)
The result is a query that only needs to do a filesort on the final result set, which will only be as many rows as there are entities.
* Remove the ORDER BY entity_id when fetching states, and add logging
Having this ORDER BY in the query prevents it from using an index due to the range filter, so it has been removed.
We already do a `groupby` in the `states_to_json` method which accomplishes exactly what the ORDER BY in the query was trying to do anyway, so this change causes no functional difference.
Also added DEBUG-level logging to allow diagnosing a user's slow history page.
* Add DEBUG-level logging for the synthetic-first-datapoint query
For diagnosing a user's slow history page
* Missed a couple instances of `created` that should be `last_updated`
* Remove `entity_id` sorting from state_changes; match significant_update
This is the same change as 09b3498f41 , but applied to the `state_changes_during_period` method which I missed before. This should give the same performance boost to the history sensor component!
* Bugfix in History query used for History Sensor
The date filter was using a different column for the upper and lower bounds. It would work, but it would be slow!
* Update Recorder purge script to use more appropriate columns
Two reasons: 1. the `created` column's meaning is fairly arbitrary and does not represent when an event or state change actually ocurred. It seems more correct to purge based on the event date than the time the database row was written.
2. The new columns are indexed, which will speed up this purge script by orders of magnitude
* Updating db model to match new query optimizations
A few things here: 1. New schema version with a new index and several removed indexes
2. A new method in the migration script to drop old indexes
3. Added an INFO-level log message when a new index will be added, as this can take quite some time on a Raspberry Pi
* Try catch around database updates in recorder. Resolves 6919
* Fixing failed test for line length
* Catch only OperationalError and retry connections before giving up
* Including alchemy exceptions in single function
* New indexes for states table
* Added recorder_runs indexes
* Created a new function for compound indexes.
A new function was created because it makes it a little cleaner when creating
a single-field index since one doesn't have to create a list. This is mostly
when creating the name of the index so with a bit more logic it's possible
to combine it into one function. Given how often migration changes are run,
I thought that code bloat was probably a worthy trade-off for now.
* Adjusted indexes, POC for ref indexes by name.
* Corrected lint errors
* Fixed pydocstyle error
* Moved create_index function outside apply_update
* Moved to single line (just barely)
* Wait up to 9 seconds
* Set number of recorder retries to 8
* Do not sleep when reporting last connection error if no retries left
* Make sure we clean up old engine if connection is retrying
* Update __init__.py
* Restore states
* feedback
* Remove component move into recorder
* space
* helper
* Address my own comments
* Improve test coverage
* Add test for light restore state
* [recorder] Add tests for full schema migration
* Remove leftover code
* Fix duplicate creation of sqlalchemy Index object
* It's that kind of day...
* Improve models_original docstring