Commit Graph

366 Commits (1e69c517b462f544a6c6adbc3af473512f0d9be3)

Author SHA1 Message Date
Brandon Farmer bad4751709 fix(influxdb): authorizing system buckets 2019-10-21 15:49:10 -07:00
Brandon Farmer ea82dc3470 fix(tasks): tasks look up system bucket id 2019-10-21 14:48:47 -07:00
Alirie Gray 552168d3ae
refactor(tasks): use Go time objects for timestamps on task Runs (#15406) 2019-10-17 17:23:45 -07:00
George 975289fba1
refactor(tasks): separate run recording behaviour out from analytical storage (#15412) 2019-10-17 10:37:03 +01:00
Alirie Gray f096605327
fix(tasks): replace deactivation of unrecoverable errors with metric (#15430) 2019-10-16 16:00:58 -07:00
Jonathan A. Sternberg b73870d3ed
test(tasks): skip flaky test in the scheduler 2019-10-15 09:18:41 -05:00
docmerlin (j. Emrys Landivar) 0958c26382 feat(tasks): add scheduler release test 2019-10-14 14:02:25 -05:00
docmerlin (j. Emrys Landivar) b8b8422384 feat(tasks): update new scheduler in response to pr comments 2019-10-14 14:02:25 -05:00
docmerlin (j. Emrys Landivar) 4b732acb3b feat(tasks): switch the new scheduler to use clock instead of custom time mocker 2019-10-14 14:02:25 -05:00
j. Emrys Landivar (docmerlin) 3fd94cbb69 feat(task): new scheduler now with more tests 2019-10-14 14:02:25 -05:00
Lyon Hill 84bc9a8293 feat(task): add scheduler metrics (first pass) 2019-10-14 14:02:25 -05:00
j. Emrys Landivar (docmerlin) 4695eccda5 feat(tasks): new tree-based scheduler 2019-10-14 14:02:25 -05:00
Lyon Hill 3c6779f011
feat(task): Allow tasks to run more isolated from other task systems (#15384)
* feat(task): Allow tasks to run more isolated from other task systems

To allow the task internal system to be used for user created tasks as well
as checks, notification and other future additions we needed to take 2 actions:

1 - We need to use type as a first class citizen, meaning that task's have a type
and each system that will be creating tasks will set the task type through the api.
This is a change to the previous assumption that any user could set task types. This change
will allow us to have other service's white label the task service for their own purposes and not
have to worry about colissions between the types.

2 - We needed to allow other systems to add data specific to the problem they are trying to solve.
For this purpose adding a `metadata` field to the internal task system which should allow other systems to
use the task service.

These changes will allow us in the future to allow for the current check's and notifications implementations
to create a task with meta data instead of creating a check object and a task object in the database.
By allowing this new behavior checks, notifications, and user task's can all follow the same pattern:

Field an api request in a system specific http endpoint, use a small translation to the `TaskService` function call,
translate the results to what the api expects for this system, and return results.

* fix(task): undo additional check for ownerID because check is not ready
2019-10-11 08:53:38 -06:00
Alirie Gray 364e80bc94
fix(tasks): use go errors for scheduler metrics (#15374) 2019-10-10 09:55:30 -07:00
Alirie Gray be28de8fbc
feat(tasks): deactivate task on unrecoverable error (#15369) 2019-10-09 13:51:03 -07:00
Lyon Hill f5e9b5e04f
feat(task): add type to some specific metrics in new execution. (#15340) 2019-10-08 15:58:41 -06:00
Alirie Gray a9df93b1fd
refactor(tasks): create coordinator for new scheduler/executor (#15268) 2019-09-26 13:55:23 -07:00
Mustafa 4fcf4c4ad1
Merge pull request #15248 from influxdata/elbehery-fix#4300
fix(storage): remove level=0 from TSM disk bytes metrics.
2019-09-26 18:21:38 +02:00
Lyon Hill 7aa98ca84f
feat(task): add limit function for task concurrency (#15266)
* feat(task): add limit function for task concurrency

The new task executor handles limit's differently then the old executor
instead of front loading limits by creating a runner for every task that might run
the new executor has a large worker pool and queue. This allow's us to have a unlimited
concurrency per task and helps us avoid a back log of task's execution based on a
arbitrary execution limit. This add's the ability to add an optional task execution limit
so a user can still have the advantages of limiting concurrency.
2019-09-25 12:02:04 -06:00
elbehery 663d4bb901 test(tasks): skip flaky test 2019-09-25 18:17:59 +02:00
Brandon Farmer d83fabeabc feat(influxdb): user disabling 2019-09-23 11:57:16 -07:00
Lyon Hill 9d6b9555ac
feat(task): add functions to the task executor to allow for coordinator control (#15218)
We needed the coordinator to be able to execute manual runs and resume runs.
These two functions have been added, but we also needed to allow for the executor to be
mocked out. To do that we needed to return a Promise interface instead of an actual
struct. Both these changes are to facilitate coordinator work and testing.
2019-09-20 10:36:44 -06:00
Lorenzo Affetti 053836e5a5
Merge pull request #15203 from influxdata/flux-staging-v0.48.x
build(flux): update to Flux v0.48.0
2019-09-20 18:24:02 +02:00
Lyon Hill 11b9a6fb28
fix(task): Update task executor to match the expected executor interface. (#15205)
I chose to add a execute function that allow's the task executor to match expectation from
the scheduler but I left in the existing executor method that return's promises. This is
because I like to be able to have the accountablilty and visiblity inside what's happening
with each execution even though the promise isn't required for the scheduler. This function signature
will be used by the coordinator and potentially other's that want to ensure a 'execution' is completed.
2019-09-20 08:20:05 -06:00
Lorenzo Affetti 3f50cd2af9 Merge branch 'master' into flux-staging-v0.48.x 2019-09-19 17:20:40 +02:00
Jonathan A. Sternberg cbd04f2884
refactor: http error serialization matches the new error schema (#15196)
The http error schema has been changed to simplify the outward facing
API. The `op` and `error` attributes have been dropped because they
confused people. The `error` attribute will likely be readded in some
form in the future, but only as additional context and will not be
required or even suggested for the UI to use.

Errors are now output differently both when they are serialized to JSON
and when they are output as strings. The `op` is no longer used if it is
present. It will only appear as an optional attribute if at all. The
`message` attribute for an error is always output and it will be the
prefix for any nested error. When this is serialized to JSON, the
message is automatically flattened so a nested error such as:

    influxdb.Error{
        Msg: errors.New("something bad happened"),
        Err: io.EOF,
    }

This would be written to the message as:

    something bad happened: EOF

This matches a developers expectations much more easily as most
programmers assume that wrapping an error will act as a prefix for the
inner error.

This is flattened when written out to HTTP in order to make this logic
immaterial to a frontend developer.

The code is still present and plays an important role in categorizing
the error type. On the other hand, the code will not be output as part
of the message as it commonly plays a redundant and confusing role when
humans read it. The human readable message usually gives more context
and a message like with the code acting as a prefix is generally not
desired. But, the code plays a very important role in helping to
identify categories of errors and so it is very important as part of the
return response.
2019-09-19 10:06:47 -05:00
Lorenzo Affetti ab835c8e0e
refactor(dependencies): use new dependency injection framework (#15174)
refactor(dependencies): use new dependency injection framework
2019-09-19 17:01:17 +02:00
docmerlin (j. Emrys Landivar) baaf93dbbc fix(tasks): fixes duration validation for every and offset, so people will get feedback if they are using durations that tasks doesn't currently support 2019-09-18 09:07:43 -05:00
Stuart Carnie 7240d21e20
fix(task): PR feedback to fix docs 2019-09-17 12:02:04 -07:00
Stuart Carnie 57a710bb9c
fix(task): Improve Executor#Execute error consistency
Implementations of the backend.Executor produce errors limited to
querying the KV store. The remainder of the errors will be processed
in the implementation of a `RunPromise`.

Fixes #15161
2019-09-16 17:10:02 -07:00
Alirie Gray aef199bcc1
fix(tasks): use influxdb errors in scheduler (#15145) 2019-09-16 13:55:39 -07:00
Stuart Carnie a8d1fd0deb
Merge pull request #15123 from influxdata/sgc/scheduler
feat(task): Change interfaces defining scheduler and executor behavior
2019-09-12 15:10:17 -07:00
Alirie Gray 067305c148
feat(tasks): add WithMaxConcurrency to configure scheduler (#15121) 2019-09-12 11:47:27 -07:00
Stuart Carnie 9389b41c6e
feat(task): Change interfaces defining scheduler and executor behavior
See #14183
2019-09-11 17:02:28 -07:00
Lyon Hill 243e946697
fix(task): execution metric now shows correct data (#15112)
The first pass failed to save the correct execution metrics,
it will now compare the difference between start and finish.
2019-09-11 13:22:22 -06:00
Alirie Gray 21e14de7aa
feat(tasks): use env variable for concurrency (#15110) 2019-09-10 16:05:50 -07:00
Alirie Gray 645df57102
feat(tasks): use influxdb errors for executor metrics (#14926) 2019-09-10 12:48:55 -07:00
Lyon Hill cc84a43cea
feat(task): add run duration to task metrics (#15102)
We need to be able to see how long its taking task's to run as well
be able to see the start delta time per task.
2019-09-10 12:30:17 -06:00
docmerlin (j. Emrys Landivar) 03215c0028 chore(tasks): up concurrency to 11, this is a temporary workaround till we get a new scheduler 2019-09-09 17:13:14 -05:00
docmerlin (j. Emrys Landivar) c91ef8e398 fix(alerts and notifications): updates latest completed when status goes from inactive->active 2019-09-06 17:36:16 -05:00
Lyon Hill 2b75d20570
fix(task): Update task in scheduler to show updated logs. (#15008)
The current behavior is that the update is pushed into the scheduler,
and the scheduler cherry pick's what it needs. This leaves the task itself out
meaning any logging the scheduler did was not going to have the new task information in it.
2019-09-06 10:38:56 -06:00
Lyon Hill 5d6bb3fced
fix(task): clean up offset when removed in script (#14961)
When the flux script removes a offset it should be removed from the task
2019-09-06 08:26:50 -06:00
Stuart Carnie 15aaae5dd4
fix(task): Create tags using NewTags to ensure they are sorted 2019-09-04 15:25:44 -07:00
Lyon Hill 5fe3600126
feat(task): Task execution will accurately measure queue delta (#14913)
When a task is told to execute it can be enqueued waiting for a worker.
This statistic will be superior to the existing delta based on scheduled for,
the current system can be effected by a user having slow queries or a long "delay" on the task.
This new way of measuring the same thing should allow us to accuratly measure when it is the task system's fault.
2019-09-03 12:55:33 -06:00
Lyon Hill 5d1c4d814b
fix(task): Remove allowance for duplicate run's in run list (#14875)
If we are caching run's in the kv storage system it is possible to get
the the cached version from the kv store and the recently completed run
from the analytical store. We just need to only show analytical results if
we find a duplicate.
2019-08-29 14:56:55 -06:00
Michael Desa 7c8988f4f7
Merge pull request #14777 from influxdata/fix/misc-check-flux
fix(checks): generate exact flux for threshold checks and notifications
2019-08-27 17:17:16 -04:00
Lyon Hill f8f8a3cf55
fix(task): fix panic when failing to update on startup (#14825)
* fix(task): fix panic when failing to update on startup

Additionally we have no need to claim tasks that are inactive
2019-08-27 13:16:18 -06:00
Michael Desa b26ed76d6a
fix(notification/check): ensure cloud integration works
fix(notification/check): include tags in check object in generated flux

Closes https://github.com/influxdata/influxdb/issues/14769

fix(notification/check): use selected field in threshold functions

Closes https://github.com/influxdata/influxdb/issues/14776

fix(testing): add selected field for check tests

fix(check): use real flux for threshold check

feat(notification/check): generate flux for deadman checks

chore(endpoint): rename webhook endpoint to http endpoint

fix(notification/rule): fetch url for flux script off of endpoint

fix(notification/rule): clean up slack and http rules

fix(notification/rule): change MessageTemp to MessageTemplate

fix(rules): pass endpoint in to rule during create

fix(ui): rename webhook to http

feat(notification/check): namespace deadman under alerts

fix(notification/check): nest tags under tags key in data object in flux

wip

feat(kv): log error if urm cannot be deleted for notification rule

fix(notification/rule): remove name from notify call in slack rule

chore(ui/cypress/e2e): skip rule create test
2019-08-27 15:02:53 -04:00
Lyon Hill ee9e622c6d
feat(task): Add task middleware's for checks and notifications (#14809)
To have checks and notifications happen transactionally we need to be
able to alert the task system when a new task was created using the checks and notifications systems.
These two new middlewares allow us to inform the task system of a update
to a task that was created through the check or notification systems.
2019-08-26 16:54:52 -06:00
Nathaniel Cook dfc28335ea refactor(query/dependencies): update to new Flux dependencies defaults 2019-08-26 16:46:17 -06:00