An accidental omission of status a level check resulted in all statuses
triggering notifications for pagerduty rules. This was not caught due
to insufficient testing of the pagerduty rule.
noticed that I had not used the http server as the entry point for server tests.
This was work to make that happen. Along the way, found a bunch of issues I hadn't
seen before 🤦. There are a number of changes tucked away inside the
other types, that make it possible to encode/decode a type with zero value for
influxdb.ID.
Closes https://github.com/influxdata/influxdb/issues/15154
Previously, we niavely incremented the unit associated with the leading
duration. Now, if the unit is `s` we double it. For every other unit, we
increase the magnitude by 1.
fix(notification/rule): change panic to t.Fatal
chore(notification/rule): add support for durations less than s
Closes https://github.com/influxdata/influxdb/issues/15042
Previously, there was an optional URL provided for the pagerduty
endpoint. However, the pagerduty API url does not change and as a result
it should not have been a parameter. The Pagerduty API does require a
`clientURL` that is presented in the pagerduty UI when an alert is
triggered. Currently that value will default to the alerts history page
for the organization.
Semantically, we've done the following:
// name of the client sending the alert.
client: "influxdata"
// url of the client sending the alert.
client_url: the endpoint URL for now (needs to change to rule)
// The class/type of the event, for example ping failure or cpu load
class: check's name
// Logical grouping of components of a service, for example app-stack
group: source measurement
Co-authored-by: Alirie Gray <alirie.gray@gmail.com>
This adds the _version: 1 correctly to the body of the HTTP POST.
Additionally, this fixes the imports when using secrets.
The POSTed JSON body now is:
```json
{
"_check_id": "046cac59e2aa3000",
"_check_name": "High CPU User Usage",
"_level": "crit",
"_measurement": "notifications",
"_message": "High CPU User Usage: rsavage.prod is crit",
"_notification_endpoint_id": "046cad0c83aec000",
"_notification_endpoint_name": "HTTP Endpoint",
"_notification_rule_id": "046dff53d4183000",
"_notification_rule_name": "HTTP Notification",
"_source_measurement": "cpu",
"_source_timestamp": 1567797375000000000,
"_start": "2019-09-06T19:15:59Z",
"_status_timestamp": 1567797376416632300,
"_stop": "2019-09-06T19:16:20.362006739Z",
"_time": "2019-09-06T19:16:20.609629338Z",
"_type": "threshold",
"_version": 1,
"cpu": "cpu-total",
"host": "rsavage.prod",
"usage_user": 91.12278069517379
}
```
The basic auth headers use a function that will soon be in flux that
renders usernames and passwords into `Basic <b64 u and p>`
This work assumes that AuthMethod has been set as well as the
appropriate secrets
We chose pretty arbitrary data from monitor to add to the pagerduty
schema.
We'll need to see how this renders and make adjustments.
Co-authored-by: Alirie Gray <alirie.gray@gmail.com>
fix(notification/rule): add overlap to windows so that state changes works
Since stateChanges cannot detect a state change for the initial point,
returned, we generate a task that queries for more data than required
and filters out records that are before the every. This way we will not
miss state changes that take place on the initial value.
fix(notification/rule): prevent union of tables for single status rule
fix(testing): add status rules to notifications
* WIP
* Fix UI linter errors from swagger changes to Level Rule
* Prevent same level selection on changes from
* Remove unused get
* Fix prettier error
* chore(notification/rule): change level rule to check level for rules
fix(notification/check): include tags in check object in generated flux
Closes https://github.com/influxdata/influxdb/issues/14769
fix(notification/check): use selected field in threshold functions
Closes https://github.com/influxdata/influxdb/issues/14776
fix(testing): add selected field for check tests
fix(check): use real flux for threshold check
feat(notification/check): generate flux for deadman checks
chore(endpoint): rename webhook endpoint to http endpoint
fix(notification/rule): fetch url for flux script off of endpoint
fix(notification/rule): clean up slack and http rules
fix(notification/rule): change MessageTemp to MessageTemplate
fix(rules): pass endpoint in to rule during create
fix(ui): rename webhook to http
feat(notification/check): namespace deadman under alerts
fix(notification/check): nest tags under tags key in data object in flux
wip
feat(kv): log error if urm cannot be deleted for notification rule
fix(notification/rule): remove name from notify call in slack rule
chore(ui/cypress/e2e): skip rule create test