docs-v2/content/kapacitor/v1/guides/hierarchical-alert-suppress...

84 lines
3.1 KiB
Markdown
Raw Normal View History

Kapacitor 1.6.0 (#2756) * base changes and cleanup for kapa 1.6 * Kapacitor 1.6.0 jstirnaman (#2684) * update: add kapacitor/v1/users to API doc (#2617). * fix: spelling * update: cleanup user API. * Apply suggestions from code review Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/kapacitor/v1.6/working/api.md * Update content/kapacitor/v1.6/working/api.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor 1.6.0 release notes (#2682) * Fix headings in Kapacitor release notes * Add 1.6.0 release notes Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor Flux tasks (#2687) * kapacitor flux task setup * added crud docs for kapacitor flux tasks * added kapacitor flux task cli commands * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Update content/kapacitor/v1.6/working/flux/_index.md Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Kapacitor TrickleNode (#2694) * closes #2691 * WIP TrickleNode * add Trickle chaining methods * remove unused back-to-top links in kapacitor docs * Kapacitor 1.6 new configuration settings (#2693) * kapacitor config doc cleanup * updated heading and links * added new kapacitor 1.6 config options, closes #2616, closes #2609 * updated slack instructions * Kapacitor Zenoss event handler (#2695) * added zenoss event handler to kapacitor, closes #2271 * updated descriptions and removed defaults from zenoss event handler * updated kapacitor release notes * Updated Kapacitor Kafka event handler partitioning options (#2697) * updated kafka partitioning behavior, closes #2263 * fixed typos in kapacitor kafka content * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * added kafka link to release notes * Kapacitor API updates (#2718) * clean up kapacitor api headings * add flux task api endpoints to kapacitor api doc * Kapacitor authorization and authentication (#2717) * ported and revamped kapacitor authentication content, closes #2690 * Apply suggestions from code review Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> * updated comment in kapacitor auth example * Apply suggestions from code review Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> * updated edge.js * Add default kapacitor.conf locations (#2782) * added default kapacitor config locations, closes #2779 * Apply suggestions from code review * updated data for kapacitor 1.6.0 changelog Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
2021-06-28 20:57:05 +00:00
---
title: Suppress Kapacitor alerts based on hierarchy
description: >
Kapacitor's '.inhibit()' allows you to create hierarchical alerting architectures by suppressing alerts with matching tags in a specified alert category.
menu:
kapacitor_v1:
Kapacitor 1.6.0 (#2756) * base changes and cleanup for kapa 1.6 * Kapacitor 1.6.0 jstirnaman (#2684) * update: add kapacitor/v1/users to API doc (#2617). * fix: spelling * update: cleanup user API. * Apply suggestions from code review Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/kapacitor/v1.6/working/api.md * Update content/kapacitor/v1.6/working/api.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor 1.6.0 release notes (#2682) * Fix headings in Kapacitor release notes * Add 1.6.0 release notes Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor Flux tasks (#2687) * kapacitor flux task setup * added crud docs for kapacitor flux tasks * added kapacitor flux task cli commands * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Update content/kapacitor/v1.6/working/flux/_index.md Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Kapacitor TrickleNode (#2694) * closes #2691 * WIP TrickleNode * add Trickle chaining methods * remove unused back-to-top links in kapacitor docs * Kapacitor 1.6 new configuration settings (#2693) * kapacitor config doc cleanup * updated heading and links * added new kapacitor 1.6 config options, closes #2616, closes #2609 * updated slack instructions * Kapacitor Zenoss event handler (#2695) * added zenoss event handler to kapacitor, closes #2271 * updated descriptions and removed defaults from zenoss event handler * updated kapacitor release notes * Updated Kapacitor Kafka event handler partitioning options (#2697) * updated kafka partitioning behavior, closes #2263 * fixed typos in kapacitor kafka content * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * added kafka link to release notes * Kapacitor API updates (#2718) * clean up kapacitor api headings * add flux task api endpoints to kapacitor api doc * Kapacitor authorization and authentication (#2717) * ported and revamped kapacitor authentication content, closes #2690 * Apply suggestions from code review Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> * updated comment in kapacitor auth example * Apply suggestions from code review Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> * updated edge.js * Add default kapacitor.conf locations (#2782) * added default kapacitor config locations, closes #2779 * Apply suggestions from code review * updated data for kapacitor 1.6.0 changelog Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
2021-06-28 20:57:05 +00:00
name: Hierarchical alert suppression
identifier: hierarchical_alert_suppression
weight: 30
parent: guides
---
Kapacitor allows you to build out a robust monitoring and alerting solution with
multiple "levels" or "tiers" of alerts.
However, an issue arises when an event triggers both high-level and low-level alerts
and you end up getting multiple alerts from different contexts.
The [AlertNode's `.inhibit()`](/kapacitor/v1/reference/nodes/alert_node/#inhibit) method
Kapacitor 1.6.0 (#2756) * base changes and cleanup for kapa 1.6 * Kapacitor 1.6.0 jstirnaman (#2684) * update: add kapacitor/v1/users to API doc (#2617). * fix: spelling * update: cleanup user API. * Apply suggestions from code review Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/kapacitor/v1.6/working/api.md * Update content/kapacitor/v1.6/working/api.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor 1.6.0 release notes (#2682) * Fix headings in Kapacitor release notes * Add 1.6.0 release notes Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor Flux tasks (#2687) * kapacitor flux task setup * added crud docs for kapacitor flux tasks * added kapacitor flux task cli commands * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Update content/kapacitor/v1.6/working/flux/_index.md Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Kapacitor TrickleNode (#2694) * closes #2691 * WIP TrickleNode * add Trickle chaining methods * remove unused back-to-top links in kapacitor docs * Kapacitor 1.6 new configuration settings (#2693) * kapacitor config doc cleanup * updated heading and links * added new kapacitor 1.6 config options, closes #2616, closes #2609 * updated slack instructions * Kapacitor Zenoss event handler (#2695) * added zenoss event handler to kapacitor, closes #2271 * updated descriptions and removed defaults from zenoss event handler * updated kapacitor release notes * Updated Kapacitor Kafka event handler partitioning options (#2697) * updated kafka partitioning behavior, closes #2263 * fixed typos in kapacitor kafka content * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * added kafka link to release notes * Kapacitor API updates (#2718) * clean up kapacitor api headings * add flux task api endpoints to kapacitor api doc * Kapacitor authorization and authentication (#2717) * ported and revamped kapacitor authentication content, closes #2690 * Apply suggestions from code review Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> * updated comment in kapacitor auth example * Apply suggestions from code review Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> * updated edge.js * Add default kapacitor.conf locations (#2782) * added default kapacitor config locations, closes #2779 * Apply suggestions from code review * updated data for kapacitor 1.6.0 changelog Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
2021-06-28 20:57:05 +00:00
allows you to suppress other alerts when an alert is triggered.
For example, let's say you are monitoring a cluster of servers.
As part of your alerting architecture, you have host-level alerts such as CPU usage
alerts, RAM usage alerts, disk I/O, etc.
You also have cluster-level alerts that monitor network health, host uptime, etc.
If a CPU spike on a host in your cluster takes the machine offline, rather than
getting a host-level alert for the CPU spike _**and**_ a cluster-level alert for
the offline node, you'd get a single alert the alert that the node is offline.
The cluster-level alert would suppress the host-level alert.
## Using the `.inhibit()` method to suppress alerts
The `.inhibit()` method uses alert categories and tags to inhibit or suppress other alerts.
```js
// ...
|alert()
.inhibit('<category>', '<tags>')
```
`category`
The category for which this alert inhibits or suppresses alerts.
`tags`
A comma-delimited list of tags that must be matched in order for alerts to be
inhibited or suppressed.
### Example hierarchical alert suppression
The following TICKscripts represent two alerts in a layered alert architecture.
The first is a host specific CPU alert that triggers an alert to the `system_alerts`
category whenever idle CPU usage is less than 10%.
Streamed data points are grouped by the `host` tag, which identifies the host the
data point is coming from.
_**cpu\_alert.tick**_
```js
stream
|from()
.measurement('cpu')
.groupBy('host')
|alert()
.category('system_alerts')
.crit(lambda: "usage_idle" < 10.0)
```
The following TICKscript is a cluster-level alert that monitors the uptime of hosts in the cluster.
It uses the [`deadman()`](/kapacitor/v1/reference/nodes/alert_node/#deadman) function to
Kapacitor 1.6.0 (#2756) * base changes and cleanup for kapa 1.6 * Kapacitor 1.6.0 jstirnaman (#2684) * update: add kapacitor/v1/users to API doc (#2617). * fix: spelling * update: cleanup user API. * Apply suggestions from code review Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Update content/kapacitor/v1.6/working/api.md * Update content/kapacitor/v1.6/working/api.md Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor 1.6.0 release notes (#2682) * Fix headings in Kapacitor release notes * Add 1.6.0 release notes Co-authored-by: Scott Anderson <sanderson@users.noreply.github.com> * Kapacitor Flux tasks (#2687) * kapacitor flux task setup * added crud docs for kapacitor flux tasks * added kapacitor flux task cli commands * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Update content/kapacitor/v1.6/working/flux/_index.md Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * Kapacitor TrickleNode (#2694) * closes #2691 * WIP TrickleNode * add Trickle chaining methods * remove unused back-to-top links in kapacitor docs * Kapacitor 1.6 new configuration settings (#2693) * kapacitor config doc cleanup * updated heading and links * added new kapacitor 1.6 config options, closes #2616, closes #2609 * updated slack instructions * Kapacitor Zenoss event handler (#2695) * added zenoss event handler to kapacitor, closes #2271 * updated descriptions and removed defaults from zenoss event handler * updated kapacitor release notes * Updated Kapacitor Kafka event handler partitioning options (#2697) * updated kafka partitioning behavior, closes #2263 * fixed typos in kapacitor kafka content * Apply suggestions from code review Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> * added kafka link to release notes * Kapacitor API updates (#2718) * clean up kapacitor api headings * add flux task api endpoints to kapacitor api doc * Kapacitor authorization and authentication (#2717) * ported and revamped kapacitor authentication content, closes #2690 * Apply suggestions from code review Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> * updated comment in kapacitor auth example * Apply suggestions from code review Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com> * updated edge.js * Add default kapacitor.conf locations (#2782) * added default kapacitor config locations, closes #2779 * Apply suggestions from code review * updated data for kapacitor 1.6.0 changelog Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com> Co-authored-by: pierwill <19642016+pierwill@users.noreply.github.com> Co-authored-by: kelseiv <47797004+kelseiv@users.noreply.github.com>
2021-06-28 20:57:05 +00:00
create an alert when a host is unresponsive or offline.
The `.inhibit()` method in the deadman alert suppresses all alerts to the `system_alerts`
category that include a matching `host` tag, meaning they are from the same host.
_**host\_alert.tick**_
```js
stream
|from()
.measurement('uptime')
.groupBy('host')
|deadman(0.0, 1m)
.inhibit('system_alerts', 'host')
```
With this alert architecture, a host may be unresponsive due to a CPU bottleneck,
but because the deadman alert inhibits system alerts from the same host, you won't
get alert notifications for both the deadman and the high CPU usage; just the
deadman alert for that specific host.