7.0 KiB
Use Cases
Chronograf works with the other components of the TICK stack to provide a user interface for monitoring and alerting on your infrastructure. This document describes common setup use cases for Chronograf.
Use Case 1: Setup for Monitoring Several Servers
Suppose you want to use Chronograf to monitor several servers. This section describes a simple setup for monitoring CPU, disk, and memory usage on three servers.
Architecture Overview
Each of the three servers has its own Telegraf instance. Those instances are configured to collect CPU, disk, and memory data using Telegraf's system stats input plugin. Each Telegraf instance is also configured to send those data to a single InfluxDB instance. When Telegraf sends data to InfluxDB, it automatically tags those data with the relevant server's hostname.
The single InfluxDB instance is connected to Chronograf.
Chronograf uses the host
tag in the Telegraf data to populate the HOST LIST page and provide other hostname-specific information in the user interface.
Setup Description
To start out, we install and start InfluxDB on a separate server. We recommend installing InfluxDB on its own machine for performance purposes. InfluxDB's default configuration doesn't require any adjustments for this particular use case.
Next, we install Telegraf on each server that we want to monitor.
Before starting the three Telegraf services we need to make some edits to Telegraf's configuration file (/etc/telegraf/telegraf.conf
).
First, we configure each instance to use the system stats plugin to collect CPU, disk, and memory data.
The system stats plugin is actually enabled by default so there's no additional work to do here.
We just double check that [[inputs.cpu]]
, [[inputs.disk]]
, and [[inputs.mem]]
are uncommented in the INPUT PLUGINS
section of Telegraf's configuration file:
###############################################################################
# INPUT PLUGINS #
###############################################################################
# Read metrics about cpu usage
[[inputs.cpu]] #✅
## Whether to report per-cpu stats or not
percpu = true
## Whether to report total system cpu stats or not
totalcpu = true
## If true, collect raw CPU time metrics.
collect_cpu_time = false
# Read metrics about disk usage by mount point
[[inputs.disk]] #✅
## By default, telegraf gather stats for all mountpoints.
## Setting mountpoints will restrict the stats to the specified mountpoints.
# mount_points = ["/"]
## Ignore some mountpoints by filesystem type. For example (dev)tmpfs (usually
## present on /run, /var/run, /dev/shm or /dev).
ignore_fs = ["tmpfs", "devtmpfs"]
[...]
# Read metrics about memory usage
[[inputs.mem]] #✅
# no configuration
Our next edit to Telegraf's configuration file ensures that each Telegraf instance sends data to our single InfluxDB instance.
To do this, we edit the urls
setting in the OUTPUT PLUGINS
section to point to the IP of our InfluxDB instance:
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
## The full HTTP or UDP endpoint URL for your InfluxDB instance.
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval.
# urls = ["udp://localhost:8089"] # UDP endpoint example
urls = ["http://<InfluxDB-IP>:8086"] # 💥 Edit here!💥
## The target database for metrics (telegraf will create it if not exists).
database = "telegraf" # required
Now that we've configured our inputs and outputs, we start the Telegraf service on all three servers.
Telegraf begins by creating a database in InfluxDB called telegraf
(that name is configurable), and Telegraf starts writing system stats data to that database.
Note that Telegraf automatically creates a host
tag that records the hostname of the server that sent the data.
Here's a sample of some CPU usage data in InfluxDB:
name: cpu
time usage_idle host <--- Telegraf's auto-generated tag
---- ---------- ----
2016-11-29T22:41:00Z 99.70000000000253 server-01
2016-11-29T22:41:00Z 99.79959919839698 server-02
2016-11-29T22:41:00Z 98.1037924151472 server-03
2016-11-29T22:41:10Z 99.60000000000036 server-01
2016-11-29T22:41:10Z 99.49698189131892 server-02
2016-11-29T22:41:10Z 99.6996996996977 server-03
2016-11-29T22:41:20Z 98.89889889889365 server-01
2016-11-29T22:41:20Z 99.40119760479097 server-02
2016-11-29T22:41:20Z 99.60039960039995 server-03
Finally, we install and start Chronograf.
Once we connect Chronograf to our InfluxDB
instance, Chronograf uses Telegraf's host
tag to populate the HOST LIST page:
The system stats dashboard template shows the CPU, Disk, and Memory metrics for the selected hostname:
Finally, you can create queries in the Data Explorer that graph results per hostname:
Use Case 2: Setup the TICK Stack in a Kubernetes Instance
Check out our 20-minute webinar for how to spin up the TICK Stack in a Kubernetes instance.