12 KiB
The System Metrics Plugin provides comprehensive system monitoring capabilities for {{% product-name %}}, collecting CPU, memory, disk, and network metrics from the host system. Monitor detailed performance insights including per-core CPU statistics, memory usage breakdowns, disk I/O performance, and network interface statistics. Features configurable metric collection with robust error handling and retry logic for reliable monitoring.
Configuration
Plugin parameters may be specified as key-value pairs in the --trigger-arguments flag (CLI) or in the trigger_arguments field (API) when creating a trigger. Some plugins support TOML configuration files, which can be specified using the plugin's config_file_path parameter.
If a plugin supports multiple trigger specifications, some parameters may depend on the trigger specification that you use.
Plugin metadata
This plugin includes a JSON metadata schema in its docstring that defines supported trigger types and configuration parameters. This metadata enables the InfluxDB 3 Explorer UI to display and configure the plugin.
Optional parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
hostname |
string | localhost |
Hostname to tag all metrics with for system identification |
include_cpu |
boolean | true |
Include comprehensive CPU metrics collection (overall and per-core statistics) |
include_memory |
boolean | true |
Include memory metrics collection (RAM usage, swap statistics, page faults) |
include_disk |
boolean | true |
Include disk metrics collection (partition usage, I/O statistics, performance) |
include_network |
boolean | true |
Include network metrics collection (interface statistics and error counts) |
max_retries |
integer | 3 |
Maximum retry attempts on failure with graceful error handling |
Note: This plugin has no required parameters. All parameters have sensible defaults.
TOML configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
config_file_path |
string | none | TOML config file path relative to PLUGIN_DIR (required for TOML configuration) |
To use a TOML configuration file, set the PLUGIN_DIR environment variable and specify the config_file_path in the trigger arguments. This is in addition to the --plugin-dir flag when starting {{% product-name %}}.
Example TOML configuration
system_metrics_config_scheduler.toml
For more information on using TOML configuration files, see the Using TOML Configuration Files section in the influxdb3_plugins/README.md.
Software Requirements
- {{% product-name %}}: with the Processing Engine enabled.
- Python packages:
psutil(for system metrics collection)
Installation steps
-
Start {{% product-name %}} with the Processing Engine enabled (
--plugin-dir /path/to/plugins):influxdb3 serve \ --node-id node0 \ --object-store file \ --data-dir ~/.influxdb3 \ --plugin-dir ~/.plugins -
Install required Python packages:
influxdb3 install package psutil
Trigger setup
Basic Scheduled Trigger
influxdb3 create trigger \
--database system_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:30s" \
system_metrics_trigger
Using Configuration File
influxdb3 create trigger \
--database system_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:1m" \
--trigger-arguments config_file_path=system_metrics_config_scheduler.toml \
system_metrics_config_trigger
Custom Configuration
influxdb3 create trigger \
--database system_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:30s" \
--trigger-arguments hostname=web-server-01,include_disk=false,max_retries=5 \
system_metrics_custom_trigger
Example usage
Monitor Web Server Performance
# Create trigger for web server monitoring every 15 seconds
influxdb3 create trigger \
--database web_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:15s" \
--trigger-arguments hostname=web-server-01,include_network=true \
web_server_metrics
Database Server Monitoring
# Focus on CPU and disk metrics for database server
influxdb3 create trigger \
--database db_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:30s" \
--trigger-arguments hostname=db-primary,include_disk=true,include_cpu=true,include_network=false \
database_metrics
High-Frequency System Monitoring
# Collect all metrics every 10 seconds with higher retry tolerance
influxdb3 create trigger \
--database system_monitoring \
--path "gh:influxdata/system_metrics/system_metrics.py" \
--trigger-spec "every:10s" \
--trigger-arguments hostname=critical-server,max_retries=10 \
high_freq_metrics
Query collected metrics
This plugin collects system metrics automatically. After the trigger runs, query to view the collected data:
influxdb3 query \
--database system_monitoring \
"SELECT * FROM system_cpu WHERE time >= now() - interval '5 minutes' LIMIT 5"
Expected output
+------+--------+-------+--------+------+--------+-------+--------+-------+-------+------------+------------------+ | host | cpu | user | system | idle | iowait | nice | irq | load1 | load5 | load15 | time | +------+--------+-------+--------+------+--------+-------+--------+-------+-------+------------+------------------+ | srv1 | total | 12.5 | 5.3 | 81.2 | 0.8 | 0.0 | 0.2 | 0.85 | 0.92 | 0.88 | 2024-01-15 10:00 | | srv1 | total | 13.1 | 5.5 | 80.4 | 0.7 | 0.0 | 0.3 | 0.87 | 0.93 | 0.88 | 2024-01-15 10:01 | | srv1 | total | 11.8 | 5.1 | 82.0 | 0.9 | 0.0 | 0.2 | 0.83 | 0.91 | 0.88 | 2024-01-15 10:02 | | srv1 | total | 14.2 | 5.8 | 79.0 | 0.8 | 0.0 | 0.2 | 0.89 | 0.92 | 0.88 | 2024-01-15 10:03 | | srv1 | total | 12.9 | 5.4 | 80.6 | 0.9 | 0.0 | 0.2 | 0.86 | 0.92 | 0.88 | 2024-01-15 10:04 | +------+--------+-------+--------+------+--------+-------+--------+-------+-------+------------+------------------+
Code overview
Main Functions
process_scheduled_call()
The main entry point for scheduled triggers. Collects system metrics based on configuration and writes them to InfluxDB.
def process_scheduled_call(influxdb3_local, call_time, args):
# Parse configuration
config = parse_config(args)
# Collect metrics based on configuration
if config['include_cpu']:
collect_cpu_metrics(influxdb3_local, config['hostname'])
if config['include_memory']:
collect_memory_metrics(influxdb3_local, config['hostname'])
# ... additional metric collections
Measurements and Fields
system_cpu
Overall CPU statistics and metrics:
- Tags:
host,cpu=total - Fields:
user,system,idle,iowait,nice,irq,softirq,steal,guest,guest_nice,frequency_current,frequency_min,frequency_max,ctx_switches,interrupts,soft_interrupts,syscalls,load1,load5,load15
system_cpu_cores
Per-core CPU statistics:
- Tags:
host,core(core number) - Fields:
usage,user,system,idle,iowait,nice,irq,softirq,steal,guest,guest_nice,frequency_current,frequency_min,frequency_max
system_memory
System memory statistics:
- Tags:
host - Fields:
total,available,used,free,active,inactive,buffers,cached,shared,slab,percent
system_swap
Swap memory statistics:
- Tags:
host - Fields:
total,used,free,percent,sin,sout
system_memory_faults
Memory page fault information (when available):
- Tags:
host - Fields:
page_faults,major_faults,minor_faults,rss,vms,dirty,uss,pss
system_disk_usage
Disk partition usage:
- Tags:
host,device,mountpoint,fstype - Fields:
total,used,free,percent
system_disk_io
Disk I/O statistics:
- Tags:
host,device - Fields:
reads,writes,read_bytes,write_bytes,read_time,write_time,busy_time,read_merged_count,write_merged_count
system_disk_performance
Calculated disk performance metrics:
- Tags:
host,device - Fields:
read_bytes_per_sec,write_bytes_per_sec,read_iops,write_iops,avg_read_latency_ms,avg_write_latency_ms,util_percent
system_network
Network interface statistics:
- Tags:
host,interface - Fields:
bytes_sent,bytes_recv,packets_sent,packets_recv,errin,errout,dropin,dropout
Troubleshooting
Common issues
Issue: Permission errors for disk I/O metrics
Solution: The plugin will continue collecting other metrics even if some require elevated permissions. Run InfluxDB with appropriate permissions if disk I/O metrics are required.
Issue: Missing psutil library
Solution: Install the psutil package:
influxdb3 install package psutil
Issue: High CPU usage from plugin
Solution: Increase the trigger interval (for example, from every:10s to every:30s). Disable unnecessary metric types. Reduce the number of disk partitions monitored.
Viewing Logs
Logs are stored in the trigger's database in the system.processing_engine_logs table:
influxdb3 query \
--database YOUR_DATABASE \
"SELECT * FROM system.processing_engine_logs WHERE trigger_name = 'system_metrics_trigger' ORDER BY event_time DESC LIMIT 10"
Verifying Data Collection
Check that metrics are being collected:
# List all system metric measurements
influxdb3 query \
--database system_monitoring \
"SHOW MEASUREMENTS WHERE measurement =~ /^system_/"
# Check recent CPU metrics
influxdb3 query \
--database system_monitoring \
"SELECT COUNT(*) FROM system_cpu WHERE time >= now() - interval '1 hour'"
Logging
Logs are stored in the _internal database (or the database where the trigger is created) in the system.processing_engine_logs table. To view logs:
influxdb3 query --database _internal "SELECT * FROM system.processing_engine_logs WHERE trigger_name = 'your_trigger_name'"
Log columns:
- event_time: Timestamp of the log event
- trigger_name: Name of the trigger that generated the log
- log_level: Severity level (INFO, WARN, ERROR)
- log_text: Message describing the action or error
Report an issue
For plugin issues, see the Plugins repository issues page.
Find support for {{% product-name %}}
The InfluxDB Discord server is the best place to find support for InfluxDB 3 Core and InfluxDB 3 Enterprise. For other InfluxDB versions, see the Support and feedback options.