Merge pull request #6422 from influxdata:influxdb3-monitor-metrics
Influxdb3 monitor metrics6403-influxdb3-perf-tuning
commit
f14c244316
111
TESTING.md
111
TESTING.md
|
@ -121,6 +121,117 @@ Potential causes:
|
|||
# This is ignored
|
||||
```
|
||||
|
||||
### Metrics Endpoint Testing
|
||||
|
||||
The metrics testing suite validates InfluxDB 3 Core and Enterprise metrics in two phases:
|
||||
|
||||
1. **Phase 1: Direct metrics validation** - Validates metric format, existence, and types by directly querying InfluxDB endpoints
|
||||
2. **Phase 2: Prometheus integration** - Validates Prometheus configuration, scraping, and relabeling work as documented
|
||||
|
||||
#### Phase 1: Direct Metrics Validation
|
||||
|
||||
The `test/influxdb3/metrics_endpoint_test.py` suite validates that InfluxDB 3 metrics endpoints expose all documented metrics in correct Prometheus format.
|
||||
|
||||
**Basic Usage:**
|
||||
|
||||
```bash
|
||||
# Using the wrapper script (recommended)
|
||||
./test/run-metrics-tests.sh
|
||||
|
||||
# Direct execution with Docker Compose
|
||||
docker compose run --rm influxdb3-core-pytest test/influxdb3/metrics_endpoint_test.py
|
||||
|
||||
# Run specific test
|
||||
docker compose run --rm influxdb3-core-pytest test/influxdb3/metrics_endpoint_test.py -k test_http_grpc_metrics
|
||||
```
|
||||
|
||||
**Verbose Output:**
|
||||
|
||||
Set `VERBOSE_METRICS_TEST=true` to see detailed output showing which metrics are searched and the actual matching lines from the Prometheus endpoint:
|
||||
|
||||
```bash
|
||||
# With wrapper script
|
||||
VERBOSE_METRICS_TEST=true ./test/run-metrics-tests.sh
|
||||
|
||||
# With Docker Compose
|
||||
VERBOSE_METRICS_TEST=true docker compose run --rm \
|
||||
-e VERBOSE_METRICS_TEST \
|
||||
influxdb3-core-pytest \
|
||||
test/influxdb3/metrics_endpoint_test.py
|
||||
```
|
||||
|
||||
Example verbose output:
|
||||
```
|
||||
TEST: HTTP/gRPC Metrics
|
||||
================================================================================
|
||||
|
||||
✓ Searching for: http_requests_total
|
||||
Found 12 total occurrences
|
||||
Matches:
|
||||
# HELP http_requests_total accumulated total requests
|
||||
# TYPE http_requests_total counter
|
||||
http_requests_total{method="GET",path="/metrics",status="aborted"} 0
|
||||
```
|
||||
|
||||
#### Phase 2: Prometheus Integration Testing
|
||||
|
||||
The `test/influxdb3/prometheus_integration_test.py` suite validates that Prometheus can scrape InfluxDB metrics and that the documented relabeling configuration works correctly.
|
||||
|
||||
**What it validates:**
|
||||
- Prometheus service discovers InfluxDB targets
|
||||
- Scrape configuration works with authentication
|
||||
- Relabeling adds `node_name` and `node_role` labels correctly
|
||||
- Regex patterns in relabel_configs match documentation
|
||||
- PromQL queries using relabeled metrics work
|
||||
- Example queries from documentation execute successfully
|
||||
|
||||
**Basic Usage:**
|
||||
|
||||
```bash
|
||||
# Using the wrapper script (recommended)
|
||||
./test/run-metrics-tests.sh --prometheus
|
||||
|
||||
# Run both direct and Prometheus tests
|
||||
./test/run-metrics-tests.sh --all
|
||||
|
||||
# Direct execution
|
||||
./test/influxdb3/run-prometheus-tests.sh
|
||||
|
||||
# With verbose output
|
||||
VERBOSE_PROMETHEUS_TEST=true ./test/influxdb3/run-prometheus-tests.sh
|
||||
```
|
||||
|
||||
**What happens during Prometheus tests:**
|
||||
|
||||
1. Starts Prometheus service with documented configuration from `test/influxdb3/prometheus.yml`
|
||||
2. Waits for Prometheus to discover and scrape both InfluxDB instances
|
||||
3. Validates relabeling adds `node_name` label (extracted from `__address__`)
|
||||
4. Validates relabeling adds `node_role` label (based on node name pattern)
|
||||
5. Tests PromQL queries can filter by node labels
|
||||
6. Validates example rate() and histogram_quantile() queries work
|
||||
|
||||
**Prerequisites:**
|
||||
- All Phase 1 prerequisites (see below)
|
||||
- Prometheus service enabled with: `docker compose --profile monitoring up -d`
|
||||
|
||||
#### Authentication
|
||||
|
||||
Tests require authentication tokens for InfluxDB 3 instances. Store tokens in:
|
||||
- `~/.env.influxdb3-core-admin-token` (for Core)
|
||||
- `~/.env.influxdb3-enterprise-admin-token` (for Enterprise)
|
||||
|
||||
Or set environment variables directly:
|
||||
- `INFLUXDB3_CORE_TOKEN`
|
||||
- `INFLUXDB3_ENTERPRISE_TOKEN`
|
||||
|
||||
#### Prerequisites
|
||||
|
||||
- Docker and Docker Compose installed
|
||||
- Running InfluxDB 3 Core container (`influxdb3-core:8181`)
|
||||
- Running InfluxDB 3 Enterprise container (`influxdb3-enterprise:8181`)
|
||||
- Valid authentication tokens
|
||||
- For Phase 2: Prometheus service (`docker compose --profile monitoring up -d`)
|
||||
|
||||
## Link Validation with Link-Checker
|
||||
|
||||
Link validation uses the `link-checker` tool to validate internal and external links in documentation files.
|
||||
|
|
117
compose.yaml
117
compose.yaml
|
@ -306,8 +306,8 @@ services:
|
|||
working_dir: /app
|
||||
influxdb3-core:
|
||||
container_name: influxdb3-core
|
||||
image: influxdb:3-core
|
||||
pull_policy: always
|
||||
image: influxdb:3.5.0-core-arm64
|
||||
pull_policy: never
|
||||
# Set variables (except your auth token) for Core in the .env.3core file.
|
||||
env_file:
|
||||
- .env.3core
|
||||
|
@ -338,8 +338,8 @@ services:
|
|||
- influxdb3-core-admin-token
|
||||
influxdb3-enterprise:
|
||||
container_name: influxdb3-enterprise
|
||||
image: influxdb:3-enterprise
|
||||
pull_policy: always
|
||||
image: influxdb:3.5.0-enterprise-arm64
|
||||
pull_policy: never
|
||||
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
|
||||
env_file:
|
||||
- .env.3ent
|
||||
|
@ -369,6 +369,74 @@ services:
|
|||
target: /var/lib/influxdb3/plugins/custom
|
||||
secrets:
|
||||
- influxdb3-enterprise-admin-token
|
||||
influxdb3-enterprise-write:
|
||||
container_name: influxdb3-enterprise-write
|
||||
image: influxdb:3.5.0-enterprise-arm64
|
||||
pull_policy: never
|
||||
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
|
||||
env_file:
|
||||
- .env.3ent
|
||||
ports:
|
||||
- 8183:8181
|
||||
command:
|
||||
- influxdb3
|
||||
- serve
|
||||
- --node-id=writer-0
|
||||
- --mode=ingest
|
||||
- --cluster-id=cluster0
|
||||
- --object-store=file
|
||||
- --data-dir=/var/lib/influxdb3/data
|
||||
- --plugin-dir=/var/lib/influxdb3/plugins
|
||||
- --log-filter=debug
|
||||
- --verbose
|
||||
environment:
|
||||
- INFLUXDB3_AUTH_TOKEN=/run/secrets/influxdb3-enterprise-admin-token
|
||||
volumes:
|
||||
- type: bind
|
||||
source: test/.influxdb3/enterprise/data
|
||||
target: /var/lib/influxdb3/data
|
||||
- type: bind
|
||||
source: test/.influxdb3/plugins/influxdata
|
||||
target: /var/lib/influxdb3/plugins
|
||||
- type: bind
|
||||
source: test/.influxdb3/enterprise/plugins
|
||||
target: /var/lib/influxdb3/plugins/custom
|
||||
secrets:
|
||||
- influxdb3-enterprise-admin-token
|
||||
influxdb3-enterprise-query:
|
||||
container_name: influxdb3-enterprise-query
|
||||
image: influxdb:3.5.0-enterprise-arm64
|
||||
pull_policy: never
|
||||
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
|
||||
env_file:
|
||||
- .env.3ent
|
||||
ports:
|
||||
- 8184:8181
|
||||
command:
|
||||
- influxdb3
|
||||
- serve
|
||||
- --node-id=querier-0
|
||||
- --mode=query
|
||||
- --cluster-id=cluster0
|
||||
- --object-store=file
|
||||
- --data-dir=/var/lib/influxdb3/data
|
||||
- --plugin-dir=/var/lib/influxdb3/plugins
|
||||
- --log-filter=debug
|
||||
- --verbose
|
||||
environment:
|
||||
- INFLUXDB3_AUTH_TOKEN=/run/secrets/influxdb3-enterprise-admin-token
|
||||
volumes:
|
||||
- type: bind
|
||||
source: test/.influxdb3/enterprise/data
|
||||
target: /var/lib/influxdb3/data
|
||||
- type: bind
|
||||
source: test/.influxdb3/plugins/influxdata
|
||||
target: /var/lib/influxdb3/plugins
|
||||
- type: bind
|
||||
source: test/.influxdb3/enterprise/plugins
|
||||
target: /var/lib/influxdb3/plugins/custom
|
||||
secrets:
|
||||
- influxdb3-enterprise-admin-token
|
||||
telegraf-pytest:
|
||||
container_name: telegraf-pytest
|
||||
image: influxdata/docs-pytest
|
||||
|
@ -499,7 +567,7 @@ services:
|
|||
remark-lint:
|
||||
container_name: remark-lint
|
||||
build:
|
||||
context: .
|
||||
context: .
|
||||
dockerfile: .ci/Dockerfile.remark
|
||||
profiles:
|
||||
- lint
|
||||
|
@ -510,6 +578,39 @@ services:
|
|||
- type: bind
|
||||
source: ./CONTRIBUTING.md
|
||||
target: /app/CONTRIBUTING.md
|
||||
prometheus:
|
||||
container_name: prometheus
|
||||
image: prom/prometheus:latest
|
||||
ports:
|
||||
- "9090:9090"
|
||||
environment:
|
||||
- INFLUXDB3_CORE_TOKEN=${INFLUXDB3_CORE_TOKEN}
|
||||
- INFLUXDB3_ENTERPRISE_TOKEN=${INFLUXDB3_ENTERPRISE_TOKEN}
|
||||
volumes:
|
||||
- type: bind
|
||||
source: ./test/influxdb3/prometheus.yml
|
||||
target: /etc/prometheus/prometheus.yml
|
||||
read_only: true
|
||||
- type: volume
|
||||
source: prometheus-data
|
||||
target: /prometheus
|
||||
entrypoint:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- |
|
||||
echo "$$INFLUXDB3_CORE_TOKEN" > /tmp/core-token
|
||||
echo "$$INFLUXDB3_ENTERPRISE_TOKEN" > /tmp/enterprise-token
|
||||
exec /bin/prometheus \
|
||||
--config.file=/etc/prometheus/prometheus.yml \
|
||||
--storage.tsdb.path=/prometheus \
|
||||
--web.console.libraries=/usr/share/prometheus/console_libraries \
|
||||
--web.console.templates=/usr/share/prometheus/consoles \
|
||||
--web.enable-lifecycle
|
||||
depends_on:
|
||||
- influxdb3-core
|
||||
- influxdb3-enterprise
|
||||
profiles:
|
||||
- monitoring
|
||||
volumes:
|
||||
test-content:
|
||||
cloud-tmp:
|
||||
|
@ -517,4 +618,8 @@ volumes:
|
|||
cloud-serverless-tmp:
|
||||
clustered-tmp:
|
||||
telegraf-tmp:
|
||||
v2-tmp:
|
||||
v2-tmp:
|
||||
influxdb3-core-tmp:
|
||||
influxdb2-data:
|
||||
influxdb2-config:
|
||||
prometheus-data:
|
|
@ -0,0 +1,24 @@
|
|||
---
|
||||
title: Monitor metrics
|
||||
seotitle: Monitor InfluxDB 3 Core metrics
|
||||
description: >
|
||||
Access and understand Prometheus-format metrics exposed by {{< product-name >}}
|
||||
to monitor system performance, resource usage, and operational health.
|
||||
menu:
|
||||
influxdb3_core:
|
||||
parent: Administer InfluxDB
|
||||
name: Monitor metrics
|
||||
weight: 110
|
||||
influxdb3/core/tags: [monitoring, metrics, prometheus, observability, operations]
|
||||
related:
|
||||
- /influxdb3/core/reference/internals/runtime-architecture/
|
||||
- /influxdb3/core/admin/performance-tuning/
|
||||
- /influxdb3/core/plugins/library/, InfluxDB 3 Core plugins
|
||||
- /influxdb3/core/write-data/use-telegraf/
|
||||
- /influxdb3/core/reference/telemetry/
|
||||
source: /shared/influxdb3-admin/monitor-metrics.md
|
||||
---
|
||||
|
||||
<!--
|
||||
//SOURCE - content/shared/influxdb3-admin/monitor-metrics.md
|
||||
-->
|
|
@ -0,0 +1,22 @@
|
|||
---
|
||||
title: Metrics
|
||||
seotitle: InfluxDB 3 Core metrics reference
|
||||
description: >
|
||||
InfluxDB 3 Core exposes Prometheus-format metrics,
|
||||
including descriptions, types, and labels for monitoring and observability.
|
||||
menu:
|
||||
influxdb3_core:
|
||||
parent: Reference
|
||||
weight: 106
|
||||
influxdb3/core/tags: [metrics, prometheus, monitoring, reference, observability]
|
||||
related:
|
||||
- /influxdb3/core/admin/monitor-metrics/
|
||||
- /influxdb3/core/reference/telemetry/
|
||||
- /influxdb3/core/reference/internals/runtime-architecture/
|
||||
source: /shared/influxdb3-reference/metrics.md
|
||||
---
|
||||
|
||||
<!--
|
||||
The content of this file is located at
|
||||
//SOURCE - content/shared/influxdb3-reference/metrics.md
|
||||
-->
|
|
@ -9,7 +9,7 @@ description: >
|
|||
menu:
|
||||
influxdb3_enterprise:
|
||||
parent: Administer InfluxDB
|
||||
weight: 105
|
||||
weight: 106
|
||||
influxdb3/enterprise/tags: [cache]
|
||||
related:
|
||||
- /influxdb3/enterprise/reference/sql/functions/cache/#last_cache, last_cache SQL function
|
||||
|
|
|
@ -0,0 +1,26 @@
|
|||
---
|
||||
title: Monitor metrics
|
||||
seotitle: Monitor {{< product-name >}} metrics
|
||||
description: >
|
||||
Access and understand Prometheus-format metrics exposed by {{< product-name >}}
|
||||
to monitor distributed cluster performance, resource usage, and operational health.
|
||||
menu:
|
||||
influxdb3_enterprise:
|
||||
parent: Administer InfluxDB
|
||||
name: Monitor metrics
|
||||
weight: 110
|
||||
influxdb3/enterprise/tags: [monitoring, metrics, prometheus, observability, operations, clustering]
|
||||
related:
|
||||
- /influxdb3/enterprise/admin/clustering/
|
||||
- /influxdb3/enterprise/reference/internals/runtime-architecture/
|
||||
- /influxdb3/enterprise/admin/performance-tuning/
|
||||
- /influxdb3/enterprise/plugins/library/, InfluxDB 3 Enterprise plugins
|
||||
- /influxdb3/enterprise/write-data/use-telegraf/
|
||||
- /influxdb3/enterprise/reference/telemetry/
|
||||
source: /shared/influxdb3-admin/monitor-metrics.md
|
||||
---
|
||||
|
||||
<!--
|
||||
The content of this file is located at
|
||||
//SOURCE - content/shared/influxdb3-admin/monitor-metrics.md
|
||||
-->
|
|
@ -402,12 +402,24 @@ In your terminal, run the `influxdb3 create token --permission` command and prov
|
|||
- `health`: The specific system resource to grant permissions to.
|
||||
- `read`: The permission to grant to the token (system tokens are always read-only).
|
||||
|
||||
{{% code-placeholders "System health token|1y" %}}
|
||||
The following example shows how to create specific system tokens:
|
||||
|
||||
{{% code-placeholders "(System [a-z]*\s?token)|1y" %}}
|
||||
```bash
|
||||
influxdb3 create token \
|
||||
--permission "system:health:read" \
|
||||
--name "System health token" \
|
||||
--expiry 1y
|
||||
|
||||
influxdb3 create token \
|
||||
--permission "system:metrics:read" \
|
||||
--name "System metrics token" \
|
||||
--expiry 1y
|
||||
|
||||
influxdb3 create token \
|
||||
--permission "system:ping:read" \
|
||||
--name "System ping token" \
|
||||
--expiry 1y
|
||||
```
|
||||
{{% /code-placeholders %}}
|
||||
|
||||
|
@ -444,9 +456,9 @@ In the request body, provide the following parameters:
|
|||
- `permissions`: an array of token permission actions (only `"read"` for system tokens)
|
||||
- `expiry_secs`: Specify the token expiration time in seconds.
|
||||
|
||||
The following example shows how to use the HTTP API to create a system token:
|
||||
The following example shows how to use the HTTP API to create specific system tokens:
|
||||
|
||||
{{% code-placeholders "AUTH_TOKEN|System health token|300000" %}}
|
||||
{{% code-placeholders "AUTH_TOKEN|(System [a-z]*\s?token)|1y|300000" %}}
|
||||
|
||||
```bash
|
||||
curl \
|
||||
|
@ -463,6 +475,36 @@ curl \
|
|||
}],
|
||||
"expiry_secs": 300000
|
||||
}'
|
||||
|
||||
curl \
|
||||
"http://{{< influxdb/host >}}/api/v3/enterprise/configure/token" \
|
||||
--header 'Accept: application/json' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header "Authorization: Bearer AUTH_TOKEN" \
|
||||
--data '{
|
||||
"token_name": "System metrics token",
|
||||
"permissions": [{
|
||||
"resource_type": "system",
|
||||
"resource_identifier": ["metrics"],
|
||||
"actions": ["read"]
|
||||
}],
|
||||
"expiry_secs": 300000
|
||||
}'
|
||||
|
||||
curl \
|
||||
"http://{{< influxdb/host >}}/api/v3/enterprise/configure/token" \
|
||||
--header 'Accept: application/json' \
|
||||
--header 'Content-Type: application/json' \
|
||||
--header "Authorization: Bearer AUTH_TOKEN" \
|
||||
--data '{
|
||||
"token_name": "System ping token",
|
||||
"permissions": [{
|
||||
"resource_type": "system",
|
||||
"resource_identifier": ["ping"],
|
||||
"actions": ["read"]
|
||||
}],
|
||||
"expiry_secs": 300000
|
||||
}'
|
||||
```
|
||||
{{% /code-placeholders %}}
|
||||
|
||||
|
|
|
@ -12,4 +12,4 @@ related:
|
|||
source: /shared/influxdb3-plugins/plugins-library/official/system-metrics.md
|
||||
---
|
||||
|
||||
<!-- //SOURCE - content/shared/influxdb3-plugins/plugins-library/official/system-metrics.md -->
|
||||
<!-- //SOURCE - content/shared/influxdb3-plugins/plugins-library/official/system-metrics.md -->
|
||||
|
|
|
@ -0,0 +1,23 @@
|
|||
---
|
||||
title: Metrics
|
||||
seotitle: InfluxDB 3 Enterprise metrics reference
|
||||
description: >
|
||||
InfluxDB 3 Enterprise exposes Prometheus-format metrics,
|
||||
including descriptions, types, and labels for monitoring and observability.
|
||||
menu:
|
||||
influxdb3_enterprise:
|
||||
parent: Reference
|
||||
weight: 106
|
||||
influxdb3/enterprise/tags: [metrics, prometheus, monitoring, reference, observability, clustering]
|
||||
related:
|
||||
- /influxdb3/enterprise/admin/monitor-metrics/
|
||||
- /influxdb3/enterprise/admin/clustering/
|
||||
- /influxdb3/enterprise/reference/telemetry/
|
||||
- /influxdb3/enterprise/reference/internals/runtime-architecture/
|
||||
source: /shared/influxdb3-reference/metrics.md
|
||||
---
|
||||
|
||||
<!--
|
||||
The content of this file is located at
|
||||
//SOURCE - content/shared/influxdb3-reference/metrics.md
|
||||
-->
|
|
@ -0,0 +1,823 @@
|
|||
Use InfluxDB metrics to monitor {{% show-in "enterprise" %}}distributed cluster {{% /show-in %}}system performance, resource usage, and operational health
|
||||
with monitoring tools like Prometheus, Grafana, or other observability platforms.
|
||||
|
||||
{{% show-in "core" %}}
|
||||
- [Access metrics](#access-metrics)
|
||||
- [Metric categories](#metric-categories)
|
||||
- [Key metrics for monitoring](#key-metrics-for-monitoring)
|
||||
- [Example monitoring queries](#example-monitoring-queries)
|
||||
- [Integration with monitoring tools](#integration-with-monitoring-tools)
|
||||
- [Best practices](#best-practices)
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
- [Access metrics](#access-metrics)
|
||||
- [Metric categories](#metric-categories)
|
||||
- [Cluster-specific metrics](#cluster-specific-metrics)
|
||||
- [Node-specific monitoring](#node-specific-monitoring)
|
||||
- [Example monitoring queries](#example-monitoring-queries)
|
||||
- [Distributed monitoring setup](#distributed-monitoring-setup)
|
||||
- [Best practices](#best-practices)
|
||||
{{% /show-in %}}
|
||||
|
||||
## Access metrics
|
||||
|
||||
An {{< product-name >}} node exposes metrics at the `/metrics` endpoint on the HTTP port (default: 8181).
|
||||
|
||||
{{% api-endpoint method="GET" endpoint="http://localhost:8181/metrics" api-ref="/influxdb3/version/api/v3/#operation/GetMetrics"%}}
|
||||
|
||||
{{% show-in "core" %}}
|
||||
### View metrics
|
||||
|
||||
```bash
|
||||
# View all metrics
|
||||
curl -s http://{{< influxdb/host >}}/metrics
|
||||
|
||||
# View specific metric patterns
|
||||
curl -s http://{{< influxdb/host >}}/metrics | grep 'http_requests_total'
|
||||
curl -s http://{{< influxdb/host >}}/metrics | grep 'influxdb3_'
|
||||
|
||||
# View metrics with authentication (if required)
|
||||
curl -s -H "Authorization: Token AUTH_TOKEN" http://node:8181/metrics
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
### View metrics from specific nodes
|
||||
|
||||
> [!Note]
|
||||
> {{< product-name >}} supports two token types for the `/metrics` endpoint:
|
||||
> - {{% token-link "Admin" %}}: Full access to all metrics
|
||||
> - {{% token-link "Fine-grained" "resource/" %}} with `system:metrics:read` permission: Read-only access to metrics
|
||||
|
||||
```bash { placeholders="AUTH_TOKEN" }
|
||||
# View metrics from specific nodes
|
||||
curl -s http://ingester-01:8181/metrics
|
||||
curl -s http://query-01:8181/metrics
|
||||
curl -s http://compactor-01:8181/metrics
|
||||
|
||||
# View metrics with authentication (if required)
|
||||
curl -s -H "Authorization: Token AUTH_TOKEN" http://node:8181/metrics
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
Replace {{% code-placeholder-key %}}`AUTH_TOKEN`{{% /code-placeholder-key %}} with your {{< product-name >}} {{% token-link %}} that has read access to the `/metrics` endpoint.
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
### Aggregate metrics across cluster
|
||||
|
||||
```bash
|
||||
# Get metrics from all nodes in cluster
|
||||
for node in ingester-01 query-01 compactor-01; do
|
||||
echo "=== Node: $node ==="
|
||||
curl -s http://$node:8181/metrics | grep 'http_requests_total.*status="ok"'
|
||||
done
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
### Metrics format
|
||||
|
||||
InfluxDB exposes metrics in [Prometheus exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format), a format supported by many tools, including [Telegraf](#collect-metrics-with-telegraf).
|
||||
Each metric follows this structure:
|
||||
|
||||
```
|
||||
# HELP metric_name Description of the metric
|
||||
# TYPE metric_name counter|gauge|histogram
|
||||
metric_name{label1="value1",label2="value2"} 42.0
|
||||
```
|
||||
|
||||
### Node identification in metrics
|
||||
|
||||
{{< product-name >}} metrics don't include automatic node identification labels.
|
||||
To identify which node produced each metric, you must configure your monitoring tool to add node labels during scraping.
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
For multi-node clusters, node identification is essential for troubleshooting and monitoring individual node performance.
|
||||
Many monitoring tools support adding static or dynamic labels during the scrape process (Prometheus calls this "relabeling")--for example:
|
||||
|
||||
1. **Extract node hostname or IP** from the scrape target address
|
||||
2. **Add extracted labels to metrics** during the scrape process
|
||||
|
||||
| Hostname | Node Identification | Recommended? |
|
||||
| -------------------------- | ------------------- | ----------------- |
|
||||
| `ingester-01`, `query-02` | Extract role and ID | Yes |
|
||||
| `node-01.cluster.internal` | Extract ID | Consider adding role information |
|
||||
| `192.168.1.10` | IP address only | No, consider renaming with ID and role |
|
||||
|
||||
For configuration examples, see [Add node identification with Prometheus](#add-node-identification-with-prometheus).
|
||||
{{% /show-in %}}
|
||||
|
||||
## Metric categories
|
||||
|
||||
{{< product-name >}} exposes the following{{% show-in "enterprise" %}} base{{% /show-in %}} categories of metrics{{% show-in "enterprise" %}}, plus additional cluster-aware metrics{{% /show-in %}}:
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
> [!Note]
|
||||
> #### Metrics reporting across node modes
|
||||
> All nodes in an {{< product-name >}} cluster report the same set of metrics regardless of their configured [mode](/influxdb3/enterprise/reference/config-options/#mode) (ingest, query, compact, process, or all).
|
||||
> The difference between nodes is in the metric _values_ and labels, which reflect the actual activity on each node.
|
||||
> For example, an ingest-only node reports query-related metrics with minimal or zero values.
|
||||
{{% /show-in %}}
|
||||
|
||||
### HTTP and gRPC metrics
|
||||
|
||||
Monitor API request patterns{{% show-in "enterprise" %}} across the cluster{{% /show-in %}}:
|
||||
|
||||
- **`http_requests_total`**: Total HTTP requests by method, path, and status{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **`http_request_duration_seconds`**: HTTP request latency distribution{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **`http_response_body_size_bytes`**: HTTP response size distribution
|
||||
- **`grpc_requests_total`**: Total gRPC requests{{% show-in "enterprise" %}} for inter-node communication{{% /show-in %}}
|
||||
- **`grpc_request_duration_seconds`**: gRPC request latency distribution
|
||||
|
||||
> [!Note]
|
||||
> Monitor all write endpoints (`/api/v3/write_lp`, `/api/v2/write`, `/write`) and query endpoints (`/api/v3/query_sql`, `/api/v3/query_influxql`, `/query`) for comprehensive request tracking.
|
||||
|
||||
### Database operations
|
||||
|
||||
Monitor database{{% show-in "enterprise" %}}-specific and distributed cluster{{% /show-in %}} operations:
|
||||
|
||||
- **`influxdb3_catalog_operations_total`**: Catalog operations by type (create_database, create_admin_token, etc.){{% show-in "enterprise" %}} across the cluster{{% /show-in %}}
|
||||
- **`influxdb3_catalog_operation_retries_total`**: Failed catalog operations that required retries{{% show-in "enterprise" %}} due to conflicts between nodes{{% /show-in %}}
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
### Node specialization metrics
|
||||
|
||||
Different metrics are more relevant depending on node [mode configuration](/influxdb3/version/admin/clustering/#configure-node-modes):
|
||||
|
||||
#### Ingest nodes (mode: ingest)
|
||||
- **`http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}`**: Write request volume (all endpoints)
|
||||
- **`object_store_transfer_bytes_total`**: WAL-to-Parquet snapshot activity
|
||||
- **`datafusion_mem_pool_bytes`**: Memory usage for snapshot operations
|
||||
|
||||
#### Query nodes (mode: query)
|
||||
- **`influxdb_iox_query_log_*`**: Query execution performance
|
||||
- **`influxdb3_parquet_cache_*`**: Cache performance for query acceleration
|
||||
- **`http_requests_total{path=~"/api/v3/query_sql|/api/v3/query_influxql|/query"}`**: Query request patterns (all endpoints)
|
||||
|
||||
#### Compactor nodes (mode: compact)
|
||||
- **`object_store_op_duration_seconds`**: Compaction operation performance
|
||||
- **`object_store_transfer_*`**: File consolidation activity
|
||||
|
||||
#### Process nodes (mode: process)
|
||||
- **`tokio_runtime_*`**: Plugin execution runtime metrics
|
||||
- Custom plugin metrics (varies by installed plugins)
|
||||
{{% /show-in %}}
|
||||
|
||||
### Memory and caching
|
||||
|
||||
Monitor memory usage{{% show-in "enterprise" %}} across specialized nodes{{% /show-in %}}:
|
||||
|
||||
- **`datafusion_mem_pool_bytes`**: DataFusion memory pool{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **`influxdb3_parquet_cache_access_total`**: Parquet cache hits, misses, and fetch status{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **`influxdb3_parquet_cache_size_bytes`**: Current size of in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **`influxdb3_parquet_cache_size_number_of_files`**: Number of files in Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **`jemalloc_memstats_bytes`**: Memory allocation statistics{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
|
||||
### Query performance
|
||||
|
||||
Monitor{{% show-in "enterprise" %}} distributed{{% /show-in %}} query execution{{% show-in "enterprise" %}} and performance{{% /show-in %}}:
|
||||
|
||||
- **`influxdb_iox_query_log_*`**: Comprehensive query execution metrics including:
|
||||
- `compute_duration_seconds`: CPU time spent on computation
|
||||
- `execute_duration_seconds`: Total query execution time
|
||||
- `plan_duration_seconds`: Time spent planning queries
|
||||
- `end2end_duration_seconds`: Complete query duration from request to response
|
||||
- `max_memory`: Peak memory usage per query
|
||||
- `parquet_files`: Number of Parquet files accessed
|
||||
- `partitions`: Number of partitions processed
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
- **`influxdb_iox_query_log_ingester_latency_*`**: Inter-node query coordination latency
|
||||
- **`influxdb_iox_query_log_ingester_partition_count`**: Data distribution across nodes
|
||||
- **`influxdb_iox_query_log_parquet_files`**: File access patterns per query
|
||||
{{% /show-in %}}
|
||||
|
||||
### Object storage
|
||||
|
||||
Monitor{{% show-in "enterprise" %}} shared{{% /show-in %}} object store operations and performance:
|
||||
|
||||
- **`object_store_op_duration_seconds`**: Object store operation latency{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **`object_store_transfer_bytes_total`**: Cumulative bytes transferred{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **`object_store_transfer_objects_total`**: Cumulative objects transferred{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
|
||||
### Runtime and system
|
||||
|
||||
Monitor runtime health and resource usage:
|
||||
|
||||
- **`process_start_time_seconds`**: Process start time
|
||||
- **`thread_panic_count_total`**: Thread panic occurrences
|
||||
- **`query_datafusion_query_execution_ooms_total`**: Out-of-memory events in query engine
|
||||
- **`tokio_runtime_*`**: Async runtime metrics (task scheduling, worker threads, queue depths)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
## Cluster-specific metrics
|
||||
|
||||
### Load distribution
|
||||
|
||||
Monitor workload distribution across nodes:
|
||||
|
||||
```bash
|
||||
# Write load across ingest nodes (all write endpoints)
|
||||
for node in ingester-01 ingester-02; do
|
||||
echo "Node $node:"
|
||||
curl -s http://$node:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\).*status="ok"'
|
||||
done
|
||||
|
||||
# Query load across query nodes
|
||||
for node in query-01 query-02; do
|
||||
echo "Node $node:"
|
||||
curl -s http://$node:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds_count'
|
||||
done
|
||||
```
|
||||
|
||||
## Node-specific monitoring
|
||||
|
||||
### Monitor ingest node health
|
||||
|
||||
Monitor data ingestion performance:
|
||||
|
||||
```bash
|
||||
# Ingest throughput (all write endpoints)
|
||||
curl -s http://ingester-01:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\)'
|
||||
|
||||
# Snapshot creation activity
|
||||
curl -s http://ingester-01:8181/metrics | grep 'object_store_transfer_bytes_total.*put'
|
||||
|
||||
# Memory pressure
|
||||
curl -s http://ingester-01:8181/metrics | grep 'datafusion_mem_pool_bytes'
|
||||
```
|
||||
|
||||
### Monitor query node performance
|
||||
|
||||
Monitor query execution:
|
||||
|
||||
```bash
|
||||
# Query latency
|
||||
curl -s http://query-01:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds'
|
||||
|
||||
# Cache effectiveness
|
||||
curl -s http://query-01:8181/metrics | grep 'influxdb3_parquet_cache_access_total'
|
||||
|
||||
# Inter-node coordination time
|
||||
curl -s http://query-01:8181/metrics | grep 'influxdb_iox_query_log_ingester_latency'
|
||||
```
|
||||
|
||||
### Monitor compactor node activity
|
||||
|
||||
Monitor data optimization:
|
||||
|
||||
```bash
|
||||
# Compaction operations
|
||||
curl -s http://compactor-01:8181/metrics | grep 'object_store_op_duration_seconds.*put'
|
||||
|
||||
# File processing volume
|
||||
curl -s http://compactor-01:8181/metrics | grep 'object_store_transfer_objects_total'
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "core" %}}
|
||||
## Key metrics for monitoring
|
||||
|
||||
### Write throughput
|
||||
|
||||
Monitor data ingestion:
|
||||
|
||||
```bash
|
||||
# HTTP requests to write endpoints (all endpoints)
|
||||
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\)'
|
||||
|
||||
# Object store writes (Parquet file creation)
|
||||
curl -s http://localhost:8181/metrics | grep 'object_store_transfer.*total.*put'
|
||||
```
|
||||
|
||||
### Query performance
|
||||
|
||||
Monitor query execution:
|
||||
|
||||
```bash
|
||||
# Query latency percentiles
|
||||
curl -s http://localhost:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds'
|
||||
|
||||
# Query memory usage
|
||||
curl -s http://localhost:8181/metrics | grep 'influxdb_iox_query_log_max_memory'
|
||||
|
||||
# Query errors and failures
|
||||
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*status="server_error"'
|
||||
```
|
||||
|
||||
### Resource utilization
|
||||
|
||||
Monitor system resources:
|
||||
|
||||
```bash
|
||||
# Memory pool usage
|
||||
curl -s http://localhost:8181/metrics | grep 'datafusion_mem_pool_bytes'
|
||||
|
||||
# Cache efficiency
|
||||
curl -s http://localhost:8181/metrics | grep 'influxdb3_parquet_cache_access_total'
|
||||
|
||||
# Runtime task health
|
||||
curl -s http://localhost:8181/metrics | grep 'tokio_runtime_num_alive_tasks'
|
||||
```
|
||||
|
||||
### Error rates
|
||||
|
||||
Monitor system health:
|
||||
|
||||
```bash
|
||||
# HTTP error rates
|
||||
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*status="client_error"\|http_requests_total.*status="server_error"'
|
||||
|
||||
# Thread panics
|
||||
curl -s http://localhost:8181/metrics | grep 'thread_panic_count_total'
|
||||
|
||||
# Query OOMs
|
||||
curl -s http://localhost:8181/metrics | grep 'query_datafusion_query_execution_ooms_total'
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
## Example monitoring queries
|
||||
|
||||
### Prometheus queries{{% show-in "enterprise" %}} for clusters{{% /show-in %}}
|
||||
|
||||
Use these queries in Prometheus or Grafana dashboards:
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
#### Cluster-wide request rate
|
||||
|
||||
```promql
|
||||
# Total requests per second across all nodes
|
||||
sum(rate(http_requests_total[5m])) by (instance)
|
||||
|
||||
# Write requests per second by ingest node (all write endpoints)
|
||||
sum(rate(http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}[5m])) by (instance)
|
||||
```
|
||||
|
||||
#### Query performance across nodes
|
||||
|
||||
```promql
|
||||
# 95th percentile query latency by query node
|
||||
histogram_quantile(0.95,
|
||||
sum(rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m])) by (instance, le)
|
||||
)
|
||||
|
||||
# Average inter-node coordination time
|
||||
avg(rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_sum[5m]) /
|
||||
rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_count[5m])) by (instance)
|
||||
```
|
||||
|
||||
#### Load balancing effectiveness
|
||||
|
||||
```promql
|
||||
# Request distribution balance (coefficient of variation)
|
||||
stddev(sum(rate(http_requests_total[5m])) by (instance)) /
|
||||
avg(sum(rate(http_requests_total[5m])) by (instance))
|
||||
|
||||
# Cache hit rate by query node
|
||||
sum(rate(influxdb3_parquet_cache_access_total{status="cached"}[5m])) by (instance) /
|
||||
sum(rate(influxdb3_parquet_cache_access_total[5m])) by (instance)
|
||||
```
|
||||
|
||||
#### Cluster health indicators
|
||||
|
||||
```promql
|
||||
# Node availability (any recent metrics)
|
||||
up{job="influxdb3-enterprise"}
|
||||
|
||||
# Catalog operation conflicts
|
||||
rate(influxdb3_catalog_operation_retries_total[5m])
|
||||
|
||||
# Cross-node error rates
|
||||
sum(rate(http_requests_total{status=~"server_error|client_error"}[5m])) by (instance, status)
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "core" %}}
|
||||
#### Request rate
|
||||
|
||||
```promql
|
||||
# Requests per second
|
||||
rate(http_requests_total[5m])
|
||||
|
||||
# Error rate percentage
|
||||
rate(http_requests_total{status=~"client_error|server_error"}[5m]) / rate(http_requests_total[5m]) * 100
|
||||
```
|
||||
|
||||
#### Query performance
|
||||
|
||||
```promql
|
||||
# 95th percentile query latency
|
||||
histogram_quantile(0.95, rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m]))
|
||||
|
||||
# Average query memory usage
|
||||
rate(influxdb_iox_query_log_max_memory_sum[5m]) / rate(influxdb_iox_query_log_max_memory_count[5m])
|
||||
```
|
||||
|
||||
#### Cache performance
|
||||
|
||||
```promql
|
||||
# Cache hit rate
|
||||
rate(influxdb3_parquet_cache_access_total{status="cached"}[5m]) / rate(influxdb3_parquet_cache_access_total[5m]) * 100
|
||||
|
||||
# Cache size in MB
|
||||
influxdb3_parquet_cache_size_bytes / 1024 / 1024
|
||||
```
|
||||
|
||||
#### Object store throughput
|
||||
|
||||
```promql
|
||||
# Bytes per second to object store
|
||||
rate(object_store_transfer_bytes_total[5m])
|
||||
|
||||
# Objects per second to object store
|
||||
rate(object_store_transfer_objects_total[5m])
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
## Distributed monitoring setup
|
||||
|
||||
### Collect metrics with Telegraf
|
||||
|
||||
Use Telegraf to collect metrics from all cluster nodes and store them in a separate {{< product-name >}} instance for centralized monitoring.
|
||||
|
||||
#### Configure Telegraf
|
||||
|
||||
Create a Telegraf configuration file (`telegraf.conf`) to scrape metrics from your cluster nodes:
|
||||
|
||||
```toml { placeholders="MONITORING_AUTH_TOKEN|INGESTER_AUTH_TOKEN|QUERY_AUTH_TOKEN|COMPACTOR_AUTH_TOKEN" }
|
||||
# Telegraf configuration for InfluxDB 3 Enterprise monitoring
|
||||
|
||||
# Output to monitoring InfluxDB instance
|
||||
[[outputs.influxdb_v2]]
|
||||
urls = ["http://monitoring-influxdb:8181"]
|
||||
token = "MONITORING_AUTH_TOKEN"
|
||||
organization = ""
|
||||
bucket = "monitoring"
|
||||
|
||||
# Scrape metrics from ingest nodes
|
||||
[[inputs.prometheus]]
|
||||
urls = [
|
||||
"http://ingester-01:8181/metrics",
|
||||
"http://ingester-02:8181/metrics"
|
||||
]
|
||||
metric_version = 2
|
||||
|
||||
# Authentication for metrics endpoint
|
||||
[inputs.prometheus.headers]
|
||||
Authorization = "Token INGESTER_AUTH_TOKEN"
|
||||
|
||||
[inputs.prometheus.tags]
|
||||
cluster = "production"
|
||||
|
||||
# Scrape metrics from query nodes
|
||||
[[inputs.prometheus]]
|
||||
urls = [
|
||||
"http://query-01:8181/metrics",
|
||||
"http://query-02:8181/metrics"
|
||||
]
|
||||
metric_version = 2
|
||||
|
||||
[inputs.prometheus.headers]
|
||||
Authorization = "Token QUERY_AUTH_TOKEN"
|
||||
|
||||
[inputs.prometheus.tags]
|
||||
cluster = "production"
|
||||
|
||||
# Scrape metrics from compactor nodes
|
||||
[[inputs.prometheus]]
|
||||
urls = ["http://compactor-01:8181/metrics"]
|
||||
metric_version = 2
|
||||
|
||||
[inputs.prometheus.headers]
|
||||
Authorization = "Token COMPACTOR_AUTH_TOKEN"
|
||||
|
||||
[inputs.prometheus.tags]
|
||||
cluster = "production"
|
||||
|
||||
# Extract node name and role from URL
|
||||
[[processors.regex]]
|
||||
namepass = ["*"]
|
||||
|
||||
[[processors.regex.tags]]
|
||||
key = "url"
|
||||
pattern = "^http://([^:]+):.*"
|
||||
replacement = "${1}"
|
||||
result_key = "node_name"
|
||||
|
||||
[[processors.regex.tags]]
|
||||
key = "node_name"
|
||||
pattern = "^(ingester|query|compactor|processor)-.*"
|
||||
replacement = "${1}"
|
||||
result_key = "node_role"
|
||||
```
|
||||
|
||||
Replace the following:
|
||||
|
||||
- {{% code-placeholder-key %}}`MONITORING_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} for the monitoring InfluxDB instance
|
||||
- {{% code-placeholder-key %}}`INGESTER_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the ingest nodes
|
||||
- {{% code-placeholder-key %}}`QUERY_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the query nodes
|
||||
- {{% code-placeholder-key %}}`COMPACTOR_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the compactor node
|
||||
|
||||
#### Start Telegraf
|
||||
|
||||
```bash
|
||||
# Start Telegraf with the configuration
|
||||
telegraf --config telegraf.conf
|
||||
|
||||
# Run as a service (systemd example)
|
||||
sudo systemctl start telegraf
|
||||
sudo systemctl enable telegraf
|
||||
```
|
||||
|
||||
#### Query collected metrics
|
||||
|
||||
Query the monitoring database using SQL:
|
||||
|
||||
```sql
|
||||
-- Request rate by node
|
||||
SELECT
|
||||
node_name,
|
||||
node_role,
|
||||
COUNT(*) as request_count
|
||||
FROM http_requests_total
|
||||
WHERE time >= now() - INTERVAL '5 minutes'
|
||||
GROUP BY node_name, node_role
|
||||
ORDER BY request_count DESC;
|
||||
|
||||
-- Query latency percentiles by node
|
||||
SELECT
|
||||
node_name,
|
||||
APPROX_PERCENTILE_CONT(value, 0.95) as p95_latency_seconds
|
||||
FROM http_request_duration_seconds
|
||||
WHERE time >= now() - INTERVAL '1 hour'
|
||||
GROUP BY node_name;
|
||||
```
|
||||
|
||||
<!--TODO - Add example Grafana dashboards
|
||||
### Grafana dashboards
|
||||
|
||||
Create role-specific dashboards with the following suggested metrics for each dashboard:
|
||||
|
||||
#### Cluster Overview Dashboard
|
||||
- Node status and availability
|
||||
- Request rates across all nodes
|
||||
- Error rates by node and operation type
|
||||
- Resource utilization summary
|
||||
|
||||
#### Ingest Performance Dashboard
|
||||
- Write throughput by ingest node
|
||||
- Snapshot creation rates
|
||||
- Memory usage and pressure
|
||||
- WAL-to-Parquet conversion metrics
|
||||
|
||||
#### Query Performance Dashboard
|
||||
- Query latency percentiles by query node
|
||||
- Cache hit rates and efficiency
|
||||
- Inter-node coordination times
|
||||
- Memory usage during query execution
|
||||
|
||||
#### Operations Dashboard
|
||||
- Compaction progress and performance
|
||||
- Object store operation success rates
|
||||
- Processing engine trigger rates
|
||||
- System health indicators
|
||||
|
||||
-->
|
||||
|
||||
<!--TODO - Use the processing engine for alerting
|
||||
|
||||
### Alerting for clusters
|
||||
|
||||
Set up cluster-aware alerting rules:
|
||||
|
||||
```yaml
|
||||
# Sample Prometheus alerting rules
|
||||
groups:
|
||||
- name: influxdb3_enterprise_cluster
|
||||
rules:
|
||||
- alert: NodeDown
|
||||
expr: up{job="influxdb3-enterprise"} == 0
|
||||
for: 1m
|
||||
labels:
|
||||
severity: critical
|
||||
annotations:
|
||||
summary: "InfluxDB 3 Enterprise node {{ $labels.instance }} is down"
|
||||
|
||||
- alert: HighCatalogConflicts
|
||||
expr: rate(influxdb3_catalog_operation_retries_total[5m]) > 0.1
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "High catalog operation conflicts in cluster"
|
||||
|
||||
- alert: UnbalancedLoad
|
||||
expr: |
|
||||
(
|
||||
stddev(sum(rate(http_requests_total[5m])) by (instance)) /
|
||||
avg(sum(rate(http_requests_total[5m])) by (instance))
|
||||
) > 0.5
|
||||
for: 10m
|
||||
labels:
|
||||
severity: info
|
||||
annotations:
|
||||
summary: "Unbalanced load distribution across cluster nodes"
|
||||
|
||||
- alert: SlowInterNodeCommunication
|
||||
expr: |
|
||||
avg(rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_sum[5m]) /
|
||||
rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_count[5m])) > 1.0
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "Slow inter-node communication detected"
|
||||
```
|
||||
-->
|
||||
|
||||
### Add node identification with Prometheus
|
||||
|
||||
If using Prometheus instead of Telegraf, add node identification through _relabeling_:
|
||||
|
||||
```yaml
|
||||
# prometheus.yml
|
||||
scrape_configs:
|
||||
- job_name: 'influxdb3-enterprise'
|
||||
static_configs:
|
||||
- targets:
|
||||
- 'ingester-01:8181'
|
||||
- 'query-01:8181'
|
||||
relabel_configs:
|
||||
# Extract node name from address
|
||||
- source_labels: [__address__]
|
||||
target_label: node_name
|
||||
regex: '([^:]+):.*'
|
||||
replacement: '${1}'
|
||||
|
||||
# Assign node role based on hostname pattern
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: 'ingester-.*'
|
||||
replacement: 'ingest'
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: 'query-.*'
|
||||
replacement: 'query'
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: 'compactor-.*'
|
||||
replacement: 'compact'
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "core" %}}
|
||||
## Integration with monitoring tools
|
||||
|
||||
### Collect metrics with Telegraf
|
||||
|
||||
Use Telegraf to collect metrics and store them in a separate {{< product-name >}} instance for monitoring.
|
||||
|
||||
#### Configure Telegraf
|
||||
|
||||
Create a Telegraf configuration file (`telegraf.conf`):
|
||||
|
||||
```toml { placeholders="MONITORING_AUTH_TOKEN|AUTH_TOKEN" }
|
||||
# Telegraf configuration for InfluxDB 3 Core monitoring
|
||||
|
||||
# Output to monitoring InfluxDB instance
|
||||
[[outputs.influxdb_v2]]
|
||||
urls = ["http://monitoring-influxdb:8181"]
|
||||
token = "MONITORING_AUTH_TOKEN"
|
||||
organization = ""
|
||||
bucket = "monitoring"
|
||||
|
||||
# Scrape metrics from InfluxDB 3 Core
|
||||
[[inputs.prometheus]]
|
||||
urls = ["http://localhost:8181/metrics"]
|
||||
metric_version = 2
|
||||
|
||||
# Authentication for metrics endpoint (if required)
|
||||
# [inputs.prometheus.headers]
|
||||
# Authorization = "Token AUTH_TOKEN"
|
||||
```
|
||||
|
||||
Replace the following:
|
||||
|
||||
- {{% code-placeholder-key %}}`MONITORING_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} for the monitoring InfluxDB instance
|
||||
- {{% code-placeholder-key %}}`AUTH_TOKEN`{{% /code-placeholder-key %}} (if uncommented): your {{% token-link %}} for accessing the `/metrics` endpoint
|
||||
|
||||
#### Start Telegraf
|
||||
|
||||
```bash
|
||||
# Start Telegraf with the configuration
|
||||
telegraf --config telegraf.conf
|
||||
|
||||
# Run as a service (systemd example)
|
||||
sudo systemctl start telegraf
|
||||
sudo systemctl enable telegraf
|
||||
```
|
||||
|
||||
#### Query collected metrics
|
||||
|
||||
Query the monitoring database using SQL:
|
||||
|
||||
```sql
|
||||
-- Request rate over time
|
||||
SELECT
|
||||
date_bin(INTERVAL '5 minutes', time) as time_bucket,
|
||||
COUNT(*) as request_count
|
||||
FROM http_requests_total
|
||||
WHERE time >= now() - INTERVAL '1 hour'
|
||||
GROUP BY time_bucket
|
||||
ORDER BY time_bucket DESC;
|
||||
|
||||
-- Error rate
|
||||
SELECT
|
||||
status,
|
||||
COUNT(*) as error_count
|
||||
FROM http_requests_total
|
||||
WHERE time >= now() - INTERVAL '1 hour'
|
||||
AND status IN ('client_error', 'server_error')
|
||||
GROUP BY status;
|
||||
```
|
||||
|
||||
<!--TODO - Add example Grafana dashboards
|
||||
### Grafana dashboard
|
||||
|
||||
Create dashboards with key metrics:
|
||||
|
||||
1. **System Overview**: Request rates, error rates, memory usage
|
||||
2. **Query Performance**: Query latency, throughput, memory per query
|
||||
3. **Storage**: Object store operations, cache hit rates, file counts
|
||||
4. **Runtime Health**: Task counts, worker utilization, panic rates
|
||||
-->
|
||||
|
||||
<!--TODO - Use the processing engine for alerting
|
||||
### Alerting rules
|
||||
|
||||
Set up alerts for critical conditions:
|
||||
|
||||
```yaml
|
||||
# Prometheus alerting rules
|
||||
groups:
|
||||
- name: influxdb3_core
|
||||
rules:
|
||||
- alert: HighErrorRate
|
||||
expr: rate(http_requests_total{status=~"server_error"}[5m]) > 0.1
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "High error rate in InfluxDB 3 Core"
|
||||
|
||||
- alert: HighQueryLatency
|
||||
expr: histogram_quantile(0.95, rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m])) > 10
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "High query latency in InfluxDB 3 Core"
|
||||
|
||||
- alert: LowCacheHitRate
|
||||
expr: rate(influxdb3_parquet_cache_access_total{status="cached"}[5m]) / rate(influxdb3_parquet_cache_access_total[5m]) < 0.5
|
||||
for: 10m
|
||||
labels:
|
||||
severity: info
|
||||
annotations:
|
||||
summary: "Low cache hit rate in InfluxDB 3 Core"
|
||||
```
|
||||
-->
|
||||
{{% /show-in %}}
|
||||
|
||||
### Extend monitoring with InfluxDB 3 plugins
|
||||
|
||||
Use {{< product-name >}} plugins to extend monitoring and alerting capabilities:
|
||||
|
||||
- [Notifier plugin](/influxdb3/version/plugins/library/official/notifier/): Send alerts to external systems based on custom logic.
|
||||
- [Threshold deadman checks plugin](/influxdb3/version/plugins/library/official/threshold-deadman-checks/): Monitor metrics and trigger alerts when thresholds are breached.
|
||||
- [System metrics plugin](/influxdb3/version/plugins/library/official/system-metrics/): Collect and visualize system-level metrics.
|
||||
|
||||
## Best practices
|
||||
|
||||
### General monitoring practices
|
||||
|
||||
1. **Monitor key metrics**: Focus on request rates, error rates, latency, and resource usage
|
||||
2. **Set appropriate scrape intervals**: 15-30 seconds for most metrics
|
||||
3. **Create meaningful alerts**: Alert on trends and thresholds that indicate real issues
|
||||
4. **Use labels effectively**: Leverage metric labels for filtering and grouping
|
||||
5. **Monitor long-term trends**: Track performance over time to identify patterns
|
||||
6. **Correlate metrics**: Combine multiple metrics to understand system behavior
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
### Cluster monitoring practices
|
||||
|
||||
1. **Monitor each node type differently**: Focus on write metrics for ingest nodes, query metrics for query nodes
|
||||
2. **Track load distribution**: Ensure work is balanced across nodes of the same type
|
||||
3. **Monitor inter-node coordination**: Watch for communication delays between nodes
|
||||
4. **Set up node-specific alerts**: Different thresholds for different node roles
|
||||
5. **Use node labels**: Tag metrics with node roles and purposes
|
||||
6. **Monitor shared resources**: Object store performance affects all nodes
|
||||
7. **Track catalog conflicts**: High retry rates indicate coordination issues
|
||||
8. **Regularly review dashboards and alerts**: Adjust as cluster usage patterns evolve
|
||||
{{% /show-in %}}
|
|
@ -0,0 +1,602 @@
|
|||
InfluxDB exposes operational metrics in [Prometheus format](#prometheus-format) at the `/metrics` endpoint{{% show-in "enterprise" %}} on each cluster node{{% /show-in %}}.
|
||||
|
||||
- [Access metrics](#access-metrics)
|
||||
- [HTTP and gRPC metrics](#http-and-grpc-metrics)
|
||||
- [Database operations](#database-operations)
|
||||
- [Query performance](#query-performance)
|
||||
- [Memory and caching](#memory-and-caching)
|
||||
- [Object storage](#object-storage)
|
||||
- [Runtime and system](#runtime-and-system)
|
||||
{{% show-in "enterprise" %}}
|
||||
- [Cluster-specific considerations](#cluster-specific-considerations)
|
||||
{{% /show-in %}}
|
||||
- [Prometheus format](#prometheus-format)
|
||||
|
||||
## Access metrics
|
||||
|
||||
{{% show-in "core" %}}
|
||||
Metrics are available at `http://localhost:8181/metrics` by default.
|
||||
|
||||
```bash
|
||||
curl -s http://localhost:8181/metrics
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
Metrics are available at `http://NODE_HOST:8181/metrics` on each cluster node.
|
||||
|
||||
```bash
|
||||
# Access metrics from specific nodes
|
||||
curl -s http://ingester-01:8181/metrics
|
||||
curl -s http://query-01:8181/metrics
|
||||
curl -s http://compactor-01:8181/metrics
|
||||
```
|
||||
{{% /show-in %}}
|
||||
|
||||
## HTTP and gRPC metrics
|
||||
|
||||
### http_requests_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Total number of HTTP requests processed{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `method`: HTTP method (GET, POST, etc.)
|
||||
- `method_path`: Method and path combination
|
||||
- `path`: Request path
|
||||
- `status`: Response status (ok, client_error, server_error, aborted, unexpected_response)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Track per-node to monitor load distribution
|
||||
{{% /show-in %}}
|
||||
|
||||
```
|
||||
# Write endpoints
|
||||
http_requests_total{method="POST",method_path="POST /api/v3/write_lp",path="/api/v3/write_lp",status="ok"} 1
|
||||
http_requests_total{method="POST",method_path="POST /api/v2/write",path="/api/v2/write",status="ok"} 1
|
||||
http_requests_total{method="POST",method_path="POST /write",path="/write",status="ok"} 1
|
||||
|
||||
# Query endpoints
|
||||
http_requests_total{method="POST",method_path="POST /api/v3/query_sql",path="/api/v3/query_sql",status="ok"} 1
|
||||
http_requests_total{method="POST",method_path="POST /api/v3/query_influxql",path="/api/v3/query_influxql",status="ok"} 1
|
||||
http_requests_total{method="GET",method_path="GET /query",path="/query",status="ok"} 1
|
||||
```
|
||||
|
||||
> [!Note]
|
||||
> Monitor all write endpoints (`/api/v3/write_lp`, `/api/v2/write`, `/write`) and query endpoints (`/api/v3/query_sql`, `/api/v3/query_influxql`, `/query`) for comprehensive request tracking.
|
||||
|
||||
### http_request_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of HTTP request latencies{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** Same as <a href="#http_requests_total"><code>http_requests_total</code></a>
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Compare latencies across nodes to identify performance bottlenecks
|
||||
{{% /show-in %}}
|
||||
|
||||
### http_response_body_size_bytes
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of HTTP response body sizes{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** Same as <a href="#http_requests_total"><code>http_requests_total</code></a>
|
||||
|
||||
### grpc_requests_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Total number of gRPC requests processed{{% show-in "enterprise" %}} (includes inter-node communication){{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `path`: gRPC method path
|
||||
- `status`: Response status
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** High gRPC volumes indicate active inter-node communication
|
||||
{{% /show-in %}}
|
||||
|
||||
### grpc_request_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of gRPC request latencies{{% show-in "enterprise" %}} (includes inter-node communication){{% /show-in %}}
|
||||
- **Labels:** Same as <a href="#grpc_requests_total"><code>grpc_requests_total</code></a>
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Monitor for network latency between cluster nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### grpc_response_body_size_bytes
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of gRPC response body sizes
|
||||
- **Labels:** Same as <a href="#grpc_requests_total"><code>grpc_requests_total</code></a>
|
||||
|
||||
## Database operations
|
||||
|
||||
### influxdb3_catalog_operations_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Total catalog operations by type{{% show-in "enterprise" %}} across the cluster{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `type`: Operation type (create_database, create_admin_token, register_node, etc.)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Monitor for catalog coordination across nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
```
|
||||
influxdb3_catalog_operations_total{type="create_database"} 5
|
||||
influxdb3_catalog_operations_total{type="create_admin_token"} 2
|
||||
{{% show-in "enterprise" %}}
|
||||
influxdb3_catalog_operations_total{type="register_node"} 6
|
||||
{{% /show-in %}}
|
||||
```
|
||||
|
||||
### influxdb3_catalog_operation_retries_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Catalog updates that had to be retried due to conflicts{{% show-in "enterprise" %}} between nodes{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** High retry rates indicate coordination issues or high contention
|
||||
{{% /show-in %}}
|
||||
|
||||
## Query performance
|
||||
|
||||
### influxdb_iox_query_log_compute_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** CPU duration spent for query computation{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Compare compute times across query nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb_iox_query_log_execute_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Total time to execute queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Track query performance across different query nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb_iox_query_log_plan_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time spent planning queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_end2end_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Complete query duration from issue time to completion{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_permit_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time to acquire a semaphore permit for query execution{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_max_memory
|
||||
- **Type:** Histogram
|
||||
- **Description:** Peak memory allocated for processing queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_parquet_files
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of Parquet files processed by queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_partitions
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of partitions processed by queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_deduplicated_parquet_files
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of files held under a DeduplicateExec operator{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_deduplicated_partitions
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of partitions held under a DeduplicateExec operator{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_phase_current
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of queries currently in each execution phase{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `phase`: Query execution phase
|
||||
|
||||
### influxdb_iox_query_log_phase_entered_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Total number of queries that entered each execution phase{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `phase`: Query execution phase
|
||||
|
||||
### influxdb_iox_query_log_ingester_latency_to_full_data_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time from initial request until querier has all data from ingesters
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Measures inter-node coordination efficiency in distributed queries
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb_iox_query_log_ingester_latency_to_plan_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time until querier can proceed with query planning
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Indicates how quickly query nodes can coordinate with ingest nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb_iox_query_log_ingester_partition_count
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of ingester partitions involved in queries
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Shows data distribution across ingest nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb_iox_query_log_ingester_response_rows
|
||||
- **Type:** Histogram
|
||||
- **Description:** Number of rows in ingester responses
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb_iox_query_log_ingester_response_size
|
||||
- **Type:** Histogram
|
||||
- **Description:** Size of ingester record batches in bytes
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Monitor network traffic between query and ingest nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
### query_datafusion_query_execution_ooms_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of out-of-memory errors encountered by the query engine{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Track OOM events across query nodes to identify resource constraints
|
||||
{{% /show-in %}}
|
||||
|
||||
## Memory and caching
|
||||
|
||||
### datafusion_mem_pool_bytes
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of bytes within the DataFusion memory pool{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Monitor memory usage across different node types
|
||||
{{% /show-in %}}
|
||||
|
||||
### influxdb3_parquet_cache_access_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Track accesses to the in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `status`: Access result (cached, miss, miss_while_fetching)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Compare cache effectiveness across query nodes
|
||||
{{% /show-in %}}
|
||||
|
||||
```
|
||||
influxdb3_parquet_cache_access_total{status="cached"} 1500
|
||||
influxdb3_parquet_cache_access_total{status="miss"} 200
|
||||
```
|
||||
|
||||
### influxdb3_parquet_cache_size_bytes
|
||||
- **Type:** Gauge
|
||||
- **Description:** Current size of in-memory Parquet cache in bytes{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### influxdb3_parquet_cache_size_number_of_files
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of files in the in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### jemalloc_memstats_bytes
|
||||
- **Type:** Gauge
|
||||
- **Description:** Memory allocation statistics from jemalloc{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `type`: Memory statistic type (active, allocated, mapped, etc.)
|
||||
|
||||
## Object storage
|
||||
|
||||
### object_store_op_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Duration of object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `op`: Operation type (get, put, delete, list, etc.)
|
||||
- `result`: Operation result (success, error)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** All nodes access shared object store; monitor for hotspots
|
||||
{{% /show-in %}}
|
||||
|
||||
### object_store_op_headers_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time to response headers for object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** Same as <a href="#object_store_op_duration_seconds"><code>object_store_op_duration_seconds</code></a>
|
||||
|
||||
### object_store_op_ttfb_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Time to first byte for object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** Same as <a href="#object_store_op_duration_seconds"><code>object_store_op_duration_seconds</code></a>
|
||||
|
||||
### object_store_transfer_bytes_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Cumulative bytes transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `op`: Operation type (get, put)
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
**Cluster considerations:** Ingest nodes show high 'put' activity; query nodes show high 'get' activity
|
||||
{{% /show-in %}}
|
||||
|
||||
### object_store_transfer_bytes_hist
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of bytes transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `op`: Operation type (get, put)
|
||||
|
||||
### object_store_transfer_objects_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Cumulative count of objects transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `op`: Operation type (get, put)
|
||||
|
||||
### object_store_transfer_objects_hist
|
||||
- **Type:** Histogram
|
||||
- **Description:** Distribution of objects transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `op`: Operation type (get, put)
|
||||
|
||||
## Runtime and system
|
||||
|
||||
### process_start_time_seconds
|
||||
- **Type:** Gauge
|
||||
- **Description:** Start time of the process since Unix epoch{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### thread_panic_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of thread panics observed{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### iox_async_semaphore_acquire_duration_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Duration to acquire async semaphore permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_holders_acquired
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of currently acquired semaphore holders{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_holders_cancelled_while_pending_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of pending semaphore holders cancelled while waiting{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_holders_pending
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of pending semaphore holders{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_permits_acquired
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of currently acquired permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_permits_cancelled_while_pending_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Permits cancelled while waiting for semaphore{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_permits_pending
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of pending permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### iox_async_semaphore_permits_total
|
||||
- **Type:** Gauge
|
||||
- **Description:** Total number of permits in semaphore{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `semaphore`: Semaphore identifier
|
||||
|
||||
### tokio_runtime_num_alive_tasks
|
||||
- **Type:** Gauge
|
||||
- **Description:** Current number of alive tasks in the Tokio runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_blocking_queue_depth
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of tasks in the blocking thread pool queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_budget_forced_yield_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of times tasks were forced to yield after exhausting budgets{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_global_queue_depth
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of tasks in the runtime's global queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_io_driver_ready_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of ready events processed by the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_io_driver_fd_deregistered_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of file descriptors deregistered by the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_io_driver_fd_registered_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of file descriptors registered with the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_num_blocking_threads
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of additional threads spawned by the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_num_idle_blocking_threads
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of idle threads spawned for spawn_blocking calls{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_num_workers
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of worker threads used by the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_runtime_remote_schedule_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of tasks scheduled from outside the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_worker_local_queue_depth
|
||||
- **Type:** Gauge
|
||||
- **Description:** Number of tasks in each worker's local queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_local_schedule_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Tasks scheduled from within the runtime on local queues{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_mean_poll_time_seconds
|
||||
- **Type:** Gauge
|
||||
- **Description:** Exponentially weighted moving average of task poll duration{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_noop_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Times worker threads unparked but performed no work{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_overflow_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Times worker threads saturated their local queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_park_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Total times worker threads have parked{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_poll_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of tasks polled by worker threads{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_steal_count_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Tasks stolen by worker threads from other workers{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_steal_operations_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of steal operations performed by worker threads{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_worker_total_busy_duration_seconds_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Time worker threads have been busy{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:**
|
||||
- `worker`: Worker thread identifier
|
||||
|
||||
### tokio_watchdog_hangs_total
|
||||
- **Type:** Counter
|
||||
- **Description:** Number of hangs detected by the Tokio watchdog{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
### tokio_watchdog_response_time_seconds
|
||||
- **Type:** Histogram
|
||||
- **Description:** Response time of the Tokio watchdog task{{% show-in "enterprise" %}} per node{{% /show-in %}}
|
||||
- **Labels:** None
|
||||
|
||||
{{% show-in "enterprise" %}}
|
||||
## Cluster-specific considerations
|
||||
|
||||
### Metrics reporting across node modes
|
||||
|
||||
All nodes in an InfluxDB 3 Enterprise cluster report the same set of metrics regardless of their configured [mode](/influxdb3/enterprise/reference/config-options/#mode) (ingest, query, compact, process, or all).
|
||||
Metrics are not filtered based on node specialization.
|
||||
The difference between nodes is in the metric _values_ and labels, which reflect the actual activity on each node.
|
||||
|
||||
For example:
|
||||
- An ingest-only node reports query-related metrics, but with minimal or zero values
|
||||
- A query-only node reports write-related metrics, but with minimal or zero values
|
||||
|
||||
### Node identification
|
||||
|
||||
For information on enriching metrics with node identification using Telegraf or Prometheus relabeling, see [Node identification in Monitor metrics](/influxdb3/enterprise/admin/monitor-metrics/#node-identification).
|
||||
|
||||
### Key cluster metrics
|
||||
|
||||
Focus on these metrics for cluster health:
|
||||
|
||||
- **Load distribution**: `sum by (node_name) (rate(http_requests_total[5m]))`
|
||||
- **Catalog conflicts**: `rate(influxdb3_catalog_operation_retries_total[5m])`
|
||||
- **Inter-node latency**: `influxdb_iox_query_log_ingester_latency_to_full_data_seconds`
|
||||
- **Node availability**: `up{job="influxdb3-enterprise"}`
|
||||
|
||||
### Performance by node type
|
||||
|
||||
Monitor different metrics based on [node specialization](/influxdb3/enterprise/admin/clustering/):
|
||||
|
||||
- **Ingest nodes or all-in-one nodes handling writes**:
|
||||
- `http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}` - Write operations via HTTP (all endpoints)
|
||||
- `grpc_requests_total{path="/api/v3/write_lp"}` - Write operations via gRPC
|
||||
- `grpc_request_duration_seconds{path="/api/v3/write_lp"}` - Write operation latency
|
||||
- `object_store_transfer_bytes_total{op="put"}` - Data written to object storage
|
||||
- **Query nodes or all-in-one nodes handling queries**:
|
||||
- `http_requests_total{path=~"/api/v3/query_sql|/api/v3/query_influxql|/query"}` - Query requests (all endpoints)
|
||||
- `influxdb_iox_query_log_execute_duration_seconds` - Query execution time
|
||||
- `influxdb3_parquet_cache_access_total` - Parquet cache performance
|
||||
- **All nodes (configuration and management)**:
|
||||
- `http_requests_total{path="/api/v3/configure/database"}` - Database configuration operations
|
||||
- `http_requests_total{path="/api/v3/configure/token/admin"}` - Token management operations
|
||||
- `influxdb3_catalog_operations_total` - Catalog operations (create_database, create_admin_token, register_node)
|
||||
- **Compactor nodes or all-in-one nodes handling compaction**:
|
||||
- `object_store_op_duration_seconds{op="put"}` - Compaction write performance
|
||||
- `object_store_transfer_objects_total` - Files processed during compaction
|
||||
{{% /show-in %}}
|
||||
|
||||
## Prometheus format
|
||||
|
||||
InfluxDB exposes metrics in Prometheus exposition format, a text-based format that includes metric names, labels, and values. Each metric follows this structure:
|
||||
|
||||
```
|
||||
metric_name{label1="value1",label2="value2"} metric_value timestamp
|
||||
```
|
||||
|
||||
**Key characteristics:**
|
||||
- **Metric names**: Use underscores and describe what is being measured
|
||||
- **Labels**: Key-value pairs in curly braces that add dimensionality
|
||||
- **Values**: Numeric measurements (integers or floats)
|
||||
- **Timestamps**: Optional Unix timestamps (usually omitted for current time)
|
||||
|
||||
**Metric types:**
|
||||
- **Counter**: Cumulative values that only increase (for example, `http_requests_total`)
|
||||
- **Gauge**: Values that can go up and down (for example, `tokio_runtime_num_alive_tasks`)
|
||||
- **Histogram**: Samples observations and counts them in configurable buckets (for example, `http_request_duration_seconds`)
|
||||
|
||||
For complete specification details, see the [Prometheus exposition format documentation](https://prometheus.io/docs/instrumenting/exposition_formats/).
|
||||
|
|
@ -0,0 +1,187 @@
|
|||
/**
|
||||
* InfluxDB Version Detector Component
|
||||
*
|
||||
* Helps users identify which InfluxDB product they're using through a
|
||||
* guided questionnaire with URL detection and scoring-based recommendations.
|
||||
*
|
||||
* DECISION TREE LOGIC (from .context/drafts/influxdb-version-detector/influxdb-decision-tree.md):
|
||||
*
|
||||
* ## Primary Detection Flow
|
||||
*
|
||||
* START: User enters URL
|
||||
* |
|
||||
* ├─→ URL matches known cloud patterns?
|
||||
* │ │
|
||||
* │ ├─→ YES: Contains "influxdb.io" → **InfluxDB Cloud Dedicated** ✓
|
||||
* │ ├─→ YES: Contains "cloud2.influxdata.com" regions → **InfluxDB Cloud Serverless** ✓
|
||||
* │ ├─→ YES: Contains "influxcloud.net" → **InfluxDB Cloud 1** ✓
|
||||
* │ └─→ YES: Contains other cloud2 regions → **InfluxDB Cloud (TSM)** ✓
|
||||
* │
|
||||
* └─→ NO: Check port and try /ping endpoint
|
||||
* │
|
||||
* ├─→ Port 8181 detected? → Strong indicator of v3 (Core/Enterprise)
|
||||
* | | Returns 200 (auth successful or disabled)?
|
||||
* | │ │--> `x-influxdb-build: Enterprise` -> **InfluxDB 3 Enterprise** ✓ (definitive)
|
||||
* | │ │--> `x-influxdb-build: Core` -> **InfluxDB 3 Core** ✓ (definitive)
|
||||
* │ │
|
||||
* │ ├─→ Returns 401 Unauthorized (default - auth required)?
|
||||
* │ │
|
||||
* │ └─→ Ask "Paid or Free?"
|
||||
* │ ├─→ Paid → **InfluxDB 3 Enterprise** ✓ (definitive)
|
||||
* │ └─→ Free → **InfluxDB 3 Core** ✓ (definitive)
|
||||
* |
|
||||
* ├─→ Port 8086 detected? → Strong indicator of legacy (OSS/Enterprise)
|
||||
* │ │ ⚠️ NOTE: v1.x ping auth optional (ping-auth-enabled), v2.x always open
|
||||
* │ │
|
||||
* │ ├─→ Returns 401 Unauthorized?
|
||||
* │ │ │ Could be v1.x with ping-auth-enabled=true OR Enterprise
|
||||
* │ │ │
|
||||
* │ │ └─→ Ask "Paid or Free?" → Show ranked results
|
||||
* │ │
|
||||
* │ ├─→ Returns 200/204 (accessible)?
|
||||
* │ │ │ Likely v2.x OSS (always open) or v1.x with ping-auth-enabled=false
|
||||
* │ │ │
|
||||
* │ │ └─→ Continue to questionnaire
|
||||
* │
|
||||
* └─→ Blocked/Can't detect?
|
||||
* │
|
||||
* └─→ Start questionnaire
|
||||
*
|
||||
* ## Questionnaire Flow (No URL or after detection)
|
||||
*
|
||||
* Q1: Which type of license do you have?
|
||||
* ├─→ Paid/Commercial License
|
||||
* ├─→ Free/Open Source (including free cloud tiers)
|
||||
* └─→ I'm not sure
|
||||
*
|
||||
* Q2: Is your InfluxDB hosted by InfluxData (cloud) or self-hosted?
|
||||
* ├─→ Cloud service (hosted by InfluxData)
|
||||
* ├─→ Self-hosted (on your own servers)
|
||||
* └─→ I'm not sure
|
||||
*
|
||||
* Q3: How long has your server been in place?
|
||||
* ├─→ Recently installed (less than 1 year)
|
||||
* ├─→ 1-5 years
|
||||
* ├─→ More than 5 years
|
||||
* └─→ I'm not sure
|
||||
*
|
||||
* Q4: Which query language(s) do you use?
|
||||
* ├─→ SQL
|
||||
* ├─→ InfluxQL
|
||||
* ├─→ Flux
|
||||
* ├─→ Multiple languages
|
||||
* └─→ I'm not sure
|
||||
*
|
||||
* ## Definitive Determinations (Stop immediately, no more questions)
|
||||
*
|
||||
* 1. **401 + Port 8181 + Paid** → InfluxDB 3 Enterprise ✓
|
||||
* 2. **401 + Port 8181 + Free** → InfluxDB 3 Core ✓
|
||||
* 3. **URL matches cloud pattern** → Specific cloud product ✓
|
||||
* 4. **x-influxdb-build header** → Definitive product identification ✓
|
||||
*
|
||||
* ## Scoring System (When not definitive)
|
||||
*
|
||||
* ### Elimination Rules
|
||||
* - **Free + Self-hosted** → Eliminates all cloud products
|
||||
* - **Free** → Eliminates: 3 Enterprise, Enterprise, Clustered, Cloud Dedicated, Cloud 1
|
||||
* - **Paid + Self-hosted** → Eliminates all cloud products
|
||||
* - **Paid + Cloud** → Eliminates all self-hosted products
|
||||
* - **Free + Cloud** → Eliminates all self-hosted products, favors Serverless/TSM
|
||||
*
|
||||
* ### Strong Signals (High points)
|
||||
* - **401 Response**: +50 for v3 products, +30 for Clustered
|
||||
* - **Port 8181**: +30 for v3 products
|
||||
* - **Port 8086**: +20 for legacy products
|
||||
* - **SQL Language**: +40 for v3 products, eliminates v1/v2
|
||||
* - **Flux Language**: +30 for v2 era, eliminates v1 and v3
|
||||
* - **Server Age 5+ years**: +30 for v1 products, -50 for v3
|
||||
*
|
||||
* ### Ranking Display Rules
|
||||
* - Only show "Most Likely" if:
|
||||
* - Top score > 30 (not low confidence)
|
||||
* - AND difference between #1 and #2 is ≥ 15 points
|
||||
* - Show manual verification commands only if:
|
||||
* - Confidence is not high (score < 60)
|
||||
* - AND it's a self-hosted product
|
||||
* - AND user didn't say it's cloud
|
||||
*/
|
||||
interface ComponentOptions {
|
||||
component: HTMLElement;
|
||||
}
|
||||
declare global {
|
||||
interface Window {
|
||||
gtag?: (_event: string, _action: string, _parameters?: Record<string, unknown>) => void;
|
||||
}
|
||||
}
|
||||
declare class InfluxDBVersionDetector {
|
||||
private container;
|
||||
private products;
|
||||
private influxdbUrls;
|
||||
private answers;
|
||||
private initialized;
|
||||
private questionFlow;
|
||||
private currentQuestionIndex;
|
||||
private questionHistory;
|
||||
private progressBar;
|
||||
private resultDiv;
|
||||
private restartBtn;
|
||||
private currentContext;
|
||||
constructor(options: ComponentOptions);
|
||||
private parseComponentData;
|
||||
private init;
|
||||
private setupPlaceholders;
|
||||
private setupPingHeadersPlaceholder;
|
||||
private setupDockerOutputPlaceholder;
|
||||
private getCurrentPageSection;
|
||||
private trackAnalyticsEvent;
|
||||
private initializeForModal;
|
||||
private getBasicUrlSuggestion;
|
||||
private getProductDisplayName;
|
||||
private generateConfigurationGuidance;
|
||||
private getHostExample;
|
||||
private usesDatabaseTerminology;
|
||||
private getAuthenticationInfo;
|
||||
private detectEnterpriseFeatures;
|
||||
private analyzeUrlPatterns;
|
||||
private render;
|
||||
private attachEventListeners;
|
||||
private updateProgress;
|
||||
private showQuestion;
|
||||
private enhanceUrlInputWithSuggestions;
|
||||
private getCurrentProduct;
|
||||
private handleUrlKnown;
|
||||
private goBack;
|
||||
private detectByUrl;
|
||||
private detectContext;
|
||||
private detectPortFromUrl;
|
||||
private startQuestionnaire;
|
||||
private startQuestionnaireWithCloudContext;
|
||||
private answerQuestion;
|
||||
private handleAuthorizationHelp;
|
||||
private showRankedResults;
|
||||
/**
|
||||
* Gets the Grafana documentation link for a given product
|
||||
*/
|
||||
private getGrafanaLink;
|
||||
/**
|
||||
* Generates a unified product result block with characteristics and Grafana link
|
||||
*/
|
||||
private generateProductResult;
|
||||
/**
|
||||
* Maps simple product keys (used in URL detection) to full product names (used in scoring)
|
||||
*/
|
||||
private mapProductKeyToFullName;
|
||||
private applyScoring;
|
||||
private displayRankedResults;
|
||||
private analyzePingHeaders;
|
||||
private showResult;
|
||||
private analyzeDockerOutput;
|
||||
private showPingTestSuggestion;
|
||||
private showOSSVersionCheckSuggestion;
|
||||
private showMultipleCandidatesSuggestion;
|
||||
private showDetectedVersion;
|
||||
private restart;
|
||||
}
|
||||
export default function initInfluxDBVersionDetector(options: ComponentOptions): InfluxDBVersionDetector;
|
||||
export {};
|
||||
//# sourceMappingURL=influxdb-version-detector.d.ts.map
|
|
@ -0,0 +1 @@
|
|||
{"version":3,"file":"influxdb-version-detector.d.ts","sourceRoot":"","sources":["../assets/js/influxdb-version-detector.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA0GG;AAuCH,UAAU,gBAAgB;IACxB,SAAS,EAAE,WAAW,CAAC;CACxB;AAaD,OAAO,CAAC,MAAM,CAAC;IACb,UAAU,MAAM;QACd,IAAI,CAAC,EAAE,CACL,MAAM,EAAE,MAAM,EACd,OAAO,EAAE,MAAM,EACf,WAAW,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,KAClC,IAAI,CAAC;KACX;CACF;AAED,cAAM,uBAAuB;IAC3B,OAAO,CAAC,SAAS,CAAc;IAC/B,OAAO,CAAC,QAAQ,CAAW;IAC3B,OAAO,CAAC,YAAY,CAA0B;IAC9C,OAAO,CAAC,OAAO,CAAe;IAC9B,OAAO,CAAC,WAAW,CAAkB;IACrC,OAAO,CAAC,YAAY,CAAgB;IACpC,OAAO,CAAC,oBAAoB,CAAK;IACjC,OAAO,CAAC,eAAe,CAAgB;IACvC,OAAO,CAAC,WAAW,CAA4B;IAC/C,OAAO,CAAC,SAAS,CAA4B;IAC7C,OAAO,CAAC,UAAU,CAA4B;IAC9C,OAAO,CAAC,cAAc,CAA+C;gBAEzD,OAAO,EAAE,gBAAgB;IAoBrC,OAAO,CAAC,kBAAkB;IAqC1B,OAAO,CAAC,IAAI;IAcZ,OAAO,CAAC,iBAAiB;IAKzB,OAAO,CAAC,2BAA2B;IAoCnC,OAAO,CAAC,4BAA4B;IA+BpC,OAAO,CAAC,qBAAqB;IAe7B,OAAO,CAAC,mBAAmB;IAsG3B,OAAO,CAAC,kBAAkB;IAqC1B,OAAO,CAAC,qBAAqB;IAK7B,OAAO,CAAC,qBAAqB;IAgC7B,OAAO,CAAC,6BAA6B;IA4FrC,OAAO,CAAC,cAAc;IAgCtB,OAAO,CAAC,uBAAuB;IAU/B,OAAO,CAAC,qBAAqB;IA2B7B,OAAO,CAAC,wBAAwB;IAehC,OAAO,CAAC,kBAAkB;IAmJ1B,OAAO,CAAC,MAAM;IAsNd,OAAO,CAAC,oBAAoB;IA8G5B,OAAO,CAAC,cAAc;IAQtB,OAAO,CAAC,YAAY;IAsBpB,OAAO,CAAC,8BAA8B;IAyCtC,OAAO,CAAC,iBAAiB;IAMzB,OAAO,CAAC,cAAc;IAuBtB,OAAO,CAAC,MAAM;YA2BA,WAAW;IA6DzB,OAAO,CAAC,aAAa;IAgBrB,OAAO,CAAC,iBAAiB;IAgBzB,OAAO,CAAC,kBAAkB;IAa1B,OAAO,CAAC,kCAAkC;IAU1C,OAAO,CAAC,cAAc;IA0BtB,OAAO,CAAC,uBAAuB;IAqG/B,OAAO,CAAC,iBAAiB;IAoCzB;;OAEG;IACH,OAAO,CAAC,cAAc;IAmBtB;;OAEG;IACH,OAAO,CAAC,qBAAqB;IAgE7B;;OAEG;IACH,OAAO,CAAC,uBAAuB;IAiB/B,OAAO,CAAC,YAAY;IAmKpB,OAAO,CAAC,oBAAoB;IAiJ5B,OAAO,CAAC,kBAAkB;IAmG1B,OAAO,CAAC,UAAU;IAUlB,OAAO,CAAC,mBAAmB;IA2F3B,OAAO,CAAC,sBAAsB;IA4D9B,OAAO,CAAC,6BAA6B;IAiDrC,OAAO,CAAC,gCAAgC;IA2CxC,OAAO,CAAC,mBAAmB;IAgB3B,OAAO,CAAC,OAAO;CA2ChB;AAGD,MAAM,CAAC,OAAO,UAAU,2BAA2B,CACjD,OAAO,EAAE,gBAAgB,GACxB,uBAAuB,CAEzB"}
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because one or more lines are too long
|
@ -0,0 +1,2 @@
|
|||
export const influxdbUrls: any;
|
||||
//# sourceMappingURL=influxdb-urls.d.ts.map
|
|
@ -0,0 +1 @@
|
|||
{"version":3,"file":"influxdb-urls.d.ts","sourceRoot":"","sources":["../../assets/js/services/influxdb-urls.js"],"names":[],"mappings":"AAEA,+BAAoD"}
|
|
@ -0,0 +1,3 @@
|
|||
import { influxdb_urls as influxdbUrlsParam } from '@params';
|
||||
export const influxdbUrls = influxdbUrlsParam || {};
|
||||
//# sourceMappingURL=influxdb-urls.js.map
|
|
@ -0,0 +1 @@
|
|||
{"version":3,"file":"influxdb-urls.js","sourceRoot":"","sources":["../../assets/js/services/influxdb-urls.js"],"names":[],"mappings":"AAAA,OAAO,EAAE,aAAa,IAAI,iBAAiB,EAAE,MAAM,SAAS,CAAC;AAE7D,MAAM,CAAC,MAAM,YAAY,GAAG,iBAAiB,IAAI,EAAE,CAAC"}
|
|
@ -0,0 +1,30 @@
|
|||
export namespace DEFAULT_STORAGE_URLS {
|
||||
let oss: any;
|
||||
let cloud: any;
|
||||
let serverless: any;
|
||||
let core: any;
|
||||
let enterprise: any;
|
||||
let dedicated: any;
|
||||
let clustered: any;
|
||||
let prev_oss: any;
|
||||
let prev_cloud: any;
|
||||
let prev_core: any;
|
||||
let prev_enterprise: any;
|
||||
let prev_serverless: any;
|
||||
let prev_dedicated: any;
|
||||
let prev_clustered: any;
|
||||
let custom: string;
|
||||
}
|
||||
export const defaultUrls: {};
|
||||
export function initializeStorageItem(storageKey: any, defaultValue: any): void;
|
||||
export function getPreference(prefName: any): any;
|
||||
export function setPreference(prefID: any, prefValue: any): void;
|
||||
export function getPreferences(): any;
|
||||
export function getInfluxDBUrls(): any;
|
||||
export function getInfluxDBUrl(product: any): any;
|
||||
export function setInfluxDBUrls(updatedUrlsObj: any): void;
|
||||
export function removeInfluxDBUrl(product: any): void;
|
||||
export function getNotifications(): any;
|
||||
export function notificationIsRead(notificationID: any, notificationType: any): any;
|
||||
export function setNotificationAsRead(notificationID: any, notificationType: any): void;
|
||||
//# sourceMappingURL=local-storage.d.ts.map
|
|
@ -0,0 +1 @@
|
|||
{"version":3,"file":"local-storage.d.ts","sourceRoot":"","sources":["../../assets/js/services/local-storage.js"],"names":[],"mappings":";;;;;;;;;;;;;;;;;AAqFA,6BAAuB;AAhEvB,gFAOC;AAwBD,kDAYC;AAGD,iEAOC;AAGD,sCAEC;AAkCD,uCAOC;AAGD,kDAYC;AAOD,2DAOC;AAGD,sDAOC;AAgBD,wCAeC;AAYD,oFAKC;AAWD,wFAWC"}
|
|
@ -0,0 +1,187 @@
|
|||
/*
|
||||
This represents an API for managing user and client-side settings for the
|
||||
InfluxData documentation. It uses the local browser storage.
|
||||
|
||||
These functions manage the following InfluxDB settings:
|
||||
|
||||
- influxdata_docs_preferences: Docs UI/UX-related preferences (obj)
|
||||
- influxdata_docs_urls: User-defined InfluxDB URLs for each product (obj)
|
||||
- influxdata_docs_notifications:
|
||||
- messages: Messages (data/notifications.yaml) that have been seen (array)
|
||||
- callouts: Feature callouts that have been seen (array)
|
||||
*/
|
||||
import { influxdbUrls } from './influxdb-urls.js';
|
||||
// Prefix for all InfluxData docs local storage
|
||||
const storagePrefix = 'influxdata_docs_';
|
||||
/*
|
||||
Initialize data in local storage with a default value.
|
||||
*/
|
||||
function initializeStorageItem(storageKey, defaultValue) {
|
||||
const fullStorageKey = storagePrefix + storageKey;
|
||||
// Check if the data exists before initializing the data
|
||||
if (localStorage.getItem(fullStorageKey) === null) {
|
||||
localStorage.setItem(fullStorageKey, defaultValue);
|
||||
}
|
||||
}
|
||||
/*
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
////////////////////////// INFLUXDATA DOCS PREFERENCES /////////////////////////
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
*/
|
||||
const prefStorageKey = storagePrefix + 'preferences';
|
||||
// Default preferences
|
||||
const defaultPrefObj = {
|
||||
api_lib: null,
|
||||
influxdb_url: 'cloud',
|
||||
sidebar_state: 'open',
|
||||
theme: 'light',
|
||||
sample_get_started_date: null,
|
||||
v3_wayfinding_show: true,
|
||||
};
|
||||
/*
|
||||
Retrieve a preference from the preference key.
|
||||
If the key doesn't exist, initialize it with default values.
|
||||
*/
|
||||
function getPreference(prefName) {
|
||||
// Initialize preference data if it doesn't already exist
|
||||
if (localStorage.getItem(prefStorageKey) === null) {
|
||||
initializeStorageItem('preferences', JSON.stringify(defaultPrefObj));
|
||||
}
|
||||
// Retrieve and parse preferences as JSON
|
||||
const prefString = localStorage.getItem(prefStorageKey);
|
||||
const prefObj = JSON.parse(prefString);
|
||||
// Return the value of the specified preference
|
||||
return prefObj[prefName];
|
||||
}
|
||||
// Set a preference in the preferences key
|
||||
function setPreference(prefID, prefValue) {
|
||||
const prefString = localStorage.getItem(prefStorageKey);
|
||||
const prefObj = JSON.parse(prefString);
|
||||
prefObj[prefID] = prefValue;
|
||||
localStorage.setItem(prefStorageKey, JSON.stringify(prefObj));
|
||||
}
|
||||
// Return an object containing all preferences
|
||||
function getPreferences() {
|
||||
return JSON.parse(localStorage.getItem(prefStorageKey));
|
||||
}
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
//////////// MANAGE INFLUXDATA DOCS URLS IN LOCAL STORAGE //////////////////////
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
const defaultUrls = {};
|
||||
Object.entries(influxdbUrls).forEach(([product, { providers }]) => {
|
||||
defaultUrls[product] =
|
||||
providers.filter((provider) => provider.name === 'Default')[0]?.regions[0]
|
||||
?.url || 'https://cloud2.influxdata.com';
|
||||
});
|
||||
export const DEFAULT_STORAGE_URLS = {
|
||||
oss: defaultUrls.oss,
|
||||
cloud: defaultUrls.cloud,
|
||||
serverless: defaultUrls.serverless,
|
||||
core: defaultUrls.core,
|
||||
enterprise: defaultUrls.enterprise,
|
||||
dedicated: defaultUrls.cloud_dedicated,
|
||||
clustered: defaultUrls.clustered,
|
||||
prev_oss: defaultUrls.oss,
|
||||
prev_cloud: defaultUrls.cloud,
|
||||
prev_core: defaultUrls.core,
|
||||
prev_enterprise: defaultUrls.enterprise,
|
||||
prev_serverless: defaultUrls.serverless,
|
||||
prev_dedicated: defaultUrls.cloud_dedicated,
|
||||
prev_clustered: defaultUrls.clustered,
|
||||
custom: '',
|
||||
};
|
||||
const urlStorageKey = storagePrefix + 'urls';
|
||||
// Return an object that contains all InfluxDB urls stored in the urls key
|
||||
function getInfluxDBUrls() {
|
||||
// Initialize urls data if it doesn't already exist
|
||||
if (localStorage.getItem(urlStorageKey) === null) {
|
||||
initializeStorageItem('urls', JSON.stringify(DEFAULT_STORAGE_URLS));
|
||||
}
|
||||
return JSON.parse(localStorage.getItem(urlStorageKey));
|
||||
}
|
||||
// Get the current or previous URL for a specific product or a custom url
|
||||
function getInfluxDBUrl(product) {
|
||||
// Initialize urls data if it doesn't already exist
|
||||
if (localStorage.getItem(urlStorageKey) === null) {
|
||||
initializeStorageItem('urls', JSON.stringify(DEFAULT_STORAGE_URLS));
|
||||
}
|
||||
// Retrieve and parse the URLs as JSON
|
||||
const urlsString = localStorage.getItem(urlStorageKey);
|
||||
const urlsObj = JSON.parse(urlsString);
|
||||
// Return the URL of the specified product
|
||||
return urlsObj[product];
|
||||
}
|
||||
/*
|
||||
Set multiple product URLs in the urls key.
|
||||
Input should be an object where the key is the product and the value is the
|
||||
URL to set for that product.
|
||||
*/
|
||||
function setInfluxDBUrls(updatedUrlsObj) {
|
||||
const urlsString = localStorage.getItem(urlStorageKey);
|
||||
const urlsObj = JSON.parse(urlsString);
|
||||
const newUrlsObj = { ...urlsObj, ...updatedUrlsObj };
|
||||
localStorage.setItem(urlStorageKey, JSON.stringify(newUrlsObj));
|
||||
}
|
||||
// Set an InfluxDB URL to an empty string in the urls key
|
||||
function removeInfluxDBUrl(product) {
|
||||
const urlsString = localStorage.getItem(urlStorageKey);
|
||||
const urlsObj = JSON.parse(urlsString);
|
||||
urlsObj[product] = '';
|
||||
localStorage.setItem(urlStorageKey, JSON.stringify(urlsObj));
|
||||
}
|
||||
/*
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
///////////////////////// INFLUXDATA DOCS NOTIFICATIONS ////////////////////////
|
||||
////////////////////////////////////////////////////////////////////////////////
|
||||
*/
|
||||
const notificationStorageKey = storagePrefix + 'notifications';
|
||||
// Default notifications
|
||||
const defaultNotificationsObj = {
|
||||
messages: [],
|
||||
callouts: [],
|
||||
};
|
||||
function getNotifications() {
|
||||
// Initialize notifications data if it doesn't already exist
|
||||
if (localStorage.getItem(notificationStorageKey) === null) {
|
||||
initializeStorageItem('notifications', JSON.stringify(defaultNotificationsObj));
|
||||
}
|
||||
// Retrieve and parse the notifications data as JSON
|
||||
const notificationString = localStorage.getItem(notificationStorageKey);
|
||||
const notificationObj = JSON.parse(notificationString);
|
||||
// Return the notifications object
|
||||
return notificationObj;
|
||||
}
|
||||
/*
|
||||
Checks if a notification is read. Provide the notification ID and one of the
|
||||
following notification types:
|
||||
|
||||
- message
|
||||
- callout
|
||||
|
||||
If the notification ID exists in the array assigned to the specified type, the
|
||||
notification has been read.
|
||||
*/
|
||||
function notificationIsRead(notificationID, notificationType) {
|
||||
const notificationsObj = getNotifications();
|
||||
const readNotifications = notificationsObj[`${notificationType}s`];
|
||||
return readNotifications.includes(notificationID);
|
||||
}
|
||||
/*
|
||||
Sets a notification as read. Provide the notification ID and one of the
|
||||
following notification types:
|
||||
|
||||
- message
|
||||
- callout
|
||||
|
||||
The notification ID is added to the array assigned to the specified type.
|
||||
*/
|
||||
function setNotificationAsRead(notificationID, notificationType) {
|
||||
const notificationsObj = getNotifications();
|
||||
const readNotifications = notificationsObj[`${notificationType}s`];
|
||||
readNotifications.push(notificationID);
|
||||
notificationsObj[notificationType + 's'] = readNotifications;
|
||||
localStorage.setItem(notificationStorageKey, JSON.stringify(notificationsObj));
|
||||
}
|
||||
// Export functions as a module and make the file backwards compatible for non-module environments until all remaining dependent scripts are ported to modules
|
||||
export { defaultUrls, initializeStorageItem, getPreference, setPreference, getPreferences, getInfluxDBUrls, getInfluxDBUrl, setInfluxDBUrls, removeInfluxDBUrl, getNotifications, notificationIsRead, setNotificationAsRead, };
|
||||
//# sourceMappingURL=local-storage.js.map
|
|
@ -0,0 +1 @@
|
|||
{"version":3,"file":"local-storage.js","sourceRoot":"","sources":["../../assets/js/services/local-storage.js"],"names":[],"mappings":"AAAA;;;;;;;;;;;EAWE;AAEF,OAAO,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AAElD,+CAA+C;AAC/C,MAAM,aAAa,GAAG,kBAAkB,CAAC;AAEzC;;EAEE;AACF,SAAS,qBAAqB,CAAC,UAAU,EAAE,YAAY;IACrD,MAAM,cAAc,GAAG,aAAa,GAAG,UAAU,CAAC;IAElD,wDAAwD;IACxD,IAAI,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,KAAK,IAAI,EAAE,CAAC;QAClD,YAAY,CAAC,OAAO,CAAC,cAAc,EAAE,YAAY,CAAC,CAAC;IACrD,CAAC;AACH,CAAC;AAED;;;;EAIE;AAEF,MAAM,cAAc,GAAG,aAAa,GAAG,aAAa,CAAC;AAErD,sBAAsB;AACtB,MAAM,cAAc,GAAG;IACrB,OAAO,EAAE,IAAI;IACb,YAAY,EAAE,OAAO;IACrB,aAAa,EAAE,MAAM;IACrB,KAAK,EAAE,OAAO;IACd,uBAAuB,EAAE,IAAI;IAC7B,kBAAkB,EAAE,IAAI;CACzB,CAAC;AAEF;;;EAGE;AACF,SAAS,aAAa,CAAC,QAAQ;IAC7B,yDAAyD;IACzD,IAAI,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,KAAK,IAAI,EAAE,CAAC;QAClD,qBAAqB,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,cAAc,CAAC,CAAC,CAAC;IACvE,CAAC;IAED,yCAAyC;IACzC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;IACxD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,+CAA+C;IAC/C,OAAO,OAAO,CAAC,QAAQ,CAAC,CAAC;AAC3B,CAAC;AAED,0CAA0C;AAC1C,SAAS,aAAa,CAAC,MAAM,EAAE,SAAS;IACtC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;IACxD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,OAAO,CAAC,MAAM,CAAC,GAAG,SAAS,CAAC;IAE5B,YAAY,CAAC,OAAO,CAAC,cAAc,EAAE,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,CAAC,CAAC;AAChE,CAAC;AAED,8CAA8C;AAC9C,SAAS,cAAc;IACrB,OAAO,IAAI,CAAC,KAAK,CAAC,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC,CAAC;AAC1D,CAAC;AAED,gFAAgF;AAChF,gFAAgF;AAChF,gFAAgF;AAEhF,MAAM,WAAW,GAAG,EAAE,CAAC;AACvB,MAAM,CAAC,OAAO,CAAC,YAAY,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,OAAO,EAAE,EAAE,SAAS,EAAE,CAAC,EAAE,EAAE;IAChE,WAAW,CAAC,OAAO,CAAC;QAClB,SAAS,CAAC,MAAM,CAAC,CAAC,QAAQ,EAAE,EAAE,CAAC,QAAQ,CAAC,IAAI,KAAK,SAAS,CAAC,CAAC,CAAC,CAAC,EAAE,OAAO,CAAC,CAAC,CAAC;YACxE,EAAE,GAAG,IAAI,+BAA+B,CAAC;AAC/C,CAAC,CAAC,CAAC;AAEH,MAAM,CAAC,MAAM,oBAAoB,GAAG;IAClC,GAAG,EAAE,WAAW,CAAC,GAAG;IACpB,KAAK,EAAE,WAAW,CAAC,KAAK;IACxB,UAAU,EAAE,WAAW,CAAC,UAAU;IAClC,IAAI,EAAE,WAAW,CAAC,IAAI;IACtB,UAAU,EAAE,WAAW,CAAC,UAAU;IAClC,SAAS,EAAE,WAAW,CAAC,eAAe;IACtC,SAAS,EAAE,WAAW,CAAC,SAAS;IAChC,QAAQ,EAAE,WAAW,CAAC,GAAG;IACzB,UAAU,EAAE,WAAW,CAAC,KAAK;IAC7B,SAAS,EAAE,WAAW,CAAC,IAAI;IAC3B,eAAe,EAAE,WAAW,CAAC,UAAU;IACvC,eAAe,EAAE,WAAW,CAAC,UAAU;IACvC,cAAc,EAAE,WAAW,CAAC,eAAe;IAC3C,cAAc,EAAE,WAAW,CAAC,SAAS;IACrC,MAAM,EAAE,EAAE;CACX,CAAC;AAEF,MAAM,aAAa,GAAG,aAAa,GAAG,MAAM,CAAC;AAE7C,0EAA0E;AAC1E,SAAS,eAAe;IACtB,mDAAmD;IACnD,IAAI,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE,CAAC;QACjD,qBAAqB,CAAC,MAAM,EAAE,IAAI,CAAC,SAAS,CAAC,oBAAoB,CAAC,CAAC,CAAC;IACtE,CAAC;IAED,OAAO,IAAI,CAAC,KAAK,CAAC,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC,CAAC;AACzD,CAAC;AAED,yEAAyE;AACzE,SAAS,cAAc,CAAC,OAAO;IAC7B,mDAAmD;IACnD,IAAI,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE,CAAC;QACjD,qBAAqB,CAAC,MAAM,EAAE,IAAI,CAAC,SAAS,CAAC,oBAAoB,CAAC,CAAC,CAAC;IACtE,CAAC;IAED,sCAAsC;IACtC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,0CAA0C;IAC1C,OAAO,OAAO,CAAC,OAAO,CAAC,CAAC;AAC1B,CAAC;AAED;;;;EAIE;AACF,SAAS,eAAe,CAAC,cAAc;IACrC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,MAAM,UAAU,GAAG,EAAE,GAAG,OAAO,EAAE,GAAG,cAAc,EAAE,CAAC;IAErD,YAAY,CAAC,OAAO,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,UAAU,CAAC,CAAC,CAAC;AAClE,CAAC;AAED,yDAAyD;AACzD,SAAS,iBAAiB,CAAC,OAAO;IAChC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,OAAO,CAAC,OAAO,CAAC,GAAG,EAAE,CAAC;IAEtB,YAAY,CAAC,OAAO,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,CAAC,CAAC;AAC/D,CAAC;AAED;;;;EAIE;AAEF,MAAM,sBAAsB,GAAG,aAAa,GAAG,eAAe,CAAC;AAE/D,wBAAwB;AACxB,MAAM,uBAAuB,GAAG;IAC9B,QAAQ,EAAE,EAAE;IACZ,QAAQ,EAAE,EAAE;CACb,CAAC;AAEF,SAAS,gBAAgB;IACvB,4DAA4D;IAC5D,IAAI,YAAY,CAAC,OAAO,CAAC,sBAAsB,CAAC,KAAK,IAAI,EAAE,CAAC;QAC1D,qBAAqB,CACnB,eAAe,EACf,IAAI,CAAC,SAAS,CAAC,uBAAuB,CAAC,CACxC,CAAC;IACJ,CAAC;IAED,oDAAoD;IACpD,MAAM,kBAAkB,GAAG,YAAY,CAAC,OAAO,CAAC,sBAAsB,CAAC,CAAC;IACxE,MAAM,eAAe,GAAG,IAAI,CAAC,KAAK,CAAC,kBAAkB,CAAC,CAAC;IAEvD,kCAAkC;IAClC,OAAO,eAAe,CAAC;AACzB,CAAC;AAED;;;;;;;;;EASE;AACF,SAAS,kBAAkB,CAAC,cAAc,EAAE,gBAAgB;IAC1D,MAAM,gBAAgB,GAAG,gBAAgB,EAAE,CAAC;IAC5C,MAAM,iBAAiB,GAAG,gBAAgB,CAAC,GAAG,gBAAgB,GAAG,CAAC,CAAC;IAEnE,OAAO,iBAAiB,CAAC,QAAQ,CAAC,cAAc,CAAC,CAAC;AACpD,CAAC;AAED;;;;;;;;EAQE;AACF,SAAS,qBAAqB,CAAC,cAAc,EAAE,gBAAgB;IAC7D,MAAM,gBAAgB,GAAG,gBAAgB,EAAE,CAAC;IAC5C,MAAM,iBAAiB,GAAG,gBAAgB,CAAC,GAAG,gBAAgB,GAAG,CAAC,CAAC;IAEnE,iBAAiB,CAAC,IAAI,CAAC,cAAc,CAAC,CAAC;IACvC,gBAAgB,CAAC,gBAAgB,GAAG,GAAG,CAAC,GAAG,iBAAiB,CAAC;IAE7D,YAAY,CAAC,OAAO,CAClB,sBAAsB,EACtB,IAAI,CAAC,SAAS,CAAC,gBAAgB,CAAC,CACjC,CAAC;AACJ,CAAC;AAED,8JAA8J;AAC9J,OAAO,EACL,WAAW,EACX,qBAAqB,EACrB,aAAa,EACb,aAAa,EACb,cAAc,EACd,eAAe,EACf,cAAc,EACd,eAAe,EACf,iBAAiB,EACjB,gBAAgB,EAChB,kBAAkB,EAClB,qBAAqB,GACtB,CAAC"}
|
|
@ -0,0 +1,71 @@
|
|||
"""Debug test to show actual metrics output."""
|
||||
|
||||
import os
|
||||
import requests
|
||||
|
||||
|
||||
def test_show_actual_metrics():
|
||||
"""Display actual metrics from Core instance."""
|
||||
|
||||
# Get token
|
||||
token = os.environ.get("INFLUXDB3_CORE_TOKEN")
|
||||
headers = {"Authorization": f"Token {token}"} if token else {}
|
||||
|
||||
# Fetch metrics
|
||||
url = "http://influxdb3-core:8181"
|
||||
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
|
||||
|
||||
print(f"\n{'='*80}")
|
||||
print(f"ACTUAL METRICS FROM {url}")
|
||||
print(f"Status Code: {response.status_code}")
|
||||
print(f"Using Auth: {'Yes' if token else 'No'}")
|
||||
print(f"{'='*80}\n")
|
||||
|
||||
if response.status_code == 200:
|
||||
lines = response.text.split('\n')
|
||||
print(f"Total lines: {len(lines)}\n")
|
||||
|
||||
# Show first 100 lines
|
||||
print("First 100 lines of actual output:\n")
|
||||
for i, line in enumerate(lines[:100], 1):
|
||||
print(f"{i:4d} | {line}")
|
||||
|
||||
# Show examples of documented metrics
|
||||
print(f"\n{'='*80}")
|
||||
print("SEARCHING FOR DOCUMENTED METRICS:")
|
||||
print(f"{'='*80}\n")
|
||||
|
||||
documented_metrics = [
|
||||
"http_requests_total",
|
||||
"grpc_requests_total",
|
||||
"influxdb3_catalog_operations_total",
|
||||
"influxdb_iox_query_log_compute_duration_seconds",
|
||||
"datafusion_mem_pool_bytes",
|
||||
"object_store_op_duration_seconds",
|
||||
"jemalloc_memstats_bytes",
|
||||
]
|
||||
|
||||
for metric in documented_metrics:
|
||||
# Find TYPE and HELP lines
|
||||
type_line = next((line for line in lines if f"# TYPE {metric}" in line), None)
|
||||
help_line = next((line for line in lines if f"# HELP {metric}" in line), None)
|
||||
|
||||
# Find first few data lines
|
||||
data_lines = [line for line in lines if line.startswith(metric) and not line.startswith("#")][:3]
|
||||
|
||||
if type_line or help_line or data_lines:
|
||||
print(f"\n✓ {metric}:")
|
||||
if help_line:
|
||||
print(f" {help_line}")
|
||||
if type_line:
|
||||
print(f" {type_line}")
|
||||
for data in data_lines:
|
||||
print(f" {data}")
|
||||
else:
|
||||
print(f"\n✗ {metric}: NOT FOUND")
|
||||
else:
|
||||
print(f"ERROR: Status {response.status_code}")
|
||||
print(response.text[:500])
|
||||
|
||||
# Always pass so we can see the output
|
||||
assert True
|
|
@ -0,0 +1,320 @@
|
|||
"""Test InfluxDB 3 metrics endpoint for PR #6422.
|
||||
|
||||
This test suite validates that the metrics documentation in PR #6422 is accurate
|
||||
by checking that all documented metrics are actually exposed by the
|
||||
InfluxDB 3 Core and Enterprise instances.
|
||||
|
||||
Usage:
|
||||
# Basic test execution
|
||||
docker compose run --rm influxdb3-core-pytest test/metrics_endpoint_test.py
|
||||
|
||||
# With verbose output (shows actual metrics and matches)
|
||||
VERBOSE_METRICS_TEST=true docker compose run --rm influxdb3-core-pytest test/metrics_endpoint_test.py
|
||||
|
||||
# Using the wrapper script (recommended)
|
||||
./test/run-metrics-tests.sh
|
||||
|
||||
# With verbose output using wrapper script
|
||||
VERBOSE_METRICS_TEST=true ./test/run-metrics-tests.sh
|
||||
|
||||
Verbose Output:
|
||||
Set VERBOSE_METRICS_TEST=true to see detailed output showing:
|
||||
- Which metrics are being searched for
|
||||
- Actual matching lines from the Prometheus metrics endpoint
|
||||
- Total occurrence counts (for tests that include comments)
|
||||
- Clear indication when metrics are not found
|
||||
|
||||
Example verbose output:
|
||||
TEST: HTTP/gRPC Metrics
|
||||
================================================================================
|
||||
|
||||
✓ Searching for: http_requests_total
|
||||
Found 12 total occurrences
|
||||
Matches:
|
||||
# HELP http_requests_total accumulated total requests
|
||||
# TYPE http_requests_total counter
|
||||
http_requests_total{method="GET",path="/metrics",status="aborted"} 0
|
||||
|
||||
Authentication:
|
||||
These tests require authentication tokens for InfluxDB 3 Core and Enterprise.
|
||||
If you get 401 errors, set the following environment variables:
|
||||
- INFLUXDB3_CORE_TOKEN: Admin token for InfluxDB 3 Core instance
|
||||
- INFLUXDB3_ENTERPRISE_TOKEN: Admin token for InfluxDB 3 Enterprise instance
|
||||
|
||||
Prerequisites:
|
||||
- Docker and Docker Compose installed
|
||||
- Running InfluxDB 3 Core and Enterprise containers
|
||||
- Valid authentication tokens stored in ~/.env.influxdb3-core-admin-token
|
||||
and ~/.env.influxdb3-enterprise-admin-token (for wrapper script)
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
|
||||
# Set to True to see detailed output of what's being checked
|
||||
VERBOSE_OUTPUT = os.environ.get("VERBOSE_METRICS_TEST", "false").lower() == "true"
|
||||
|
||||
|
||||
class MetricsHelper:
|
||||
"""Helper class for metrics endpoint testing."""
|
||||
|
||||
@staticmethod
|
||||
def get_auth_headers(token_env_var):
|
||||
"""Get authorization headers if token is set."""
|
||||
token = os.environ.get(token_env_var)
|
||||
if token:
|
||||
return {"Authorization": f"Token {token}"}
|
||||
return {}
|
||||
|
||||
@staticmethod
|
||||
def get_metrics(url, token_env_var):
|
||||
"""Get metrics from endpoint with optional authentication."""
|
||||
headers = MetricsHelper.get_auth_headers(token_env_var)
|
||||
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
|
||||
|
||||
if response.status_code == 401:
|
||||
pytest.skip(f"Authentication required. Set {token_env_var} environment variable.")
|
||||
|
||||
assert response.status_code == 200, f"Metrics returned {response.status_code}"
|
||||
return response.text
|
||||
|
||||
@staticmethod
|
||||
def print_metric_search(test_name, metrics, text, include_comments=False):
|
||||
"""Print verbose output showing searched metrics and matches."""
|
||||
if not VERBOSE_OUTPUT:
|
||||
return
|
||||
|
||||
print("\n" + "="*80)
|
||||
print(f"TEST: {test_name}")
|
||||
print("="*80)
|
||||
|
||||
for metric in metrics:
|
||||
lines = text.split('\n')
|
||||
if include_comments:
|
||||
matches = [line for line in lines if metric in line][:3]
|
||||
else:
|
||||
matches = [line for line in lines if metric in line and not line.startswith("#")][:3]
|
||||
|
||||
print(f"\n✓ Searching for: {metric}")
|
||||
if matches:
|
||||
if include_comments:
|
||||
print(f" Found {len([l for l in lines if metric in l])} total occurrences")
|
||||
print(" Matches:")
|
||||
for match in matches:
|
||||
print(f" {match}")
|
||||
else:
|
||||
print(f" ✗ NOT FOUND")
|
||||
|
||||
|
||||
def test_core_metrics_endpoint_accessible():
|
||||
"""Test that Core metrics endpoint is accessible."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
assert len(text) > 0, "Core metrics response is empty"
|
||||
|
||||
|
||||
def test_enterprise_metrics_endpoint_accessible():
|
||||
"""Test that Enterprise metrics endpoint is accessible."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN")
|
||||
assert len(text) > 0, "Enterprise metrics response is empty"
|
||||
|
||||
|
||||
def test_prometheus_format():
|
||||
"""Test that metrics follow Prometheus exposition format."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
# Check for HELP comments
|
||||
assert "# HELP" in text, "Missing HELP comments"
|
||||
|
||||
# Check for TYPE comments
|
||||
assert "# TYPE" in text, "Missing TYPE comments"
|
||||
|
||||
# Check for valid metric lines (name{labels} value or name value)
|
||||
metric_pattern = r"^[a-zA-Z_][a-zA-Z0-9_]*(\{[^}]*\})?\s+[\d\.\+\-eE]+(\s+\d+)?$"
|
||||
lines = [line for line in text.split("\n") if line and not line.startswith("#")]
|
||||
assert any(
|
||||
re.match(metric_pattern, line) for line in lines
|
||||
), "No valid metric lines found"
|
||||
|
||||
|
||||
def test_http_grpc_metrics():
|
||||
"""Test HTTP and gRPC metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"http_requests_total",
|
||||
"http_request_duration_seconds",
|
||||
"http_response_body_size_bytes",
|
||||
"grpc_requests_total",
|
||||
"grpc_request_duration_seconds",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("HTTP/gRPC Metrics", metrics, text, include_comments=True)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing HTTP/gRPC metrics: {missing}"
|
||||
|
||||
|
||||
def test_database_operation_metrics():
|
||||
"""Test database operation metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"influxdb3_catalog_operations_total",
|
||||
"influxdb3_catalog_operation_retries_total",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Database Operation Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing database operation metrics: {missing}"
|
||||
|
||||
|
||||
def test_query_performance_metrics():
|
||||
"""Test query performance metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"influxdb_iox_query_log_compute_duration_seconds",
|
||||
"influxdb_iox_query_log_execute_duration_seconds",
|
||||
"influxdb_iox_query_log_plan_duration_seconds",
|
||||
"influxdb_iox_query_log_end2end_duration_seconds",
|
||||
"influxdb_iox_query_log_max_memory",
|
||||
"influxdb_iox_query_log_parquet_files",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Query Performance Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing query performance metrics: {missing}"
|
||||
|
||||
|
||||
def test_memory_caching_metrics():
|
||||
"""Test memory and caching metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"datafusion_mem_pool_bytes",
|
||||
"influxdb3_parquet_cache_access_total",
|
||||
"influxdb3_parquet_cache_size_bytes",
|
||||
"influxdb3_parquet_cache_size_number_of_files",
|
||||
"jemalloc_memstats_bytes",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Memory & Caching Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing memory/caching metrics: {missing}"
|
||||
|
||||
|
||||
def test_object_storage_metrics():
|
||||
"""Test object storage metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"object_store_op_duration_seconds",
|
||||
"object_store_transfer_bytes_total",
|
||||
"object_store_transfer_objects_total",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Object Storage Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing object storage metrics: {missing}"
|
||||
|
||||
|
||||
def test_runtime_system_metrics():
|
||||
"""Test runtime and system metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
metrics = [
|
||||
"process_start_time_seconds",
|
||||
"thread_panic_count_total",
|
||||
"tokio_runtime_num_alive_tasks",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Runtime & System Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing runtime/system metrics: {missing}"
|
||||
|
||||
|
||||
def test_metric_types():
|
||||
"""Test that key metrics have correct types."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
|
||||
|
||||
# Check for expected types (case-insensitive partial match)
|
||||
type_checks = [
|
||||
("http_requests_total", "counter"),
|
||||
("http_request_duration_seconds", "histogram"),
|
||||
("datafusion_mem_pool_bytes", "gauge"),
|
||||
]
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print("TEST: Metric Type Validation")
|
||||
print("="*80)
|
||||
for metric_name, expected_type in type_checks:
|
||||
type_pattern = rf"# TYPE {metric_name}\s+{expected_type}"
|
||||
match = re.search(type_pattern, text, re.IGNORECASE)
|
||||
print(f"\n✓ Checking: {metric_name} should be {expected_type}")
|
||||
if match:
|
||||
print(f" Match: {match.group()}")
|
||||
else:
|
||||
print(f" ✗ NOT FOUND or WRONG TYPE")
|
||||
|
||||
for metric_name, expected_type in type_checks:
|
||||
# Look for TYPE line for this metric
|
||||
type_pattern = rf"# TYPE {metric_name}\s+{expected_type}"
|
||||
assert re.search(
|
||||
type_pattern, text, re.IGNORECASE
|
||||
), f"Metric {metric_name} should be type {expected_type}"
|
||||
|
||||
|
||||
def test_enterprise_cluster_metrics():
|
||||
"""Test Enterprise-specific cluster metrics exist."""
|
||||
text = MetricsHelper.get_metrics("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN")
|
||||
|
||||
# These metrics are mentioned in Enterprise documentation
|
||||
metrics = [
|
||||
"influxdb3_catalog_operation_retries_total",
|
||||
"influxdb_iox_query_log_ingester_latency",
|
||||
]
|
||||
|
||||
MetricsHelper.print_metric_search("Enterprise Cluster Metrics", metrics, text)
|
||||
|
||||
missing = [m for m in metrics if m not in text]
|
||||
assert not missing, f"Missing Enterprise cluster metrics: {missing}"
|
||||
|
||||
|
||||
@pytest.mark.parametrize("url,token_env,instance", [
|
||||
("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN", "Core"),
|
||||
("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN", "Enterprise")
|
||||
])
|
||||
def test_metrics_have_labels(url, token_env, instance):
|
||||
"""Test that metrics have proper labels."""
|
||||
text = MetricsHelper.get_metrics(url, token_env)
|
||||
|
||||
# Find a metric with labels (look for http_requests_total)
|
||||
label_pattern = r'http_requests_total\{[^}]+\}'
|
||||
matches = re.findall(label_pattern, text)
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print(f"TEST: Metric Label Validation ({instance})")
|
||||
print("="*80)
|
||||
print(f"\n✓ Searching for labeled metrics using pattern: {label_pattern}")
|
||||
print(f" Found {len(matches)} labeled metrics")
|
||||
if matches:
|
||||
print(" Sample matches:")
|
||||
for match in matches[:3]:
|
||||
print(f" {match}")
|
||||
|
||||
assert len(matches) > 0, f"{instance}: No metrics with labels found"
|
||||
|
||||
# Check that labels are properly formatted
|
||||
for match in matches:
|
||||
assert '="' in match, f"{instance}: Labels should use = and quotes"
|
||||
assert match.endswith("}"), f"{instance}: Labels should end with }}"
|
|
@ -0,0 +1,87 @@
|
|||
# Prometheus configuration for testing InfluxDB 3 metrics
|
||||
# Based on documentation in content/shared/influxdb3-admin/monitor-metrics.md
|
||||
# This configuration matches the examples provided in PR #6422
|
||||
|
||||
# NOTE: If your InfluxDB instance requires authentication for the /metrics endpoint,
|
||||
# you'll need to configure bearer_token or bearer_token_file in the scrape configs below.
|
||||
# See: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
|
||||
|
||||
global:
|
||||
scrape_interval: 30s
|
||||
evaluation_interval: 30s
|
||||
external_labels:
|
||||
monitor: 'influxdb3-test'
|
||||
|
||||
# Scrape configurations
|
||||
scrape_configs:
|
||||
# InfluxDB 3 Core
|
||||
# Documentation reference: lines 563-571 in monitor-metrics.md
|
||||
- job_name: 'influxdb3-core'
|
||||
static_configs:
|
||||
- targets: ['influxdb3-core:8181']
|
||||
labels:
|
||||
environment: 'test'
|
||||
product: 'core'
|
||||
metrics_path: '/metrics'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
# Authentication - uses credential file
|
||||
# Token is written to /tmp/core-token by docker-compose entrypoint
|
||||
authorization:
|
||||
credentials_file: /tmp/core-token
|
||||
# Fallback protocol for targets that don't send Content-Type
|
||||
fallback_scrape_protocol: 'PrometheusText0.0.4'
|
||||
|
||||
# Relabeling to add node identification (same as Enterprise)
|
||||
relabel_configs:
|
||||
# Extract node name from address
|
||||
- source_labels: [__address__]
|
||||
target_label: node_name
|
||||
regex: '([^:]+):.*'
|
||||
replacement: '${1}'
|
||||
# Add node role based on name pattern
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: '.*core.*'
|
||||
replacement: 'all-in-one-core'
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: '.*enterprise.*'
|
||||
replacement: 'all-in-one-enterprise'
|
||||
|
||||
# InfluxDB 3 Enterprise
|
||||
# Documentation reference: lines 399-418 in monitor-metrics.md
|
||||
# Includes relabeling from lines 536-553
|
||||
- job_name: 'influxdb3-enterprise'
|
||||
static_configs:
|
||||
- targets: ['influxdb3-enterprise:8181']
|
||||
labels:
|
||||
environment: 'test'
|
||||
product: 'enterprise'
|
||||
metrics_path: '/metrics'
|
||||
scrape_interval: 30s
|
||||
scrape_timeout: 10s
|
||||
# Authentication - uses credential file
|
||||
# Token is written to /tmp/enterprise-token by docker-compose entrypoint
|
||||
authorization:
|
||||
credentials_file: /tmp/enterprise-token
|
||||
# Fallback protocol for targets that don't send Content-Type
|
||||
fallback_scrape_protocol: 'PrometheusText0.0.4'
|
||||
|
||||
# Relabeling to add node identification
|
||||
# Documentation reference: lines 536-553
|
||||
relabel_configs:
|
||||
# Extract node name from address
|
||||
- source_labels: [__address__]
|
||||
target_label: node_name
|
||||
regex: '([^:]+):.*'
|
||||
replacement: '${1}'
|
||||
# Add node role based on name pattern
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: '.*core.*'
|
||||
replacement: 'all-in-one-core'
|
||||
- source_labels: [node_name]
|
||||
target_label: node_role
|
||||
regex: '.*enterprise.*'
|
||||
replacement: 'all-in-one-enterprise'
|
|
@ -0,0 +1,391 @@
|
|||
"""Test Prometheus integration and relabeling for PR #6422.
|
||||
|
||||
This test suite validates that the Prometheus configuration and relabeling
|
||||
examples documented in PR #6422 actually work correctly.
|
||||
|
||||
Unlike metrics_endpoint_test.py which directly queries InfluxDB endpoints,
|
||||
this test:
|
||||
1. Starts Prometheus with the documented configuration
|
||||
2. Validates Prometheus can scrape InfluxDB endpoints
|
||||
3. Verifies relabeling rules add node_name and node_role labels
|
||||
4. Tests PromQL queries with the relabeled metrics
|
||||
|
||||
Usage:
|
||||
# Start Prometheus and run integration tests
|
||||
docker compose --profile monitoring up -d
|
||||
docker compose run --rm influxdb3-core-pytest test/prometheus_integration_test.py
|
||||
|
||||
# Or use the wrapper script
|
||||
./test/run-prometheus-tests.sh
|
||||
|
||||
Prerequisites:
|
||||
- Docker and Docker Compose installed
|
||||
- Running InfluxDB 3 Core and Enterprise containers
|
||||
- Prometheus service started with --profile monitoring
|
||||
- Valid authentication tokens (if required)
|
||||
"""
|
||||
|
||||
import os
|
||||
import time
|
||||
|
||||
import pytest
|
||||
import requests
|
||||
|
||||
# Prometheus API endpoint
|
||||
PROMETHEUS_URL = os.environ.get("PROMETHEUS_URL", "http://prometheus:9090")
|
||||
|
||||
# Set to True to see detailed output
|
||||
VERBOSE_OUTPUT = os.environ.get("VERBOSE_PROMETHEUS_TEST", "false").lower() == "true"
|
||||
|
||||
|
||||
class PrometheusHelper:
|
||||
"""Helper class for Prometheus integration testing."""
|
||||
|
||||
@staticmethod
|
||||
def wait_for_prometheus(timeout=30):
|
||||
"""Wait for Prometheus to be ready."""
|
||||
start_time = time.time()
|
||||
while time.time() - start_time < timeout:
|
||||
try:
|
||||
response = requests.get(f"{PROMETHEUS_URL}/-/ready", timeout=5)
|
||||
if response.status_code == 200:
|
||||
return True
|
||||
except requests.exceptions.RequestException:
|
||||
pass
|
||||
time.sleep(1)
|
||||
return False
|
||||
|
||||
@staticmethod
|
||||
def wait_for_targets(timeout=60):
|
||||
"""Wait for Prometheus to discover and scrape targets."""
|
||||
start_time = time.time()
|
||||
while time.time() - start_time < timeout:
|
||||
try:
|
||||
response = requests.get(
|
||||
f"{PROMETHEUS_URL}/api/v1/targets",
|
||||
timeout=5
|
||||
)
|
||||
if response.status_code == 200:
|
||||
data = response.json()
|
||||
active_targets = data.get("data", {}).get("activeTargets", [])
|
||||
|
||||
# Check if all targets are up
|
||||
all_up = all(
|
||||
target.get("health") == "up"
|
||||
for target in active_targets
|
||||
)
|
||||
|
||||
if all_up and len(active_targets) >= 2:
|
||||
if VERBOSE_OUTPUT:
|
||||
print(f"\n✓ All {len(active_targets)} targets are up")
|
||||
return True
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
up_count = sum(
|
||||
1 for t in active_targets
|
||||
if t.get("health") == "up"
|
||||
)
|
||||
print(f" Waiting for targets: {up_count}/{len(active_targets)} up")
|
||||
except requests.exceptions.RequestException as e:
|
||||
if VERBOSE_OUTPUT:
|
||||
print(f" Error checking targets: {e}")
|
||||
time.sleep(2)
|
||||
return False
|
||||
|
||||
@staticmethod
|
||||
def query_prometheus(query):
|
||||
"""Execute a PromQL query."""
|
||||
response = requests.get(
|
||||
f"{PROMETHEUS_URL}/api/v1/query",
|
||||
params={"query": query},
|
||||
timeout=10
|
||||
)
|
||||
assert response.status_code == 200, f"Query failed: {response.text}"
|
||||
return response.json()
|
||||
|
||||
@staticmethod
|
||||
def print_query_result(query, result):
|
||||
"""Print verbose query result."""
|
||||
if not VERBOSE_OUTPUT:
|
||||
return
|
||||
|
||||
print(f"\n✓ Query: {query}")
|
||||
data = result.get("data", {})
|
||||
result_type = data.get("resultType")
|
||||
results = data.get("result", [])
|
||||
|
||||
print(f" Result type: {result_type}")
|
||||
print(f" Number of results: {len(results)}")
|
||||
|
||||
if results:
|
||||
print(" Sample results:")
|
||||
for result in results[:3]:
|
||||
metric = result.get("metric", {})
|
||||
value = result.get("value", [None, None])
|
||||
print(f" {metric} => {value[1]}")
|
||||
|
||||
|
||||
def test_prometheus_is_ready():
|
||||
"""Test that Prometheus service is ready."""
|
||||
assert PrometheusHelper.wait_for_prometheus(), (
|
||||
"Prometheus not ready after 30 seconds. "
|
||||
"Ensure Prometheus is running: docker compose --profile monitoring up -d"
|
||||
)
|
||||
|
||||
|
||||
def test_prometheus_targets_discovered():
|
||||
"""Test that Prometheus has discovered InfluxDB targets."""
|
||||
response = requests.get(f"{PROMETHEUS_URL}/api/v1/targets", timeout=10)
|
||||
assert response.status_code == 200, "Failed to get targets"
|
||||
|
||||
data = response.json()
|
||||
targets = data.get("data", {}).get("activeTargets", [])
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print("TEST: Prometheus Target Discovery")
|
||||
print("="*80)
|
||||
for target in targets:
|
||||
health = target.get("health")
|
||||
job = target.get("labels", {}).get("job")
|
||||
address = target.get("scrapeUrl")
|
||||
print(f"\n✓ Target: {job}")
|
||||
print(f" Health: {health}")
|
||||
print(f" Address: {address}")
|
||||
|
||||
# Should have at least 2 targets (core and enterprise)
|
||||
assert len(targets) >= 2, f"Expected at least 2 targets, found {len(targets)}"
|
||||
|
||||
# Check for expected job names
|
||||
job_names = {target.get("labels", {}).get("job") for target in targets}
|
||||
assert "influxdb3-core" in job_names, "Missing influxdb3-core target"
|
||||
assert "influxdb3-enterprise" in job_names, "Missing influxdb3-enterprise target"
|
||||
|
||||
|
||||
def test_prometheus_targets_up():
|
||||
"""Test that all Prometheus targets are healthy."""
|
||||
assert PrometheusHelper.wait_for_targets(), (
|
||||
"Targets not healthy after 60 seconds. "
|
||||
"Check that InfluxDB instances are running and accessible."
|
||||
)
|
||||
|
||||
response = requests.get(f"{PROMETHEUS_URL}/api/v1/targets", timeout=10)
|
||||
data = response.json()
|
||||
targets = data.get("data", {}).get("activeTargets", [])
|
||||
|
||||
unhealthy = [
|
||||
target for target in targets
|
||||
if target.get("health") != "up"
|
||||
]
|
||||
|
||||
assert not unhealthy, (
|
||||
f"Found {len(unhealthy)} unhealthy targets: "
|
||||
f"{[t.get('labels', {}).get('job') for t in unhealthy]}"
|
||||
)
|
||||
|
||||
|
||||
def test_relabeling_adds_node_name():
|
||||
"""Test that relabeling adds node_name label.
|
||||
|
||||
Documentation reference: monitor-metrics.md lines 536-540
|
||||
Relabeling extracts hostname from __address__ and adds as node_name.
|
||||
"""
|
||||
# Wait for metrics to be scraped
|
||||
time.sleep(5)
|
||||
|
||||
# Query for any metric with node_name label
|
||||
query = 'http_requests_total{node_name!=""}'
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
data = result.get("data", {})
|
||||
results = data.get("result", [])
|
||||
|
||||
assert len(results) > 0, (
|
||||
"No metrics found with node_name label. "
|
||||
"Relabeling may not be working correctly."
|
||||
)
|
||||
|
||||
# Verify node_name values match expected patterns
|
||||
node_names = {
|
||||
result.get("metric", {}).get("node_name")
|
||||
for result in results
|
||||
}
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print(f"\n✓ Found node_name labels: {node_names}")
|
||||
|
||||
# Should have node names for both core and enterprise
|
||||
assert any("core" in name for name in node_names), (
|
||||
"No node_name containing 'core' found"
|
||||
)
|
||||
assert any("enterprise" in name for name in node_names), (
|
||||
"No node_name containing 'enterprise' found"
|
||||
)
|
||||
|
||||
|
||||
def test_relabeling_adds_node_role():
|
||||
"""Test that relabeling adds node_role label.
|
||||
|
||||
Documentation reference: monitor-metrics.md lines 541-553
|
||||
Relabeling assigns node_role based on node_name pattern.
|
||||
"""
|
||||
# Wait for metrics to be scraped
|
||||
time.sleep(5)
|
||||
|
||||
# Query for metrics with node_role label
|
||||
query = 'http_requests_total{node_role!=""}'
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
data = result.get("data", {})
|
||||
results = data.get("result", [])
|
||||
|
||||
assert len(results) > 0, (
|
||||
"No metrics found with node_role label. "
|
||||
"Relabeling may not be working correctly."
|
||||
)
|
||||
|
||||
# Verify node_role values
|
||||
node_roles = {
|
||||
result.get("metric", {}).get("node_role")
|
||||
for result in results
|
||||
}
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print(f"\n✓ Found node_role labels: {node_roles}")
|
||||
|
||||
# Based on test/prometheus.yml relabeling rules
|
||||
expected_roles = {"all-in-one-core", "all-in-one-enterprise"}
|
||||
assert node_roles & expected_roles, (
|
||||
f"Expected roles {expected_roles}, found {node_roles}"
|
||||
)
|
||||
|
||||
|
||||
def test_query_metrics_by_node():
|
||||
"""Test that metrics can be queried by node labels.
|
||||
|
||||
This validates that users can filter metrics by node_name and node_role
|
||||
as documented in the monitoring guide.
|
||||
"""
|
||||
# Wait for metrics to be scraped
|
||||
time.sleep(5)
|
||||
|
||||
# Query metrics for specific node
|
||||
queries = [
|
||||
'http_requests_total{node_name="influxdb3-core"}',
|
||||
'http_requests_total{node_name="influxdb3-enterprise"}',
|
||||
'http_requests_total{node_role="all-in-one-core"}',
|
||||
'http_requests_total{node_role="all-in-one-enterprise"}',
|
||||
]
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print("TEST: Query Metrics by Node Labels")
|
||||
print("="*80)
|
||||
|
||||
for query in queries:
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
data = result.get("data", {})
|
||||
results = data.get("result", [])
|
||||
|
||||
assert len(results) > 0, f"No results for query: {query}"
|
||||
|
||||
|
||||
def test_promql_rate_query():
|
||||
"""Test rate() query from documentation examples.
|
||||
|
||||
Documentation commonly shows rate queries for counters.
|
||||
"""
|
||||
# Wait for enough data
|
||||
time.sleep(10)
|
||||
|
||||
query = 'rate(http_requests_total[1m])'
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
data = result.get("data", {})
|
||||
results = data.get("result", [])
|
||||
|
||||
# Should have results (may be 0 if no recent requests)
|
||||
assert isinstance(results, list), "Expected list of results"
|
||||
|
||||
|
||||
def test_promql_histogram_quantile():
|
||||
"""Test histogram_quantile() query from documentation examples.
|
||||
|
||||
Documentation reference: Example queries for query duration metrics.
|
||||
"""
|
||||
# Wait for enough data
|
||||
time.sleep(10)
|
||||
|
||||
query = 'histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[1m]))'
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
# Query should execute without error
|
||||
assert result.get("status") == "success", (
|
||||
f"Query failed: {result.get('error')}"
|
||||
)
|
||||
|
||||
|
||||
def test_enterprise_metrics_queryable():
|
||||
"""Test that Enterprise-specific metrics are queryable via Prometheus."""
|
||||
# Wait for metrics to be scraped
|
||||
time.sleep(5)
|
||||
|
||||
# Query Enterprise-specific metrics
|
||||
queries = [
|
||||
'influxdb3_catalog_operation_retries_total',
|
||||
'influxdb_iox_query_log_ingester_latency',
|
||||
]
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print("TEST: Enterprise-Specific Metrics")
|
||||
print("="*80)
|
||||
|
||||
for query in queries:
|
||||
result = PrometheusHelper.query_prometheus(query)
|
||||
PrometheusHelper.print_query_result(query, result)
|
||||
|
||||
# Query should execute (may have no results if no activity)
|
||||
assert result.get("status") == "success", (
|
||||
f"Query failed: {result.get('error')}"
|
||||
)
|
||||
|
||||
|
||||
def test_prometheus_config_matches_docs():
|
||||
"""Verify Prometheus configuration matches documented examples.
|
||||
|
||||
This test validates that test/prometheus.yml matches the configuration
|
||||
examples in the documentation.
|
||||
"""
|
||||
response = requests.get(f"{PROMETHEUS_URL}/api/v1/status/config", timeout=10)
|
||||
assert response.status_code == 200, "Failed to get Prometheus config"
|
||||
|
||||
config = response.json()
|
||||
config_yaml = config.get("data", {}).get("yaml", "")
|
||||
|
||||
if VERBOSE_OUTPUT:
|
||||
print("\n" + "="*80)
|
||||
print("TEST: Prometheus Configuration")
|
||||
print("="*80)
|
||||
print("\nConfiguration (first 500 chars):")
|
||||
print(config_yaml[:500])
|
||||
|
||||
# Verify key configuration elements from documentation
|
||||
assert "influxdb3-core" in config_yaml, "Missing influxdb3-core job"
|
||||
assert "influxdb3-enterprise" in config_yaml, "Missing influxdb3-enterprise job"
|
||||
assert "relabel_configs" in config_yaml, "Missing relabel_configs"
|
||||
assert "node_name" in config_yaml, "Missing node_name in relabeling"
|
||||
assert "node_role" in config_yaml, "Missing node_role in relabeling"
|
||||
|
||||
# Verify scrape settings
|
||||
assert "/metrics" in config_yaml, "Missing /metrics path"
|
|
@ -0,0 +1,50 @@
|
|||
#!/bin/bash
|
||||
# Run metrics endpoint tests with authentication
|
||||
#
|
||||
# Usage:
|
||||
# ./test/run-metrics-tests.sh # Run direct metrics tests
|
||||
# ./test/run-metrics-tests.sh --prometheus # Run Prometheus integration tests
|
||||
# ./test/run-metrics-tests.sh --all # Run both test suites
|
||||
|
||||
set -e
|
||||
|
||||
# Read tokens from secret files
|
||||
INFLUXDB3_CORE_TOKEN=$(cat ~/.env.influxdb3-core-admin-token)
|
||||
INFLUXDB3_ENTERPRISE_TOKEN=$(cat ~/.env.influxdb3-enterprise-admin-token)
|
||||
|
||||
# Export for docker compose
|
||||
export INFLUXDB3_CORE_TOKEN
|
||||
export INFLUXDB3_ENTERPRISE_TOKEN
|
||||
export VERBOSE_METRICS_TEST
|
||||
|
||||
# Parse arguments
|
||||
RUN_DIRECT=true
|
||||
RUN_PROMETHEUS=false
|
||||
|
||||
if [[ "$1" == "--prometheus" ]]; then
|
||||
RUN_DIRECT=false
|
||||
RUN_PROMETHEUS=true
|
||||
shift
|
||||
elif [[ "$1" == "--all" ]]; then
|
||||
RUN_DIRECT=true
|
||||
RUN_PROMETHEUS=true
|
||||
shift
|
||||
fi
|
||||
|
||||
# Run direct metrics tests
|
||||
if [[ "$RUN_DIRECT" == "true" ]]; then
|
||||
echo "Running direct metrics endpoint tests..."
|
||||
docker compose run --rm \
|
||||
-e INFLUXDB3_CORE_TOKEN \
|
||||
-e INFLUXDB3_ENTERPRISE_TOKEN \
|
||||
-e VERBOSE_METRICS_TEST \
|
||||
influxdb3-core-pytest \
|
||||
"test/influxdb3/metrics_endpoint_test.py" "$@"
|
||||
echo ""
|
||||
fi
|
||||
|
||||
# Run Prometheus integration tests
|
||||
if [[ "$RUN_PROMETHEUS" == "true" ]]; then
|
||||
echo "Running Prometheus integration tests..."
|
||||
./test/influxdb3/run-prometheus-tests.sh "$@"
|
||||
fi
|
|
@ -0,0 +1,48 @@
|
|||
#!/bin/bash
|
||||
# Run Prometheus integration tests with authentication
|
||||
# This script validates that Prometheus can scrape InfluxDB metrics
|
||||
# and that relabeling configuration works as documented.
|
||||
|
||||
set -e
|
||||
|
||||
# Read tokens from secret files
|
||||
INFLUXDB3_CORE_TOKEN=$(cat ~/.env.influxdb3-core-admin-token)
|
||||
INFLUXDB3_ENTERPRISE_TOKEN=$(cat ~/.env.influxdb3-enterprise-admin-token)
|
||||
|
||||
# Export for docker compose
|
||||
export INFLUXDB3_CORE_TOKEN
|
||||
export INFLUXDB3_ENTERPRISE_TOKEN
|
||||
export VERBOSE_PROMETHEUS_TEST
|
||||
|
||||
echo "Starting Prometheus integration tests..."
|
||||
echo ""
|
||||
echo "This will:"
|
||||
echo " 1. Start Prometheus with documented configuration"
|
||||
echo " 2. Wait for Prometheus to scrape InfluxDB endpoints"
|
||||
echo " 3. Validate relabeling adds node_name and node_role labels"
|
||||
echo " 4. Test PromQL queries with relabeled metrics"
|
||||
echo ""
|
||||
|
||||
# Start Prometheus if not already running
|
||||
if ! docker ps | grep -q prometheus; then
|
||||
echo "Starting Prometheus service..."
|
||||
docker compose --profile monitoring up -d prometheus
|
||||
echo "Waiting for Prometheus to start..."
|
||||
sleep 5
|
||||
fi
|
||||
|
||||
# Run tests
|
||||
echo "Running Prometheus integration tests..."
|
||||
docker compose run --rm \
|
||||
-e INFLUXDB3_CORE_TOKEN \
|
||||
-e INFLUXDB3_ENTERPRISE_TOKEN \
|
||||
-e VERBOSE_PROMETHEUS_TEST \
|
||||
-e PROMETHEUS_URL=http://prometheus:9090 \
|
||||
influxdb3-core-pytest \
|
||||
"test/influxdb3/prometheus_integration_test.py" "$@"
|
||||
|
||||
echo ""
|
||||
echo "Tests complete!"
|
||||
echo ""
|
||||
echo "To view Prometheus UI, visit: http://localhost:9090"
|
||||
echo "To stop Prometheus: docker compose --profile monitoring down"
|
|
@ -0,0 +1,88 @@
|
|||
#!/usr/bin/env python3
|
||||
"""Display sample metrics output from InfluxDB 3 instances."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import requests
|
||||
|
||||
def show_metrics_sample(url, token_env_var, instance_name, num_lines=150):
|
||||
"""Fetch and display sample metrics."""
|
||||
print(f"\n{'='*80}")
|
||||
print(f"{instance_name} Metrics Sample (first {num_lines} lines)")
|
||||
print(f"URL: {url}/metrics")
|
||||
print(f"{'='*80}\n")
|
||||
|
||||
# Get auth headers
|
||||
headers = {}
|
||||
token = os.environ.get(token_env_var)
|
||||
if token:
|
||||
headers = {"Authorization": f"Token {token}"}
|
||||
print(f"✓ Using authentication token from {token_env_var}\n")
|
||||
else:
|
||||
print(f"⚠ No token found in {token_env_var} - trying without auth\n")
|
||||
|
||||
try:
|
||||
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
|
||||
|
||||
if response.status_code == 401:
|
||||
print(f"✗ Authentication required but no valid token provided")
|
||||
return
|
||||
|
||||
if response.status_code != 200:
|
||||
print(f"✗ Unexpected status code: {response.status_code}")
|
||||
return
|
||||
|
||||
# Display first N lines
|
||||
lines = response.text.split('\n')
|
||||
print(f"Total lines: {len(lines)}\n")
|
||||
|
||||
for i, line in enumerate(lines[:num_lines], 1):
|
||||
print(f"{i:4d} | {line}")
|
||||
|
||||
if len(lines) > num_lines:
|
||||
print(f"\n... ({len(lines) - num_lines} more lines)")
|
||||
|
||||
# Show some interesting metrics
|
||||
print(f"\n{'='*80}")
|
||||
print("Sample Metric Searches:")
|
||||
print(f"{'='*80}\n")
|
||||
|
||||
metrics_to_show = [
|
||||
"http_requests_total",
|
||||
"grpc_requests_total",
|
||||
"influxdb3_catalog_operations_total",
|
||||
"influxdb_iox_query_log_compute_duration_seconds",
|
||||
"datafusion_mem_pool_bytes",
|
||||
"object_store_op_duration_seconds",
|
||||
]
|
||||
|
||||
for metric in metrics_to_show:
|
||||
matching = [line for line in lines if metric in line and not line.startswith("#")]
|
||||
if matching:
|
||||
print(f"✓ Found '{metric}' - showing first 3 values:")
|
||||
for match in matching[:3]:
|
||||
print(f" {match}")
|
||||
else:
|
||||
print(f"✗ Metric '{metric}' not found")
|
||||
|
||||
except Exception as e:
|
||||
print(f"✗ Error fetching metrics: {e}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Show Core metrics
|
||||
show_metrics_sample(
|
||||
"http://influxdb3-core:8181",
|
||||
"INFLUXDB3_CORE_TOKEN",
|
||||
"InfluxDB 3 Core",
|
||||
num_lines=100
|
||||
)
|
||||
|
||||
print("\n\n")
|
||||
|
||||
# Show Enterprise metrics
|
||||
show_metrics_sample(
|
||||
"http://influxdb3-enterprise:8181",
|
||||
"INFLUXDB3_ENTERPRISE_TOKEN",
|
||||
"InfluxDB 3 Enterprise",
|
||||
num_lines=100
|
||||
)
|
|
@ -5,8 +5,8 @@
|
|||
python_files = *_test.py *_test_sh.py
|
||||
# Collect classes with names ending in Test.
|
||||
python_classes = *Test
|
||||
# Collect all functions.
|
||||
python_functions = *
|
||||
# Collect test functions (exclude helpers).
|
||||
python_functions = test_*
|
||||
|
||||
filterwarnings = ignore::pytest.PytestReturnNotNoneWarning
|
||||
# Log settings.
|
||||
|
|
Loading…
Reference in New Issue