Merge pull request #6422 from influxdata:influxdb3-monitor-metrics

Influxdb3 monitor metrics
6403-influxdb3-perf-tuning
Jason Stirnaman 2025-10-02 17:02:19 -05:00 committed by GitHub
commit f14c244316
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
31 changed files with 5357 additions and 13 deletions

View File

@ -121,6 +121,117 @@ Potential causes:
# This is ignored
```
### Metrics Endpoint Testing
The metrics testing suite validates InfluxDB 3 Core and Enterprise metrics in two phases:
1. **Phase 1: Direct metrics validation** - Validates metric format, existence, and types by directly querying InfluxDB endpoints
2. **Phase 2: Prometheus integration** - Validates Prometheus configuration, scraping, and relabeling work as documented
#### Phase 1: Direct Metrics Validation
The `test/influxdb3/metrics_endpoint_test.py` suite validates that InfluxDB 3 metrics endpoints expose all documented metrics in correct Prometheus format.
**Basic Usage:**
```bash
# Using the wrapper script (recommended)
./test/run-metrics-tests.sh
# Direct execution with Docker Compose
docker compose run --rm influxdb3-core-pytest test/influxdb3/metrics_endpoint_test.py
# Run specific test
docker compose run --rm influxdb3-core-pytest test/influxdb3/metrics_endpoint_test.py -k test_http_grpc_metrics
```
**Verbose Output:**
Set `VERBOSE_METRICS_TEST=true` to see detailed output showing which metrics are searched and the actual matching lines from the Prometheus endpoint:
```bash
# With wrapper script
VERBOSE_METRICS_TEST=true ./test/run-metrics-tests.sh
# With Docker Compose
VERBOSE_METRICS_TEST=true docker compose run --rm \
-e VERBOSE_METRICS_TEST \
influxdb3-core-pytest \
test/influxdb3/metrics_endpoint_test.py
```
Example verbose output:
```
TEST: HTTP/gRPC Metrics
================================================================================
✓ Searching for: http_requests_total
Found 12 total occurrences
Matches:
# HELP http_requests_total accumulated total requests
# TYPE http_requests_total counter
http_requests_total{method="GET",path="/metrics",status="aborted"} 0
```
#### Phase 2: Prometheus Integration Testing
The `test/influxdb3/prometheus_integration_test.py` suite validates that Prometheus can scrape InfluxDB metrics and that the documented relabeling configuration works correctly.
**What it validates:**
- Prometheus service discovers InfluxDB targets
- Scrape configuration works with authentication
- Relabeling adds `node_name` and `node_role` labels correctly
- Regex patterns in relabel_configs match documentation
- PromQL queries using relabeled metrics work
- Example queries from documentation execute successfully
**Basic Usage:**
```bash
# Using the wrapper script (recommended)
./test/run-metrics-tests.sh --prometheus
# Run both direct and Prometheus tests
./test/run-metrics-tests.sh --all
# Direct execution
./test/influxdb3/run-prometheus-tests.sh
# With verbose output
VERBOSE_PROMETHEUS_TEST=true ./test/influxdb3/run-prometheus-tests.sh
```
**What happens during Prometheus tests:**
1. Starts Prometheus service with documented configuration from `test/influxdb3/prometheus.yml`
2. Waits for Prometheus to discover and scrape both InfluxDB instances
3. Validates relabeling adds `node_name` label (extracted from `__address__`)
4. Validates relabeling adds `node_role` label (based on node name pattern)
5. Tests PromQL queries can filter by node labels
6. Validates example rate() and histogram_quantile() queries work
**Prerequisites:**
- All Phase 1 prerequisites (see below)
- Prometheus service enabled with: `docker compose --profile monitoring up -d`
#### Authentication
Tests require authentication tokens for InfluxDB 3 instances. Store tokens in:
- `~/.env.influxdb3-core-admin-token` (for Core)
- `~/.env.influxdb3-enterprise-admin-token` (for Enterprise)
Or set environment variables directly:
- `INFLUXDB3_CORE_TOKEN`
- `INFLUXDB3_ENTERPRISE_TOKEN`
#### Prerequisites
- Docker and Docker Compose installed
- Running InfluxDB 3 Core container (`influxdb3-core:8181`)
- Running InfluxDB 3 Enterprise container (`influxdb3-enterprise:8181`)
- Valid authentication tokens
- For Phase 2: Prometheus service (`docker compose --profile monitoring up -d`)
## Link Validation with Link-Checker
Link validation uses the `link-checker` tool to validate internal and external links in documentation files.

View File

@ -306,8 +306,8 @@ services:
working_dir: /app
influxdb3-core:
container_name: influxdb3-core
image: influxdb:3-core
pull_policy: always
image: influxdb:3.5.0-core-arm64
pull_policy: never
# Set variables (except your auth token) for Core in the .env.3core file.
env_file:
- .env.3core
@ -338,8 +338,8 @@ services:
- influxdb3-core-admin-token
influxdb3-enterprise:
container_name: influxdb3-enterprise
image: influxdb:3-enterprise
pull_policy: always
image: influxdb:3.5.0-enterprise-arm64
pull_policy: never
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
env_file:
- .env.3ent
@ -369,6 +369,74 @@ services:
target: /var/lib/influxdb3/plugins/custom
secrets:
- influxdb3-enterprise-admin-token
influxdb3-enterprise-write:
container_name: influxdb3-enterprise-write
image: influxdb:3.5.0-enterprise-arm64
pull_policy: never
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
env_file:
- .env.3ent
ports:
- 8183:8181
command:
- influxdb3
- serve
- --node-id=writer-0
- --mode=ingest
- --cluster-id=cluster0
- --object-store=file
- --data-dir=/var/lib/influxdb3/data
- --plugin-dir=/var/lib/influxdb3/plugins
- --log-filter=debug
- --verbose
environment:
- INFLUXDB3_AUTH_TOKEN=/run/secrets/influxdb3-enterprise-admin-token
volumes:
- type: bind
source: test/.influxdb3/enterprise/data
target: /var/lib/influxdb3/data
- type: bind
source: test/.influxdb3/plugins/influxdata
target: /var/lib/influxdb3/plugins
- type: bind
source: test/.influxdb3/enterprise/plugins
target: /var/lib/influxdb3/plugins/custom
secrets:
- influxdb3-enterprise-admin-token
influxdb3-enterprise-query:
container_name: influxdb3-enterprise-query
image: influxdb:3.5.0-enterprise-arm64
pull_policy: never
# Set license email and other variables (except your auth token) for Enterprise in the .env.3ent file.
env_file:
- .env.3ent
ports:
- 8184:8181
command:
- influxdb3
- serve
- --node-id=querier-0
- --mode=query
- --cluster-id=cluster0
- --object-store=file
- --data-dir=/var/lib/influxdb3/data
- --plugin-dir=/var/lib/influxdb3/plugins
- --log-filter=debug
- --verbose
environment:
- INFLUXDB3_AUTH_TOKEN=/run/secrets/influxdb3-enterprise-admin-token
volumes:
- type: bind
source: test/.influxdb3/enterprise/data
target: /var/lib/influxdb3/data
- type: bind
source: test/.influxdb3/plugins/influxdata
target: /var/lib/influxdb3/plugins
- type: bind
source: test/.influxdb3/enterprise/plugins
target: /var/lib/influxdb3/plugins/custom
secrets:
- influxdb3-enterprise-admin-token
telegraf-pytest:
container_name: telegraf-pytest
image: influxdata/docs-pytest
@ -499,7 +567,7 @@ services:
remark-lint:
container_name: remark-lint
build:
context: .
context: .
dockerfile: .ci/Dockerfile.remark
profiles:
- lint
@ -510,6 +578,39 @@ services:
- type: bind
source: ./CONTRIBUTING.md
target: /app/CONTRIBUTING.md
prometheus:
container_name: prometheus
image: prom/prometheus:latest
ports:
- "9090:9090"
environment:
- INFLUXDB3_CORE_TOKEN=${INFLUXDB3_CORE_TOKEN}
- INFLUXDB3_ENTERPRISE_TOKEN=${INFLUXDB3_ENTERPRISE_TOKEN}
volumes:
- type: bind
source: ./test/influxdb3/prometheus.yml
target: /etc/prometheus/prometheus.yml
read_only: true
- type: volume
source: prometheus-data
target: /prometheus
entrypoint:
- /bin/sh
- -c
- |
echo "$$INFLUXDB3_CORE_TOKEN" > /tmp/core-token
echo "$$INFLUXDB3_ENTERPRISE_TOKEN" > /tmp/enterprise-token
exec /bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/prometheus \
--web.console.libraries=/usr/share/prometheus/console_libraries \
--web.console.templates=/usr/share/prometheus/consoles \
--web.enable-lifecycle
depends_on:
- influxdb3-core
- influxdb3-enterprise
profiles:
- monitoring
volumes:
test-content:
cloud-tmp:
@ -517,4 +618,8 @@ volumes:
cloud-serverless-tmp:
clustered-tmp:
telegraf-tmp:
v2-tmp:
v2-tmp:
influxdb3-core-tmp:
influxdb2-data:
influxdb2-config:
prometheus-data:

View File

@ -0,0 +1,24 @@
---
title: Monitor metrics
seotitle: Monitor InfluxDB 3 Core metrics
description: >
Access and understand Prometheus-format metrics exposed by {{< product-name >}}
to monitor system performance, resource usage, and operational health.
menu:
influxdb3_core:
parent: Administer InfluxDB
name: Monitor metrics
weight: 110
influxdb3/core/tags: [monitoring, metrics, prometheus, observability, operations]
related:
- /influxdb3/core/reference/internals/runtime-architecture/
- /influxdb3/core/admin/performance-tuning/
- /influxdb3/core/plugins/library/, InfluxDB 3 Core plugins
- /influxdb3/core/write-data/use-telegraf/
- /influxdb3/core/reference/telemetry/
source: /shared/influxdb3-admin/monitor-metrics.md
---
<!--
//SOURCE - content/shared/influxdb3-admin/monitor-metrics.md
-->

View File

@ -0,0 +1,22 @@
---
title: Metrics
seotitle: InfluxDB 3 Core metrics reference
description: >
InfluxDB 3 Core exposes Prometheus-format metrics,
including descriptions, types, and labels for monitoring and observability.
menu:
influxdb3_core:
parent: Reference
weight: 106
influxdb3/core/tags: [metrics, prometheus, monitoring, reference, observability]
related:
- /influxdb3/core/admin/monitor-metrics/
- /influxdb3/core/reference/telemetry/
- /influxdb3/core/reference/internals/runtime-architecture/
source: /shared/influxdb3-reference/metrics.md
---
<!--
The content of this file is located at
//SOURCE - content/shared/influxdb3-reference/metrics.md
-->

View File

@ -9,7 +9,7 @@ description: >
menu:
influxdb3_enterprise:
parent: Administer InfluxDB
weight: 105
weight: 106
influxdb3/enterprise/tags: [cache]
related:
- /influxdb3/enterprise/reference/sql/functions/cache/#last_cache, last_cache SQL function

View File

@ -0,0 +1,26 @@
---
title: Monitor metrics
seotitle: Monitor {{< product-name >}} metrics
description: >
Access and understand Prometheus-format metrics exposed by {{< product-name >}}
to monitor distributed cluster performance, resource usage, and operational health.
menu:
influxdb3_enterprise:
parent: Administer InfluxDB
name: Monitor metrics
weight: 110
influxdb3/enterprise/tags: [monitoring, metrics, prometheus, observability, operations, clustering]
related:
- /influxdb3/enterprise/admin/clustering/
- /influxdb3/enterprise/reference/internals/runtime-architecture/
- /influxdb3/enterprise/admin/performance-tuning/
- /influxdb3/enterprise/plugins/library/, InfluxDB 3 Enterprise plugins
- /influxdb3/enterprise/write-data/use-telegraf/
- /influxdb3/enterprise/reference/telemetry/
source: /shared/influxdb3-admin/monitor-metrics.md
---
<!--
The content of this file is located at
//SOURCE - content/shared/influxdb3-admin/monitor-metrics.md
-->

View File

@ -402,12 +402,24 @@ In your terminal, run the `influxdb3 create token --permission` command and prov
- `health`: The specific system resource to grant permissions to.
- `read`: The permission to grant to the token (system tokens are always read-only).
{{% code-placeholders "System health token|1y" %}}
The following example shows how to create specific system tokens:
{{% code-placeholders "(System [a-z]*\s?token)|1y" %}}
```bash
influxdb3 create token \
--permission "system:health:read" \
--name "System health token" \
--expiry 1y
influxdb3 create token \
--permission "system:metrics:read" \
--name "System metrics token" \
--expiry 1y
influxdb3 create token \
--permission "system:ping:read" \
--name "System ping token" \
--expiry 1y
```
{{% /code-placeholders %}}
@ -444,9 +456,9 @@ In the request body, provide the following parameters:
- `permissions`: an array of token permission actions (only `"read"` for system tokens)
- `expiry_secs`: Specify the token expiration time in seconds.
The following example shows how to use the HTTP API to create a system token:
The following example shows how to use the HTTP API to create specific system tokens:
{{% code-placeholders "AUTH_TOKEN|System health token|300000" %}}
{{% code-placeholders "AUTH_TOKEN|(System [a-z]*\s?token)|1y|300000" %}}
```bash
curl \
@ -463,6 +475,36 @@ curl \
}],
"expiry_secs": 300000
}'
curl \
"http://{{< influxdb/host >}}/api/v3/enterprise/configure/token" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer AUTH_TOKEN" \
--data '{
"token_name": "System metrics token",
"permissions": [{
"resource_type": "system",
"resource_identifier": ["metrics"],
"actions": ["read"]
}],
"expiry_secs": 300000
}'
curl \
"http://{{< influxdb/host >}}/api/v3/enterprise/configure/token" \
--header 'Accept: application/json' \
--header 'Content-Type: application/json' \
--header "Authorization: Bearer AUTH_TOKEN" \
--data '{
"token_name": "System ping token",
"permissions": [{
"resource_type": "system",
"resource_identifier": ["ping"],
"actions": ["read"]
}],
"expiry_secs": 300000
}'
```
{{% /code-placeholders %}}

View File

@ -12,4 +12,4 @@ related:
source: /shared/influxdb3-plugins/plugins-library/official/system-metrics.md
---
<!-- //SOURCE - content/shared/influxdb3-plugins/plugins-library/official/system-metrics.md -->
<!-- //SOURCE - content/shared/influxdb3-plugins/plugins-library/official/system-metrics.md -->

View File

@ -0,0 +1,23 @@
---
title: Metrics
seotitle: InfluxDB 3 Enterprise metrics reference
description: >
InfluxDB 3 Enterprise exposes Prometheus-format metrics,
including descriptions, types, and labels for monitoring and observability.
menu:
influxdb3_enterprise:
parent: Reference
weight: 106
influxdb3/enterprise/tags: [metrics, prometheus, monitoring, reference, observability, clustering]
related:
- /influxdb3/enterprise/admin/monitor-metrics/
- /influxdb3/enterprise/admin/clustering/
- /influxdb3/enterprise/reference/telemetry/
- /influxdb3/enterprise/reference/internals/runtime-architecture/
source: /shared/influxdb3-reference/metrics.md
---
<!--
The content of this file is located at
//SOURCE - content/shared/influxdb3-reference/metrics.md
-->

View File

@ -0,0 +1,823 @@
Use InfluxDB metrics to monitor {{% show-in "enterprise" %}}distributed cluster {{% /show-in %}}system performance, resource usage, and operational health
with monitoring tools like Prometheus, Grafana, or other observability platforms.
{{% show-in "core" %}}
- [Access metrics](#access-metrics)
- [Metric categories](#metric-categories)
- [Key metrics for monitoring](#key-metrics-for-monitoring)
- [Example monitoring queries](#example-monitoring-queries)
- [Integration with monitoring tools](#integration-with-monitoring-tools)
- [Best practices](#best-practices)
{{% /show-in %}}
{{% show-in "enterprise" %}}
- [Access metrics](#access-metrics)
- [Metric categories](#metric-categories)
- [Cluster-specific metrics](#cluster-specific-metrics)
- [Node-specific monitoring](#node-specific-monitoring)
- [Example monitoring queries](#example-monitoring-queries)
- [Distributed monitoring setup](#distributed-monitoring-setup)
- [Best practices](#best-practices)
{{% /show-in %}}
## Access metrics
An {{< product-name >}} node exposes metrics at the `/metrics` endpoint on the HTTP port (default: 8181).
{{% api-endpoint method="GET" endpoint="http://localhost:8181/metrics" api-ref="/influxdb3/version/api/v3/#operation/GetMetrics"%}}
{{% show-in "core" %}}
### View metrics
```bash
# View all metrics
curl -s http://{{< influxdb/host >}}/metrics
# View specific metric patterns
curl -s http://{{< influxdb/host >}}/metrics | grep 'http_requests_total'
curl -s http://{{< influxdb/host >}}/metrics | grep 'influxdb3_'
# View metrics with authentication (if required)
curl -s -H "Authorization: Token AUTH_TOKEN" http://node:8181/metrics
```
{{% /show-in %}}
{{% show-in "enterprise" %}}
### View metrics from specific nodes
> [!Note]
> {{< product-name >}} supports two token types for the `/metrics` endpoint:
> - {{% token-link "Admin" %}}: Full access to all metrics
> - {{% token-link "Fine-grained" "resource/" %}} with `system:metrics:read` permission: Read-only access to metrics
```bash { placeholders="AUTH_TOKEN" }
# View metrics from specific nodes
curl -s http://ingester-01:8181/metrics
curl -s http://query-01:8181/metrics
curl -s http://compactor-01:8181/metrics
# View metrics with authentication (if required)
curl -s -H "Authorization: Token AUTH_TOKEN" http://node:8181/metrics
```
{{% /show-in %}}
Replace {{% code-placeholder-key %}}`AUTH_TOKEN`{{% /code-placeholder-key %}} with your {{< product-name >}} {{% token-link %}} that has read access to the `/metrics` endpoint.
{{% show-in "enterprise" %}}
### Aggregate metrics across cluster
```bash
# Get metrics from all nodes in cluster
for node in ingester-01 query-01 compactor-01; do
echo "=== Node: $node ==="
curl -s http://$node:8181/metrics | grep 'http_requests_total.*status="ok"'
done
```
{{% /show-in %}}
### Metrics format
InfluxDB exposes metrics in [Prometheus exposition format](https://prometheus.io/docs/instrumenting/exposition_formats/#text-based-format), a format supported by many tools, including [Telegraf](#collect-metrics-with-telegraf).
Each metric follows this structure:
```
# HELP metric_name Description of the metric
# TYPE metric_name counter|gauge|histogram
metric_name{label1="value1",label2="value2"} 42.0
```
### Node identification in metrics
{{< product-name >}} metrics don't include automatic node identification labels.
To identify which node produced each metric, you must configure your monitoring tool to add node labels during scraping.
{{% show-in "enterprise" %}}
For multi-node clusters, node identification is essential for troubleshooting and monitoring individual node performance.
Many monitoring tools support adding static or dynamic labels during the scrape process (Prometheus calls this "relabeling")--for example:
1. **Extract node hostname or IP** from the scrape target address
2. **Add extracted labels to metrics** during the scrape process
| Hostname | Node Identification | Recommended? |
| -------------------------- | ------------------- | ----------------- |
| `ingester-01`, `query-02` | Extract role and ID | Yes |
| `node-01.cluster.internal` | Extract ID | Consider adding role information |
| `192.168.1.10` | IP address only | No, consider renaming with ID and role |
For configuration examples, see [Add node identification with Prometheus](#add-node-identification-with-prometheus).
{{% /show-in %}}
## Metric categories
{{< product-name >}} exposes the following{{% show-in "enterprise" %}} base{{% /show-in %}} categories of metrics{{% show-in "enterprise" %}}, plus additional cluster-aware metrics{{% /show-in %}}:
{{% show-in "enterprise" %}}
> [!Note]
> #### Metrics reporting across node modes
> All nodes in an {{< product-name >}} cluster report the same set of metrics regardless of their configured [mode](/influxdb3/enterprise/reference/config-options/#mode) (ingest, query, compact, process, or all).
> The difference between nodes is in the metric _values_ and labels, which reflect the actual activity on each node.
> For example, an ingest-only node reports query-related metrics with minimal or zero values.
{{% /show-in %}}
### HTTP and gRPC metrics
Monitor API request patterns{{% show-in "enterprise" %}} across the cluster{{% /show-in %}}:
- **`http_requests_total`**: Total HTTP requests by method, path, and status{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **`http_request_duration_seconds`**: HTTP request latency distribution{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **`http_response_body_size_bytes`**: HTTP response size distribution
- **`grpc_requests_total`**: Total gRPC requests{{% show-in "enterprise" %}} for inter-node communication{{% /show-in %}}
- **`grpc_request_duration_seconds`**: gRPC request latency distribution
> [!Note]
> Monitor all write endpoints (`/api/v3/write_lp`, `/api/v2/write`, `/write`) and query endpoints (`/api/v3/query_sql`, `/api/v3/query_influxql`, `/query`) for comprehensive request tracking.
### Database operations
Monitor database{{% show-in "enterprise" %}}-specific and distributed cluster{{% /show-in %}} operations:
- **`influxdb3_catalog_operations_total`**: Catalog operations by type (create_database, create_admin_token, etc.){{% show-in "enterprise" %}} across the cluster{{% /show-in %}}
- **`influxdb3_catalog_operation_retries_total`**: Failed catalog operations that required retries{{% show-in "enterprise" %}} due to conflicts between nodes{{% /show-in %}}
{{% show-in "enterprise" %}}
### Node specialization metrics
Different metrics are more relevant depending on node [mode configuration](/influxdb3/version/admin/clustering/#configure-node-modes):
#### Ingest nodes (mode: ingest)
- **`http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}`**: Write request volume (all endpoints)
- **`object_store_transfer_bytes_total`**: WAL-to-Parquet snapshot activity
- **`datafusion_mem_pool_bytes`**: Memory usage for snapshot operations
#### Query nodes (mode: query)
- **`influxdb_iox_query_log_*`**: Query execution performance
- **`influxdb3_parquet_cache_*`**: Cache performance for query acceleration
- **`http_requests_total{path=~"/api/v3/query_sql|/api/v3/query_influxql|/query"}`**: Query request patterns (all endpoints)
#### Compactor nodes (mode: compact)
- **`object_store_op_duration_seconds`**: Compaction operation performance
- **`object_store_transfer_*`**: File consolidation activity
#### Process nodes (mode: process)
- **`tokio_runtime_*`**: Plugin execution runtime metrics
- Custom plugin metrics (varies by installed plugins)
{{% /show-in %}}
### Memory and caching
Monitor memory usage{{% show-in "enterprise" %}} across specialized nodes{{% /show-in %}}:
- **`datafusion_mem_pool_bytes`**: DataFusion memory pool{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **`influxdb3_parquet_cache_access_total`**: Parquet cache hits, misses, and fetch status{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **`influxdb3_parquet_cache_size_bytes`**: Current size of in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **`influxdb3_parquet_cache_size_number_of_files`**: Number of files in Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **`jemalloc_memstats_bytes`**: Memory allocation statistics{{% show-in "enterprise" %}} per node{{% /show-in %}}
### Query performance
Monitor{{% show-in "enterprise" %}} distributed{{% /show-in %}} query execution{{% show-in "enterprise" %}} and performance{{% /show-in %}}:
- **`influxdb_iox_query_log_*`**: Comprehensive query execution metrics including:
- `compute_duration_seconds`: CPU time spent on computation
- `execute_duration_seconds`: Total query execution time
- `plan_duration_seconds`: Time spent planning queries
- `end2end_duration_seconds`: Complete query duration from request to response
- `max_memory`: Peak memory usage per query
- `parquet_files`: Number of Parquet files accessed
- `partitions`: Number of partitions processed
{{% show-in "enterprise" %}}
- **`influxdb_iox_query_log_ingester_latency_*`**: Inter-node query coordination latency
- **`influxdb_iox_query_log_ingester_partition_count`**: Data distribution across nodes
- **`influxdb_iox_query_log_parquet_files`**: File access patterns per query
{{% /show-in %}}
### Object storage
Monitor{{% show-in "enterprise" %}} shared{{% /show-in %}} object store operations and performance:
- **`object_store_op_duration_seconds`**: Object store operation latency{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **`object_store_transfer_bytes_total`**: Cumulative bytes transferred{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **`object_store_transfer_objects_total`**: Cumulative objects transferred{{% show-in "enterprise" %}} per node{{% /show-in %}}
### Runtime and system
Monitor runtime health and resource usage:
- **`process_start_time_seconds`**: Process start time
- **`thread_panic_count_total`**: Thread panic occurrences
- **`query_datafusion_query_execution_ooms_total`**: Out-of-memory events in query engine
- **`tokio_runtime_*`**: Async runtime metrics (task scheduling, worker threads, queue depths)
{{% show-in "enterprise" %}}
## Cluster-specific metrics
### Load distribution
Monitor workload distribution across nodes:
```bash
# Write load across ingest nodes (all write endpoints)
for node in ingester-01 ingester-02; do
echo "Node $node:"
curl -s http://$node:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\).*status="ok"'
done
# Query load across query nodes
for node in query-01 query-02; do
echo "Node $node:"
curl -s http://$node:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds_count'
done
```
## Node-specific monitoring
### Monitor ingest node health
Monitor data ingestion performance:
```bash
# Ingest throughput (all write endpoints)
curl -s http://ingester-01:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\)'
# Snapshot creation activity
curl -s http://ingester-01:8181/metrics | grep 'object_store_transfer_bytes_total.*put'
# Memory pressure
curl -s http://ingester-01:8181/metrics | grep 'datafusion_mem_pool_bytes'
```
### Monitor query node performance
Monitor query execution:
```bash
# Query latency
curl -s http://query-01:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds'
# Cache effectiveness
curl -s http://query-01:8181/metrics | grep 'influxdb3_parquet_cache_access_total'
# Inter-node coordination time
curl -s http://query-01:8181/metrics | grep 'influxdb_iox_query_log_ingester_latency'
```
### Monitor compactor node activity
Monitor data optimization:
```bash
# Compaction operations
curl -s http://compactor-01:8181/metrics | grep 'object_store_op_duration_seconds.*put'
# File processing volume
curl -s http://compactor-01:8181/metrics | grep 'object_store_transfer_objects_total'
```
{{% /show-in %}}
{{% show-in "core" %}}
## Key metrics for monitoring
### Write throughput
Monitor data ingestion:
```bash
# HTTP requests to write endpoints (all endpoints)
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*\(api/v3/write_lp\|api/v2/write\|/write\)'
# Object store writes (Parquet file creation)
curl -s http://localhost:8181/metrics | grep 'object_store_transfer.*total.*put'
```
### Query performance
Monitor query execution:
```bash
# Query latency percentiles
curl -s http://localhost:8181/metrics | grep 'influxdb_iox_query_log_execute_duration_seconds'
# Query memory usage
curl -s http://localhost:8181/metrics | grep 'influxdb_iox_query_log_max_memory'
# Query errors and failures
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*status="server_error"'
```
### Resource utilization
Monitor system resources:
```bash
# Memory pool usage
curl -s http://localhost:8181/metrics | grep 'datafusion_mem_pool_bytes'
# Cache efficiency
curl -s http://localhost:8181/metrics | grep 'influxdb3_parquet_cache_access_total'
# Runtime task health
curl -s http://localhost:8181/metrics | grep 'tokio_runtime_num_alive_tasks'
```
### Error rates
Monitor system health:
```bash
# HTTP error rates
curl -s http://localhost:8181/metrics | grep 'http_requests_total.*status="client_error"\|http_requests_total.*status="server_error"'
# Thread panics
curl -s http://localhost:8181/metrics | grep 'thread_panic_count_total'
# Query OOMs
curl -s http://localhost:8181/metrics | grep 'query_datafusion_query_execution_ooms_total'
```
{{% /show-in %}}
## Example monitoring queries
### Prometheus queries{{% show-in "enterprise" %}} for clusters{{% /show-in %}}
Use these queries in Prometheus or Grafana dashboards:
{{% show-in "enterprise" %}}
#### Cluster-wide request rate
```promql
# Total requests per second across all nodes
sum(rate(http_requests_total[5m])) by (instance)
# Write requests per second by ingest node (all write endpoints)
sum(rate(http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}[5m])) by (instance)
```
#### Query performance across nodes
```promql
# 95th percentile query latency by query node
histogram_quantile(0.95,
sum(rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m])) by (instance, le)
)
# Average inter-node coordination time
avg(rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_sum[5m]) /
rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_count[5m])) by (instance)
```
#### Load balancing effectiveness
```promql
# Request distribution balance (coefficient of variation)
stddev(sum(rate(http_requests_total[5m])) by (instance)) /
avg(sum(rate(http_requests_total[5m])) by (instance))
# Cache hit rate by query node
sum(rate(influxdb3_parquet_cache_access_total{status="cached"}[5m])) by (instance) /
sum(rate(influxdb3_parquet_cache_access_total[5m])) by (instance)
```
#### Cluster health indicators
```promql
# Node availability (any recent metrics)
up{job="influxdb3-enterprise"}
# Catalog operation conflicts
rate(influxdb3_catalog_operation_retries_total[5m])
# Cross-node error rates
sum(rate(http_requests_total{status=~"server_error|client_error"}[5m])) by (instance, status)
```
{{% /show-in %}}
{{% show-in "core" %}}
#### Request rate
```promql
# Requests per second
rate(http_requests_total[5m])
# Error rate percentage
rate(http_requests_total{status=~"client_error|server_error"}[5m]) / rate(http_requests_total[5m]) * 100
```
#### Query performance
```promql
# 95th percentile query latency
histogram_quantile(0.95, rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m]))
# Average query memory usage
rate(influxdb_iox_query_log_max_memory_sum[5m]) / rate(influxdb_iox_query_log_max_memory_count[5m])
```
#### Cache performance
```promql
# Cache hit rate
rate(influxdb3_parquet_cache_access_total{status="cached"}[5m]) / rate(influxdb3_parquet_cache_access_total[5m]) * 100
# Cache size in MB
influxdb3_parquet_cache_size_bytes / 1024 / 1024
```
#### Object store throughput
```promql
# Bytes per second to object store
rate(object_store_transfer_bytes_total[5m])
# Objects per second to object store
rate(object_store_transfer_objects_total[5m])
```
{{% /show-in %}}
{{% show-in "enterprise" %}}
## Distributed monitoring setup
### Collect metrics with Telegraf
Use Telegraf to collect metrics from all cluster nodes and store them in a separate {{< product-name >}} instance for centralized monitoring.
#### Configure Telegraf
Create a Telegraf configuration file (`telegraf.conf`) to scrape metrics from your cluster nodes:
```toml { placeholders="MONITORING_AUTH_TOKEN|INGESTER_AUTH_TOKEN|QUERY_AUTH_TOKEN|COMPACTOR_AUTH_TOKEN" }
# Telegraf configuration for InfluxDB 3 Enterprise monitoring
# Output to monitoring InfluxDB instance
[[outputs.influxdb_v2]]
urls = ["http://monitoring-influxdb:8181"]
token = "MONITORING_AUTH_TOKEN"
organization = ""
bucket = "monitoring"
# Scrape metrics from ingest nodes
[[inputs.prometheus]]
urls = [
"http://ingester-01:8181/metrics",
"http://ingester-02:8181/metrics"
]
metric_version = 2
# Authentication for metrics endpoint
[inputs.prometheus.headers]
Authorization = "Token INGESTER_AUTH_TOKEN"
[inputs.prometheus.tags]
cluster = "production"
# Scrape metrics from query nodes
[[inputs.prometheus]]
urls = [
"http://query-01:8181/metrics",
"http://query-02:8181/metrics"
]
metric_version = 2
[inputs.prometheus.headers]
Authorization = "Token QUERY_AUTH_TOKEN"
[inputs.prometheus.tags]
cluster = "production"
# Scrape metrics from compactor nodes
[[inputs.prometheus]]
urls = ["http://compactor-01:8181/metrics"]
metric_version = 2
[inputs.prometheus.headers]
Authorization = "Token COMPACTOR_AUTH_TOKEN"
[inputs.prometheus.tags]
cluster = "production"
# Extract node name and role from URL
[[processors.regex]]
namepass = ["*"]
[[processors.regex.tags]]
key = "url"
pattern = "^http://([^:]+):.*"
replacement = "${1}"
result_key = "node_name"
[[processors.regex.tags]]
key = "node_name"
pattern = "^(ingester|query|compactor|processor)-.*"
replacement = "${1}"
result_key = "node_role"
```
Replace the following:
- {{% code-placeholder-key %}}`MONITORING_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} for the monitoring InfluxDB instance
- {{% code-placeholder-key %}}`INGESTER_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the ingest nodes
- {{% code-placeholder-key %}}`QUERY_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the query nodes
- {{% code-placeholder-key %}}`COMPACTOR_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} with `system:metrics:read` permission for the compactor node
#### Start Telegraf
```bash
# Start Telegraf with the configuration
telegraf --config telegraf.conf
# Run as a service (systemd example)
sudo systemctl start telegraf
sudo systemctl enable telegraf
```
#### Query collected metrics
Query the monitoring database using SQL:
```sql
-- Request rate by node
SELECT
node_name,
node_role,
COUNT(*) as request_count
FROM http_requests_total
WHERE time >= now() - INTERVAL '5 minutes'
GROUP BY node_name, node_role
ORDER BY request_count DESC;
-- Query latency percentiles by node
SELECT
node_name,
APPROX_PERCENTILE_CONT(value, 0.95) as p95_latency_seconds
FROM http_request_duration_seconds
WHERE time >= now() - INTERVAL '1 hour'
GROUP BY node_name;
```
<!--TODO - Add example Grafana dashboards
### Grafana dashboards
Create role-specific dashboards with the following suggested metrics for each dashboard:
#### Cluster Overview Dashboard
- Node status and availability
- Request rates across all nodes
- Error rates by node and operation type
- Resource utilization summary
#### Ingest Performance Dashboard
- Write throughput by ingest node
- Snapshot creation rates
- Memory usage and pressure
- WAL-to-Parquet conversion metrics
#### Query Performance Dashboard
- Query latency percentiles by query node
- Cache hit rates and efficiency
- Inter-node coordination times
- Memory usage during query execution
#### Operations Dashboard
- Compaction progress and performance
- Object store operation success rates
- Processing engine trigger rates
- System health indicators
-->
<!--TODO - Use the processing engine for alerting
### Alerting for clusters
Set up cluster-aware alerting rules:
```yaml
# Sample Prometheus alerting rules
groups:
- name: influxdb3_enterprise_cluster
rules:
- alert: NodeDown
expr: up{job="influxdb3-enterprise"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "InfluxDB 3 Enterprise node {{ $labels.instance }} is down"
- alert: HighCatalogConflicts
expr: rate(influxdb3_catalog_operation_retries_total[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High catalog operation conflicts in cluster"
- alert: UnbalancedLoad
expr: |
(
stddev(sum(rate(http_requests_total[5m])) by (instance)) /
avg(sum(rate(http_requests_total[5m])) by (instance))
) > 0.5
for: 10m
labels:
severity: info
annotations:
summary: "Unbalanced load distribution across cluster nodes"
- alert: SlowInterNodeCommunication
expr: |
avg(rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_sum[5m]) /
rate(influxdb_iox_query_log_ingester_latency_to_full_data_seconds_count[5m])) > 1.0
for: 5m
labels:
severity: warning
annotations:
summary: "Slow inter-node communication detected"
```
-->
### Add node identification with Prometheus
If using Prometheus instead of Telegraf, add node identification through _relabeling_:
```yaml
# prometheus.yml
scrape_configs:
- job_name: 'influxdb3-enterprise'
static_configs:
- targets:
- 'ingester-01:8181'
- 'query-01:8181'
relabel_configs:
# Extract node name from address
- source_labels: [__address__]
target_label: node_name
regex: '([^:]+):.*'
replacement: '${1}'
# Assign node role based on hostname pattern
- source_labels: [node_name]
target_label: node_role
regex: 'ingester-.*'
replacement: 'ingest'
- source_labels: [node_name]
target_label: node_role
regex: 'query-.*'
replacement: 'query'
- source_labels: [node_name]
target_label: node_role
regex: 'compactor-.*'
replacement: 'compact'
```
{{% /show-in %}}
{{% show-in "core" %}}
## Integration with monitoring tools
### Collect metrics with Telegraf
Use Telegraf to collect metrics and store them in a separate {{< product-name >}} instance for monitoring.
#### Configure Telegraf
Create a Telegraf configuration file (`telegraf.conf`):
```toml { placeholders="MONITORING_AUTH_TOKEN|AUTH_TOKEN" }
# Telegraf configuration for InfluxDB 3 Core monitoring
# Output to monitoring InfluxDB instance
[[outputs.influxdb_v2]]
urls = ["http://monitoring-influxdb:8181"]
token = "MONITORING_AUTH_TOKEN"
organization = ""
bucket = "monitoring"
# Scrape metrics from InfluxDB 3 Core
[[inputs.prometheus]]
urls = ["http://localhost:8181/metrics"]
metric_version = 2
# Authentication for metrics endpoint (if required)
# [inputs.prometheus.headers]
# Authorization = "Token AUTH_TOKEN"
```
Replace the following:
- {{% code-placeholder-key %}}`MONITORING_AUTH_TOKEN`{{% /code-placeholder-key %}}: your {{% token-link %}} for the monitoring InfluxDB instance
- {{% code-placeholder-key %}}`AUTH_TOKEN`{{% /code-placeholder-key %}} (if uncommented): your {{% token-link %}} for accessing the `/metrics` endpoint
#### Start Telegraf
```bash
# Start Telegraf with the configuration
telegraf --config telegraf.conf
# Run as a service (systemd example)
sudo systemctl start telegraf
sudo systemctl enable telegraf
```
#### Query collected metrics
Query the monitoring database using SQL:
```sql
-- Request rate over time
SELECT
date_bin(INTERVAL '5 minutes', time) as time_bucket,
COUNT(*) as request_count
FROM http_requests_total
WHERE time >= now() - INTERVAL '1 hour'
GROUP BY time_bucket
ORDER BY time_bucket DESC;
-- Error rate
SELECT
status,
COUNT(*) as error_count
FROM http_requests_total
WHERE time >= now() - INTERVAL '1 hour'
AND status IN ('client_error', 'server_error')
GROUP BY status;
```
<!--TODO - Add example Grafana dashboards
### Grafana dashboard
Create dashboards with key metrics:
1. **System Overview**: Request rates, error rates, memory usage
2. **Query Performance**: Query latency, throughput, memory per query
3. **Storage**: Object store operations, cache hit rates, file counts
4. **Runtime Health**: Task counts, worker utilization, panic rates
-->
<!--TODO - Use the processing engine for alerting
### Alerting rules
Set up alerts for critical conditions:
```yaml
# Prometheus alerting rules
groups:
- name: influxdb3_core
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status=~"server_error"}[5m]) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "High error rate in InfluxDB 3 Core"
- alert: HighQueryLatency
expr: histogram_quantile(0.95, rate(influxdb_iox_query_log_execute_duration_seconds_bucket[5m])) > 10
for: 5m
labels:
severity: warning
annotations:
summary: "High query latency in InfluxDB 3 Core"
- alert: LowCacheHitRate
expr: rate(influxdb3_parquet_cache_access_total{status="cached"}[5m]) / rate(influxdb3_parquet_cache_access_total[5m]) < 0.5
for: 10m
labels:
severity: info
annotations:
summary: "Low cache hit rate in InfluxDB 3 Core"
```
-->
{{% /show-in %}}
### Extend monitoring with InfluxDB 3 plugins
Use {{< product-name >}} plugins to extend monitoring and alerting capabilities:
- [Notifier plugin](/influxdb3/version/plugins/library/official/notifier/): Send alerts to external systems based on custom logic.
- [Threshold deadman checks plugin](/influxdb3/version/plugins/library/official/threshold-deadman-checks/): Monitor metrics and trigger alerts when thresholds are breached.
- [System metrics plugin](/influxdb3/version/plugins/library/official/system-metrics/): Collect and visualize system-level metrics.
## Best practices
### General monitoring practices
1. **Monitor key metrics**: Focus on request rates, error rates, latency, and resource usage
2. **Set appropriate scrape intervals**: 15-30 seconds for most metrics
3. **Create meaningful alerts**: Alert on trends and thresholds that indicate real issues
4. **Use labels effectively**: Leverage metric labels for filtering and grouping
5. **Monitor long-term trends**: Track performance over time to identify patterns
6. **Correlate metrics**: Combine multiple metrics to understand system behavior
{{% show-in "enterprise" %}}
### Cluster monitoring practices
1. **Monitor each node type differently**: Focus on write metrics for ingest nodes, query metrics for query nodes
2. **Track load distribution**: Ensure work is balanced across nodes of the same type
3. **Monitor inter-node coordination**: Watch for communication delays between nodes
4. **Set up node-specific alerts**: Different thresholds for different node roles
5. **Use node labels**: Tag metrics with node roles and purposes
6. **Monitor shared resources**: Object store performance affects all nodes
7. **Track catalog conflicts**: High retry rates indicate coordination issues
8. **Regularly review dashboards and alerts**: Adjust as cluster usage patterns evolve
{{% /show-in %}}

View File

@ -0,0 +1,602 @@
InfluxDB exposes operational metrics in [Prometheus format](#prometheus-format) at the `/metrics` endpoint{{% show-in "enterprise" %}} on each cluster node{{% /show-in %}}.
- [Access metrics](#access-metrics)
- [HTTP and gRPC metrics](#http-and-grpc-metrics)
- [Database operations](#database-operations)
- [Query performance](#query-performance)
- [Memory and caching](#memory-and-caching)
- [Object storage](#object-storage)
- [Runtime and system](#runtime-and-system)
{{% show-in "enterprise" %}}
- [Cluster-specific considerations](#cluster-specific-considerations)
{{% /show-in %}}
- [Prometheus format](#prometheus-format)
## Access metrics
{{% show-in "core" %}}
Metrics are available at `http://localhost:8181/metrics` by default.
```bash
curl -s http://localhost:8181/metrics
```
{{% /show-in %}}
{{% show-in "enterprise" %}}
Metrics are available at `http://NODE_HOST:8181/metrics` on each cluster node.
```bash
# Access metrics from specific nodes
curl -s http://ingester-01:8181/metrics
curl -s http://query-01:8181/metrics
curl -s http://compactor-01:8181/metrics
```
{{% /show-in %}}
## HTTP and gRPC metrics
### http_requests_total
- **Type:** Counter
- **Description:** Total number of HTTP requests processed{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `method`: HTTP method (GET, POST, etc.)
- `method_path`: Method and path combination
- `path`: Request path
- `status`: Response status (ok, client_error, server_error, aborted, unexpected_response)
{{% show-in "enterprise" %}}
**Cluster considerations:** Track per-node to monitor load distribution
{{% /show-in %}}
```
# Write endpoints
http_requests_total{method="POST",method_path="POST /api/v3/write_lp",path="/api/v3/write_lp",status="ok"} 1
http_requests_total{method="POST",method_path="POST /api/v2/write",path="/api/v2/write",status="ok"} 1
http_requests_total{method="POST",method_path="POST /write",path="/write",status="ok"} 1
# Query endpoints
http_requests_total{method="POST",method_path="POST /api/v3/query_sql",path="/api/v3/query_sql",status="ok"} 1
http_requests_total{method="POST",method_path="POST /api/v3/query_influxql",path="/api/v3/query_influxql",status="ok"} 1
http_requests_total{method="GET",method_path="GET /query",path="/query",status="ok"} 1
```
> [!Note]
> Monitor all write endpoints (`/api/v3/write_lp`, `/api/v2/write`, `/write`) and query endpoints (`/api/v3/query_sql`, `/api/v3/query_influxql`, `/query`) for comprehensive request tracking.
### http_request_duration_seconds
- **Type:** Histogram
- **Description:** Distribution of HTTP request latencies{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** Same as <a href="#http_requests_total"><code>http_requests_total</code></a>
{{% show-in "enterprise" %}}
**Cluster considerations:** Compare latencies across nodes to identify performance bottlenecks
{{% /show-in %}}
### http_response_body_size_bytes
- **Type:** Histogram
- **Description:** Distribution of HTTP response body sizes{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** Same as <a href="#http_requests_total"><code>http_requests_total</code></a>
### grpc_requests_total
- **Type:** Counter
- **Description:** Total number of gRPC requests processed{{% show-in "enterprise" %}} (includes inter-node communication){{% /show-in %}}
- **Labels:**
- `path`: gRPC method path
- `status`: Response status
{{% show-in "enterprise" %}}
**Cluster considerations:** High gRPC volumes indicate active inter-node communication
{{% /show-in %}}
### grpc_request_duration_seconds
- **Type:** Histogram
- **Description:** Distribution of gRPC request latencies{{% show-in "enterprise" %}} (includes inter-node communication){{% /show-in %}}
- **Labels:** Same as <a href="#grpc_requests_total"><code>grpc_requests_total</code></a>
{{% show-in "enterprise" %}}
**Cluster considerations:** Monitor for network latency between cluster nodes
{{% /show-in %}}
### grpc_response_body_size_bytes
- **Type:** Histogram
- **Description:** Distribution of gRPC response body sizes
- **Labels:** Same as <a href="#grpc_requests_total"><code>grpc_requests_total</code></a>
## Database operations
### influxdb3_catalog_operations_total
- **Type:** Counter
- **Description:** Total catalog operations by type{{% show-in "enterprise" %}} across the cluster{{% /show-in %}}
- **Labels:**
- `type`: Operation type (create_database, create_admin_token, register_node, etc.)
{{% show-in "enterprise" %}}
**Cluster considerations:** Monitor for catalog coordination across nodes
{{% /show-in %}}
```
influxdb3_catalog_operations_total{type="create_database"} 5
influxdb3_catalog_operations_total{type="create_admin_token"} 2
{{% show-in "enterprise" %}}
influxdb3_catalog_operations_total{type="register_node"} 6
{{% /show-in %}}
```
### influxdb3_catalog_operation_retries_total
- **Type:** Counter
- **Description:** Catalog updates that had to be retried due to conflicts{{% show-in "enterprise" %}} between nodes{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** High retry rates indicate coordination issues or high contention
{{% /show-in %}}
## Query performance
### influxdb_iox_query_log_compute_duration_seconds
- **Type:** Histogram
- **Description:** CPU duration spent for query computation{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Compare compute times across query nodes
{{% /show-in %}}
### influxdb_iox_query_log_execute_duration_seconds
- **Type:** Histogram
- **Description:** Total time to execute queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Track query performance across different query nodes
{{% /show-in %}}
### influxdb_iox_query_log_plan_duration_seconds
- **Type:** Histogram
- **Description:** Time spent planning queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_end2end_duration_seconds
- **Type:** Histogram
- **Description:** Complete query duration from issue time to completion{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_permit_duration_seconds
- **Type:** Histogram
- **Description:** Time to acquire a semaphore permit for query execution{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_max_memory
- **Type:** Histogram
- **Description:** Peak memory allocated for processing queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_parquet_files
- **Type:** Histogram
- **Description:** Number of Parquet files processed by queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_partitions
- **Type:** Histogram
- **Description:** Number of partitions processed by queries{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_deduplicated_parquet_files
- **Type:** Histogram
- **Description:** Number of files held under a DeduplicateExec operator{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_deduplicated_partitions
- **Type:** Histogram
- **Description:** Number of partitions held under a DeduplicateExec operator{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### influxdb_iox_query_log_phase_current
- **Type:** Gauge
- **Description:** Number of queries currently in each execution phase{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `phase`: Query execution phase
### influxdb_iox_query_log_phase_entered_total
- **Type:** Counter
- **Description:** Total number of queries that entered each execution phase{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `phase`: Query execution phase
### influxdb_iox_query_log_ingester_latency_to_full_data_seconds
- **Type:** Histogram
- **Description:** Time from initial request until querier has all data from ingesters
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Measures inter-node coordination efficiency in distributed queries
{{% /show-in %}}
### influxdb_iox_query_log_ingester_latency_to_plan_seconds
- **Type:** Histogram
- **Description:** Time until querier can proceed with query planning
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Indicates how quickly query nodes can coordinate with ingest nodes
{{% /show-in %}}
### influxdb_iox_query_log_ingester_partition_count
- **Type:** Histogram
- **Description:** Number of ingester partitions involved in queries
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Shows data distribution across ingest nodes
{{% /show-in %}}
### influxdb_iox_query_log_ingester_response_rows
- **Type:** Histogram
- **Description:** Number of rows in ingester responses
- **Labels:** None
### influxdb_iox_query_log_ingester_response_size
- **Type:** Histogram
- **Description:** Size of ingester record batches in bytes
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Monitor network traffic between query and ingest nodes
{{% /show-in %}}
### query_datafusion_query_execution_ooms_total
- **Type:** Counter
- **Description:** Number of out-of-memory errors encountered by the query engine{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Track OOM events across query nodes to identify resource constraints
{{% /show-in %}}
## Memory and caching
### datafusion_mem_pool_bytes
- **Type:** Gauge
- **Description:** Number of bytes within the DataFusion memory pool{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
**Cluster considerations:** Monitor memory usage across different node types
{{% /show-in %}}
### influxdb3_parquet_cache_access_total
- **Type:** Counter
- **Description:** Track accesses to the in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **Labels:**
- `status`: Access result (cached, miss, miss_while_fetching)
{{% show-in "enterprise" %}}
**Cluster considerations:** Compare cache effectiveness across query nodes
{{% /show-in %}}
```
influxdb3_parquet_cache_access_total{status="cached"} 1500
influxdb3_parquet_cache_access_total{status="miss"} 200
```
### influxdb3_parquet_cache_size_bytes
- **Type:** Gauge
- **Description:** Current size of in-memory Parquet cache in bytes{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **Labels:** None
### influxdb3_parquet_cache_size_number_of_files
- **Type:** Gauge
- **Description:** Number of files in the in-memory Parquet cache{{% show-in "enterprise" %}} per query node{{% /show-in %}}
- **Labels:** None
### jemalloc_memstats_bytes
- **Type:** Gauge
- **Description:** Memory allocation statistics from jemalloc{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `type`: Memory statistic type (active, allocated, mapped, etc.)
## Object storage
### object_store_op_duration_seconds
- **Type:** Histogram
- **Description:** Duration of object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `op`: Operation type (get, put, delete, list, etc.)
- `result`: Operation result (success, error)
{{% show-in "enterprise" %}}
**Cluster considerations:** All nodes access shared object store; monitor for hotspots
{{% /show-in %}}
### object_store_op_headers_seconds
- **Type:** Histogram
- **Description:** Time to response headers for object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** Same as <a href="#object_store_op_duration_seconds"><code>object_store_op_duration_seconds</code></a>
### object_store_op_ttfb_seconds
- **Type:** Histogram
- **Description:** Time to first byte for object store operations{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** Same as <a href="#object_store_op_duration_seconds"><code>object_store_op_duration_seconds</code></a>
### object_store_transfer_bytes_total
- **Type:** Counter
- **Description:** Cumulative bytes transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `op`: Operation type (get, put)
{{% show-in "enterprise" %}}
**Cluster considerations:** Ingest nodes show high 'put' activity; query nodes show high 'get' activity
{{% /show-in %}}
### object_store_transfer_bytes_hist
- **Type:** Histogram
- **Description:** Distribution of bytes transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `op`: Operation type (get, put)
### object_store_transfer_objects_total
- **Type:** Counter
- **Description:** Cumulative count of objects transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `op`: Operation type (get, put)
### object_store_transfer_objects_hist
- **Type:** Histogram
- **Description:** Distribution of objects transferred to/from object store{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `op`: Operation type (get, put)
## Runtime and system
### process_start_time_seconds
- **Type:** Gauge
- **Description:** Start time of the process since Unix epoch{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### thread_panic_count_total
- **Type:** Counter
- **Description:** Number of thread panics observed{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### iox_async_semaphore_acquire_duration_seconds
- **Type:** Histogram
- **Description:** Duration to acquire async semaphore permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_holders_acquired
- **Type:** Gauge
- **Description:** Number of currently acquired semaphore holders{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_holders_cancelled_while_pending_total
- **Type:** Counter
- **Description:** Number of pending semaphore holders cancelled while waiting{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_holders_pending
- **Type:** Gauge
- **Description:** Number of pending semaphore holders{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_permits_acquired
- **Type:** Gauge
- **Description:** Number of currently acquired permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_permits_cancelled_while_pending_total
- **Type:** Counter
- **Description:** Permits cancelled while waiting for semaphore{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_permits_pending
- **Type:** Gauge
- **Description:** Number of pending permits{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### iox_async_semaphore_permits_total
- **Type:** Gauge
- **Description:** Total number of permits in semaphore{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `semaphore`: Semaphore identifier
### tokio_runtime_num_alive_tasks
- **Type:** Gauge
- **Description:** Current number of alive tasks in the Tokio runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_blocking_queue_depth
- **Type:** Gauge
- **Description:** Number of tasks in the blocking thread pool queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_budget_forced_yield_count_total
- **Type:** Counter
- **Description:** Number of times tasks were forced to yield after exhausting budgets{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_global_queue_depth
- **Type:** Gauge
- **Description:** Number of tasks in the runtime's global queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_io_driver_ready_count_total
- **Type:** Counter
- **Description:** Number of ready events processed by the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_io_driver_fd_deregistered_count_total
- **Type:** Counter
- **Description:** Number of file descriptors deregistered by the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_io_driver_fd_registered_count_total
- **Type:** Counter
- **Description:** Number of file descriptors registered with the I/O driver{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_num_blocking_threads
- **Type:** Gauge
- **Description:** Number of additional threads spawned by the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_num_idle_blocking_threads
- **Type:** Gauge
- **Description:** Number of idle threads spawned for spawn_blocking calls{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_num_workers
- **Type:** Gauge
- **Description:** Number of worker threads used by the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_runtime_remote_schedule_count_total
- **Type:** Counter
- **Description:** Number of tasks scheduled from outside the runtime{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_worker_local_queue_depth
- **Type:** Gauge
- **Description:** Number of tasks in each worker's local queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_local_schedule_count_total
- **Type:** Counter
- **Description:** Tasks scheduled from within the runtime on local queues{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_mean_poll_time_seconds
- **Type:** Gauge
- **Description:** Exponentially weighted moving average of task poll duration{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_noop_count_total
- **Type:** Counter
- **Description:** Times worker threads unparked but performed no work{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_overflow_count_total
- **Type:** Counter
- **Description:** Times worker threads saturated their local queue{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_park_count_total
- **Type:** Counter
- **Description:** Total times worker threads have parked{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_poll_count_total
- **Type:** Counter
- **Description:** Number of tasks polled by worker threads{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_steal_count_total
- **Type:** Counter
- **Description:** Tasks stolen by worker threads from other workers{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_steal_operations_total
- **Type:** Counter
- **Description:** Number of steal operations performed by worker threads{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_worker_total_busy_duration_seconds_total
- **Type:** Counter
- **Description:** Time worker threads have been busy{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:**
- `worker`: Worker thread identifier
### tokio_watchdog_hangs_total
- **Type:** Counter
- **Description:** Number of hangs detected by the Tokio watchdog{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
### tokio_watchdog_response_time_seconds
- **Type:** Histogram
- **Description:** Response time of the Tokio watchdog task{{% show-in "enterprise" %}} per node{{% /show-in %}}
- **Labels:** None
{{% show-in "enterprise" %}}
## Cluster-specific considerations
### Metrics reporting across node modes
All nodes in an InfluxDB 3 Enterprise cluster report the same set of metrics regardless of their configured [mode](/influxdb3/enterprise/reference/config-options/#mode) (ingest, query, compact, process, or all).
Metrics are not filtered based on node specialization.
The difference between nodes is in the metric _values_ and labels, which reflect the actual activity on each node.
For example:
- An ingest-only node reports query-related metrics, but with minimal or zero values
- A query-only node reports write-related metrics, but with minimal or zero values
### Node identification
For information on enriching metrics with node identification using Telegraf or Prometheus relabeling, see [Node identification in Monitor metrics](/influxdb3/enterprise/admin/monitor-metrics/#node-identification).
### Key cluster metrics
Focus on these metrics for cluster health:
- **Load distribution**: `sum by (node_name) (rate(http_requests_total[5m]))`
- **Catalog conflicts**: `rate(influxdb3_catalog_operation_retries_total[5m])`
- **Inter-node latency**: `influxdb_iox_query_log_ingester_latency_to_full_data_seconds`
- **Node availability**: `up{job="influxdb3-enterprise"}`
### Performance by node type
Monitor different metrics based on [node specialization](/influxdb3/enterprise/admin/clustering/):
- **Ingest nodes or all-in-one nodes handling writes**:
- `http_requests_total{path=~"/api/v3/write_lp|/api/v2/write|/write"}` - Write operations via HTTP (all endpoints)
- `grpc_requests_total{path="/api/v3/write_lp"}` - Write operations via gRPC
- `grpc_request_duration_seconds{path="/api/v3/write_lp"}` - Write operation latency
- `object_store_transfer_bytes_total{op="put"}` - Data written to object storage
- **Query nodes or all-in-one nodes handling queries**:
- `http_requests_total{path=~"/api/v3/query_sql|/api/v3/query_influxql|/query"}` - Query requests (all endpoints)
- `influxdb_iox_query_log_execute_duration_seconds` - Query execution time
- `influxdb3_parquet_cache_access_total` - Parquet cache performance
- **All nodes (configuration and management)**:
- `http_requests_total{path="/api/v3/configure/database"}` - Database configuration operations
- `http_requests_total{path="/api/v3/configure/token/admin"}` - Token management operations
- `influxdb3_catalog_operations_total` - Catalog operations (create_database, create_admin_token, register_node)
- **Compactor nodes or all-in-one nodes handling compaction**:
- `object_store_op_duration_seconds{op="put"}` - Compaction write performance
- `object_store_transfer_objects_total` - Files processed during compaction
{{% /show-in %}}
## Prometheus format
InfluxDB exposes metrics in Prometheus exposition format, a text-based format that includes metric names, labels, and values. Each metric follows this structure:
```
metric_name{label1="value1",label2="value2"} metric_value timestamp
```
**Key characteristics:**
- **Metric names**: Use underscores and describe what is being measured
- **Labels**: Key-value pairs in curly braces that add dimensionality
- **Values**: Numeric measurements (integers or floats)
- **Timestamps**: Optional Unix timestamps (usually omitted for current time)
**Metric types:**
- **Counter**: Cumulative values that only increase (for example, `http_requests_total`)
- **Gauge**: Values that can go up and down (for example, `tokio_runtime_num_alive_tasks`)
- **Histogram**: Samples observations and counts them in configurable buckets (for example, `http_request_duration_seconds`)
For complete specification details, see the [Prometheus exposition format documentation](https://prometheus.io/docs/instrumenting/exposition_formats/).

187
dist/influxdb-version-detector.d.ts vendored Normal file
View File

@ -0,0 +1,187 @@
/**
* InfluxDB Version Detector Component
*
* Helps users identify which InfluxDB product they're using through a
* guided questionnaire with URL detection and scoring-based recommendations.
*
* DECISION TREE LOGIC (from .context/drafts/influxdb-version-detector/influxdb-decision-tree.md):
*
* ## Primary Detection Flow
*
* START: User enters URL
* |
* URL matches known cloud patterns?
*
* YES: Contains "influxdb.io" **InfluxDB Cloud Dedicated**
* YES: Contains "cloud2.influxdata.com" regions **InfluxDB Cloud Serverless**
* YES: Contains "influxcloud.net" **InfluxDB Cloud 1**
* YES: Contains other cloud2 regions **InfluxDB Cloud (TSM)**
*
* NO: Check port and try /ping endpoint
*
* Port 8181 detected? Strong indicator of v3 (Core/Enterprise)
* | | Returns 200 (auth successful or disabled)?
* | --> `x-influxdb-build: Enterprise` -> **InfluxDB 3 Enterprise** (definitive)
* | --> `x-influxdb-build: Core` -> **InfluxDB 3 Core** (definitive)
*
* Returns 401 Unauthorized (default - auth required)?
*
* Ask "Paid or Free?"
* Paid **InfluxDB 3 Enterprise** (definitive)
* Free **InfluxDB 3 Core** (definitive)
* |
* Port 8086 detected? Strong indicator of legacy (OSS/Enterprise)
* NOTE: v1.x ping auth optional (ping-auth-enabled), v2.x always open
*
* Returns 401 Unauthorized?
* Could be v1.x with ping-auth-enabled=true OR Enterprise
*
* Ask "Paid or Free?" Show ranked results
*
* Returns 200/204 (accessible)?
* Likely v2.x OSS (always open) or v1.x with ping-auth-enabled=false
*
* Continue to questionnaire
*
* Blocked/Can't detect?
*
* Start questionnaire
*
* ## Questionnaire Flow (No URL or after detection)
*
* Q1: Which type of license do you have?
* Paid/Commercial License
* Free/Open Source (including free cloud tiers)
* I'm not sure
*
* Q2: Is your InfluxDB hosted by InfluxData (cloud) or self-hosted?
* Cloud service (hosted by InfluxData)
* Self-hosted (on your own servers)
* I'm not sure
*
* Q3: How long has your server been in place?
* Recently installed (less than 1 year)
* 1-5 years
* More than 5 years
* I'm not sure
*
* Q4: Which query language(s) do you use?
* SQL
* InfluxQL
* Flux
* Multiple languages
* I'm not sure
*
* ## Definitive Determinations (Stop immediately, no more questions)
*
* 1. **401 + Port 8181 + Paid** InfluxDB 3 Enterprise
* 2. **401 + Port 8181 + Free** InfluxDB 3 Core
* 3. **URL matches cloud pattern** Specific cloud product
* 4. **x-influxdb-build header** Definitive product identification
*
* ## Scoring System (When not definitive)
*
* ### Elimination Rules
* - **Free + Self-hosted** Eliminates all cloud products
* - **Free** Eliminates: 3 Enterprise, Enterprise, Clustered, Cloud Dedicated, Cloud 1
* - **Paid + Self-hosted** Eliminates all cloud products
* - **Paid + Cloud** Eliminates all self-hosted products
* - **Free + Cloud** Eliminates all self-hosted products, favors Serverless/TSM
*
* ### Strong Signals (High points)
* - **401 Response**: +50 for v3 products, +30 for Clustered
* - **Port 8181**: +30 for v3 products
* - **Port 8086**: +20 for legacy products
* - **SQL Language**: +40 for v3 products, eliminates v1/v2
* - **Flux Language**: +30 for v2 era, eliminates v1 and v3
* - **Server Age 5+ years**: +30 for v1 products, -50 for v3
*
* ### Ranking Display Rules
* - Only show "Most Likely" if:
* - Top score > 30 (not low confidence)
* - AND difference between #1 and #2 is 15 points
* - Show manual verification commands only if:
* - Confidence is not high (score < 60)
* - AND it's a self-hosted product
* - AND user didn't say it's cloud
*/
interface ComponentOptions {
component: HTMLElement;
}
declare global {
interface Window {
gtag?: (_event: string, _action: string, _parameters?: Record<string, unknown>) => void;
}
}
declare class InfluxDBVersionDetector {
private container;
private products;
private influxdbUrls;
private answers;
private initialized;
private questionFlow;
private currentQuestionIndex;
private questionHistory;
private progressBar;
private resultDiv;
private restartBtn;
private currentContext;
constructor(options: ComponentOptions);
private parseComponentData;
private init;
private setupPlaceholders;
private setupPingHeadersPlaceholder;
private setupDockerOutputPlaceholder;
private getCurrentPageSection;
private trackAnalyticsEvent;
private initializeForModal;
private getBasicUrlSuggestion;
private getProductDisplayName;
private generateConfigurationGuidance;
private getHostExample;
private usesDatabaseTerminology;
private getAuthenticationInfo;
private detectEnterpriseFeatures;
private analyzeUrlPatterns;
private render;
private attachEventListeners;
private updateProgress;
private showQuestion;
private enhanceUrlInputWithSuggestions;
private getCurrentProduct;
private handleUrlKnown;
private goBack;
private detectByUrl;
private detectContext;
private detectPortFromUrl;
private startQuestionnaire;
private startQuestionnaireWithCloudContext;
private answerQuestion;
private handleAuthorizationHelp;
private showRankedResults;
/**
* Gets the Grafana documentation link for a given product
*/
private getGrafanaLink;
/**
* Generates a unified product result block with characteristics and Grafana link
*/
private generateProductResult;
/**
* Maps simple product keys (used in URL detection) to full product names (used in scoring)
*/
private mapProductKeyToFullName;
private applyScoring;
private displayRankedResults;
private analyzePingHeaders;
private showResult;
private analyzeDockerOutput;
private showPingTestSuggestion;
private showOSSVersionCheckSuggestion;
private showMultipleCandidatesSuggestion;
private showDetectedVersion;
private restart;
}
export default function initInfluxDBVersionDetector(options: ComponentOptions): InfluxDBVersionDetector;
export {};
//# sourceMappingURL=influxdb-version-detector.d.ts.map

View File

@ -0,0 +1 @@
{"version":3,"file":"influxdb-version-detector.d.ts","sourceRoot":"","sources":["../assets/js/influxdb-version-detector.ts"],"names":[],"mappings":"AAAA;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;GA0GG;AAuCH,UAAU,gBAAgB;IACxB,SAAS,EAAE,WAAW,CAAC;CACxB;AAaD,OAAO,CAAC,MAAM,CAAC;IACb,UAAU,MAAM;QACd,IAAI,CAAC,EAAE,CACL,MAAM,EAAE,MAAM,EACd,OAAO,EAAE,MAAM,EACf,WAAW,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,OAAO,CAAC,KAClC,IAAI,CAAC;KACX;CACF;AAED,cAAM,uBAAuB;IAC3B,OAAO,CAAC,SAAS,CAAc;IAC/B,OAAO,CAAC,QAAQ,CAAW;IAC3B,OAAO,CAAC,YAAY,CAA0B;IAC9C,OAAO,CAAC,OAAO,CAAe;IAC9B,OAAO,CAAC,WAAW,CAAkB;IACrC,OAAO,CAAC,YAAY,CAAgB;IACpC,OAAO,CAAC,oBAAoB,CAAK;IACjC,OAAO,CAAC,eAAe,CAAgB;IACvC,OAAO,CAAC,WAAW,CAA4B;IAC/C,OAAO,CAAC,SAAS,CAA4B;IAC7C,OAAO,CAAC,UAAU,CAA4B;IAC9C,OAAO,CAAC,cAAc,CAA+C;gBAEzD,OAAO,EAAE,gBAAgB;IAoBrC,OAAO,CAAC,kBAAkB;IAqC1B,OAAO,CAAC,IAAI;IAcZ,OAAO,CAAC,iBAAiB;IAKzB,OAAO,CAAC,2BAA2B;IAoCnC,OAAO,CAAC,4BAA4B;IA+BpC,OAAO,CAAC,qBAAqB;IAe7B,OAAO,CAAC,mBAAmB;IAsG3B,OAAO,CAAC,kBAAkB;IAqC1B,OAAO,CAAC,qBAAqB;IAK7B,OAAO,CAAC,qBAAqB;IAgC7B,OAAO,CAAC,6BAA6B;IA4FrC,OAAO,CAAC,cAAc;IAgCtB,OAAO,CAAC,uBAAuB;IAU/B,OAAO,CAAC,qBAAqB;IA2B7B,OAAO,CAAC,wBAAwB;IAehC,OAAO,CAAC,kBAAkB;IAmJ1B,OAAO,CAAC,MAAM;IAsNd,OAAO,CAAC,oBAAoB;IA8G5B,OAAO,CAAC,cAAc;IAQtB,OAAO,CAAC,YAAY;IAsBpB,OAAO,CAAC,8BAA8B;IAyCtC,OAAO,CAAC,iBAAiB;IAMzB,OAAO,CAAC,cAAc;IAuBtB,OAAO,CAAC,MAAM;YA2BA,WAAW;IA6DzB,OAAO,CAAC,aAAa;IAgBrB,OAAO,CAAC,iBAAiB;IAgBzB,OAAO,CAAC,kBAAkB;IAa1B,OAAO,CAAC,kCAAkC;IAU1C,OAAO,CAAC,cAAc;IA0BtB,OAAO,CAAC,uBAAuB;IAqG/B,OAAO,CAAC,iBAAiB;IAoCzB;;OAEG;IACH,OAAO,CAAC,cAAc;IAmBtB;;OAEG;IACH,OAAO,CAAC,qBAAqB;IAgE7B;;OAEG;IACH,OAAO,CAAC,uBAAuB;IAiB/B,OAAO,CAAC,YAAY;IAmKpB,OAAO,CAAC,oBAAoB;IAiJ5B,OAAO,CAAC,kBAAkB;IAmG1B,OAAO,CAAC,UAAU;IAUlB,OAAO,CAAC,mBAAmB;IA2F3B,OAAO,CAAC,sBAAsB;IA4D9B,OAAO,CAAC,6BAA6B;IAiDrC,OAAO,CAAC,gCAAgC;IA2CxC,OAAO,CAAC,mBAAmB;IAgB3B,OAAO,CAAC,OAAO;CA2ChB;AAGD,MAAM,CAAC,OAAO,UAAU,2BAA2B,CACjD,OAAO,EAAE,gBAAgB,GACxB,uBAAuB,CAEzB"}

2096
dist/influxdb-version-detector.js vendored Normal file

File diff suppressed because it is too large Load Diff

1
dist/influxdb-version-detector.js.map vendored Normal file

File diff suppressed because one or more lines are too long

2
dist/services/influxdb-urls.d.ts vendored Normal file
View File

@ -0,0 +1,2 @@
export const influxdbUrls: any;
//# sourceMappingURL=influxdb-urls.d.ts.map

1
dist/services/influxdb-urls.d.ts.map vendored Normal file
View File

@ -0,0 +1 @@
{"version":3,"file":"influxdb-urls.d.ts","sourceRoot":"","sources":["../../assets/js/services/influxdb-urls.js"],"names":[],"mappings":"AAEA,+BAAoD"}

3
dist/services/influxdb-urls.js vendored Normal file
View File

@ -0,0 +1,3 @@
import { influxdb_urls as influxdbUrlsParam } from '@params';
export const influxdbUrls = influxdbUrlsParam || {};
//# sourceMappingURL=influxdb-urls.js.map

1
dist/services/influxdb-urls.js.map vendored Normal file
View File

@ -0,0 +1 @@
{"version":3,"file":"influxdb-urls.js","sourceRoot":"","sources":["../../assets/js/services/influxdb-urls.js"],"names":[],"mappings":"AAAA,OAAO,EAAE,aAAa,IAAI,iBAAiB,EAAE,MAAM,SAAS,CAAC;AAE7D,MAAM,CAAC,MAAM,YAAY,GAAG,iBAAiB,IAAI,EAAE,CAAC"}

30
dist/services/local-storage.d.ts vendored Normal file
View File

@ -0,0 +1,30 @@
export namespace DEFAULT_STORAGE_URLS {
let oss: any;
let cloud: any;
let serverless: any;
let core: any;
let enterprise: any;
let dedicated: any;
let clustered: any;
let prev_oss: any;
let prev_cloud: any;
let prev_core: any;
let prev_enterprise: any;
let prev_serverless: any;
let prev_dedicated: any;
let prev_clustered: any;
let custom: string;
}
export const defaultUrls: {};
export function initializeStorageItem(storageKey: any, defaultValue: any): void;
export function getPreference(prefName: any): any;
export function setPreference(prefID: any, prefValue: any): void;
export function getPreferences(): any;
export function getInfluxDBUrls(): any;
export function getInfluxDBUrl(product: any): any;
export function setInfluxDBUrls(updatedUrlsObj: any): void;
export function removeInfluxDBUrl(product: any): void;
export function getNotifications(): any;
export function notificationIsRead(notificationID: any, notificationType: any): any;
export function setNotificationAsRead(notificationID: any, notificationType: any): void;
//# sourceMappingURL=local-storage.d.ts.map

1
dist/services/local-storage.d.ts.map vendored Normal file
View File

@ -0,0 +1 @@
{"version":3,"file":"local-storage.d.ts","sourceRoot":"","sources":["../../assets/js/services/local-storage.js"],"names":[],"mappings":";;;;;;;;;;;;;;;;;AAqFA,6BAAuB;AAhEvB,gFAOC;AAwBD,kDAYC;AAGD,iEAOC;AAGD,sCAEC;AAkCD,uCAOC;AAGD,kDAYC;AAOD,2DAOC;AAGD,sDAOC;AAgBD,wCAeC;AAYD,oFAKC;AAWD,wFAWC"}

187
dist/services/local-storage.js vendored Normal file
View File

@ -0,0 +1,187 @@
/*
This represents an API for managing user and client-side settings for the
InfluxData documentation. It uses the local browser storage.
These functions manage the following InfluxDB settings:
- influxdata_docs_preferences: Docs UI/UX-related preferences (obj)
- influxdata_docs_urls: User-defined InfluxDB URLs for each product (obj)
- influxdata_docs_notifications:
- messages: Messages (data/notifications.yaml) that have been seen (array)
- callouts: Feature callouts that have been seen (array)
*/
import { influxdbUrls } from './influxdb-urls.js';
// Prefix for all InfluxData docs local storage
const storagePrefix = 'influxdata_docs_';
/*
Initialize data in local storage with a default value.
*/
function initializeStorageItem(storageKey, defaultValue) {
const fullStorageKey = storagePrefix + storageKey;
// Check if the data exists before initializing the data
if (localStorage.getItem(fullStorageKey) === null) {
localStorage.setItem(fullStorageKey, defaultValue);
}
}
/*
////////////////////////////////////////////////////////////////////////////////
////////////////////////// INFLUXDATA DOCS PREFERENCES /////////////////////////
////////////////////////////////////////////////////////////////////////////////
*/
const prefStorageKey = storagePrefix + 'preferences';
// Default preferences
const defaultPrefObj = {
api_lib: null,
influxdb_url: 'cloud',
sidebar_state: 'open',
theme: 'light',
sample_get_started_date: null,
v3_wayfinding_show: true,
};
/*
Retrieve a preference from the preference key.
If the key doesn't exist, initialize it with default values.
*/
function getPreference(prefName) {
// Initialize preference data if it doesn't already exist
if (localStorage.getItem(prefStorageKey) === null) {
initializeStorageItem('preferences', JSON.stringify(defaultPrefObj));
}
// Retrieve and parse preferences as JSON
const prefString = localStorage.getItem(prefStorageKey);
const prefObj = JSON.parse(prefString);
// Return the value of the specified preference
return prefObj[prefName];
}
// Set a preference in the preferences key
function setPreference(prefID, prefValue) {
const prefString = localStorage.getItem(prefStorageKey);
const prefObj = JSON.parse(prefString);
prefObj[prefID] = prefValue;
localStorage.setItem(prefStorageKey, JSON.stringify(prefObj));
}
// Return an object containing all preferences
function getPreferences() {
return JSON.parse(localStorage.getItem(prefStorageKey));
}
////////////////////////////////////////////////////////////////////////////////
//////////// MANAGE INFLUXDATA DOCS URLS IN LOCAL STORAGE //////////////////////
////////////////////////////////////////////////////////////////////////////////
const defaultUrls = {};
Object.entries(influxdbUrls).forEach(([product, { providers }]) => {
defaultUrls[product] =
providers.filter((provider) => provider.name === 'Default')[0]?.regions[0]
?.url || 'https://cloud2.influxdata.com';
});
export const DEFAULT_STORAGE_URLS = {
oss: defaultUrls.oss,
cloud: defaultUrls.cloud,
serverless: defaultUrls.serverless,
core: defaultUrls.core,
enterprise: defaultUrls.enterprise,
dedicated: defaultUrls.cloud_dedicated,
clustered: defaultUrls.clustered,
prev_oss: defaultUrls.oss,
prev_cloud: defaultUrls.cloud,
prev_core: defaultUrls.core,
prev_enterprise: defaultUrls.enterprise,
prev_serverless: defaultUrls.serverless,
prev_dedicated: defaultUrls.cloud_dedicated,
prev_clustered: defaultUrls.clustered,
custom: '',
};
const urlStorageKey = storagePrefix + 'urls';
// Return an object that contains all InfluxDB urls stored in the urls key
function getInfluxDBUrls() {
// Initialize urls data if it doesn't already exist
if (localStorage.getItem(urlStorageKey) === null) {
initializeStorageItem('urls', JSON.stringify(DEFAULT_STORAGE_URLS));
}
return JSON.parse(localStorage.getItem(urlStorageKey));
}
// Get the current or previous URL for a specific product or a custom url
function getInfluxDBUrl(product) {
// Initialize urls data if it doesn't already exist
if (localStorage.getItem(urlStorageKey) === null) {
initializeStorageItem('urls', JSON.stringify(DEFAULT_STORAGE_URLS));
}
// Retrieve and parse the URLs as JSON
const urlsString = localStorage.getItem(urlStorageKey);
const urlsObj = JSON.parse(urlsString);
// Return the URL of the specified product
return urlsObj[product];
}
/*
Set multiple product URLs in the urls key.
Input should be an object where the key is the product and the value is the
URL to set for that product.
*/
function setInfluxDBUrls(updatedUrlsObj) {
const urlsString = localStorage.getItem(urlStorageKey);
const urlsObj = JSON.parse(urlsString);
const newUrlsObj = { ...urlsObj, ...updatedUrlsObj };
localStorage.setItem(urlStorageKey, JSON.stringify(newUrlsObj));
}
// Set an InfluxDB URL to an empty string in the urls key
function removeInfluxDBUrl(product) {
const urlsString = localStorage.getItem(urlStorageKey);
const urlsObj = JSON.parse(urlsString);
urlsObj[product] = '';
localStorage.setItem(urlStorageKey, JSON.stringify(urlsObj));
}
/*
////////////////////////////////////////////////////////////////////////////////
///////////////////////// INFLUXDATA DOCS NOTIFICATIONS ////////////////////////
////////////////////////////////////////////////////////////////////////////////
*/
const notificationStorageKey = storagePrefix + 'notifications';
// Default notifications
const defaultNotificationsObj = {
messages: [],
callouts: [],
};
function getNotifications() {
// Initialize notifications data if it doesn't already exist
if (localStorage.getItem(notificationStorageKey) === null) {
initializeStorageItem('notifications', JSON.stringify(defaultNotificationsObj));
}
// Retrieve and parse the notifications data as JSON
const notificationString = localStorage.getItem(notificationStorageKey);
const notificationObj = JSON.parse(notificationString);
// Return the notifications object
return notificationObj;
}
/*
Checks if a notification is read. Provide the notification ID and one of the
following notification types:
- message
- callout
If the notification ID exists in the array assigned to the specified type, the
notification has been read.
*/
function notificationIsRead(notificationID, notificationType) {
const notificationsObj = getNotifications();
const readNotifications = notificationsObj[`${notificationType}s`];
return readNotifications.includes(notificationID);
}
/*
Sets a notification as read. Provide the notification ID and one of the
following notification types:
- message
- callout
The notification ID is added to the array assigned to the specified type.
*/
function setNotificationAsRead(notificationID, notificationType) {
const notificationsObj = getNotifications();
const readNotifications = notificationsObj[`${notificationType}s`];
readNotifications.push(notificationID);
notificationsObj[notificationType + 's'] = readNotifications;
localStorage.setItem(notificationStorageKey, JSON.stringify(notificationsObj));
}
// Export functions as a module and make the file backwards compatible for non-module environments until all remaining dependent scripts are ported to modules
export { defaultUrls, initializeStorageItem, getPreference, setPreference, getPreferences, getInfluxDBUrls, getInfluxDBUrl, setInfluxDBUrls, removeInfluxDBUrl, getNotifications, notificationIsRead, setNotificationAsRead, };
//# sourceMappingURL=local-storage.js.map

1
dist/services/local-storage.js.map vendored Normal file
View File

@ -0,0 +1 @@
{"version":3,"file":"local-storage.js","sourceRoot":"","sources":["../../assets/js/services/local-storage.js"],"names":[],"mappings":"AAAA;;;;;;;;;;;EAWE;AAEF,OAAO,EAAE,YAAY,EAAE,MAAM,oBAAoB,CAAC;AAElD,+CAA+C;AAC/C,MAAM,aAAa,GAAG,kBAAkB,CAAC;AAEzC;;EAEE;AACF,SAAS,qBAAqB,CAAC,UAAU,EAAE,YAAY;IACrD,MAAM,cAAc,GAAG,aAAa,GAAG,UAAU,CAAC;IAElD,wDAAwD;IACxD,IAAI,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,KAAK,IAAI,EAAE,CAAC;QAClD,YAAY,CAAC,OAAO,CAAC,cAAc,EAAE,YAAY,CAAC,CAAC;IACrD,CAAC;AACH,CAAC;AAED;;;;EAIE;AAEF,MAAM,cAAc,GAAG,aAAa,GAAG,aAAa,CAAC;AAErD,sBAAsB;AACtB,MAAM,cAAc,GAAG;IACrB,OAAO,EAAE,IAAI;IACb,YAAY,EAAE,OAAO;IACrB,aAAa,EAAE,MAAM;IACrB,KAAK,EAAE,OAAO;IACd,uBAAuB,EAAE,IAAI;IAC7B,kBAAkB,EAAE,IAAI;CACzB,CAAC;AAEF;;;EAGE;AACF,SAAS,aAAa,CAAC,QAAQ;IAC7B,yDAAyD;IACzD,IAAI,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,KAAK,IAAI,EAAE,CAAC;QAClD,qBAAqB,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,cAAc,CAAC,CAAC,CAAC;IACvE,CAAC;IAED,yCAAyC;IACzC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;IACxD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,+CAA+C;IAC/C,OAAO,OAAO,CAAC,QAAQ,CAAC,CAAC;AAC3B,CAAC;AAED,0CAA0C;AAC1C,SAAS,aAAa,CAAC,MAAM,EAAE,SAAS;IACtC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC;IACxD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,OAAO,CAAC,MAAM,CAAC,GAAG,SAAS,CAAC;IAE5B,YAAY,CAAC,OAAO,CAAC,cAAc,EAAE,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,CAAC,CAAC;AAChE,CAAC;AAED,8CAA8C;AAC9C,SAAS,cAAc;IACrB,OAAO,IAAI,CAAC,KAAK,CAAC,YAAY,CAAC,OAAO,CAAC,cAAc,CAAC,CAAC,CAAC;AAC1D,CAAC;AAED,gFAAgF;AAChF,gFAAgF;AAChF,gFAAgF;AAEhF,MAAM,WAAW,GAAG,EAAE,CAAC;AACvB,MAAM,CAAC,OAAO,CAAC,YAAY,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,OAAO,EAAE,EAAE,SAAS,EAAE,CAAC,EAAE,EAAE;IAChE,WAAW,CAAC,OAAO,CAAC;QAClB,SAAS,CAAC,MAAM,CAAC,CAAC,QAAQ,EAAE,EAAE,CAAC,QAAQ,CAAC,IAAI,KAAK,SAAS,CAAC,CAAC,CAAC,CAAC,EAAE,OAAO,CAAC,CAAC,CAAC;YACxE,EAAE,GAAG,IAAI,+BAA+B,CAAC;AAC/C,CAAC,CAAC,CAAC;AAEH,MAAM,CAAC,MAAM,oBAAoB,GAAG;IAClC,GAAG,EAAE,WAAW,CAAC,GAAG;IACpB,KAAK,EAAE,WAAW,CAAC,KAAK;IACxB,UAAU,EAAE,WAAW,CAAC,UAAU;IAClC,IAAI,EAAE,WAAW,CAAC,IAAI;IACtB,UAAU,EAAE,WAAW,CAAC,UAAU;IAClC,SAAS,EAAE,WAAW,CAAC,eAAe;IACtC,SAAS,EAAE,WAAW,CAAC,SAAS;IAChC,QAAQ,EAAE,WAAW,CAAC,GAAG;IACzB,UAAU,EAAE,WAAW,CAAC,KAAK;IAC7B,SAAS,EAAE,WAAW,CAAC,IAAI;IAC3B,eAAe,EAAE,WAAW,CAAC,UAAU;IACvC,eAAe,EAAE,WAAW,CAAC,UAAU;IACvC,cAAc,EAAE,WAAW,CAAC,eAAe;IAC3C,cAAc,EAAE,WAAW,CAAC,SAAS;IACrC,MAAM,EAAE,EAAE;CACX,CAAC;AAEF,MAAM,aAAa,GAAG,aAAa,GAAG,MAAM,CAAC;AAE7C,0EAA0E;AAC1E,SAAS,eAAe;IACtB,mDAAmD;IACnD,IAAI,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE,CAAC;QACjD,qBAAqB,CAAC,MAAM,EAAE,IAAI,CAAC,SAAS,CAAC,oBAAoB,CAAC,CAAC,CAAC;IACtE,CAAC;IAED,OAAO,IAAI,CAAC,KAAK,CAAC,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC,CAAC;AACzD,CAAC;AAED,yEAAyE;AACzE,SAAS,cAAc,CAAC,OAAO;IAC7B,mDAAmD;IACnD,IAAI,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,KAAK,IAAI,EAAE,CAAC;QACjD,qBAAqB,CAAC,MAAM,EAAE,IAAI,CAAC,SAAS,CAAC,oBAAoB,CAAC,CAAC,CAAC;IACtE,CAAC;IAED,sCAAsC;IACtC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,0CAA0C;IAC1C,OAAO,OAAO,CAAC,OAAO,CAAC,CAAC;AAC1B,CAAC;AAED;;;;EAIE;AACF,SAAS,eAAe,CAAC,cAAc;IACrC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,MAAM,UAAU,GAAG,EAAE,GAAG,OAAO,EAAE,GAAG,cAAc,EAAE,CAAC;IAErD,YAAY,CAAC,OAAO,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,UAAU,CAAC,CAAC,CAAC;AAClE,CAAC;AAED,yDAAyD;AACzD,SAAS,iBAAiB,CAAC,OAAO;IAChC,MAAM,UAAU,GAAG,YAAY,CAAC,OAAO,CAAC,aAAa,CAAC,CAAC;IACvD,MAAM,OAAO,GAAG,IAAI,CAAC,KAAK,CAAC,UAAU,CAAC,CAAC;IAEvC,OAAO,CAAC,OAAO,CAAC,GAAG,EAAE,CAAC;IAEtB,YAAY,CAAC,OAAO,CAAC,aAAa,EAAE,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,CAAC,CAAC;AAC/D,CAAC;AAED;;;;EAIE;AAEF,MAAM,sBAAsB,GAAG,aAAa,GAAG,eAAe,CAAC;AAE/D,wBAAwB;AACxB,MAAM,uBAAuB,GAAG;IAC9B,QAAQ,EAAE,EAAE;IACZ,QAAQ,EAAE,EAAE;CACb,CAAC;AAEF,SAAS,gBAAgB;IACvB,4DAA4D;IAC5D,IAAI,YAAY,CAAC,OAAO,CAAC,sBAAsB,CAAC,KAAK,IAAI,EAAE,CAAC;QAC1D,qBAAqB,CACnB,eAAe,EACf,IAAI,CAAC,SAAS,CAAC,uBAAuB,CAAC,CACxC,CAAC;IACJ,CAAC;IAED,oDAAoD;IACpD,MAAM,kBAAkB,GAAG,YAAY,CAAC,OAAO,CAAC,sBAAsB,CAAC,CAAC;IACxE,MAAM,eAAe,GAAG,IAAI,CAAC,KAAK,CAAC,kBAAkB,CAAC,CAAC;IAEvD,kCAAkC;IAClC,OAAO,eAAe,CAAC;AACzB,CAAC;AAED;;;;;;;;;EASE;AACF,SAAS,kBAAkB,CAAC,cAAc,EAAE,gBAAgB;IAC1D,MAAM,gBAAgB,GAAG,gBAAgB,EAAE,CAAC;IAC5C,MAAM,iBAAiB,GAAG,gBAAgB,CAAC,GAAG,gBAAgB,GAAG,CAAC,CAAC;IAEnE,OAAO,iBAAiB,CAAC,QAAQ,CAAC,cAAc,CAAC,CAAC;AACpD,CAAC;AAED;;;;;;;;EAQE;AACF,SAAS,qBAAqB,CAAC,cAAc,EAAE,gBAAgB;IAC7D,MAAM,gBAAgB,GAAG,gBAAgB,EAAE,CAAC;IAC5C,MAAM,iBAAiB,GAAG,gBAAgB,CAAC,GAAG,gBAAgB,GAAG,CAAC,CAAC;IAEnE,iBAAiB,CAAC,IAAI,CAAC,cAAc,CAAC,CAAC;IACvC,gBAAgB,CAAC,gBAAgB,GAAG,GAAG,CAAC,GAAG,iBAAiB,CAAC;IAE7D,YAAY,CAAC,OAAO,CAClB,sBAAsB,EACtB,IAAI,CAAC,SAAS,CAAC,gBAAgB,CAAC,CACjC,CAAC;AACJ,CAAC;AAED,8JAA8J;AAC9J,OAAO,EACL,WAAW,EACX,qBAAqB,EACrB,aAAa,EACb,aAAa,EACb,cAAc,EACd,eAAe,EACf,cAAc,EACd,eAAe,EACf,iBAAiB,EACjB,gBAAgB,EAChB,kBAAkB,EAClB,qBAAqB,GACtB,CAAC"}

View File

@ -0,0 +1,71 @@
"""Debug test to show actual metrics output."""
import os
import requests
def test_show_actual_metrics():
"""Display actual metrics from Core instance."""
# Get token
token = os.environ.get("INFLUXDB3_CORE_TOKEN")
headers = {"Authorization": f"Token {token}"} if token else {}
# Fetch metrics
url = "http://influxdb3-core:8181"
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
print(f"\n{'='*80}")
print(f"ACTUAL METRICS FROM {url}")
print(f"Status Code: {response.status_code}")
print(f"Using Auth: {'Yes' if token else 'No'}")
print(f"{'='*80}\n")
if response.status_code == 200:
lines = response.text.split('\n')
print(f"Total lines: {len(lines)}\n")
# Show first 100 lines
print("First 100 lines of actual output:\n")
for i, line in enumerate(lines[:100], 1):
print(f"{i:4d} | {line}")
# Show examples of documented metrics
print(f"\n{'='*80}")
print("SEARCHING FOR DOCUMENTED METRICS:")
print(f"{'='*80}\n")
documented_metrics = [
"http_requests_total",
"grpc_requests_total",
"influxdb3_catalog_operations_total",
"influxdb_iox_query_log_compute_duration_seconds",
"datafusion_mem_pool_bytes",
"object_store_op_duration_seconds",
"jemalloc_memstats_bytes",
]
for metric in documented_metrics:
# Find TYPE and HELP lines
type_line = next((line for line in lines if f"# TYPE {metric}" in line), None)
help_line = next((line for line in lines if f"# HELP {metric}" in line), None)
# Find first few data lines
data_lines = [line for line in lines if line.startswith(metric) and not line.startswith("#")][:3]
if type_line or help_line or data_lines:
print(f"\n{metric}:")
if help_line:
print(f" {help_line}")
if type_line:
print(f" {type_line}")
for data in data_lines:
print(f" {data}")
else:
print(f"\n{metric}: NOT FOUND")
else:
print(f"ERROR: Status {response.status_code}")
print(response.text[:500])
# Always pass so we can see the output
assert True

View File

@ -0,0 +1,320 @@
"""Test InfluxDB 3 metrics endpoint for PR #6422.
This test suite validates that the metrics documentation in PR #6422 is accurate
by checking that all documented metrics are actually exposed by the
InfluxDB 3 Core and Enterprise instances.
Usage:
# Basic test execution
docker compose run --rm influxdb3-core-pytest test/metrics_endpoint_test.py
# With verbose output (shows actual metrics and matches)
VERBOSE_METRICS_TEST=true docker compose run --rm influxdb3-core-pytest test/metrics_endpoint_test.py
# Using the wrapper script (recommended)
./test/run-metrics-tests.sh
# With verbose output using wrapper script
VERBOSE_METRICS_TEST=true ./test/run-metrics-tests.sh
Verbose Output:
Set VERBOSE_METRICS_TEST=true to see detailed output showing:
- Which metrics are being searched for
- Actual matching lines from the Prometheus metrics endpoint
- Total occurrence counts (for tests that include comments)
- Clear indication when metrics are not found
Example verbose output:
TEST: HTTP/gRPC Metrics
================================================================================
Searching for: http_requests_total
Found 12 total occurrences
Matches:
# HELP http_requests_total accumulated total requests
# TYPE http_requests_total counter
http_requests_total{method="GET",path="/metrics",status="aborted"} 0
Authentication:
These tests require authentication tokens for InfluxDB 3 Core and Enterprise.
If you get 401 errors, set the following environment variables:
- INFLUXDB3_CORE_TOKEN: Admin token for InfluxDB 3 Core instance
- INFLUXDB3_ENTERPRISE_TOKEN: Admin token for InfluxDB 3 Enterprise instance
Prerequisites:
- Docker and Docker Compose installed
- Running InfluxDB 3 Core and Enterprise containers
- Valid authentication tokens stored in ~/.env.influxdb3-core-admin-token
and ~/.env.influxdb3-enterprise-admin-token (for wrapper script)
"""
import os
import re
import pytest
import requests
# Set to True to see detailed output of what's being checked
VERBOSE_OUTPUT = os.environ.get("VERBOSE_METRICS_TEST", "false").lower() == "true"
class MetricsHelper:
"""Helper class for metrics endpoint testing."""
@staticmethod
def get_auth_headers(token_env_var):
"""Get authorization headers if token is set."""
token = os.environ.get(token_env_var)
if token:
return {"Authorization": f"Token {token}"}
return {}
@staticmethod
def get_metrics(url, token_env_var):
"""Get metrics from endpoint with optional authentication."""
headers = MetricsHelper.get_auth_headers(token_env_var)
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
if response.status_code == 401:
pytest.skip(f"Authentication required. Set {token_env_var} environment variable.")
assert response.status_code == 200, f"Metrics returned {response.status_code}"
return response.text
@staticmethod
def print_metric_search(test_name, metrics, text, include_comments=False):
"""Print verbose output showing searched metrics and matches."""
if not VERBOSE_OUTPUT:
return
print("\n" + "="*80)
print(f"TEST: {test_name}")
print("="*80)
for metric in metrics:
lines = text.split('\n')
if include_comments:
matches = [line for line in lines if metric in line][:3]
else:
matches = [line for line in lines if metric in line and not line.startswith("#")][:3]
print(f"\n✓ Searching for: {metric}")
if matches:
if include_comments:
print(f" Found {len([l for l in lines if metric in l])} total occurrences")
print(" Matches:")
for match in matches:
print(f" {match}")
else:
print(f" ✗ NOT FOUND")
def test_core_metrics_endpoint_accessible():
"""Test that Core metrics endpoint is accessible."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
assert len(text) > 0, "Core metrics response is empty"
def test_enterprise_metrics_endpoint_accessible():
"""Test that Enterprise metrics endpoint is accessible."""
text = MetricsHelper.get_metrics("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN")
assert len(text) > 0, "Enterprise metrics response is empty"
def test_prometheus_format():
"""Test that metrics follow Prometheus exposition format."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
# Check for HELP comments
assert "# HELP" in text, "Missing HELP comments"
# Check for TYPE comments
assert "# TYPE" in text, "Missing TYPE comments"
# Check for valid metric lines (name{labels} value or name value)
metric_pattern = r"^[a-zA-Z_][a-zA-Z0-9_]*(\{[^}]*\})?\s+[\d\.\+\-eE]+(\s+\d+)?$"
lines = [line for line in text.split("\n") if line and not line.startswith("#")]
assert any(
re.match(metric_pattern, line) for line in lines
), "No valid metric lines found"
def test_http_grpc_metrics():
"""Test HTTP and gRPC metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"http_requests_total",
"http_request_duration_seconds",
"http_response_body_size_bytes",
"grpc_requests_total",
"grpc_request_duration_seconds",
]
MetricsHelper.print_metric_search("HTTP/gRPC Metrics", metrics, text, include_comments=True)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing HTTP/gRPC metrics: {missing}"
def test_database_operation_metrics():
"""Test database operation metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"influxdb3_catalog_operations_total",
"influxdb3_catalog_operation_retries_total",
]
MetricsHelper.print_metric_search("Database Operation Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing database operation metrics: {missing}"
def test_query_performance_metrics():
"""Test query performance metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"influxdb_iox_query_log_compute_duration_seconds",
"influxdb_iox_query_log_execute_duration_seconds",
"influxdb_iox_query_log_plan_duration_seconds",
"influxdb_iox_query_log_end2end_duration_seconds",
"influxdb_iox_query_log_max_memory",
"influxdb_iox_query_log_parquet_files",
]
MetricsHelper.print_metric_search("Query Performance Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing query performance metrics: {missing}"
def test_memory_caching_metrics():
"""Test memory and caching metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"datafusion_mem_pool_bytes",
"influxdb3_parquet_cache_access_total",
"influxdb3_parquet_cache_size_bytes",
"influxdb3_parquet_cache_size_number_of_files",
"jemalloc_memstats_bytes",
]
MetricsHelper.print_metric_search("Memory & Caching Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing memory/caching metrics: {missing}"
def test_object_storage_metrics():
"""Test object storage metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"object_store_op_duration_seconds",
"object_store_transfer_bytes_total",
"object_store_transfer_objects_total",
]
MetricsHelper.print_metric_search("Object Storage Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing object storage metrics: {missing}"
def test_runtime_system_metrics():
"""Test runtime and system metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
metrics = [
"process_start_time_seconds",
"thread_panic_count_total",
"tokio_runtime_num_alive_tasks",
]
MetricsHelper.print_metric_search("Runtime & System Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing runtime/system metrics: {missing}"
def test_metric_types():
"""Test that key metrics have correct types."""
text = MetricsHelper.get_metrics("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN")
# Check for expected types (case-insensitive partial match)
type_checks = [
("http_requests_total", "counter"),
("http_request_duration_seconds", "histogram"),
("datafusion_mem_pool_bytes", "gauge"),
]
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print("TEST: Metric Type Validation")
print("="*80)
for metric_name, expected_type in type_checks:
type_pattern = rf"# TYPE {metric_name}\s+{expected_type}"
match = re.search(type_pattern, text, re.IGNORECASE)
print(f"\n✓ Checking: {metric_name} should be {expected_type}")
if match:
print(f" Match: {match.group()}")
else:
print(f" ✗ NOT FOUND or WRONG TYPE")
for metric_name, expected_type in type_checks:
# Look for TYPE line for this metric
type_pattern = rf"# TYPE {metric_name}\s+{expected_type}"
assert re.search(
type_pattern, text, re.IGNORECASE
), f"Metric {metric_name} should be type {expected_type}"
def test_enterprise_cluster_metrics():
"""Test Enterprise-specific cluster metrics exist."""
text = MetricsHelper.get_metrics("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN")
# These metrics are mentioned in Enterprise documentation
metrics = [
"influxdb3_catalog_operation_retries_total",
"influxdb_iox_query_log_ingester_latency",
]
MetricsHelper.print_metric_search("Enterprise Cluster Metrics", metrics, text)
missing = [m for m in metrics if m not in text]
assert not missing, f"Missing Enterprise cluster metrics: {missing}"
@pytest.mark.parametrize("url,token_env,instance", [
("http://influxdb3-core:8181", "INFLUXDB3_CORE_TOKEN", "Core"),
("http://influxdb3-enterprise:8181", "INFLUXDB3_ENTERPRISE_TOKEN", "Enterprise")
])
def test_metrics_have_labels(url, token_env, instance):
"""Test that metrics have proper labels."""
text = MetricsHelper.get_metrics(url, token_env)
# Find a metric with labels (look for http_requests_total)
label_pattern = r'http_requests_total\{[^}]+\}'
matches = re.findall(label_pattern, text)
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print(f"TEST: Metric Label Validation ({instance})")
print("="*80)
print(f"\n✓ Searching for labeled metrics using pattern: {label_pattern}")
print(f" Found {len(matches)} labeled metrics")
if matches:
print(" Sample matches:")
for match in matches[:3]:
print(f" {match}")
assert len(matches) > 0, f"{instance}: No metrics with labels found"
# Check that labels are properly formatted
for match in matches:
assert '="' in match, f"{instance}: Labels should use = and quotes"
assert match.endswith("}"), f"{instance}: Labels should end with }}"

View File

@ -0,0 +1,87 @@
# Prometheus configuration for testing InfluxDB 3 metrics
# Based on documentation in content/shared/influxdb3-admin/monitor-metrics.md
# This configuration matches the examples provided in PR #6422
# NOTE: If your InfluxDB instance requires authentication for the /metrics endpoint,
# you'll need to configure bearer_token or bearer_token_file in the scrape configs below.
# See: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
global:
scrape_interval: 30s
evaluation_interval: 30s
external_labels:
monitor: 'influxdb3-test'
# Scrape configurations
scrape_configs:
# InfluxDB 3 Core
# Documentation reference: lines 563-571 in monitor-metrics.md
- job_name: 'influxdb3-core'
static_configs:
- targets: ['influxdb3-core:8181']
labels:
environment: 'test'
product: 'core'
metrics_path: '/metrics'
scrape_interval: 30s
scrape_timeout: 10s
# Authentication - uses credential file
# Token is written to /tmp/core-token by docker-compose entrypoint
authorization:
credentials_file: /tmp/core-token
# Fallback protocol for targets that don't send Content-Type
fallback_scrape_protocol: 'PrometheusText0.0.4'
# Relabeling to add node identification (same as Enterprise)
relabel_configs:
# Extract node name from address
- source_labels: [__address__]
target_label: node_name
regex: '([^:]+):.*'
replacement: '${1}'
# Add node role based on name pattern
- source_labels: [node_name]
target_label: node_role
regex: '.*core.*'
replacement: 'all-in-one-core'
- source_labels: [node_name]
target_label: node_role
regex: '.*enterprise.*'
replacement: 'all-in-one-enterprise'
# InfluxDB 3 Enterprise
# Documentation reference: lines 399-418 in monitor-metrics.md
# Includes relabeling from lines 536-553
- job_name: 'influxdb3-enterprise'
static_configs:
- targets: ['influxdb3-enterprise:8181']
labels:
environment: 'test'
product: 'enterprise'
metrics_path: '/metrics'
scrape_interval: 30s
scrape_timeout: 10s
# Authentication - uses credential file
# Token is written to /tmp/enterprise-token by docker-compose entrypoint
authorization:
credentials_file: /tmp/enterprise-token
# Fallback protocol for targets that don't send Content-Type
fallback_scrape_protocol: 'PrometheusText0.0.4'
# Relabeling to add node identification
# Documentation reference: lines 536-553
relabel_configs:
# Extract node name from address
- source_labels: [__address__]
target_label: node_name
regex: '([^:]+):.*'
replacement: '${1}'
# Add node role based on name pattern
- source_labels: [node_name]
target_label: node_role
regex: '.*core.*'
replacement: 'all-in-one-core'
- source_labels: [node_name]
target_label: node_role
regex: '.*enterprise.*'
replacement: 'all-in-one-enterprise'

View File

@ -0,0 +1,391 @@
"""Test Prometheus integration and relabeling for PR #6422.
This test suite validates that the Prometheus configuration and relabeling
examples documented in PR #6422 actually work correctly.
Unlike metrics_endpoint_test.py which directly queries InfluxDB endpoints,
this test:
1. Starts Prometheus with the documented configuration
2. Validates Prometheus can scrape InfluxDB endpoints
3. Verifies relabeling rules add node_name and node_role labels
4. Tests PromQL queries with the relabeled metrics
Usage:
# Start Prometheus and run integration tests
docker compose --profile monitoring up -d
docker compose run --rm influxdb3-core-pytest test/prometheus_integration_test.py
# Or use the wrapper script
./test/run-prometheus-tests.sh
Prerequisites:
- Docker and Docker Compose installed
- Running InfluxDB 3 Core and Enterprise containers
- Prometheus service started with --profile monitoring
- Valid authentication tokens (if required)
"""
import os
import time
import pytest
import requests
# Prometheus API endpoint
PROMETHEUS_URL = os.environ.get("PROMETHEUS_URL", "http://prometheus:9090")
# Set to True to see detailed output
VERBOSE_OUTPUT = os.environ.get("VERBOSE_PROMETHEUS_TEST", "false").lower() == "true"
class PrometheusHelper:
"""Helper class for Prometheus integration testing."""
@staticmethod
def wait_for_prometheus(timeout=30):
"""Wait for Prometheus to be ready."""
start_time = time.time()
while time.time() - start_time < timeout:
try:
response = requests.get(f"{PROMETHEUS_URL}/-/ready", timeout=5)
if response.status_code == 200:
return True
except requests.exceptions.RequestException:
pass
time.sleep(1)
return False
@staticmethod
def wait_for_targets(timeout=60):
"""Wait for Prometheus to discover and scrape targets."""
start_time = time.time()
while time.time() - start_time < timeout:
try:
response = requests.get(
f"{PROMETHEUS_URL}/api/v1/targets",
timeout=5
)
if response.status_code == 200:
data = response.json()
active_targets = data.get("data", {}).get("activeTargets", [])
# Check if all targets are up
all_up = all(
target.get("health") == "up"
for target in active_targets
)
if all_up and len(active_targets) >= 2:
if VERBOSE_OUTPUT:
print(f"\n✓ All {len(active_targets)} targets are up")
return True
if VERBOSE_OUTPUT:
up_count = sum(
1 for t in active_targets
if t.get("health") == "up"
)
print(f" Waiting for targets: {up_count}/{len(active_targets)} up")
except requests.exceptions.RequestException as e:
if VERBOSE_OUTPUT:
print(f" Error checking targets: {e}")
time.sleep(2)
return False
@staticmethod
def query_prometheus(query):
"""Execute a PromQL query."""
response = requests.get(
f"{PROMETHEUS_URL}/api/v1/query",
params={"query": query},
timeout=10
)
assert response.status_code == 200, f"Query failed: {response.text}"
return response.json()
@staticmethod
def print_query_result(query, result):
"""Print verbose query result."""
if not VERBOSE_OUTPUT:
return
print(f"\n✓ Query: {query}")
data = result.get("data", {})
result_type = data.get("resultType")
results = data.get("result", [])
print(f" Result type: {result_type}")
print(f" Number of results: {len(results)}")
if results:
print(" Sample results:")
for result in results[:3]:
metric = result.get("metric", {})
value = result.get("value", [None, None])
print(f" {metric} => {value[1]}")
def test_prometheus_is_ready():
"""Test that Prometheus service is ready."""
assert PrometheusHelper.wait_for_prometheus(), (
"Prometheus not ready after 30 seconds. "
"Ensure Prometheus is running: docker compose --profile monitoring up -d"
)
def test_prometheus_targets_discovered():
"""Test that Prometheus has discovered InfluxDB targets."""
response = requests.get(f"{PROMETHEUS_URL}/api/v1/targets", timeout=10)
assert response.status_code == 200, "Failed to get targets"
data = response.json()
targets = data.get("data", {}).get("activeTargets", [])
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print("TEST: Prometheus Target Discovery")
print("="*80)
for target in targets:
health = target.get("health")
job = target.get("labels", {}).get("job")
address = target.get("scrapeUrl")
print(f"\n✓ Target: {job}")
print(f" Health: {health}")
print(f" Address: {address}")
# Should have at least 2 targets (core and enterprise)
assert len(targets) >= 2, f"Expected at least 2 targets, found {len(targets)}"
# Check for expected job names
job_names = {target.get("labels", {}).get("job") for target in targets}
assert "influxdb3-core" in job_names, "Missing influxdb3-core target"
assert "influxdb3-enterprise" in job_names, "Missing influxdb3-enterprise target"
def test_prometheus_targets_up():
"""Test that all Prometheus targets are healthy."""
assert PrometheusHelper.wait_for_targets(), (
"Targets not healthy after 60 seconds. "
"Check that InfluxDB instances are running and accessible."
)
response = requests.get(f"{PROMETHEUS_URL}/api/v1/targets", timeout=10)
data = response.json()
targets = data.get("data", {}).get("activeTargets", [])
unhealthy = [
target for target in targets
if target.get("health") != "up"
]
assert not unhealthy, (
f"Found {len(unhealthy)} unhealthy targets: "
f"{[t.get('labels', {}).get('job') for t in unhealthy]}"
)
def test_relabeling_adds_node_name():
"""Test that relabeling adds node_name label.
Documentation reference: monitor-metrics.md lines 536-540
Relabeling extracts hostname from __address__ and adds as node_name.
"""
# Wait for metrics to be scraped
time.sleep(5)
# Query for any metric with node_name label
query = 'http_requests_total{node_name!=""}'
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
data = result.get("data", {})
results = data.get("result", [])
assert len(results) > 0, (
"No metrics found with node_name label. "
"Relabeling may not be working correctly."
)
# Verify node_name values match expected patterns
node_names = {
result.get("metric", {}).get("node_name")
for result in results
}
if VERBOSE_OUTPUT:
print(f"\n✓ Found node_name labels: {node_names}")
# Should have node names for both core and enterprise
assert any("core" in name for name in node_names), (
"No node_name containing 'core' found"
)
assert any("enterprise" in name for name in node_names), (
"No node_name containing 'enterprise' found"
)
def test_relabeling_adds_node_role():
"""Test that relabeling adds node_role label.
Documentation reference: monitor-metrics.md lines 541-553
Relabeling assigns node_role based on node_name pattern.
"""
# Wait for metrics to be scraped
time.sleep(5)
# Query for metrics with node_role label
query = 'http_requests_total{node_role!=""}'
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
data = result.get("data", {})
results = data.get("result", [])
assert len(results) > 0, (
"No metrics found with node_role label. "
"Relabeling may not be working correctly."
)
# Verify node_role values
node_roles = {
result.get("metric", {}).get("node_role")
for result in results
}
if VERBOSE_OUTPUT:
print(f"\n✓ Found node_role labels: {node_roles}")
# Based on test/prometheus.yml relabeling rules
expected_roles = {"all-in-one-core", "all-in-one-enterprise"}
assert node_roles & expected_roles, (
f"Expected roles {expected_roles}, found {node_roles}"
)
def test_query_metrics_by_node():
"""Test that metrics can be queried by node labels.
This validates that users can filter metrics by node_name and node_role
as documented in the monitoring guide.
"""
# Wait for metrics to be scraped
time.sleep(5)
# Query metrics for specific node
queries = [
'http_requests_total{node_name="influxdb3-core"}',
'http_requests_total{node_name="influxdb3-enterprise"}',
'http_requests_total{node_role="all-in-one-core"}',
'http_requests_total{node_role="all-in-one-enterprise"}',
]
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print("TEST: Query Metrics by Node Labels")
print("="*80)
for query in queries:
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
data = result.get("data", {})
results = data.get("result", [])
assert len(results) > 0, f"No results for query: {query}"
def test_promql_rate_query():
"""Test rate() query from documentation examples.
Documentation commonly shows rate queries for counters.
"""
# Wait for enough data
time.sleep(10)
query = 'rate(http_requests_total[1m])'
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
data = result.get("data", {})
results = data.get("result", [])
# Should have results (may be 0 if no recent requests)
assert isinstance(results, list), "Expected list of results"
def test_promql_histogram_quantile():
"""Test histogram_quantile() query from documentation examples.
Documentation reference: Example queries for query duration metrics.
"""
# Wait for enough data
time.sleep(10)
query = 'histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[1m]))'
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
# Query should execute without error
assert result.get("status") == "success", (
f"Query failed: {result.get('error')}"
)
def test_enterprise_metrics_queryable():
"""Test that Enterprise-specific metrics are queryable via Prometheus."""
# Wait for metrics to be scraped
time.sleep(5)
# Query Enterprise-specific metrics
queries = [
'influxdb3_catalog_operation_retries_total',
'influxdb_iox_query_log_ingester_latency',
]
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print("TEST: Enterprise-Specific Metrics")
print("="*80)
for query in queries:
result = PrometheusHelper.query_prometheus(query)
PrometheusHelper.print_query_result(query, result)
# Query should execute (may have no results if no activity)
assert result.get("status") == "success", (
f"Query failed: {result.get('error')}"
)
def test_prometheus_config_matches_docs():
"""Verify Prometheus configuration matches documented examples.
This test validates that test/prometheus.yml matches the configuration
examples in the documentation.
"""
response = requests.get(f"{PROMETHEUS_URL}/api/v1/status/config", timeout=10)
assert response.status_code == 200, "Failed to get Prometheus config"
config = response.json()
config_yaml = config.get("data", {}).get("yaml", "")
if VERBOSE_OUTPUT:
print("\n" + "="*80)
print("TEST: Prometheus Configuration")
print("="*80)
print("\nConfiguration (first 500 chars):")
print(config_yaml[:500])
# Verify key configuration elements from documentation
assert "influxdb3-core" in config_yaml, "Missing influxdb3-core job"
assert "influxdb3-enterprise" in config_yaml, "Missing influxdb3-enterprise job"
assert "relabel_configs" in config_yaml, "Missing relabel_configs"
assert "node_name" in config_yaml, "Missing node_name in relabeling"
assert "node_role" in config_yaml, "Missing node_role in relabeling"
# Verify scrape settings
assert "/metrics" in config_yaml, "Missing /metrics path"

View File

@ -0,0 +1,50 @@
#!/bin/bash
# Run metrics endpoint tests with authentication
#
# Usage:
# ./test/run-metrics-tests.sh # Run direct metrics tests
# ./test/run-metrics-tests.sh --prometheus # Run Prometheus integration tests
# ./test/run-metrics-tests.sh --all # Run both test suites
set -e
# Read tokens from secret files
INFLUXDB3_CORE_TOKEN=$(cat ~/.env.influxdb3-core-admin-token)
INFLUXDB3_ENTERPRISE_TOKEN=$(cat ~/.env.influxdb3-enterprise-admin-token)
# Export for docker compose
export INFLUXDB3_CORE_TOKEN
export INFLUXDB3_ENTERPRISE_TOKEN
export VERBOSE_METRICS_TEST
# Parse arguments
RUN_DIRECT=true
RUN_PROMETHEUS=false
if [[ "$1" == "--prometheus" ]]; then
RUN_DIRECT=false
RUN_PROMETHEUS=true
shift
elif [[ "$1" == "--all" ]]; then
RUN_DIRECT=true
RUN_PROMETHEUS=true
shift
fi
# Run direct metrics tests
if [[ "$RUN_DIRECT" == "true" ]]; then
echo "Running direct metrics endpoint tests..."
docker compose run --rm \
-e INFLUXDB3_CORE_TOKEN \
-e INFLUXDB3_ENTERPRISE_TOKEN \
-e VERBOSE_METRICS_TEST \
influxdb3-core-pytest \
"test/influxdb3/metrics_endpoint_test.py" "$@"
echo ""
fi
# Run Prometheus integration tests
if [[ "$RUN_PROMETHEUS" == "true" ]]; then
echo "Running Prometheus integration tests..."
./test/influxdb3/run-prometheus-tests.sh "$@"
fi

View File

@ -0,0 +1,48 @@
#!/bin/bash
# Run Prometheus integration tests with authentication
# This script validates that Prometheus can scrape InfluxDB metrics
# and that relabeling configuration works as documented.
set -e
# Read tokens from secret files
INFLUXDB3_CORE_TOKEN=$(cat ~/.env.influxdb3-core-admin-token)
INFLUXDB3_ENTERPRISE_TOKEN=$(cat ~/.env.influxdb3-enterprise-admin-token)
# Export for docker compose
export INFLUXDB3_CORE_TOKEN
export INFLUXDB3_ENTERPRISE_TOKEN
export VERBOSE_PROMETHEUS_TEST
echo "Starting Prometheus integration tests..."
echo ""
echo "This will:"
echo " 1. Start Prometheus with documented configuration"
echo " 2. Wait for Prometheus to scrape InfluxDB endpoints"
echo " 3. Validate relabeling adds node_name and node_role labels"
echo " 4. Test PromQL queries with relabeled metrics"
echo ""
# Start Prometheus if not already running
if ! docker ps | grep -q prometheus; then
echo "Starting Prometheus service..."
docker compose --profile monitoring up -d prometheus
echo "Waiting for Prometheus to start..."
sleep 5
fi
# Run tests
echo "Running Prometheus integration tests..."
docker compose run --rm \
-e INFLUXDB3_CORE_TOKEN \
-e INFLUXDB3_ENTERPRISE_TOKEN \
-e VERBOSE_PROMETHEUS_TEST \
-e PROMETHEUS_URL=http://prometheus:9090 \
influxdb3-core-pytest \
"test/influxdb3/prometheus_integration_test.py" "$@"
echo ""
echo "Tests complete!"
echo ""
echo "To view Prometheus UI, visit: http://localhost:9090"
echo "To stop Prometheus: docker compose --profile monitoring down"

View File

@ -0,0 +1,88 @@
#!/usr/bin/env python3
"""Display sample metrics output from InfluxDB 3 instances."""
import os
import sys
import requests
def show_metrics_sample(url, token_env_var, instance_name, num_lines=150):
"""Fetch and display sample metrics."""
print(f"\n{'='*80}")
print(f"{instance_name} Metrics Sample (first {num_lines} lines)")
print(f"URL: {url}/metrics")
print(f"{'='*80}\n")
# Get auth headers
headers = {}
token = os.environ.get(token_env_var)
if token:
headers = {"Authorization": f"Token {token}"}
print(f"✓ Using authentication token from {token_env_var}\n")
else:
print(f"⚠ No token found in {token_env_var} - trying without auth\n")
try:
response = requests.get(f"{url}/metrics", headers=headers, timeout=5)
if response.status_code == 401:
print(f"✗ Authentication required but no valid token provided")
return
if response.status_code != 200:
print(f"✗ Unexpected status code: {response.status_code}")
return
# Display first N lines
lines = response.text.split('\n')
print(f"Total lines: {len(lines)}\n")
for i, line in enumerate(lines[:num_lines], 1):
print(f"{i:4d} | {line}")
if len(lines) > num_lines:
print(f"\n... ({len(lines) - num_lines} more lines)")
# Show some interesting metrics
print(f"\n{'='*80}")
print("Sample Metric Searches:")
print(f"{'='*80}\n")
metrics_to_show = [
"http_requests_total",
"grpc_requests_total",
"influxdb3_catalog_operations_total",
"influxdb_iox_query_log_compute_duration_seconds",
"datafusion_mem_pool_bytes",
"object_store_op_duration_seconds",
]
for metric in metrics_to_show:
matching = [line for line in lines if metric in line and not line.startswith("#")]
if matching:
print(f"✓ Found '{metric}' - showing first 3 values:")
for match in matching[:3]:
print(f" {match}")
else:
print(f"✗ Metric '{metric}' not found")
except Exception as e:
print(f"✗ Error fetching metrics: {e}")
if __name__ == "__main__":
# Show Core metrics
show_metrics_sample(
"http://influxdb3-core:8181",
"INFLUXDB3_CORE_TOKEN",
"InfluxDB 3 Core",
num_lines=100
)
print("\n\n")
# Show Enterprise metrics
show_metrics_sample(
"http://influxdb3-enterprise:8181",
"INFLUXDB3_ENTERPRISE_TOKEN",
"InfluxDB 3 Enterprise",
num_lines=100
)

View File

@ -5,8 +5,8 @@
python_files = *_test.py *_test_sh.py
# Collect classes with names ending in Test.
python_classes = *Test
# Collect all functions.
python_functions = *
# Collect test functions (exclude helpers).
python_functions = test_*
filterwarnings = ignore::pytest.PytestReturnNotNoneWarning
# Log settings.