docs-v2

19 KiB

Raw Permalink Blame History

Configure thread allocation, memory settings, and other parameters to optimize {{% product-name %}} performance based on your workload characteristics.

Best practices
General monitoring principles
Essential settings for performance
Common performance issues
Configuration examples by workload
Thread allocation details
Enterprise mode-specific tuning
Memory tuning
Advanced tuning options
Monitoring and validation
Common performance issues

Best practices

Start with monitoring: Understand your current bottlenecks before tuning
Change one parameter at a time: Isolate the impact of each change
Test with production-like workloads: Use realistic data and query patterns
Document your configuration: Keep track of what works for your workload
Plan for growth: Leave headroom for traffic increases
Regular review: Periodically reassess as workloads evolve

General monitoring principles

Before tuning performance, establish baseline metrics to identify bottlenecks:

Key metrics to monitor

CPU usage per core
- Monitor individual core utilization to identify thread pool imbalances
- Watch for cores at 100% while others are idle (indicates thread allocation issues)
- Use top -H or htop to view per-thread CPU usage
Memory consumption
- Track heap usage vs available RAM
- Monitor query execution memory pool utilization
- Watch for OOM errors or excessive swapping
IO and network
- Measure write throughput (points/second)
- Track query response times
- Monitor object store latency for cloud deployments
- Check disk IO wait times with iostat

Establish baselines

# Monitor CPU per thread
top -H -p $(pgrep influxdb3)

# Track memory usage
free -h
watch -n 1 "free -h"

# Check IO wait
iostat -x 1

[!Tip] For comprehensive metrics monitoring, see Monitor metrics.

Essential settings for performance

{{% show-in "enterprise" %}} Use the following to tune performance in all-in-one deployments:

[!Note] For specialized cluster nodes (ingest-only, query-only, etc.), see Configure specialized cluster nodes for mode-specific optimizations. {{% /show-in %}}

Thread allocation (--num-io-threads{{% show-in "enterprise" %}}, --num-datafusion-threads{{% /show-in %}})

IO threads handle HTTP requests and line protocol parsing. Default: 2 (often insufficient). {{% show-in "enterprise" %}}DataFusion threads process queries and snapshots.{{% /show-in %}}

[!Note] {{% product-name %}} automatically allocates remaining cores to DataFusion after reserving IO threads. You can only configure --num-io-threads. {{% /show-in %}}

[!Note] {{% product-name %}} lets you configure both thread pools explicitly with --num-io-threads and --num-datafusion-threads. {{% /show-in %}}

# Write-heavy: More IO threads
influxdb3 --num-io-threads=12 serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Query-heavy: Fewer IO threads
influxdb3 --num-io-threads=4 serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Write-heavy: More IO threads, adequate DataFusion
influxdb3 --num-io-threads=12 --num-datafusion-threads=20 serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

# Query-heavy: Fewer IO threads, more DataFusion
influxdb3 --num-io-threads=4 --num-datafusion-threads=28 serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

[!Warning]

Increase IO threads for concurrent writers

If you have multiple concurrent writers (for example, Telegraf agents), the default of 2 IO threads can bottleneck write performance.

Memory pool (--exec-mem-pool-bytes)

Controls memory for query execution. Default: {{% show-in "core" %}}70%{{% /show-in %}}{{% show-in "enterprise" %}}20%{{% /show-in %}} of RAM.

# Increase for query-heavy workloads
influxdb3 --exec-mem-pool-bytes=90% serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Decrease if experiencing memory pressure
influxdb3 --exec-mem-pool-bytes=60% serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Increase for query-heavy workloads
influxdb3 --exec-mem-pool-bytes=90% serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

# Decrease if experiencing memory pressure
influxdb3 --exec-mem-pool-bytes=60% serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

Parquet cache ({{% show-in "core" %}}--parquet-mem-cache-size-mb{{% /show-in %}}{{% show-in "enterprise" %}}--parquet-mem-cache-size{{% /show-in %}})

Caches frequently accessed data files in memory.

# Enable caching for better query performance
influxdb3 --parquet-mem-cache-size-mb=4096 serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Enable caching for better query performance
influxdb3 --parquet-mem-cache-size=4GB serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

WAL flush interval (--wal-flush-interval)

Controls write latency vs throughput. Default: 1s.

# Reduce latency for real-time data
influxdb3 --wal-flush-interval=100ms serve \
  --node-id=node0 \
  --object-store=file --data-dir=~/.influxdb3

# Reduce latency for real-time data
influxdb3 --wal-flush-interval=100ms serve \
  --node-id=node0 --cluster-id=cluster0 \
  --object-store=file --data-dir=~/.influxdb3

Common performance issues

High write latency

Symptoms: Increasing write response times, timeouts, points dropped

Solutions:

Increase IO threads (default is only 2)
Reduce WAL flush interval (from 1s to 100ms)
Check disk IO performance

Slow query performance

Symptoms: Long execution times, high memory usage, query timeouts

Solutions:

{{% show-in "enterprise" %}}Increase DataFusion threads
{{% /show-in %}}Increase execution memory pool (to 90%)
Enable Parquet caching

Memory pressure

Symptoms: OOM errors, swapping, high memory usage

Solutions:

Reduce execution memory pool (to 60%)
Lower snapshot threshold (--force-snapshot-mem-threshold=70%)

CPU bottlenecks

Symptoms: 100% CPU utilization, uneven thread usage (only 2 cores for writes)

Solutions:

Rebalance thread allocation
Check if only 2 cores are used for write parsing (increase IO threads)

[!Important]

"My ingesters are only using 2 cores"

Increase --num-io-threads to 8-16+ for ingest nodes.{{% show-in "enterprise" %}} For dedicated ingest nodes with --mode=ingest, see Configure ingest nodes.{{% /show-in %}}

Configuration examples by workload

Write-heavy workloads (>100k points/second)

# 32-core system, high ingest rate
influxdb3 --num-io-threads=12 \
  --exec-mem-pool-bytes=80% \
  --wal-flush-interval=100ms \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# 32-core system, high ingest rate
influxdb3 --num-io-threads=12 \
  --num-datafusion-threads=20 \
  --exec-mem-pool-bytes=80% \
  --wal-flush-interval=100ms \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Query-heavy workloads (complex analytics)

# 32-core system, analytical queries
influxdb3 --num-io-threads=4 \
  --exec-mem-pool-bytes=90% \
  --parquet-mem-cache-size-mb=2048 \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# 32-core system, analytical queries
influxdb3 --num-io-threads=4 \
  --num-datafusion-threads=28 \
  --exec-mem-pool-bytes=90% \
  --parquet-mem-cache-size=2GB \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Mixed workloads (real-time dashboards)

# 32-core system, balanced operations
influxdb3 --num-io-threads=8 \
  --exec-mem-pool-bytes=70% \
  --parquet-mem-cache-size-mb=1024 \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# 32-core system, balanced operations
influxdb3 --num-io-threads=8 \
  --num-datafusion-threads=24 \
  --exec-mem-pool-bytes=70% \
  --parquet-mem-cache-size=1GB \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Thread allocation details

Calculate optimal thread counts

Use this formula as a starting point:

Total cores = N
Concurrent writers = W
Query complexity factor = Q (1-10, where 10 is most complex)

IO threads = min(W + 2, N * 0.4)
DataFusion threads = N - IO threads

Example configurations by system size

Small system (4 cores, 16 GB RAM)

# Balanced configuration
influxdb3 --num-io-threads=2 \
  --exec-mem-pool-bytes=10GB \
  --parquet-mem-cache-size-mb=500 \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# Balanced configuration
influxdb3 --num-io-threads=2 \
  --exec-mem-pool-bytes=10GB \
  --parquet-mem-cache-size=500MB \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Medium system (16 cores, 64 GB RAM)

# Write-optimized configuration
influxdb3 --num-io-threads=6 \
  --exec-mem-pool-bytes=45GB \
  --parquet-mem-cache-size-mb=2048 \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# Write-optimized configuration
influxdb3 --num-io-threads=6 \
  --num-datafusion-threads=10 \
  --exec-mem-pool-bytes=45GB \
  --parquet-mem-cache-size=2GB \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Large system (64 cores, 256 GB RAM)

# Query-optimized configuration
influxdb3 --num-io-threads=8 \
  --exec-mem-pool-bytes=200GB \
  --parquet-mem-cache-size-mb=10240 \
  --object-store-connection-limit=200 \
  serve \
  --node-id=node0 \
  --object-store=file \
  --data-dir=~/.influxdb3

# Query-optimized configuration
influxdb3 --num-io-threads=8 \
  --num-datafusion-threads=56 \
  --exec-mem-pool-bytes=200GB \
  --parquet-mem-cache-size=10GB \
  --object-store-connection-limit=200 \
  serve \
  --node-id=node0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Enterprise mode-specific tuning

Ingest mode optimization

Dedicated ingest nodes require significant IO threads:

# High-throughput ingester (96 cores)
influxdb3 --mode=ingest \
  --num-cores=96 \
  --num-io-threads=24 \
  --num-datafusion-threads=72 \
  --force-snapshot-mem-threshold=90% \
  serve \
  --node-id=ingester0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

[!Warning] Without explicitly setting --num-io-threads, a 96-core ingester uses only 2 cores for parsing line protocol, wasting 94% of available CPU for ingest operations.

Query mode optimization

Query nodes should maximize DataFusion threads:

# Query-optimized node (64 cores)
influxdb3 --mode=query \
  --num-cores=64 \
  --num-io-threads=4 \
  --num-datafusion-threads=60 \
  --exec-mem-pool-bytes=90% \
  --parquet-mem-cache-size=4GB \
  serve \
  --node-id=query0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Compactor mode optimization

Compaction is DataFusion-intensive:

# Dedicated compactor (32 cores)
influxdb3 --mode=compact \
  --num-cores=32 \
  --num-io-threads=2 \
  --num-datafusion-threads=30 \
  --compaction-row-limit=1000000 \
  serve \
  --node-id=compactor0 \
  --cluster-id=cluster0 \
  --object-store=file \
  --data-dir=~/.influxdb3

Memory tuning

Execution memory pool

Configure the query execution memory pool:

# Absolute value in bytes
--exec-mem-pool-bytes=8589934592  # 8GB

# Percentage of available RAM
--exec-mem-pool-bytes=80%  # 80% of system RAM

Guidelines:

Write-heavy: 60-70% (leave room for OS cache)
Query-heavy: 80-90% (maximize query memory)
Mixed: 70% (balanced approach)

Parquet cache configuration

Cache frequently accessed Parquet files:

# Set cache size
--parquet-mem-cache-size=2147483648  # 2GB

# Configure cache behavior
--parquet-mem-cache-prune-interval=1m \
--parquet-mem-cache-prune-percentage=20

WAL and snapshot tuning

Control memory pressure from write buffers:

# Force snapshot when memory usage exceeds threshold
--force-snapshot-mem-threshold=80%

# Configure WAL rotation
--wal-flush-interval=10s \
--wal-snapshot-size=100MB

Advanced tuning options

Specialized cluster nodes

For performance optimizations using dedicated ingest, query, compaction, or processing nodes, see Configure specialized cluster nodes. {{% /show-in %}}

For less common performance optimizations and detailed configuration options, see:

DataFusion engine tuning

Advanced DataFusion runtime parameters:

--datafusion-config

HTTP and network tuning

Request size and network optimization:

--max-http-request-size - For large batches (default: 10 MB)
--http-bind - Bind address

Object store optimization

Performance tuning for cloud object stores:

--object-store-connection-limit - Connection pool size
--object-store-max-retries - Retry configuration
--object-store-http2-only - Force HTTP/2

Complete configuration reference

For all available configuration options, see:

Monitoring and validation

Monitor thread utilization

# Linux: View per-thread CPU usage
top -H -p $(pgrep influxdb3)

# Monitor specific threads
watch -n 1 "ps -eLf | grep influxdb3 | head -20"

Check performance metrics

Monitor key indicators:

-- Query system.threads table (Enterprise)
SELECT * FROM system.threads
WHERE cpu_usage > 90
ORDER BY cpu_usage DESC;

-- Check write throughput
SELECT
  count(*) as points_written,
  max(timestamp) - min(timestamp) as time_range
FROM your_measurement
WHERE timestamp > now() - INTERVAL '1 minute';

Validate configuration

Verify your tuning changes:

# Check effective configuration
influxdb3 serve --help-all | grep -E "num-io-threads|num-datafusion-threads"

# Monitor memory usage
free -h
watch -n 1 "free -h"

# Check IO wait
iostat -x 1

Common performance issues

High write latency

Symptoms:

Increasing write response times
Timeouts from write clients
Points dropped or rejected

Solutions:

Increase IO threads: --num-io-threads=16
Reduce batch sizes in writers
Increase WAL flush frequency
Check disk IO performance

Slow query performance

Symptoms:

Long query execution times
High memory usage during queries
Query timeouts

Solutions: {{% show-in "core" %}}1. Increase execution memory pool: --exec-mem-pool-bytes=90% 2. Enable Parquet caching: --parquet-mem-cache-size=4GB 3. Optimize query patterns (smaller time ranges, fewer fields){{% /show-in %}} {{% show-in "enterprise" %}}1. Increase DataFusion threads: --num-datafusion-threads=30 2. Increase execution memory pool: --exec-mem-pool-bytes=90% 3. Enable Parquet caching: --parquet-mem-cache-size=4GB 4. Optimize query patterns (smaller time ranges, fewer fields){{% /show-in %}}

Memory pressure

Symptoms:

Out of memory errors
Frequent garbage collection
System swapping

Solutions:

Reduce execution memory pool: --exec-mem-pool-bytes=60%
Lower snapshot threshold: --force-snapshot-mem-threshold=70%
Decrease cache sizes
Add more RAM or reduce workload

CPU bottlenecks

Symptoms:

100% CPU utilization
Uneven thread pool usage
Performance plateaus

Solutions:

Rebalance thread allocation based on workload
Add more CPU cores
Optimize client batching
{{% show-in "enterprise" %}}Distribute workload across specialized nodes{{% /show-in %}}

19 KiB Raw Permalink Blame History

Best practices

General monitoring principles

Key metrics to monitor

Establish baselines

Essential settings for performance

Thread allocation (--num-io-threads{{% show-in "enterprise" %}}, --num-datafusion-threads{{% /show-in %}})

Increase IO threads for concurrent writers

Memory pool (--exec-mem-pool-bytes)

Parquet cache ({{% show-in "core" %}}--parquet-mem-cache-size-mb{{% /show-in %}}{{% show-in "enterprise" %}}--parquet-mem-cache-size{{% /show-in %}})

WAL flush interval (--wal-flush-interval)

Common performance issues

High write latency

Slow query performance

Memory pressure

CPU bottlenecks

"My ingesters are only using 2 cores"

Configuration examples by workload

Write-heavy workloads (>100k points/second)

Query-heavy workloads (complex analytics)

Mixed workloads (real-time dashboards)

Thread allocation details

Calculate optimal thread counts

Example configurations by system size

Small system (4 cores, 16 GB RAM)

Medium system (16 cores, 64 GB RAM)

Large system (64 cores, 256 GB RAM)

Enterprise mode-specific tuning

Ingest mode optimization

Query mode optimization

Compactor mode optimization

Memory tuning

Execution memory pool

Parquet cache configuration

WAL and snapshot tuning

Advanced tuning options

Specialized cluster nodes

DataFusion engine tuning

HTTP and network tuning

Object store optimization

Complete configuration reference

Monitoring and validation

Monitor thread utilization

Check performance metrics

Validate configuration

Common performance issues

High write latency

Slow query performance

Memory pressure

CPU bottlenecks

19 KiB

Raw Permalink Blame History