fix(influxdb3): document exec-mem-pool-bytes usage for data persistence (#6394)

* fix(influxdb3): document exec-mem-pool-bytes usage for data persistence

Updates documentation to reflect that exec-mem-pool-bytes is used for both
query processing and parquet persistence operations, not just queries.

Based on source code analysis showing the memory pool is used by:
- Query executor for processing queries
- Persister for converting WAL data to Parquet format

Changes:
- Updated config description to include "data operations"
- Added memory usage info to durability docs for Parquet storage
- Added troubleshooting section for memory-related write performance
- Fixed capitalization of "object store" throughout

Addresses DAR #499

Source analysis: influxdb3/src/commands/serve.rs:772-798
Shows separate executors for queries and write path operations,
both using memory pools for data processing.

* fix(influxdb3): broken links to no-sync
pull/6514/head
Jason Stirnaman 2025-11-05 12:05:13 -05:00 committed by GitHub
parent df06c64fb5
commit 7f2178afa5
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 48 additions and 4 deletions

View File

@ -1088,7 +1088,9 @@ Defines the address on which InfluxDB serves HTTP API requests.
#### exec-mem-pool-bytes
Specifies the size of memory pool used during query execution.
Specifies the size of the memory pool used for query processing and data operations.
This memory pool is used when {{% product-name %}} processes queries and performs
internal data management tasks.
Can be given as absolute value in bytes or as a percentage of the total available memory--for
example: `8000000000` or `10%`.

View File

@ -19,14 +19,14 @@ As written data moves through {{% product-name %}}, it follows a structured path
- **Process**: InfluxDB validates incoming data before accepting it into the system.
- **Impact**: Prevents malformed or unsupported data from entering the database.
- **Details**: The database validates incoming data and stores it in the write buffer (in memory). If [`no_sync=true`](#no-sync-write-option), the server sends a response to acknowledge the write.
- **Details**: The database validates incoming data and stores it in the write buffer (in memory). If `no_sync=true`, the server sends a response to acknowledge the write [without waiting for persistence](/influxdb3/version/reference/cli/influxdb3/write/#write-line-protocol-and-immediately-return-a-response).
### Write-ahead log (WAL) persistence
- **Process**: The database flushes the write buffer to the WAL every second (default).
- **Impact**: Ensures durability by persisting data to object storage.
- **Tradeoff**: More frequent flushing improves durability but increases I/O overhead.
- **Details**: Every second (default), the database flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the Object store. If [`no_sync=false`](#no-sync-write-option) (default), the server sends a response to acknowledge the write.
- **Details**: Every second (default), the database flushes the write buffer to the Write-Ahead Log (WAL) for persistence in the object store. If `no_sync=false` (default), the server sends a response to acknowledge the write.
### Query availability
@ -40,7 +40,8 @@ As written data moves through {{% product-name %}}, it follows a structured path
- **Process**: Every ten minutes (default), data is persisted to Parquet files in object storage.
- **Impact**: Provides durable, long-term storage.
- **Tradeoff**: More frequent persistence reduces reliance on the WAL but increases I/O costs.
- **Details**: Every ten minutes (default), the {{% product-name %}} persists the oldest data from the queryable buffer to the Object store in Parquet format, and keeps the remaining data (the most recent 5 minutes) in memory.
- **Memory usage**: The persistence process uses memory from the configured memory pool ([`exec-mem-pool-bytes`](/influxdb3/version/reference/config-options/#exec-mem-pool-bytes)) when converting data to Parquet format. For write-heavy workloads, ensure adequate memory is allocated.
- **Details**: Every ten minutes (default), {{% product-name %}} persists the oldest data from the queryable buffer to the object store in Parquet format, and keeps the remaining data (the most recent 5 minutes) in memory.
### In-memory cache

View File

@ -6,6 +6,7 @@ Learn how to avoid unexpected results and recover from errors when writing to
- [Review HTTP status codes](#review-http-status-codes)
- [Troubleshoot failures](#troubleshoot-failures)
- [Troubleshoot rejected points](#troubleshoot-rejected-points)
{{% show-in "core,enterprise" %}}- [Troubleshoot write performance issues](#troubleshoot-write-performance-issues){{% /show-in %}}
## Handle write responses
@ -65,3 +66,43 @@ InfluxDB rejects points that don't match the schema of existing data.
Check for [field data type](/influxdb3/version/reference/syntax/line-protocol/#data-types-and-format)
differences between the rejected data point and points within the same
database--for example, did you attempt to write `string` data to an `int` field?
{{% show-in "core,enterprise" %}}
## Troubleshoot write performance issues
If you experience slow write performance or timeouts during high-volume ingestion,
consider the following:
### Memory configuration
{{% product-name %}} uses memory for both query processing and internal data operations,
including converting data to Parquet format during persistence.
For write-heavy workloads, insufficient memory allocation can cause performance issues.
**Symptoms of memory-related write issues:**
- Slow write performance during data persistence (typically every 10 minutes)
- Increased response times during high-volume ingestion
- Memory-related errors in server logs
**Solutions:**
- Increase the [`exec-mem-pool-bytes`](/influxdb3/version/reference/config-options/#exec-mem-pool-bytes)
configuration to allocate more memory for data operations.
For write-heavy workloads, consider setting this to 30-40% of available memory.
- Monitor memory usage during peak write periods to identify bottlenecks.
- Adjust the [`gen1-duration`](/influxdb3/version/reference/config-options/#gen1-duration)
to control how frequently data is persisted to Parquet format.
### Example configuration for write-heavy workloads
```bash { placeholders="PERCENTAGE" }
influxdb3 serve \
--exec-mem-pool-bytes PERCENTAGE \
--gen1-duration 15m \
# ... other options
```
Replace {{% code-placeholder-key %}}`PERCENTAGE`{{% /code-placeholder-key %}} with the percentage
of available memory to allocate (for example, `35%` for write-heavy workloads).
{{% /show-in %}}