* feat: Add CompactPointsPerBlock config opt
This PR adds an additional parameter for influxd
CompactPointsPerBlock. It adjusts the DefaultAggressiveMaxPointsPerBlock
to 10,000. We had discovered that with the points per block set to
100,000 compacted TSM files were increasing. After modifying the
points per block to 10,000 we noticed that the file sizes decreased.
The value has been set as a parameter that can be adjusted by administrators
this allows there to be some tuning if compression problems are encountered.
* feat: Modify optimized compaction to cover edge cases
This PR changes the algorithm for compaction to account for the following
cases that were not previously accounted for:
- Many generations with a groupsize over 2 GB
- Single generation with many files and a groupsize under 2 GB
- Where groupsize is the total size of the TSM files in said shard directory.
- shards that may have over a 2 GB group size but
many fragmented files (under 2 GB and under aggressive
point per block count)
closes https://github.com/influxdata/influxdb/issues/25666
Currently, deletion of series or measurements are
serialized. This new feature will add
max-concurrent-deletes to the [data] section of the
configuration file. Legal values are any positive
number, defaulting to 1, the current behavior.
closes https://github.com/influxdata/influxdb/issues/23054
When a SELECT INTO query generates an illegal value that cannot be inserted,
like +/- Inf, it should return an error, rather than failing silently.
This adds a boolean parameter to the [data] section of influxdb.conf:
* strict-error-handling
When false, the default, the old behavior is preserved. When true,
unsupported values will return an error from SELECT INTO queries
Fixes https://github.com/influxdata/influxdb/issues/20426
This commit changes `DefaultSeriesIDSetCacheSize` to zero so that the
tag value cache is disabled by default. There is a rare known bug where
the cache can cause a segfault which crasheds the process. The cache
is being disabled instead of removed as some users may still need the
cache for performance reasons.
This commit adds a config option to the tsdb Config allowing the size of
the bitset cached in the TSI index to be specified.
Setting the cache size to 0 will disable the cache.
This PR adds a configuration option that can be used to inform the
kernel that we intent to page in much of the TSM files.
This madvise value has been problematic in the past when its been set,
so this option defaults to off. It may be useful to some users with slow
disks.
This commit adds the `max-index-log-file-size` configuration flag so
that users can restrict the maximum size of log files before compaction.
The default limit was also lowered from `5MB` to `1MB`. The original
size was set before we partitioned the index so the change reflects this.
With the recent changes to compactions and snapshotting, the current
default can create lots of small level 1 TSM files. This increases
the default in order to create larger level 1 files and less disk
utilization.
Update support in the `toml` package for parsing human-readble byte sizes.
Supported size suffixes are "k" or "K" for kibibytes, "m" or "M" for
mebibytes, and "g" or "G" for gibibytes. If a size suffix isn't specified
then bytes are assumed.
In the config, `cache-max-memory-size` and `cache-snapshot-memory-size` are
now typed as `toml.Size` and support the new syntax.
This limit allows the number of concurrent level and full compactions
to be throttled. Snapshot compactions are not affected by this limit
as then need to run continously.
This limit can be used to control how much CPU is consumed by compactions.
The default is to limit to the number of CPU available.
Fsyncs to the WAL can cause higher IO with lots of small writes or
slower disks. This reworks the previous wal fsyncing to remove the
extra goroutine and remove the hard-coded 100ms delay. Writes to
the wal still maintain the invariant that they do not return to the
caller until the write is fsync'd.
This also adds a new config options wal-fsync-delay (default 0s)
which can be increased if a delay is desired. This is somewhat useful
for system with slower disks, but the current default works well as
is.
These were all b1/bz1 settings that no longer have any effect:
- {Default,}MaxWALSize
- {Default,}WALFlushInterval
- {Default,}WALPartitionFlushDelay
- {Default,WAL}ReadySeriesSize
- {Default,WAL}CompactionThreshold
- {Default,WAL}MaxSeriesSize
- {Default,WAL}FlushColdInterval
- {Default,WAL}PartitionSizeThreshold