The buffer allocation in bz1 was unused and I'm fairly certain that it
was harmful to performance if used. For queries that run through a bz1
block, needing to hold on to a 64kb block is expensive. Better to churn
on the allocator and have the blocks be released when they are unused
than to have 64kb hanging around for each series regardless of size.
Thanks to @jwilder for brainstorming this issue with me.
We no longer insert the `upgrade_artifiact`. Instead we now just use the entire series name as the measurement name.
Add info on throttling as well as status messages.
* Update the store to remove the WAL directories associated with a shard or database when they are deleted.
* Fix the Store so that it creates separate WAL directories for databases and retention policies.
This func show up in profiling. It's called frequently from multiple places and
can be made more efficient. The previous implementation looped over the input
slice 4 times updating an returning a new slice each time. The changes it to loop
once and create one result slice.
With influx_stress
Before:
Wrote 10000000 points at average rate of 241750
Average response time: 187.78968ms
After:
Wrote 10000000 points at average rate of 254618
Average response time: 172.235028ms
Through profiling of writes, point.Fields() and point.Name() were called
repeatedly in PointsWriter and the Shard. These calls are somewhat expensive
when writing large batches so we can cache them to avoid wasting CPU cycles.
Using influx_stress with default settings
Before:
Wrote 10000000 points at average rate of 202570
Average response time: 235.450355ms
After:
Wrote 10000000 points at average rate of 246120
Average response time: 182.881008ms
This commit changes the bz1 append to check for a small
ending block first. If the block is below the threshold
for block size then it is rewritten with the new data
points instead of having a new block written.
Conflicts:
tsdb/engine/bz1/bz1.go
tsdb/engine/bz1/bz1_test.go
If a flush is happening and you bring up a cursor for a series, if that series didn't have any data in the cache (after the flush started) then it would return no data. What it should have done instead is return the data that is in the flush cache, which is held in separate area of memory until it is committed to the index.