Deduplicate is called from various places in the engine and can cause
a lot of garbage to get created. It first creates a map and then
adds each value to the map in order (1st alloc). It then creates a
new slice (2nd alloc) and appends everything from the map to the slice.
Finally, it sorted the new slice (3rd alloc).
This switches the algorithm to use stable sorting and resuing the existing
slice to avoid allocations.
Due to a bug in compactions, it's possible some blocks may have duplicate
points stored. If those blocks are decoded and re-compacted, an assertion
panic could trigger.
We now dedup those blocks if necessary to remove the duplicate points
and avoid the panic.
This fixes a pathalogical query condition cause by and problematic
structuring of TSM files based on how points were written. The
condition can occur when there are multiple TSM files and a large
number of points are written into the past. The earlier existing
TSM files must also have points in the past and close to the present
causing their time range to eclipse the later files.
When this condition occurs, some queries can spend an excessive amount
of time merge all the overlapping blocks.
The fix was to constrain the window of overlapping blocks based on
the first one we ran into. There was also a simple case in the Merge
where we could skip the binary search path and just append the two
inputs.