The cache defaulted to entry capacity size of 32. This default
is fine for lower cardinalities, but causes big spikes in InUse
heap with higher cardinalities that can OOM the process. Since
the hints had to be removed previously due to increased memory usage,
they are now completely removed. For lower cardinalities, we do
grow the slice, but this has a small performance penalty compared
to the large memory usage/OOMs with larger cardinalities.
The series file compaction previously did not snapshot the max
offset before compacting and would keep compacting until it reached
the end of segment file. This caused more entries than expected into
the RHH map and this map gets exponentially slower as it gets close
to full.
The string `node <n>` can be used to specify which data node the data
should be retrieved from. This uses the `node_id=X` query parameter that
is supported, but wasn't exposed anywhere in the client library.
We use this feature enough internally when attempting to find
inconsistencies or network errors that it is easier if this is just
supported. Otherwise, I continue having to recompile the CLI program
every time I need to do this.
To clear a previously set node, you can use `node 0` or `node clear`.
There is a strange race condition where a query can be killed and finish
at approximately the same time. If this happens, the query gets
retrieved by the killing task, the query gets closed by the normal
processing thread, and then the killing task attempts to kill the query
afterwards. Since the close doesn't mark the query as already killed
(since it's not killed, just merely unused), the killing thread attempts
to close the channel again.
Mark the query as killed whenever it is closed to prevent a double close
from happening. This should never cause the status to be erroneously
reported since the query status is removed from the query table within
the same lock scope.