The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities,
therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new
algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++. The sparse
representation was kept in to provide better low cardinality accuracy. However the linear counting and range estimations
are replaced.
This fixes the case where log files are compacted out of order
and cause non-contiguous sets of index files to be compacted.
Previously, the compaction planner would fetch a list of index files
for each level and compact them in order starting with the oldest
ones. This can be a problem for level 1 because level 0 (log files)
are compacted individually and in some cases a log file can finish
compacting before older log files are finished compacting. This
causes there to be a gap in the list of level 1 files that is
ignored when fetching a list of index files.
Now, the planner reads the list of index files starting from the
oldest but stops once it hits a log file. This prevents that gap
from being ignored.
* off by default, enabled by `query-stats-enabled`
* writes to cq_query measurement of configured monitor database
* see CHANGELOG for schema of individual points
This check was previously in a different section of code which
was lost during a refactor to the new compaction strategy. The
compaction planning now makes a check to ensure at least two
files are available for compaction in a level.
When a `SELECT ... INTO ...` is used with `top()` or `bottom()` used
with tags, the points will be written with the tags still intact instead
of converted to fields.
The previous version of `top()` and `bottom()` would gather all of the
points to use in a slice, filter them (if necessary), then use a
slightly modified heap sort to retrieve the top or bottom values.
This performed horrendously from the standpoint of memory. Since it
consumed so much memory and spent so much time in allocations (along
with sorting a potentially very large slice), this affected speed too.
These calls have now been modified so they keep the top or bottom points
in a min or max heap. For `top()`, a new point will read the minimum
value from the heap. If the new point is greater than the minimum point,
it will replace the minimum point and fix the heap with the new value.
If the new point is smaller, it discards that point. For `bottom()`, the
process is the opposite.
It will then sort the final result to ensure the correct ordering of the
selected points.
When `top()` or `bottom()` contain a tag to select, they have now been
modified so this query:
SELECT top(value, host, 2) FROM cpu
Essentially becomes this query:
SELECT top(value, 2), host FROM (
SELECT max(value) FROM cpu GROUP BY host
)
This should drastically increase the performance of all `top()` and
`bottom()` queries.
Writing points outside of a retention policy range were silently dropped. They
are dropped to prevent creating a shard that will be immediately deleted. These
dropped points were silent and did not return an error respone to the caller.
Fixes#8392
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:
curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
curl -o cpu.txt "http://localhost:8086/debug/pprof/profile
This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).
This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.
The new command for gathering profiles from users should now be:
curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"
Or, if we need to see a CPU profile:
curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"
It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.
Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
After using `/debug/requests`, the client will wait for 30 seconds
(configurable by specifying `seconds=` in the query parameters) and the
HTTP handler will track every incoming query and write to the system.
After that time period has passed, it will output a JSON blob that looks
very similar to `/debug/vars` that shows every IP address and user
account (if authentication is used) that connected to the host during
that time.
In the future, we can add more metrics to track. This is an initial
start to aid with debugging machines that connect too often by looking
at a sample of time (like `/debug/pprof`).
Tombstone files would be written to all TSM files even if the deleted
keys or timerange did not exist in the TSM file. This had the side
effect of causing shards to get recompacted back to the same state. If
any shards or large numbers of TSM files existed, disk usage and CPU
utilization would spike causing issues.
This prevents tombstones being written for TSM files that could not
possiby contain the series keys being deleted or if the delted time
range is outside the range of the file.
This change refactors the subquery code into a separate builder class to
help allow for more reuse and make the functions smaller and easier to
read.
The previous function that handled most of the code was too big and
impossible to reason through.
This also goes and replaces the complicated logic of aggregates that had
a subquery source with the simpler IteratorMapper. I think the overhead
from the IteratorMapper will be more, but I also believe that the actual
code is simpler and more robust to produce more accurate answers. It
might be a future project to optimize that section of code, but I don't
have any actual numbers for the efficiency of one method and I believe
accuracy and code clarity may be more important at the moment since I am
otherwise incapable of reading my own code.
When LIMIT and OFFSET were used with any functions that were not handled
directly by the query engine (anything other than count, max, min, mean,
first, or last), the input to the function would be limited instead of
receiving the full stream of values it was supposed to receive.
This also fixes a bug that caused the server to hang when LIMIT and
OFFSET were used with a selector. When using a selector, the limit and
offset should be handled before the points go to the auxiliary iterator
to be split into different iterators. Limiting happened afterwards which
caused the auxiliary iterator to hang forever.
`top()` and `bottom()` will now organize the points by time and also
keep the points original time even when a time grouping is used. At the
same time, `top()` and `bottom()` will no longer honor any fill options
that are present since they don't really make sense for these specific
functions.
This also fixes the aggregate and selectors to honor the ordered
iterator option so iterator remain ordered and to also respect the
buckets that are created by the final dimensions of the query so that
two buckets don't overlap each other within the same reducer. A test has
been added for this situation. This should clarify and encourage the use
of the ordered attribute within the query engine.