When merging streams of system iterators we don't use tags or time.
Instead we add series keys (in the case of, for example, `SHOW SERIES`)
to the `Aux` field of the iterators' elements. This is because we only
emit merged and sorted sets of series key to the client.
We currently use `SortedMergeHeap`s to merge together multiple
iterators, and the comparitor function did not consider `Aux` fields
when determining which heap to pop the next item off during a merge. As
such, `SHOW SERIES` and `SHOW TAG KEYS` (any meta query that gets
converted into a special type of `SELECT`) were returning results in
arbitrary order.
This issue was never noticed on the `inmem` index because the streams
are always duplicates of each other, and of course it doesn't matter if
you arbitrarily merge together two idential, sorted streams...
The issue first manifested itself on the `tsi1` index, but this fix will
apply to both indexes.
Other applications or services sometimes expose a header containing a
unique ID, which can then be included in logging or response information
to allow an operator to link inter-service requests. The most common
header name used by services in the wild appears to be `X-Request-ID`,
but `Request-Id` is also used.
This commit adds support for specifying either `X-Request-ID` or
`Request-Id` headers, which will then be used by InfluxDB when logging
request information, and also in the `X-Request-ID` and `Request-Id`
response headers.
We populate both `X-Request-ID` and `Request-Id` to maintain backwards
compatibility with previous version, and to support the more common
`X-Request-ID` header name.
If both `X-Request-ID` and `Request-Id` are specified, then
`X-Request-ID` is used.
If neither header is specified, then in line with previous behaviour, we
generate a v1 UUID.
If a large compaction was running and was aborted. It could would leave
some tmp files around for files that it had fully written. The current
active file was cleaned up, but already completed ones would not. This
would occur when a TSM file needed to rollover due to size.
The ConditionExpr function is more accurate because it parses the
condition and ensures that time conditions are actually used correctly.
That means that attempting to combine conditions with OR will not result
in the query silently pretending it's an AND and nested conditions work
correctly so there is only one way to read the query.
It also extracts the non-time conditions into a separate condition so we
can stop attempting to parse around the time conditions in lower layers
of the storage engine. This change does not remove those hacks, but a
following commit should be able to sanitize the condition and remove
them.
This commit provides more insight into server errors by both setting
the error on a response header, and, in the case of server errors (5xx),
logging those error messages to the HTTPD log, if [http] log_enabled =
true.
Previously pseudo iterators could be created for meta data such
as series, measurement, and tag data. These iterators were created
at a higher level and lacked a lot of the power of the query engine.
This commit moves system iterators down to the series level and
supports the following:
- _name
- _seriesKey
- _tagKey
- _tagValue
- _fieldKey
These can be used as normal fields such as:
SELECT _seriesKey FROM cpu
This will return all the series keys for `cpu`.
There was a change to speed up deleting and dropping measurements
that executed the deletes in parallel for all shards at once. #7015
When TSI was merged in #7618, the series keys passed into Shard.DeleteMeasurement
were removed and were expanded lower down. This causes memory to blow up
when a delete across many shards occurs as we now expand the set of series
keys N times instead of just once as before.
While running the deletes in parallel would be ideal, there have been a number
of optimizations in the delete path that make running deletes serially pretty
good. This change just limits the concurrency of the deletes which keeps memory
more stable.
When snapshots and compactions are disabled, the check to see if
the compaction should be aborted occurs in between writing to the
next TSM file. If a large compaction is running, it might take
a while for the file to be finished writing causing long delays.
This now interrupts compactions while iterating over the blocks to
write which allows them to abort immediately.
When a time zone shift happened at the edge of a time bucket, the logic
was incorrect. When we added an hour to the day with a time grouping of
2 hours, we took the hour away from the previous bucket instead of the
next bucket so the first bucket of the day would be from midnight to 1
AM and the second bucket would include 1 AM to 4 AM (2 AM to 3 AM does
not exist when shifting forwards). The correct buckets should have been
12 AM to 2 AM for the first bucket of the day and 3 AM to 4 AM for the
second (since 2 AM to 3 AM does not exist).
This also modifies the tests to use table tests and sub tests.
This calculates the start and end time along with any time zones shifts
so that continuous queries are run at the correct time when a time zone
is included in the query.
Removing the forced `Connection: close` header from the `/query`
endpoint. This was originally added because of golang/go#13165, but it
seems like it's possible to use pipelining with go 1.8 and http 1.1,
just not recommended.
After some testing, it appears that the channel returned by
`ResponseWriter.CloseNotify()` will not send a value if the connection
was not interrupted. We already account for this in /query by exiting
from the goroutine if the request has finished by signaling another
channel.
Since the handler already accounts for the possibility that the channel
will not signal and since `CloseNotify()` doesn't interfere with a
pipelined request, we can remove the forced `Connection: close` that was
added to force clients to establish a new connection.
The Go timestamp leads Truncate to start a week on Monday, but the query
engine uses unix timestamps which has the week start on a Thursday.
Updating the service so it uses a custom truncate method that uses the
unix timestamp instead of `time.Time`.
Fixes#8569.
The sortedSeriesIds slice was not getting reset to 0 which caused
the same series ids to exist in the slice more than once. Since
the size of the slice never matched the size of the seriesID map,
it kept appendending to the slice and sorting it which cause multiple
cursor to get created for the same series.
Fixes#8531