Commit Graph

95 Commits (c2315b208cda29c2ec44180079e22095eabe611e)

Author SHA1 Message Date
Edd Robinson ed41122ade Pre-allocate map for performance 2016-09-15 18:28:46 +01:00
Jonathan A. Sternberg dc2527ce86 Merge branch '1.0' 2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg 964341eb20 Optimize queries that compare a tag value to an empty string
The behavior for querying tag values with an empty string was originally
fixed in #6283, but it also added a performance problem when the
cardinality of the tag was high. Since a call to `Union()` or `Reject()`
would happen for every series key and it would be called N times for N
cardinality, the comparisons against a blank string were unnecessarily
slow with large memory allocations.

This optimizes these queries so it doesn't use those methods anymore.
Those methods are still useful and used when combining AND and OR
clauses, but they aren't useful when finding the series ids for a single
clause. These methods were unnecessary anyway because the series ids for
the tags were unique anyway and didn't have to be merged as a set.
2016-08-31 14:03:23 -05:00
Ben Johnson a30f9b6c70 Merge pull request #7196 from benbjohnson/mmap-fix
Fix mmap dereferencing
2016-08-24 10:48:28 -06:00
Ben Johnson cc628a1097
Fix mmap dereferencing
Adds a missing dereference call to `Close()` as well as fixes
a tag copy issue.
2016-08-24 10:48:07 -06:00
Edd Robinson 6cafdbc604 Ensure we don't mutate provided statistics tags 2016-08-24 11:40:13 +01:00
Edd Robinson 90ff713f21 Fix base64 encoding issue in stats
Fixes #7177.
2016-08-22 15:21:31 +01:00
Ben Johnson 65536676a4 Merge pull request #7138 from benbjohnson/optimize-shard-open
Reduce memory allocations in index
2016-08-17 15:27:33 -06:00
Ben Johnson 8aa224b22d
reduce memory allocations in index
This commit changes the index to point to index data in the shards
instead of keeping it in-memory on the heap.
2016-08-16 14:09:00 -06:00
Jonathan A. Sternberg 6b5b24a3e3 Decrement number of measurements only once when deleting the last series from a measurement 2016-08-15 13:57:08 -05:00
Mark Rushakoff f34a7430e3 Fix length of (*DatabaseIndex).SeriesKeys()
Previously, it would return as many empty strings in the first half of
the slice as valid values at the end of the slice.
2016-07-27 16:07:39 -07:00
Jason Wilder c31f0c25b4 Fix duplicate series getting created
There was a race where the same series would get added to the in-memory
index for a measurement more than once.  This would result in the same
series being returned more than once during queries causing duplicate
results.  The issue was that we check for the series under the read
lock, but did not check again under the write lock where there was
a small window where the series could be added by another goroutine.

We now check for the series under the write lock.

Fixes #6946
2016-07-18 16:46:36 -06:00
Jonathan A. Sternberg 837a9804cf Refactoring the monitor service to avoid expvar
Truncate the time interval output of the monitor service to be on even
time intervals rather than on every minute based on the start time. This
normalizes the output from the monitor service.
2016-07-07 11:13:58 -05:00
Jonathan A. Sternberg 497db2a6d3 Removing dead code from every package except influxql
The tsdb package had a substantial amount of dead code related to the
old query engine still in there. It is no longer used, so it was removed
since it was left unmaintained. There is likely still more code that is
the same, but wasn't found as part of this code cleanup.

influxql has dead code show up because of the code generation so it is
not included in this pruning.
2016-06-20 22:41:07 -05:00
Ben Johnson 1b94cd2686
optimize SHOW TAG VALUES
This commit optimizes `SHOW TAG VALUES` so that it avoids the
`SELECT` query engine execution and iterator creation. There
are also optimizations to reduce individual memory allocations
and to reduce in-memory heap size by only operating on one
measurement at a time.

Execution time has been reduce to approximately 900ms for
500,000 rows. This is about 2µs per row. Of this time,
approximately 1µs is spent retrieving and sorting the row
and 1µs is spent encoding into JSON and writing to the
response body.
2016-06-06 15:50:53 -06:00
Jason Wilder 579923d95f Fix sporadic write failures with influx_stress
This Unlock was moved which seems to create a deadlock situation
sometimes under high write load.  This deadlock causes writes to
fail with timeouts.
2016-06-01 17:25:47 -06:00
Jason Wilder ff1447202c Reduce lock contention in Measurement.AddSeries 2016-05-27 10:30:08 -06:00
Jason Wilder f1ab89561a Reload series count stat at startup 2016-05-18 15:21:57 -06:00
Jonathan A. Sternberg 23f6a706bb Support cast syntax for selecting a specific type
Casting syntax is done with the PostgreSQL syntax `field1::float` to
specify which type should be used when selecting a field. You can also
do `field1::field` or `tag1::tag` to specify that a field or tag should
be selected.

This makes it possible to select a tag when a field key and a tag key
conflict with each other in a measurement. It also means it's possible
to choose a field with a specific type if multiple shards disagree. If
no types are given, the same ordering for how a type is chosen is used
to determine which type to return.

The FieldDimensions method has been updated to return the data type for
the fields that get returned. The SeriesKeys function has also been
removed since it is no longer needed. SeriesKeys was originally used for
the fill iterator, but then expanded to be used by auxiliary iterators
for determining the channel iterator types. The fill iterator doesn't
need it anymore and the auxiliary types are better served by
FieldDimensions implementing that functionality, so SeriesKeys is no
longer needed.

Fixes #6519.
2016-05-16 12:08:29 -04:00
Jonathan A. Sternberg a17f3d960a SHOW TAG VALUES accepts != and !~ in WHERE clause
Fixes #6607.
2016-05-16 08:51:09 -04:00
Ben Johnson 49eb3b8d04
optimize show series iterator
This commit changes the `SeriesIterator` to process one measurement
at a time and uses a `floatFastDedupeIterator` to avoid point
encoding during deduplication.
2016-05-03 08:52:44 -06:00
Jason Wilder d82aa98951 Reduce indentation in filter func 2016-05-02 11:38:25 -06:00
Jason Wilder 3a7429886e Optimize Measurement.DropSeries 2016-05-02 11:36:04 -06:00
Jason Wilder 8082fc61ba Fix parsing keys when loading database index
The code for parsing a key our of the WAL or TSM files in the engine
was naive and didn't account for measurements with escape chars. This
uses the correct parsing code to parse and load them correctly.

Fixes #6496
2016-04-30 14:47:19 -06:00
Jason Wilder abcb559b09 Remove index meta data when series and measurements are gone
This remove the dropMeta param from the tsdb.Store.DeleteSeries and
lets the shard determine when to remove the meta data from the index
based on what series still have data in the shard.

This uncovered a nasty bug in compactions where a fully deleted series would
prematurely end the compactions and not carry forward the rest of the data
in the TSM file.  This is now fixed as well.
2016-04-29 16:31:57 -06:00
Edd Robinson 4d1cfa887c Ensure measurement dropped when no more series 2016-04-29 00:05:42 +01:00
Jason Wilder 2bd5880d7a Remove series from index when shard is closed
When a shard is closed and removed due to retention policy enforcement,
the series contained in the shard would still exists in the index causing
a memory leak.  Restarting the server would cause them not to be loaded.

Fixes #6457
2016-04-28 12:34:46 -06:00
Jonathan A. Sternberg d26e4e3650 Pass binary expressions to the underlying query
Binary math inside of a where condition was previously disallowed. Now,
these types of queries are just passed verbatim down to the underlying
query engine which can handle it.

We may want to revisit this when it comes to tags at some point as it
prevents the more efficient filtering of tags that a simple expression
allows, but it allows a query like this to be done:

    SELECT * FROM cpu WHERE value + 2 < 5

So while it can be better, this is a good initial implementation to
provide this functionality. There are very rare situations where a tag
may be used appropriately in one of these circumstances.

Fixes #3558.
2016-04-22 11:30:36 -04:00
Jonathan A. Sternberg 09c46a451a Sort the series keys inside of a tag set so the output is deterministic
The series keys within a tag set were previously not sorted which would
cause the output to be non-deterministic. This sorts the output series
by their keys so it has a consistent output especially when using
limits.

Fixes #3166.
2016-04-18 17:45:31 -04:00
Jonathan A. Sternberg ea6262b712 Enhance comparing tags and fields in the where clause
Now it is possible to compare tags and fields and it is also now
possible to compare tags and tags. Previously, it was only possible to
compare fields with fields and tags with a string or a regex.

Fixes #3371.
2016-04-11 18:10:08 -04:00
Jonathan A. Sternberg 5bdd61bde7 Support empty tags for all WHERE equality operations
A missing tag on a point was sometimes treated as `""` and sometimes
treated as a separate `null` entity. This change modifies the equality
operations to always treat a missing tag as an empty string.

Empty tags are *not* indexed and do not have the same performance as a
tag that exists.

Fixes #3773.
2016-04-11 12:01:35 -04:00
Edd Robinson 5327a75a6f Merge pull request #6216 from influxdata/er-scope-proto
Change protobuf package names to avoid clashes
2016-04-07 16:38:21 +01:00
Edd Robinson 184257a10d Scope all internal protobuf packages 2016-04-05 13:54:21 +01:00
Jason Wilder 3f4c5a5585 Fix race on measurementFields
Both Shard and Engine had the same reference to the measurementField map,
but they each protected it with their own locks.  This causes a race when
write and queries are occurring because writes can add new fields to the
map while queries are reading from it.

The fix moves the ownership to the Engine and provides protected accessors
to that Shard now users.  For the most parts, the access on shard were old
dead code.

Fixing the measurementFields map race created a new race on the internal
fields map.  This is now unexported and protected via MeasurementFields
exported funcs.

Fixes #6188
2016-04-01 18:57:01 -06:00
Jason Wilder 07e3215d11 Remove ununsed Series.match func 2016-03-31 10:19:46 -06:00
Jason Wilder 40c4973423 Remove per measurement stats collection
The stats setup ends up creating a lot of lock contention which signifcantly
impacts write throughput when a large number of measurements are used.

Fixes #6131
2016-03-31 10:19:27 -06:00
Jason Wilder f1bb87d4f8 Convert index write lock to series lock 2016-03-31 10:19:27 -06:00
Jason Wilder 9f41acba2f Move shard mapping logic into index 2016-03-29 12:59:27 -06:00
Jason Wilder 3f0e871425 Reduce lock content when loading database index 2016-03-29 12:59:26 -06:00
Jason Wilder 03ced4cc90 Load shards concurrently 2016-03-29 12:58:52 -06:00
Jonathan A. Sternberg a35d9602cd Fix where filters when a OR is used and when a tag does not exist
If an OR was used, merging filters between different expressions would
not work correctly. If one of the sides had a set of series ids with a
condition and the other side had no series ids associated with the
expression, all of the series from the side with a condition would have
the condition ignored. Instead of defaulting a non-existant series
filter to true, it should just be false and the evaluation of the one
side that does exist should take care of determining if the series id
should be included or not. The AND condition used false correctly so did
not have to be changed.

If a tag did not exist and `!=` or `!~` were used, it would return false
even though the neither a field or a tag equaled those values. This has
now been modified to correctly return the correct series ids and the
correct condition.

Also fixed a panic that would occur when a tag caused a field access to
become unnecessary. The filter using the field access still got created
and used even though it was unnecessary, resulting in an attempted
access to a non-initialized map.

Fixes #5152 and a bunch of other miscellaneous issues.
2016-03-22 12:19:06 -04:00
Jonathan A. Sternberg d75428f79f Rename the special condition "name" to "_name" to reduce conflicts
Fixes #6034.
2016-03-16 17:17:04 -04:00
Ben Johnson f692621ef5 allow querying of system-like series
Internal system series start with an underscore prefix but
restricting this prevents users who already use an underscore
prefix in their series names.

Fixes #5870
2016-03-14 13:50:52 -06:00
Jason Wilder c44195d999 Convert measurementToRegex to exported func
Make it consistent with other conventions where exported funcs
take a lock.
2016-03-09 17:45:37 -07:00
Jason Wilder ae2360df7c Use read lock to expand sources
A write-lock was taken which locks the whole store during a query
that needs to expand sources.  Under load, writes can start to fail.
2016-03-09 17:22:57 -07:00
Ben Johnson 41dde61226 SHOW SERIES 2016-03-08 11:47:57 -07:00
Jonathan A. Sternberg 2f0e246757 Implemented the tag values iterator for `SHOW TAG VALUES`
`SHOW TAG VALUES` output has been modified to print the measurement name
for every measurement and to return the output in two columns: key and
value. An example output might be:

    > SHOW TAG VALUES WITH KEY IN (host, region)
    name: cpu
    ---------
    key     value
    host    server01
    region  useast

    name: mem
    ---------
    key     value
    host    server02
    region  useast

`measurementsByExpr` has been taught how to handle reserved keys (ones
with an underscore at the beginning) to allow reusing that function and
skipping over expressions that don't matter to the call.

Fixes #5593.
2016-03-06 09:52:34 -05:00
Mark Rushakoff fb83374389 Track stats for number of series, measurements
Per database: track number of series and measurements
Per measurement: track number of series
2016-02-24 08:10:16 -08:00
Mark Rushakoff fc9ab7a46f Miscellaneous cleanup in tsdb package
* When possible, initialize maps/slices to exact length/capacity
  * See slice benchmarks at
    https://gist.github.com/mark-rushakoff/b5650bd8f06bece0b9fd
* Fixed some typos
* Removed an unnecessary loop in stringset.intersect
2016-02-10 18:00:47 -08:00
Justin Nuß 82c276756a Lint tsdb and tsdb/engine package 2016-02-10 21:33:46 +01:00