influxdb

Commit Graph

Author	SHA1	Message	Date
Edd Robinson	2d9bd09784	Use []byte where possible in Index	2017-01-05 09:57:34 -07:00
Edd Robinson	4b1ef68dc9	Move series and measurement stats to store	2017-01-05 09:54:05 -07:00
Edd Robinson	aaf85ae38d	Tombstoning with series cardinality part 1	2017-01-05 09:54:04 -07:00
Edd Robinson	bd8dd9a291	Sketches working	2017-01-05 09:54:04 -07:00
Edd Robinson	d19fbf5ab4	Wire in HLL estimator	2017-01-05 09:54:03 -07:00
Edd Robinson	2b8efefef4	Initial index interface	2017-01-05 09:51:43 -07:00
Edd Robinson	c535e3899a	Remove in-memory index from Shard and Store	2017-01-05 09:47:09 -07:00
Mark Rushakoff	07b87f2630	Miscellaneous lint cleanup	2017-01-03 09:47:32 -08:00
Mark Rushakoff	4a774eb600	Update godoc for the tsdb package	2016-12-30 21:12:37 -08:00
gunnaraasen	78b1a0e771	Add stats on dropped measurements and series; Fixes #7697	2016-12-13 15:17:31 -08:00
Jason Wilder	bf17074f58	Avoid allocation when counting tag keys A new sorted slice was called by the monitor func every 10s. The tag keys don't need to be sorted so this avoid the allocation of the slice and one during sorting.	2016-11-15 16:13:55 -07:00
Jonathan A. Sternberg	3681bc8a43	Filter out series within shards that do not have data for that series Previously, we would return a full tag set for every shard and the tag set would include all series that existed in the database index including series that didn't physically exist within that shard. This led to the tag sets returned being incredibly huge when we had high cardinality but sparse data. Since the data was sparse, it was unexpected that it would cause such a large strain on the system by most people. Now we filter out the series ids that are not assigned to the current shard when computing a tag set for that shard. This lowers the memory usage for high cardinality sparse data drastically and allows queries on those to complete successfully. This does not resolve issues for high cardinality data in every shard that is also spread out over a long series of time. That situation isn't nearly as common as the above situation though.	2016-10-20 14:15:34 -05:00
Jason Wilder	2e473e9518	Fix panic in AppendSeriesKeyByID Calling this function with a series ID that does not exist in the measurement causes a panic. Fixes #7334	2016-10-19 11:07:19 -06:00
Jonathan A. Sternberg	41e4e73d4e	Reduce map allocations when computing the TagSets of a measurement Instead of assigning a boolean value of true to the filter expressions when there was no meaningful expression, this drops a boolean expression of true from the filter expressions so we don't have to perform a map assignment. This allows us to reduce allocations and assignments when a `WHERE` clause only contains tag comparisons and no field comparisons.	2016-10-17 12:13:19 -05:00
Jason Wilder	a5f871d62c	Rework monitoring to avoid allocations	2016-10-10 11:42:15 -06:00
Jason Wilder	8fce6bba48	Add tag value cardinality limit	2016-10-10 11:42:15 -06:00
Jason Wilder	68dd312bb1	Reduce allocations when calculating tagsets The TagSets function was creating a lot of intermediate maps and slices to calculate the sorted tag sets. It first creates a map to group tag sets with their series, it then created an equally sized slice of the tag keys and sorted then. Finally, it created a new slice and added the tag sets in the original map by the ordering of the sorted keys. It was also recreating the tags map multiple time creating extra garbage in the loop. This simplifies the code to create one map for grouping and than adding the distinct sets to a slice which is then sorted. It also fixes the multple tag maps getting created.	2016-09-29 16:02:29 -06:00
Jason Wilder	6671ef00f0	Reduce allocations in idsForExpr	2016-09-26 08:36:59 -06:00
Edd Robinson	ed41122ade	Pre-allocate map for performance	2016-09-15 18:28:46 +01:00
Jonathan A. Sternberg	dc2527ce86	Merge branch '1.0'	2016-08-31 14:45:57 -05:00
Jonathan A. Sternberg	964341eb20	Optimize queries that compare a tag value to an empty string The behavior for querying tag values with an empty string was originally fixed in #6283, but it also added a performance problem when the cardinality of the tag was high. Since a call to `Union()` or `Reject()` would happen for every series key and it would be called N times for N cardinality, the comparisons against a blank string were unnecessarily slow with large memory allocations. This optimizes these queries so it doesn't use those methods anymore. Those methods are still useful and used when combining AND and OR clauses, but they aren't useful when finding the series ids for a single clause. These methods were unnecessary anyway because the series ids for the tags were unique anyway and didn't have to be merged as a set.	2016-08-31 14:03:23 -05:00
Ben Johnson	a30f9b6c70	Merge pull request #7196 from benbjohnson/mmap-fix Fix mmap dereferencing	2016-08-24 10:48:28 -06:00
Ben Johnson	cc628a1097	Fix mmap dereferencing Adds a missing dereference call to `Close()` as well as fixes a tag copy issue.	2016-08-24 10:48:07 -06:00
Edd Robinson	6cafdbc604	Ensure we don't mutate provided statistics tags	2016-08-24 11:40:13 +01:00
Edd Robinson	90ff713f21	Fix base64 encoding issue in stats Fixes #7177.	2016-08-22 15:21:31 +01:00
Ben Johnson	65536676a4	Merge pull request #7138 from benbjohnson/optimize-shard-open Reduce memory allocations in index	2016-08-17 15:27:33 -06:00
Ben Johnson	8aa224b22d	reduce memory allocations in index This commit changes the index to point to index data in the shards instead of keeping it in-memory on the heap.	2016-08-16 14:09:00 -06:00
Jonathan A. Sternberg	6b5b24a3e3	Decrement number of measurements only once when deleting the last series from a measurement	2016-08-15 13:57:08 -05:00
Mark Rushakoff	f34a7430e3	Fix length of (*DatabaseIndex).SeriesKeys() Previously, it would return as many empty strings in the first half of the slice as valid values at the end of the slice.	2016-07-27 16:07:39 -07:00
Jason Wilder	c31f0c25b4	Fix duplicate series getting created There was a race where the same series would get added to the in-memory index for a measurement more than once. This would result in the same series being returned more than once during queries causing duplicate results. The issue was that we check for the series under the read lock, but did not check again under the write lock where there was a small window where the series could be added by another goroutine. We now check for the series under the write lock. Fixes #6946	2016-07-18 16:46:36 -06:00
Jonathan A. Sternberg	837a9804cf	Refactoring the monitor service to avoid expvar Truncate the time interval output of the monitor service to be on even time intervals rather than on every minute based on the start time. This normalizes the output from the monitor service.	2016-07-07 11:13:58 -05:00
Jonathan A. Sternberg	497db2a6d3	Removing dead code from every package except influxql The tsdb package had a substantial amount of dead code related to the old query engine still in there. It is no longer used, so it was removed since it was left unmaintained. There is likely still more code that is the same, but wasn't found as part of this code cleanup. influxql has dead code show up because of the code generation so it is not included in this pruning.	2016-06-20 22:41:07 -05:00
Ben Johnson	1b94cd2686	optimize SHOW TAG VALUES This commit optimizes `SHOW TAG VALUES` so that it avoids the `SELECT` query engine execution and iterator creation. There are also optimizations to reduce individual memory allocations and to reduce in-memory heap size by only operating on one measurement at a time. Execution time has been reduce to approximately 900ms for 500,000 rows. This is about 2µs per row. Of this time, approximately 1µs is spent retrieving and sorting the row and 1µs is spent encoding into JSON and writing to the response body.	2016-06-06 15:50:53 -06:00
Jason Wilder	579923d95f	Fix sporadic write failures with influx_stress This Unlock was moved which seems to create a deadlock situation sometimes under high write load. This deadlock causes writes to fail with timeouts.	2016-06-01 17:25:47 -06:00
Jason Wilder	ff1447202c	Reduce lock contention in Measurement.AddSeries	2016-05-27 10:30:08 -06:00
Jason Wilder	f1ab89561a	Reload series count stat at startup	2016-05-18 15:21:57 -06:00
Jonathan A. Sternberg	23f6a706bb	Support cast syntax for selecting a specific type Casting syntax is done with the PostgreSQL syntax `field1::float` to specify which type should be used when selecting a field. You can also do `field1::field` or `tag1::tag` to specify that a field or tag should be selected. This makes it possible to select a tag when a field key and a tag key conflict with each other in a measurement. It also means it's possible to choose a field with a specific type if multiple shards disagree. If no types are given, the same ordering for how a type is chosen is used to determine which type to return. The FieldDimensions method has been updated to return the data type for the fields that get returned. The SeriesKeys function has also been removed since it is no longer needed. SeriesKeys was originally used for the fill iterator, but then expanded to be used by auxiliary iterators for determining the channel iterator types. The fill iterator doesn't need it anymore and the auxiliary types are better served by FieldDimensions implementing that functionality, so SeriesKeys is no longer needed. Fixes #6519.	2016-05-16 12:08:29 -04:00
Jonathan A. Sternberg	a17f3d960a	SHOW TAG VALUES accepts != and !~ in WHERE clause Fixes #6607.	2016-05-16 08:51:09 -04:00
Ben Johnson	49eb3b8d04	optimize show series iterator This commit changes the `SeriesIterator` to process one measurement at a time and uses a `floatFastDedupeIterator` to avoid point encoding during deduplication.	2016-05-03 08:52:44 -06:00
Jason Wilder	d82aa98951	Reduce indentation in filter func	2016-05-02 11:38:25 -06:00
Jason Wilder	3a7429886e	Optimize Measurement.DropSeries	2016-05-02 11:36:04 -06:00
Jason Wilder	8082fc61ba	Fix parsing keys when loading database index The code for parsing a key our of the WAL or TSM files in the engine was naive and didn't account for measurements with escape chars. This uses the correct parsing code to parse and load them correctly. Fixes #6496	2016-04-30 14:47:19 -06:00
Jason Wilder	abcb559b09	Remove index meta data when series and measurements are gone This remove the dropMeta param from the tsdb.Store.DeleteSeries and lets the shard determine when to remove the meta data from the index based on what series still have data in the shard. This uncovered a nasty bug in compactions where a fully deleted series would prematurely end the compactions and not carry forward the rest of the data in the TSM file. This is now fixed as well.	2016-04-29 16:31:57 -06:00
Edd Robinson	4d1cfa887c	Ensure measurement dropped when no more series	2016-04-29 00:05:42 +01:00
Jason Wilder	2bd5880d7a	Remove series from index when shard is closed When a shard is closed and removed due to retention policy enforcement, the series contained in the shard would still exists in the index causing a memory leak. Restarting the server would cause them not to be loaded. Fixes #6457	2016-04-28 12:34:46 -06:00
Jonathan A. Sternberg	d26e4e3650	Pass binary expressions to the underlying query Binary math inside of a where condition was previously disallowed. Now, these types of queries are just passed verbatim down to the underlying query engine which can handle it. We may want to revisit this when it comes to tags at some point as it prevents the more efficient filtering of tags that a simple expression allows, but it allows a query like this to be done: SELECT * FROM cpu WHERE value + 2 < 5 So while it can be better, this is a good initial implementation to provide this functionality. There are very rare situations where a tag may be used appropriately in one of these circumstances. Fixes #3558.	2016-04-22 11:30:36 -04:00
Jonathan A. Sternberg	09c46a451a	Sort the series keys inside of a tag set so the output is deterministic The series keys within a tag set were previously not sorted which would cause the output to be non-deterministic. This sorts the output series by their keys so it has a consistent output especially when using limits. Fixes #3166.	2016-04-18 17:45:31 -04:00
Jonathan A. Sternberg	ea6262b712	Enhance comparing tags and fields in the where clause Now it is possible to compare tags and fields and it is also now possible to compare tags and tags. Previously, it was only possible to compare fields with fields and tags with a string or a regex. Fixes #3371.	2016-04-11 18:10:08 -04:00
Jonathan A. Sternberg	5bdd61bde7	Support empty tags for all WHERE equality operations A missing tag on a point was sometimes treated as `""` and sometimes treated as a separate `null` entity. This change modifies the equality operations to always treat a missing tag as an empty string. Empty tags are not indexed and do not have the same performance as a tag that exists. Fixes #3773.	2016-04-11 12:01:35 -04:00
Edd Robinson	5327a75a6f	Merge pull request #6216 from influxdata/er-scope-proto Change protobuf package names to avoid clashes	2016-04-07 16:38:21 +01:00
Edd Robinson	184257a10d	Scope all internal protobuf packages	2016-04-05 13:54:21 +01:00
Jason Wilder	3f4c5a5585	Fix race on measurementFields Both Shard and Engine had the same reference to the measurementField map, but they each protected it with their own locks. This causes a race when write and queries are occurring because writes can add new fields to the map while queries are reading from it. The fix moves the ownership to the Engine and provides protected accessors to that Shard now users. For the most parts, the access on shard were old dead code. Fixing the measurementFields map race created a new race on the internal fields map. This is now unexported and protected via MeasurementFields exported funcs. Fixes #6188	2016-04-01 18:57:01 -06:00
Jason Wilder	07e3215d11	Remove ununsed Series.match func	2016-03-31 10:19:46 -06:00
Jason Wilder	40c4973423	Remove per measurement stats collection The stats setup ends up creating a lot of lock contention which signifcantly impacts write throughput when a large number of measurements are used. Fixes #6131	2016-03-31 10:19:27 -06:00
Jason Wilder	f1bb87d4f8	Convert index write lock to series lock	2016-03-31 10:19:27 -06:00
Jason Wilder	9f41acba2f	Move shard mapping logic into index	2016-03-29 12:59:27 -06:00
Jason Wilder	3f0e871425	Reduce lock content when loading database index	2016-03-29 12:59:26 -06:00
Jason Wilder	03ced4cc90	Load shards concurrently	2016-03-29 12:58:52 -06:00
Jonathan A. Sternberg	a35d9602cd	Fix where filters when a OR is used and when a tag does not exist If an OR was used, merging filters between different expressions would not work correctly. If one of the sides had a set of series ids with a condition and the other side had no series ids associated with the expression, all of the series from the side with a condition would have the condition ignored. Instead of defaulting a non-existant series filter to true, it should just be false and the evaluation of the one side that does exist should take care of determining if the series id should be included or not. The AND condition used false correctly so did not have to be changed. If a tag did not exist and `!=` or `!~` were used, it would return false even though the neither a field or a tag equaled those values. This has now been modified to correctly return the correct series ids and the correct condition. Also fixed a panic that would occur when a tag caused a field access to become unnecessary. The filter using the field access still got created and used even though it was unnecessary, resulting in an attempted access to a non-initialized map. Fixes #5152 and a bunch of other miscellaneous issues.	2016-03-22 12:19:06 -04:00
Jonathan A. Sternberg	d75428f79f	Rename the special condition "name" to "_name" to reduce conflicts Fixes #6034.	2016-03-16 17:17:04 -04:00
Ben Johnson	f692621ef5	allow querying of system-like series Internal system series start with an underscore prefix but restricting this prevents users who already use an underscore prefix in their series names. Fixes #5870	2016-03-14 13:50:52 -06:00
Jason Wilder	c44195d999	Convert measurementToRegex to exported func Make it consistent with other conventions where exported funcs take a lock.	2016-03-09 17:45:37 -07:00
Jason Wilder	ae2360df7c	Use read lock to expand sources A write-lock was taken which locks the whole store during a query that needs to expand sources. Under load, writes can start to fail.	2016-03-09 17:22:57 -07:00
Ben Johnson	41dde61226	SHOW SERIES	2016-03-08 11:47:57 -07:00
Jonathan A. Sternberg	2f0e246757	Implemented the tag values iterator for `SHOW TAG VALUES` `SHOW TAG VALUES` output has been modified to print the measurement name for every measurement and to return the output in two columns: key and value. An example output might be: > SHOW TAG VALUES WITH KEY IN (host, region) name: cpu --------- key value host server01 region useast name: mem --------- key value host server02 region useast `measurementsByExpr` has been taught how to handle reserved keys (ones with an underscore at the beginning) to allow reusing that function and skipping over expressions that don't matter to the call. Fixes #5593.	2016-03-06 09:52:34 -05:00
Mark Rushakoff	fb83374389	Track stats for number of series, measurements Per database: track number of series and measurements Per measurement: track number of series	2016-02-24 08:10:16 -08:00
Mark Rushakoff	fc9ab7a46f	Miscellaneous cleanup in tsdb package * When possible, initialize maps/slices to exact length/capacity * See slice benchmarks at https://gist.github.com/mark-rushakoff/b5650bd8f06bece0b9fd * Fixed some typos * Removed an unnecessary loop in stringset.intersect	2016-02-10 18:00:47 -08:00
Justin Nuß	82c276756a	Lint tsdb and tsdb/engine package	2016-02-10 21:33:46 +01:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Ben Johnson	607750ab1b	add SHOW MEASUREMENTS iterator	2016-02-10 09:40:28 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Ben Johnson	cde973f409	refactor query engine	2016-02-10 09:40:24 -07:00
Sean Beckett	1d83c8c427	Update meta.go	2015-10-13 16:46:59 -07:00
David Norton	512d6ac050	fix #4280 : only drop points matching WHERE clause	2015-10-09 18:34:32 -04:00
Ben Johnson	b213ddad78	refactor cursor	2015-09-22 13:10:12 -06:00
Ben Johnson	1b8b625787	refactor SelectMapper	2015-09-22 13:09:26 -06:00
Mark Rushakoff	85275e7d59	Sort DatabaseIndex.measurementsByTagFilters result Fixes #4118	2015-09-20 14:37:27 -07:00
Cory LaNou	d19a510ad2	refactor Points and Rows to dedicated packages	2015-09-16 15:33:08 -05:00
Jason Wilder	6b4926257a	Add inspect tool Start of a lower-level file inspection tool. This currently dumps summary statistics for the shards, index and WAL that can be used to understand the shape of the data is in the local shards. This util operates on the shards itself and not through the server and is intended more for debugging/troubleshooting.	2015-09-04 10:38:59 -06:00
Jason Wilder	a4c1d9a9a7	Remove unused Database index names and sorting Writes could timeout and when adding new measurement names to the index if the sort took a long time. The names slice was never actually used (except a test) so keeping it in index wastes memory and sort it wastes CPU and increases lock contention. The sorting was happening while the shard held a write-lock. Fixes #3869	2015-08-27 11:57:20 -06:00
Paul Dix	1c24cbd8a7	Fix query engine not goroutine safe issue.	2015-08-19 18:43:50 -04:00
Paul Dix	a509df0484	Compress metadata, add Delete to WAL. * All metadata for each shard is now stored in a single key with compressed value * Creation of new metadata no longer requires a syncrhnous write to Bolt. It is passed to the WAL and written to Bolt periodically outside the write path * Added DeleteSeries to WAL and updated bz1 to remove series there when DeleteSeries or DropMeasurement are called	2015-08-18 08:10:51 -04:00
Paul Dix	3348dab4e0	Fix bug with new shards not getting series data persisted.	2015-08-16 15:45:09 -04:00
Jason Wilder	70aa6961c5	Remove unused in-memory index hash The series map on Measurement was updated and deleted from but never actually used. Series keys can be very bia since they are the the string representation of the measurement plus sorted tags. Locally I see 20%-30% reduction in memory usage with 1M series.	2015-08-14 16:37:21 -06:00
Philip O'Toole	7b4879f0ce	Fully remove a series when dropped Fix issue #3226.	2015-08-14 10:50:35 -07:00
Jason Wilder	68b82f3030	Fix regex queries regression ValidateGroupBy was returning an error if a tag does not exist but it appears that function was supposed to be validating that a field name was not used as a group by field. Fixes #3326	2015-08-10 15:02:29 -06:00
Ben Johnson	1ada790de7	add bz1 storage engine	2015-08-03 14:32:17 -06:00
Jason Wilder	37c971bb82	Fix querying measurements with spaces Fixes #3319	2015-07-22 14:49:54 -06:00
Ben Johnson	a7f50ae03c	refactor storage to engine	2015-07-22 11:08:10 -06:00
Ben Johnson	de1f9a3736	refactor tsdb tests into test package	2015-07-22 11:07:06 -06:00
Philip O'Toole	df3caefcf9	stringSet now takes varadic slice to add	2015-07-20 14:40:39 -07:00
Philip O'Toole	74cb96646c	Refactor query engine for distributed query support With this change, the query engine code gathers information about shards and tagsets by working with individual shards, collating the information, and returning that to the client. It does not assume that any particular shard is local, and accesses all shards through abstracted Mappers, of which there are two types -- a Mapper type for Raw queries and a second type for Aggregate queries. There are corresponding Executors for each type of Mapper, but both types of Executors share the same interface.	2015-07-15 12:54:55 -07:00
Philip O'Toole	dd66491f65	stringSet now returns elements in sorted order	2015-07-06 12:03:58 -04:00
Philip O'Toole	cb7baa6d9e	Don't group TagSets when tag values are identical Fixes issue #3059	2015-06-22 16:04:13 -07:00
Pradeep Chhetri	37750acef6	Fixed some Typos	2015-06-11 17:33:26 +05:45
Philip O'Toole	64af1b6241	Report number of measurements and series per node	2015-06-11 00:21:15 -07:00
Philip O'Toole	344a1f4948	Don't even return value from DropSeries	2015-06-10 20:50:07 -07:00
Philip O'Toole	85fd3d0292	Series was not already dropped, return false	2015-06-09 14:25:20 -07:00
Paul Dix	9bf09ee026	Correct comments in tsdb/meta	2015-06-04 16:08:12 -04:00
Paul Dix	408bc3f81e	Ensure proper locking of index structures on writes and queries.	2015-06-04 14:50:32 -04:00

1 2 3 4

163 Commits (1.11)