influxdb

Commit Graph

Author	SHA1	Message	Date
Ben Johnson	9fb8f1ec1d	Fix database and tag limits.	2017-03-24 09:48:10 -06:00
Ben Johnson	358b1e0b05	Merge remote-tracking branch 'upstream/master' into tsi	2017-03-15 10:13:32 -06:00
Jason Wilder	675d7c9d65	Merge branch '1.2' into jw-merge12	2017-03-06 11:09:05 -07:00
Jason Wilder	29f8d8de76	Fix race in WALEntry.Encode and Value.Deduplicate Under high query load, a race exists in the cache and the WAL. Since writes currently hit the cache first, they are availble for query before they hit the WAL. If the WAL is writing and accessign the Value slice at the same time that a query is run that needs to dedup the same slice, a race occurs. To fix this, the cache now just copies the values instead of storing the slice passed in. Another way to fix this might be to have the writes go to the wal before the cache. I think the latter would be better, but it introduces some larger write path issues that we'd need to also address. e.g. if the cache was full, writes to the WAL would need to be rejected to avoid filling the disk. Copying the slice in the cache is simpler for now and does not appear to dramatically affect performance.	2017-03-06 09:38:22 -07:00
Jason Wilder	a024003f2c	Merge branch '1.2' into jw-merge-12	2017-02-22 12:13:29 -07:00
Ben Johnson	78a9bb2527	Remove Tags.shouldCopy, replace with forceCopy on series creation. Previously, tags had a `shouldCopy` flag to indicate if those tags referenced an underlying buffer and should be copied to allow GC. Unfortunately, this prevented tags from being copied that were created and referenced the mmap which caused segfaults. This change removes the `shouldCopy` flag and replaces it with a `forceCopy` argument in `CreateSeriesIfNotExists()`. This allows the write path to indicate that tags must be cloned on insert.	2017-02-21 11:13:35 -07:00
Mark Rushakoff	601cbcd084	Merge branch '1.2' into mr-merge-12	2017-02-17 16:14:22 -08:00
Jonathan A. Sternberg	2fe48d6781	Rename zap import back to github.com/uber-go/zap They rebased a revision we were previously relying upon that allowed us to use the vanity name so we are reverting back to an older version with the old import path.	2017-02-17 17:17:22 -06:00
Ben Johnson	047c21f4d9	Merge remote-tracking branch 'upstream/master' into tsi	2017-01-24 09:28:58 -07:00
Edd Robinson	feb7a2842c	Use unbuffered error channels in tests	2017-01-17 10:53:15 -08:00
Edd Robinson	292b30b82b	Fix subtle bugs and remove dead code from tsdb	2017-01-17 09:47:34 -08:00
Jonathan A. Sternberg	d7c8c7ca4f	Support subquery execution in the query language This adds query syntax support for subqueries and adds support to the query engine to execute queries on subqueries. Subqueries act as a source for another query. It is the equivalent of writing the results of a query to a temporary database, executing a query on that temporary database, and then deleting the database (except this is all performed in-memory). The syntax is like this: SELECT sum(derivative) FROM (SELECT derivative(mean(value)) FROM cpu GROUP BY *) This will execute derivative and then sum the result of those derivatives. Another example: SELECT max(min) FROM (SELECT min(value) FROM cpu GROUP BY host) This would let you find the maximum minimum value of each host. There is complete freedom to mix subqueries with auxiliary fields. The only caveat is that the following two queries: SELECT mean(value) FROM cpu SELECT mean(value) FROM (SELECT value FROM cpu) Have different performance characteristics. The first will calculate `mean(value)` at the shard level and will be faster, especially when it comes to clustered setups. The second will process the mean at the top level and will not include that optimization.	2017-01-07 13:00:48 -06:00
Ben Johnson	f9efcb3365	Re-add shared in-memory index.	2017-01-05 10:17:09 -07:00
Edd Robinson	0f9b2bfe6a	Fix tests	2017-01-05 10:16:15 -07:00
Ben Johnson	9f8b206b51	Fix measurement system queries.	2017-01-05 10:15:34 -07:00
Ben Johnson	cb93f10120	Remove per-shard in-memory index.	2017-01-05 10:11:09 -07:00
Ben Johnson	409b0165f5	shared in-memory index	2017-01-05 10:09:57 -07:00
Ben Johnson	a812502ea3	reintegrating in-memory index	2017-01-05 10:07:35 -07:00
Ben Johnson	1ac067e53b	intermediate	2017-01-05 10:03:09 -07:00
Ben Johnson	62d2b3ebe9	Series filtering.	2017-01-05 10:02:42 -07:00
Ben Johnson	62269c3cea	intermediate	2017-01-05 10:02:41 -07:00
Edd Robinson	da63b349a4	Fix bad rebase	2017-01-05 09:59:44 -07:00
Edd Robinson	149b1cef1d	Fix 32bit overflow; limit capacity	2017-01-05 09:59:10 -07:00
Edd Robinson	d19fbf5ab4	Wire in HLL estimator	2017-01-05 09:54:03 -07:00
Edd Robinson	05bc4dec00	Refactor	2017-01-05 09:50:23 -07:00
Edd Robinson	2171d9471b	Initialise index in shards	2017-01-05 09:42:48 -07:00
Jonathan A. Sternberg	ec57108520	Use proper uber-go/zap import path It looks like the real import path to the project is go.uber.org/zap instead of github.com/uber-go/zap since the example in the project references that path.	2016-12-15 08:54:14 -06:00
Jonathan A. Sternberg	21502a39e8	Switch logging to use structured logging everywhere The logging library has been switched to use uber-go/zap. While the logging has been changed to use structured logging, this commit does not change any of the logging statements to take advantage of the new structured log or new log levels. Those changes will come in future commits.	2016-12-14 10:45:15 -06:00
Edd Robinson	28ba8ced74	Fixes #7625	2016-11-17 16:31:36 +00:00
Jason Wilder	0b6f5441b9	Add config option to messages when limits exceeded When a limit is exceeded, we return errors and sometimes log (if appropriate) that a limit was exceeded. The messages don't always provide an indication as to where or how they are configured. Instead, return the config option (easily searchable for) as well as the limit currently set and the value that exceeded it when possible.	2016-10-28 14:54:45 -06:00
Jason Wilder	873189e0c2	Fix panic: interface conversion: tsm1.Value is tsm1.FloatValue, not tsm1.StringValue If concurrent writes to the same shard occur, it's possible for different types to be added to the cache for the same series. The way the measurementFields map on the shard is updated is racy in this scenario which would normally prevent this from occurring. When this occurs, the snapshot compaction panics because it can't encode different types in the same series. To prevent this, we have the cache return an error a different type is added to existing values in the cache. Fixes #7498	2016-10-28 12:15:50 -06:00
Jason Wilder	bbecb3f03d	Drop points that would execeed limits This changes the behavior of the max-series-per-database and max-values-per-tag limits to drop points that would exceed the limits and allow the remaining points to be written. Previously, the whole batch would fail and return and 500 error to the client. This now will write the allow points and return a `partial write` error indicating some of the points were dropped, how many were dropped and one of the problem measureent and tags.	2016-10-10 11:42:15 -06:00
Ben Johnson	8aa224b22d	reduce memory allocations in index This commit changes the index to point to index data in the shards instead of keeping it in-memory on the heap.	2016-08-16 14:09:00 -06:00
Jonathan A. Sternberg	9621bee195	Drop time when used as a tag or field key The "time" field and tags are unqueryable so we prevent those from being written so we don't have unreadable data.	2016-08-10 10:02:01 -05:00
David Norton	0c4559722c	feat #6679 : add series limit config setting	2016-08-01 08:28:46 -04:00
Michael Desa	517d8d5881	Move benchmarks beneath other NewSeries	2016-06-23 10:15:37 -07:00
Michael Desa	0c867e4b2c	Fix benchmark test names Previously the test names included an `s` for the name of a singular component.	2016-06-16 08:45:36 -07:00
Michael Desa	9dfaa182a7	Add additional benchmarks for various schemas Anecdotally, the relationship between memory consumption and series cardinality was thought to be exponential. I suspect that this is false. The intent of the added benchmarks is to verify my suspicion. Eventually the these benchmarks will run nightly to serve as a basis to evualuate the memory performance in a controlled environment. https://github.com/influxdata/docs.influxdata.com/issues/392	2016-06-15 14:54:14 -07:00
Jason Wilder	1ff8ecf4fb	Add ability to disable shards Disabling a shard causes all writes and queries to a shard to return an error. This also disables compactions for the shard.	2016-05-31 10:51:54 -06:00
Jonathan A. Sternberg	5e7e0bd19b	Filter out sources that do not match the shard database/retention policy If you use a statement like this: SELECT value FROM one..cpu, two..cpu It will access both the `one` and `two` databases as if you had selected the `cpu` measurement twice for both of them. Updated the `tsdb.Shard` create iterator function to filter out any sources that do not apply to that shard so this duplication doesn't happen. Fixes #6701.	2016-05-23 17:05:33 -04:00
Jonathan A. Sternberg	23f6a706bb	Support cast syntax for selecting a specific type Casting syntax is done with the PostgreSQL syntax `field1::float` to specify which type should be used when selecting a field. You can also do `field1::field` or `tag1::tag` to specify that a field or tag should be selected. This makes it possible to select a tag when a field key and a tag key conflict with each other in a measurement. It also means it's possible to choose a field with a specific type if multiple shards disagree. If no types are given, the same ordering for how a type is chosen is used to determine which type to return. The FieldDimensions method has been updated to return the data type for the fields that get returned. The SeriesKeys function has also been removed since it is no longer needed. SeriesKeys was originally used for the fill iterator, but then expanded to be used by auxiliary iterators for determining the channel iterator types. The fill iterator doesn't need it anymore and the auxiliary types are better served by FieldDimensions implementing that functionality, so SeriesKeys is no longer needed. Fixes #6519.	2016-05-16 12:08:29 -04:00
Jason Wilder	2bd5880d7a	Remove series from index when shard is closed When a shard is closed and removed due to retention policy enforcement, the series contained in the shard would still exists in the index causing a memory leak. Restarting the server would cause them not to be loaded. Fixes #6457	2016-04-28 12:34:46 -06:00
Jonathan A. Sternberg	7ec2a991d5	Modify all of the iterators to allow returning an error on Next() This also switches the remaining iterators to be lazy so they can return errors properly. They needed to be converted to lazy initialization anyway, which has the side effect of making it much easier for us to propagate the underlying error during initialization. Updated the Emitter to return errors when it cannot read properly from the iterators.	2016-04-18 11:17:55 -04:00
Jonathan A. Sternberg	94ec92d669	Handle nil values from the tsm1 cursor correctly Send nil values from the tsm1 cursor at the end of the cursor. After the cursor reached tsm1, the `nextAt()` call would always return the default value rather than a nil value. Descending also didn't work correctly because the seeking functionality for tsm1 iterators would always act like they were ascending instead of descending when choosing which value to select. This resulted in very strange output from the emitter since it couldn't figure out if it was ascending or descending. Fixes #6206.	2016-04-06 09:27:02 -04:00
Edd Robinson	8e2d1e48c7	Check if engine closed. Fixes #6140	2016-03-31 15:59:04 +01:00
Mark Rushakoff	cdcb079769	Tag TSM stats with database, retention policy ... by extracting the db/rp from the given path. Now that the code has "standardized" on extracting db/rp this way, the ShardLocation struct is no longer necessary and thus has been removed. We're back on the previous style of passing the path and walPath to NewShard.	2016-02-29 09:17:34 -08:00
Mark Rushakoff	40a98e0d55	Add database, RP as tags on shard stats This commit updates tsdb.Shard to contain a ShardConfig and updates tsdb.Store to directly reference a map of tsdb.Shard rather than the previous tsdb.shardLocation abstraction.	2016-02-25 13:41:55 -08:00
Mark Rushakoff	fb83374389	Track stats for number of series, measurements Per database: track number of series and measurements Per measurement: track number of series	2016-02-24 08:10:16 -08:00
Ben Johnson	f7e04abef7	remove NaN from query engine This commit removes `math.NaN` returns from float iterators.	2016-02-17 14:11:31 -07:00
Ben Johnson	5a0d1ab7c1	rename influxdb/influxdb to influxdata/influxdb This commit changes all the import and URL references from: github.com/influxdb/influxdb to: github.com/influxdata/influxdb	2016-02-10 10:26:18 -07:00
Ben Johnson	47c2bab74b	add SHOW TAG KEYS support	2016-02-10 09:40:28 -07:00
Ben Johnson	00806de9b8	refactor query engine	2016-02-10 09:40:25 -07:00
Ben Johnson	cde973f409	refactor query engine	2016-02-10 09:40:24 -07:00
Jason Wilder	0926b19e6b	Prevent creating points with NaN float values Float values are not supported in the existing engine and the tsm1 engines. This changes NewPoint to return an error if a field value contains a NaN field. It also allows us to validate fields to prevent other unsupported types from sneaking in through other input plugins.	2015-10-27 17:12:52 -06:00
Cory LaNou	d19a510ad2	refactor Points and Rows to dedicated packages	2015-09-16 15:33:08 -05:00
Jason Wilder	a4c1d9a9a7	Remove unused Database index names and sorting Writes could timeout and when adding new measurement names to the index if the sort took a long time. The names slice was never actually used (except a test) so keeping it in index wastes memory and sort it wastes CPU and increases lock contention. The sorting was happening while the shard held a write-lock. Fixes #3869	2015-08-27 11:57:20 -06:00
Paul Dix	73f3dc1e14	Update store to properly manage WAL create/delete. * Update the store to remove the WAL directories associated with a shard or database when they are deleted. * Fix the Store so that it creates separate WAL directories for databases and retention policies.	2015-08-21 11:22:04 -04:00
Paul Dix	9df3b7d828	Add WAL configuration options	2015-08-18 16:59:54 -04:00
Paul Dix	3348dab4e0	Fix bug with new shards not getting series data persisted.	2015-08-16 15:45:09 -04:00
Paul Dix	b583b896ce	Integrate WAL and BZ1 and make BZ1 the default engine.	2015-08-16 12:46:50 -04:00
Ben Johnson	a7f50ae03c	refactor storage to engine	2015-07-22 11:08:10 -06:00
Ben Johnson	de1f9a3736	refactor tsdb tests into test package	2015-07-22 11:07:06 -06:00
Philip O'Toole	ca86fa2633	Allow WAL inter-flush time to be configurable	2015-07-02 10:40:26 -04:00
Joseph Crail	5fccee3d16	Fix spelling errors in comments and strings.	2015-06-28 02:54:34 -04:00
Ben Johnson	b574e2f755	Add write ahead log This commit adds a write ahead log to the shard. Entries are cached in memory and periodically flushed back into the index. The WAL and the cache are both partitioned into buckets so that flushing doesn't stop the world as long.	2015-06-25 15:47:13 -06:00
Jason Wilder	bc7e1f6fd6	Fix panic when adding new fields Fixes #2869 When adding a new field to an existing measurment, Shard.validateSeriesAndFields would also encode the fields as a side-effect. In the case of a new field that needed to be created, the encoding would fail because the field type had not been created for the measurement yet. The fields are re-encoded after validateSeriesAndFields returns and after the field encoding have been setup properly so this additional encoding during validation isn't necessary.	2015-06-10 10:30:14 -06:00
Todd Persen	0ee71b9755	Merge pull request #2743 from influxdb/tsdb-benchmarks add shard & index benchmarks	2015-06-05 13:15:38 -07:00
Paul Dix	408bc3f81e	Ensure proper locking of index structures on writes and queries.	2015-06-04 14:50:32 -04:00
David Norton	938ad2ef85	add Store Open benchmark test	2015-06-03 10:09:50 -04:00
David Norton	31bb8e70a9	don't build index before benchmarking WritePoints	2015-06-02 17:17:31 -04:00
David Norton	97c84a6d4f	add benchmark tests for shard WritePoints	2015-06-02 17:00:25 -04:00
Jason Wilder	9a9bb736f7	Add text protocol parsing and serialzation for points This changes the implementation of point to minimize the extra processing needed to parse and marshal point data though the system.	2015-05-29 11:18:40 -06:00
Jason Wilder	21bfb150a1	Wire up new write path This allows the new write path to be hooked up if you start the server with `INFLUXDB_ALPHA1=1`. When set, writes will go though the coordinator and be stubbed out to write to a single local data node with one shards. The write will be logged and written to disk . The env var is used so that the current write path is not completely broken which would break many of the tests that depend on writes. Note that queries are not currently working w/ the this change.	2015-05-26 12:07:56 -06:00
Paul Dix	6c80108f63	Change Database to DatabaseIndex, remove leftover warn statement	2015-05-24 07:39:45 -04:00
Paul Dix	c3ab88a715	Make the metadata index shared across shards while keeping field types and encoding local to each shard.	2015-05-23 18:06:07 -04:00
Jason Wilder	1076153a00	Convert Point to interface Should be possible to replace the implementation with a more optimized version now.	2015-05-22 15:39:55 -06:00
Jason Wilder	f8d599cda9	Convert Point.Tags to Point.Tags()	2015-05-22 15:12:34 -06:00
Jason Wilder	5dcab443dc	Move data.Point to tsdb.Point	2015-05-22 15:00:51 -06:00
Paul Dix	8f937cae87	Initial implementation for writing data to a shard.	2015-05-22 16:11:18 -04:00

1 2 3

129 Commits (88d6487f4a95dfd4d699bdc155f1fff57a792efb)