influxdb

Commit Graph

Author	SHA1	Message	Date
Carol (Nichols \|\| Goulding)	24af745b23	chore: Upgrade to flatbuffers 0.8	2021-03-22 09:38:58 -04:00
Andrew Lamb	6e1795fda0	refactor: Move some types (not yet exposed to clients) into internal_types (#1015 ) * refactor: Move some types (not yet exposed to clients) into internal_types * docs: Add README.md explaining the rationale * refactor: remove some stragglers * fix: fix benches * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: add clippy lints * fix: fmt * docs: Apply suggestions from code review fix typos Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-19 16:27:57 +00:00
Raphael Taylor-Davies	7e6c6d67b4	feat: graceful shutdown (#827 ) (#1018 ) * feat: graceful shutdown (#827) * chore: additional docs * chore: further docs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-19 15:36:49 +00:00
Raphael Taylor-Davies	dd94a33bc7	feat: retain limited tracker history (#1005 )	2021-03-17 16:32:34 +00:00
Andrew Lamb	72eff5eed5	chore: update deps (including arrow)	2021-03-16 18:15:44 -04:00
Raphael Taylor-Davies	3fe1b8c5b7	feat: add longrunning operations client (#981 ) refactor: add separate format feature influxdb_iox_client Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-16 13:19:44 +00:00
Raphael Taylor-Davies	65f7a1ac5b	fix: use consistent crate versions (#989 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-15 15:42:19 +00:00
kodiakhq[bot]	fcd4419702	Merge branch 'main' into pd-routing-rules	2021-03-12 20:02:53 +00:00
Raphael Taylor-Davies	7e25c4e896	feat: add fanout task tracking (#956 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-12 15:01:27 +00:00
Marko Mikulicic	9df4131e60	feat: Add server remote [set\|remove\|list] commands	2021-03-12 10:41:18 +00:00
Paul Dix	0606203b40	feat: add configuration for routing rules This is a strawman for what routing rules might look like in DatabaseRules. Once there's a chance for discussion, I'd move next to looking at how the Server would split up an incoming write into separate FB blobs to be sent to remote IOx servers. That might change what the API/configuration looks like as that's how it would be used (at least for writes). After that it would make sense to move to adding the proto definitions with conversions and gRPC and CLI CRUD to configure routing rules.	2021-03-11 15:25:57 -05:00
Raphael Taylor-Davies	d2859a99d0	feat: add google longrunning operations stubs (#959 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-10 17:34:07 +00:00
Andrew Lamb	f568c083a4	fix: Do not leave child processes around after the end-to-end test (#955 )	2021-03-10 14:25:27 +00:00
Andrew Lamb	1af5cf8b7c	refactor: Move end-to-end test server fixture into its own module (#945 ) * refactor: Move test server fixture into its own module * fix: Update tests/end-to-end.rs * fix: better error handling and display * fix: tweak startup message	2021-03-09 19:08:55 +00:00
Andrew Lamb	746373a687	refactor: Remove mutable_buffer crate dependency on query crate (#927 )	2021-03-05 11:34:27 +00:00
Andrew Lamb	3abfb5f089	chore: Update arrow deps, turn off optional datafusion features (#930 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-04 22:41:54 +00:00
Raphael Taylor-Davies	5c9dd68bf8	feat: MVP Management CLI (#907 ) * feat: MVP CLI implementation * feat: multiline database description	2021-03-03 17:37:55 +00:00
Nga Tran	957e05ef25	chore: use newly added Arrow's Expr::is_not_null function	2021-03-03 11:46:49 -05:00
Raphael Taylor-Davies	51981c92f5	feat: implement gRPC API and migrate influxdb_iox_client to use it (#853 ) * feat: implement gRPC management API * feat: migrate influxdb_iox_client to use gRPC API * fix: review comments * refactor: separate influxdb_iox_client error types Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-03-02 17:51:46 +00:00
Andrew Lamb	86c82be7f1	chore: Update arrow deps, remove custom JsonArrayWriter (#888 ) * chore: update dependencies * refactor: Use Arrow json::ArrayWriter	2021-02-27 10:18:43 +00:00
kodiakhq[bot]	76921bdef9	Merge branch 'main' into alamb/update_deps	2021-02-25 13:24:32 +00:00
Raphael Taylor-Davies	ffc20fa821	feat: add basic gRPC health service (#862 ) * feat: add basic gRPC health service * feat: update README.md add /health to HTTP API * feat: add health client to influxdb_iox_client feat: end-to-end test health check service Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-02-25 13:24:12 +00:00
Andrew Lamb	d29d7efa8c	chore: Update arrow/datafusion deps again	2021-02-25 07:39:01 -05:00
kodiakhq[bot]	3502f96ab9	Merge branch 'main' into cn/google-list-with-delimiter	2021-02-22 19:42:49 +00:00
Jake Goulding	6e6cc616a0	refactor: Switch to parking_lot::Mutex	2021-02-22 13:51:31 -05:00
Jake Goulding	2bd09612bd	refactor: Enable parking lot for dependencies	2021-02-22 13:37:35 -05:00
Jake Goulding	6603ecd758	refactor: Unify on once_cell This is closest in API to what is likely to be added to the standard library at some point, so let's use it consistently.	2021-02-22 13:37:35 -05:00
Raphael Taylor-Davies	dd8b41cdb0	feat: encoding of standard gRPC error details payloads (#846 ) * feat: encoding of standard gRPC error details payloads * feat: return unknown error on failure to encode error details Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-02-22 18:03:17 +00:00
Carol (Nichols \|\| Goulding)	cff12da3a1	fix: Upgrade to released version of cloud_storage Fixes #801.	2021-02-22 13:01:06 -05:00
Carol (Nichols \|\| Goulding)	a42103f436	Merge remote-tracking branch 'origin/main' into cn/google-list-with-delimiter	2021-02-22 12:53:46 -05:00
Carol (Nichols \|\| Goulding)	57942b51b7	feat: Update to latest Azure sdk to get delimiter support Needed these PRs: - https://github.com/Azure/azure-sdk-for-rust/pull/176 - https://github.com/Azure/azure-sdk-for-rust/pull/179 Also needed to enable the queue feature to get the azure_storage crate compiling; at the moment, the code is still being reorganized and the features aren't independent yet: https://github.com/Azure/azure-sdk-for-rust/issues/177	2021-02-18 14:59:06 -05:00
Marko Mikulicic	536c1724bd	feat: Allow to put streams of unknown length to objectstore Addresses the API aspect of #818 Adds a utility module that helps computing the length of a stream while buffering it for later replay (in-memory or spilling it in a temporary file).	2021-02-18 16:49:18 +00:00
NGA TRAN	213094f8f7	chore: update Arrow dependencies	2021-02-18 10:02:57 -05:00
Carol (Nichols \|\| Goulding)	ef54131afb	feat: Gets google cloud list_with_delimiter tests passing	2021-02-17 14:23:33 -05:00
Andrew Lamb	071b13b939	chore: Update dependencies (#821 ) * chore: Update dependencies * fix: update udf implementation for DataFusion update Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-02-16 23:27:36 +00:00
Andrew Lamb	150baec84c	chore: Update arrow dependencies	2021-02-14 05:47:23 -05:00
Raphael Taylor-Davies	6d3c21b952	feat: Use Bytes instead of Vec<u8> in prost generated code Also add google.rpc.* Protobuf definitions	2021-02-12 17:10:58 +00:00
Andrew Lamb	b8f85967dd	feat: Enable/Disable logging in tests via RUST_LOG environment variable (#793 ) * feat: Enable/Disable logging in tests via RUST_LOG environment variable * docs: Add section to contributing * docs: tweak readme * fix: Use same logging system in tests as in influxdb_ioxd	2021-02-12 13:43:12 +00:00
Andrew Lamb	a03598dfe2	feat: Implement Cross Chunk Schema / RecordBatch merging at query time (#783 ) * feat: feat: Implement Cross Chunk Schema / RecordBatch merging at query time * docs: update comments about NullArray::new_with-type * docs: Update comments based on code review Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-02-11 18:26:38 +00:00
kodiakhq[bot]	a242fc39aa	Merge branch 'main' into jg/flight-client	2021-02-11 14:29:54 +00:00
Raphael Taylor-Davies	7debe94ee6	feat: add background task tracking (#655 )	2021-02-11 10:30:19 +00:00
Jake Goulding	699f4a577f	feat: Add an optional Flight client to the IOx client library	2021-02-10 10:30:05 -05:00
Raphael Taylor-Davies	143488fae9	feat: add WAL metadata endpoint (#724 )	2021-02-08 16:21:34 +00:00
Raphael Taylor-Davies	29314a6118	feat: consistent global error handling and logging	2021-02-04 13:15:17 +00:00
Andrew Lamb	d5ebf9c3da	chore: Update deps again (#738 )	2021-02-04 06:02:05 -05:00
Jake Goulding	a5e09366b0	feat: Export arrow-flight from arrow-deps	2021-02-03 09:56:56 -05:00
Carol (Nichols \|\| Goulding)	0f8ef9c7d5	Merge branch 'main' into cn+jg/osp-types	2021-02-03 09:09:04 -05:00
Andrew Lamb	abc26a33c1	chore: Update dependencies (again) (#718 ) * chore: Update dependencies (again) * refactor: update for changes in DataFusion API * fix: fmt * fix: clippy	2021-02-02 18:33:01 -05:00
Andrew Lamb	485a59b2f8	feat: Implement logfmt (Heroku) formatted log output (#716 ) * feat: add option to output logs formatted via logfmt * refactor: Apply suggestions from code review Co-authored-by: Edd Robinson <me@edd.io> * fix: add tests for span inclusion * feat: Also log spans * fix: bug in normalizer Co-authored-by: Edd Robinson <me@edd.io>	2021-02-01 16:43:01 -05:00
Carol (Nichols \|\| Goulding)	ff6955a433	refactor: Extract a trait for ObjectStoreApi with associated path This is the promised cleanup. This structure gets rid of a lot of intermediate structures and encodes through associated types how the object stores and path types are related. The enums are still necessary to avoid having generics leak all over the place, but the object store variants and path variants should always match because they'll always come from the object store trait implementations that use the associated types.	2021-02-01 14:56:47 -05:00
Andrew Lamb	f3bd8bd0e3	chore: update deps (tokio 1.0 and ecosystem) (#707 ) * chore: Update arrow + tokio deps * chore: Use bleeding edge azure * chore: Update aws + other deps * fix: fmt * fix: Switch to in-house version of routerify * fix: Upgrade to hyper 0.14 The hyper::error module is now private; hyper::Error is the public re-export * fix: Upgrade cloud storage to get tokio upgrade * fix: Upgrade open_telemetry * fix: Do not call `panic::set_hook` during another panic Doing so leads to a double panic which aborts the process. * fix: new h2 error who dis Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@integer32.com> Co-authored-by: Jake Goulding <jake.goulding@integer32.com>	2021-01-29 16:11:55 -05:00
Andrew Lamb	8308fad188	chore: update arrow deps again (#699 )	2021-01-26 07:55:30 -05:00
Andrew Lamb	9b6fbae7f5	chore: Bump arrow deps	2021-01-23 08:09:46 -05:00
Andrew Lamb	a967e2f1dd	fix: disallow control characters in Database names (#684 )	2021-01-21 17:55:55 -05:00
Andrew Lamb	c50f9b1baf	Merge branch 'main' into alamb/underscore_in_bucket_names_2	2021-01-21 15:53:27 -05:00
Andrew Lamb	747b96d801	chore: Upgrade arrow dependencies, reduce duplication with upstream (#676 )	2021-01-21 08:58:11 -05:00
Andrew Lamb	a6b9ff9c91	fix: allow arbitrary characters in org/bucket names	2021-01-20 17:58:15 -05:00
Andrew Lamb	7969808f09	feat: Chunk Migration APIs and query data in the read buffer via SQL (#668 ) * feat: Chunk Migration APIs and query data in the read buffer via SQL * fix: Make code more consistent * fix: fmt / clippy * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * refactor: Remove unecessary Result and make chunks() infallable * chore: Apply more suggestions from code review Co-authored-by: Edd Robinson <me@edd.io> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Edd Robinson <me@edd.io>	2021-01-19 13:28:26 -05:00
Andrew Lamb	71627120b9	refactor: consolidate line protocol schema creation into data_types and port code to use it (#663 ) * refactor: consolidate line protocol schema creation into data_types, and port code to use it refactor: Port mutable buffer to use SchemaBuilder * fix: doctest * refactor: remove unecessary clippyisms * docs: Improve comments via suggestions from code review Co-authored-by: Edd Robinson <me@edd.io> * refactor: use more idomatic try_ naming and TryInto trait * docs: Change from line protocol data model to InfluxDB data model * refactor: rename LP --> Influx in code * feat: add support for UInteger type Co-authored-by: Edd Robinson <me@edd.io>	2021-01-15 17:29:30 -05:00
Carol (Nichols \|\| Goulding)	813092649d	fix: Make file behave the same as other object stores with paths	2021-01-15 10:25:05 -05:00
Dom	a2c0434554	Merge branch 'main' into dom/iox-api-client	2021-01-14 14:35:17 +00:00
Paul Dix	9377dc1943	chore: address pr feedback	2021-01-13 18:15:35 -05:00
Paul Dix	3b1f045f44	feat: add serialization to wal buffer segments Adds serialization with compression and checksum for WAL buffer segments. This required a weird structure where the flatbuffer bytes of ReplicatedWrite were kept as a raw payload. I did this because otherwise each of the replicated writes would have been rebuilt in the segment. The other thing that isn't ideal is that deserializing a segment actually marshals it into a Rust struct as opposed to keeping the entire thing as raw flatbuffers. We could update this later to have a concept of an open segment (regular rust stuct) and closed segments that are just the flatbuffers.	2021-01-13 18:15:34 -05:00
Dom	5647bfeb6f	refactor: error handling & typed errors Refactors the API method errors. The user of the API client needs to be able to distinguish between various error states when an API request fails. The most ergonomic way of exposing this information is by returning an error enum that is specific to each API method (or at least the important ones with well defined failure modes) - currently only the `create_database()` method has significant error states, so this is the only one with a specific error type in this impl. This change defines a bunch of API error codes in the API client, adds them to the IOx API error response body, and maps them in the API client. Due to error wrapping the error code mapping in the IOx server is less exhaustive than I had hoped however.	2021-01-13 17:32:12 +00:00
Andrew Lamb	38abe9735f	Merge branch 'main' into dom/iox-api-client	2021-01-12 13:19:49 -05:00
Andrew Lamb	2938c8f8fc	feat: implement chunk listing and snapshotting in mutable buffer (#641 ) * feat: implement chunk listing and snapshotting in mutable buffer * fix: update to use latest version of string interner and remove custom clone * docs: fix comment	2021-01-12 12:46:18 -05:00
Dom	e06b989fa6	Merge branch 'main' into dom/iox-api-client	2021-01-12 17:10:51 +00:00
Andrew Lamb	c1a7778d85	refactor: move id and deps out of query crate (#646 )	2021-01-12 11:47:43 -05:00
Dom	62349edb94	feat: create IOx API client Initialises a new library crate and implements a basic IOx API client. The API client supports: - ping - create database Care has been taken to abstract away the underlying HTTP client used (reqwest) and avoid leaking it into the public API (error types is a common leak!) This makes updating the HTTP client and/or swapping it for something else a backwards compatible change for end users of the crate. Outstanding items: - move shared API types into a sensible location - discriminate between various IOx error responses The former doesn't need doing until we publish the crate and will likely be rather invasive / conlict prone so aiming to merge this PR and then move things around in a follow-up. The latter would allow us to expose error conditions to the user such that they can take actions to remidy the situation / know if the request can or should be retried / etc. Currently we expose a string error message when requests fail, requiring string matching and/or passing the string higher in the stack (and thus punting the problem to the caller). It would be very nice to have typed errors, but a detail I have left for later.	2021-01-12 16:38:33 +00:00
Dom	bdc832d040	refactor: replace config system with structopt Replaces the hand-rolled config system with a StructOpt managed config struct. I've got most of it ported across, but the interaction between all the logging config bits is complex! I've left what is there and hooked in the value from the config struct (which directly replaces the env var in usage, as it also sources from the env).	2021-01-11 18:43:14 +00:00
Carol (Nichols \|\| Goulding)	b66ad643d5	refactor: Extract panic logging to its own crate for ease of reuse	2021-01-08 12:36:56 -05:00
Edd Robinson	4ce6821d90	feat: implement table_names on	2021-01-08 16:19:19 +00:00
Karsten Jeschkies	2cd383af6f	feat: Azure support for object store Closes #528 This patch adds support for Microsfot Azure Blob storage. The implementations requires an account, a key and container name. They can be configured via the environment variables `AZURE_STORAGE_ACCOUNT`, `AZURE_STORAGE_MASTER_KEY` and `AZURE_STORAGE_CONTAINER`.	2021-01-08 16:27:17 +01:00
Andrew Lamb	8219403fab	feat: Instantiate ReadBuffer as part of server creation (#620 ) * feat: Instantiate ReadBuffer as part of server creation * refactor: remove Store from read_buffer	2021-01-07 13:25:42 -05:00
Andrew Lamb	c672bb341d	feat: Extract SQL planning out of databases (#618 )	2021-01-07 13:13:30 -05:00
Carol (Nichols \|\| Goulding)	18ee1b561b	feat: Use ObjectStorePath everywhere to feel out the API needed	2021-01-07 10:48:22 -05:00
Paul Dix	cf56c1ba9e	feat: Add object store path abstraction	2021-01-07 09:19:50 -05:00
Paul Dix	4b40d11e60	feat: Add list_with_delimiter to object store This adds a new function list_with_delimiter to the object store. This commit contains just the implementation for S3, leaving the others to be completed in follow on commits. This has a fixed delimiter to ensure a directory structure is created. This delimiter should be dependent on platform and which object store is used. For any of the cloud object stores or in memory, the delimiter should be /. For the future disk based implementation it should be dependendent on if you're running on Windows or Linux. I didn't use Stream for the return type because I found it difficult to work with and I don't think it actually added anything useful. The return ListResult struct has the next token and I prefer that the caller explicitly makes calls that go over the network so they're more aware of what's going on, where a Stream abstracts that away so it's hidden behind the scenes. We can easilsy add a Stream based version on top of this existing API if we want.	2021-01-07 09:19:15 -05:00
Andrew Lamb	9f0ff678f1	feat: Formalizes the config system for IOx, including tests (#608 ) * feat: Create configuration system, port IOx to use it * docs: Apply suggestions from code review Co-authored-by: Paul Dix <paul@influxdata.com> * fix: fix test for setting values Co-authored-by: Paul Dix <paul@influxdata.com>	2020-12-31 07:02:31 -05:00
Paul Dix	db6ce0503c	chore: Benchmark ReplicatedWrite (#607 ) This adds benchmarks to the data_types crate for ReplicatedWrite. This is the first in a series to test benchmarking Flatbuffers vs. JSON for the WAL Segment format.	2020-12-30 12:44:32 -05:00
Andrew Lamb	0d0ec0ce69	chore: Upgrade arrow dependencies (#603 ) * chore: Update arrow dependencies to latest * refactor: Update code to conform to new arrow api	2020-12-28 16:08:09 -05:00
Andrew Lamb	5fa77c32cc	feat: Add "Chunks" to the Mutable Buffer (#596 ) * refactor: Update docs, remove unused field * refactor: rename partition -> chunk * feat: Introduce new partition, which is a holder for Chunks * refactor: Remove use of wal from mutable database * refactor: cleanups, remove last direct use of chunks * fix: delete old benchmarks * fix: clippy sacrifice * docs: tidy up comments * refactor: remove unused error types * chore: remove commented out tests	2020-12-28 07:10:25 -05:00
Paul Dix	1d200c5c77	chore: move http API over to Routerify This moves the HTTP API over to Routerify, which has the basic route parsing logic that will enable the API design for IOx. I had a little trouble with the error handling in Routerify so I ended up creating a macro for constructing error responses in the HTTP API. I'm not sure what I think of this pattern so I'm interested in what others think. Another option would be to have two functions for each API endpoint. One which is x_handler with a Routerify function signature. Then another which is just x that has the Result<Response<Body>, ApplicationError> return type, which would make using the ? operator work in those functions. That would eliminate the need for the return_err macro. I'm happy to refactor to that if people prefer it.	2020-12-24 16:45:20 -06:00
Edd Robinson	199ba68769	refactor: rename segment_store crate to read_buffer	2020-12-22 21:26:04 +00:00
Andrew Lamb	48c43b136c	refactor: rename write_buffer --> mutable_buffer (#595 ) * refactor: git mv write_buffer mutable_buffer * refactor: update crate name references * refactor: update some more references	2020-12-22 10:49:53 -05:00
Andrew Lamb	bb96142564	chore: Update arrow dependencies, remove custom min/max implementation (#585 ) * chore: Update arrow dependency * fix: Update code for changes in datafusion * fix: use arrow version of min_boolean	2020-12-21 12:31:39 -05:00
Edd Robinson	7a40bd5971	perf: use hashbrown raw_entry API This commit swaps out the std library `HashMap` for the implementation provided by the `hashbrown` crate. Not only does this allow us to use the raw entry API, but it increases performance through the use of a faster non-crytographically safe hashing function. We do not need an expensive hash function for this code path. Benchmark improvements are roughly 20-40%. Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000: Collecting 100 samples in estimated 6.5961 s (400 iterations) Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000: Analyzing segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [16.502 ms 16.527 ms 16.558 ms] thrpt: [1.2079 Kelem/s 1.2101 Kelem/s 1.2120 Kelem/s] change: time: [-40.808% -40.616% -40.428%] (p = 0.00 < 0.05) thrpt: [+67.863% +68.394% +68.942%] Performance has improved. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000: Collecting 100 samples in estimated 5.0698 s (300 iterations) Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000: Analyzing segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [16.531 ms 16.542 ms 16.555 ms] thrpt: [12.081 Kelem/s 12.090 Kelem/s 12.099 Kelem/s] change: time: [-43.304% -43.047% -42.810%] (p = 0.00 < 0.05) thrpt: [+74.856% +75.582% +76.378%] Performance has improved. Found 8 outliers among 100 measurements (8.00%) 5 (5.00%) high mild 3 (3.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000: Collecting 100 samples in estimated 5.2590 s (300 iterations) Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000: Analyzing segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [17.497 ms 17.568 ms 17.648 ms] thrpt: [113.33 Kelem/s 113.84 Kelem/s 114.30 Kelem/s] change: time: [-38.468% -38.188% -37.880%] (p = 0.00 < 0.05) thrpt: [+60.978% +61.782% +62.518%] Performance has improved. Found 12 outliers among 100 measurements (12.00%) 12 (12.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Collecting 100 samples in estimated 7.0471 s (300 iterations) Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Analyzing segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [23.305 ms 23.320 ms 23.336 ms] thrpt: [857.05 Kelem/s 857.64 Kelem/s 858.20 Kelem/s] change: time: [-35.933% -35.778% -35.648%] (p = 0.00 < 0.05) thrpt: [+55.396% +55.711% +56.087%] Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Collecting 100 samples in estimated 6.8058 s (300 iterations) Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Analyzing segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [22.475 ms 22.540 ms 22.622 ms] thrpt: [884.10 Kelem/s 887.31 Kelem/s 889.87 Kelem/s] change: time: [-34.249% -34.051% -33.768%] (p = 0.00 < 0.05) thrpt: [+50.984% +51.633% +52.089%] Performance has improved. Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Collecting 100 samples in estimated 7.0631 s (300 iterations) Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Analyzing segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [23.683 ms 23.724 ms 23.779 ms] thrpt: [841.08 Kelem/s 843.02 Kelem/s 844.49 Kelem/s] change: time: [-34.575% -34.419% -34.241%] (p = 0.00 < 0.05) thrpt: [+52.070% +52.482% +52.847%] Performance has improved. Found 9 outliers among 100 measurements (9.00%) 6 (6.00%) high mild 3 (3.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Collecting 100 samples in estimated 5.1007 s (200 iterations) Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Analyzing segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [25.379 ms 25.456 ms 25.545 ms] thrpt: [782.93 Kelem/s 785.67 Kelem/s 788.06 Kelem/s] change: time: [-37.254% -36.988% -36.701%] (p = 0.00 < 0.05) thrpt: [+57.981% +58.699% +59.373%] Performance has improved. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000: Collecting 100 samples in estimated 5.7756 s (400 iterations) Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000: Analyzing segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [14.404 ms 14.411 ms 14.419 ms] thrpt: [1.3870 Melem/s 1.3878 Melem/s 1.3885 Melem/s] change: time: [-28.007% -27.893% -27.798%] (p = 0.00 < 0.05) thrpt: [+38.500% +38.683% +38.903%] Performance has improved. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Collecting 100 samples in estimated 6.9256 s (300 iterations) Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Analyzing segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [23.191 ms 23.299 ms 23.419 ms] thrpt: [854.02 Kelem/s 858.42 Kelem/s 862.40 Kelem/s] change: time: [-32.647% -32.302% -31.912%] (p = 0.00 < 0.05) thrpt: [+46.868% +47.715% +48.471%] Performance has improved. Found 11 outliers among 100 measurements (11.00%) 11 (11.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Collecting 100 samples in estimated 6.1544 s (200 iterations) Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Analyzing segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [30.813 ms 30.859 ms 30.916 ms] thrpt: [646.92 Kelem/s 648.10 Kelem/s 649.07 Kelem/s] change: time: [-37.155% -36.779% -36.436%] (p = 0.00 < 0.05) thrpt: [+57.322% +58.174% +59.121%] Performance has improved. Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Collecting 100 samples in estimated 7.8548 s (200 iterations) Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Analyzing segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [39.303 ms 39.349 ms 39.405 ms] thrpt: [507.55 Kelem/s 508.27 Kelem/s 508.86 Kelem/s] change: time: [-36.857% -36.699% -36.576%] (p = 0.00 < 0.05) thrpt: [+57.669% +57.975% +58.371%] Performance has improved. Found 14 outliers among 100 measurements (14.00%) 8 (8.00%) high mild 6 (6.00%) high severe	2020-12-17 17:15:49 +00:00
Edd Robinson	0d60102c74	feat: make group keys comparable and results sortable This commit provides functionality on top of the `GroupKey` type (a vector of materialised values), which allows them to be comparable by implementing `Ord`. Then, using the `permutation` crate, it is possible sort all rows in a result set based on the group keys, which will be useful for testing.	2020-12-17 11:10:26 +00:00
Andrew Lamb	a6d2c13888	chore: Update arrow + other depenencies (#540 ) * chore: Update arrow + other depenencies * chore: Update write_buffer and query crate	2020-12-15 08:46:27 -05:00
Dom	2d29b985b4	chore(deps): remove env_logger from ingest Already using tracing!	2020-12-14 12:06:53 +00:00
Dom	60ee7e1dbb	chore(deps): remove unused env_logger	2020-12-14 12:06:53 +00:00
Dom	9d7389dec2	feat(tracing): add Jaeger tracing sink Adds telemetry / tracing with support for a Jaeger backend, and changes the logger from env_logger to a tracing subscriber to collect the log entries. Events are batched and then emitted asynchronosuly via UDP to the Jaeger collector using the tokio runtime. There's a bunch of settings (env vars) related to batch sizes and flush frequency etc - they're all using their default values at the moment (if it ain't broke...) See the docs for more info: https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/sdk-environment-variables.md#opentelemetry-environment-variable-specification This is only part 1 of telemetry - it does NOT propagate traces across RPC boundaries as we're still defining how all this should work. I've created #541 to track this. Closes #202 and closes #203.	2020-12-14 12:06:52 +00:00
Edd Robinson	5e138bcded	refactor: return groups as vectors	2020-12-10 15:15:34 +00:00
Edd Robinson	fe27690ca8	test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```	2020-12-10 15:15:34 +00:00
Edd Robinson	8c45170a15	feat: read group aggregates on RLE columns	2020-12-10 15:15:34 +00:00
Paul Dix	fa3ecbd4ed	feat: Implement write buffer to Parquet snapshotting (#526 ) * feat: Implement write buffer to Parquet snapshotting This introduces snapshot to the server packages to manage snapshotting. It also introduces a new trait for representing a Partition. There is a very crude API wired up in http_routes for testing purposes. Follow on work will bring the server package into http_routes and rework the snapshot API.	2020-12-08 14:20:43 -05:00
Dom	c3a0e893ae	test: use flate2	2020-12-01 11:01:10 +00:00
Dom	867aba847a	perf(convert): use flate2 for gzip decompression Switches from `libflate` to `flate2` for the top-level commands (specifically TSM conversion).	2020-11-30 15:18:25 +00:00
Andrew Lamb	3a9ee88f00	chore: update to latest version of arrow + update code (#486 ) * chore: update to latest version of arrow + update code * chore: Update rust toolchain to match arrow * fix: clippy	2020-11-25 14:46:35 -05:00
Matt Freitas-Stavola	7e2df1fc59	chore(server): add logs for dropped WAL segments (#478 ) * chore(server): add logs for dropped WAL segments Added logging for dropped writes and old segments in rollover scenarios Also including a dep on tracing and dev-dep on test_helpers Refs: #466 * chore(server): Add more context to logs Minor cleanup around remove_oldest_segment usage Suggestions from @alamb's review	2020-11-24 16:37:09 -05:00
Andrew Lamb	cdb26e60e4	refactor: rename `storage` crate to `query` to better reflect what it is (#475 ) * refactor: rename storage --> query * refactor: update a few more referenes	2020-11-24 14:19:29 -05:00
Paul Dix	5101e52434	Merge pull request #464 from influxdata/pd-wal_buffer-main feat: Implement WAL in-memory buffer	2020-11-20 11:16:30 -05:00
Paul Dix	0deee2c0db	feat: Implement WAL in-memory buffer This splits the cluster package out into server and buffer modules. The WAL buffer is in-memory and split into segments. Follow on commits will implement it in the server and add persistence to object storage.	2020-11-19 19:35:17 -05:00
Andrew Lamb	cad5f9166b	feat: Port Duration and Window logic to support window aggregates (#460 ) * feat: Port enough of Window and Duration to implement window_bounds * fix: clippy * fix: Add a few more source links * fix: Eust --> Rust in comments :( * fix: add comments about remainder, and add test demonstraitng behavior * fix: Apply suggestions from code review	2020-11-18 09:49:59 -05:00
Andrew Lamb	831a0875d6	chore: update to latest arrow + Rust nightly-2020-11-14 (#454 ) * chore: update to latest arrow + Rust nightly-2020-11-14 * chore: update ci * fix: update for clippy lints * fix: Allow redundant_field_names in generated types crate * fix: clippy about try_for_each * fix: clippy uneeded-collect * fix: clippy about default values * fix: clippy mathces --> matches! * fix: clippy sort --> sort_by_key * fix: clippy about default values again	2020-11-16 11:48:42 -05:00
Andrew Lamb	2fa0e03162	fix: Use datafusion optimizer in IOx query plans (#439 ) * chore: update arrow dep to `8e4d9ebef3` * fix: checkin Cargo.lock * fix: Enable datafusion optimizer, use display_indent_schema	2020-11-11 18:06:21 -05:00
Edd Robinson	0958849956	chore: rename the segment store crate	2020-11-10 16:35:17 +00:00
Edd Robinson	ab458b5f17	refactor: address PR feedback	2020-11-06 17:28:27 +00:00
Andrew Lamb	5bb530ccc6	refactor: rename tsm --> influxdb_tsm (#418 )	2020-11-05 14:35:38 -05:00
Andrew Lamb	b745a180a4	refactor: rename delorean --> InfluDB IOx (#417 )	2020-11-05 13:51:04 -05:00
Andrew Lamb	a52e0001c5	refactor: rename all crates that start with`delorean_` in preparation for rename (#415 ) * refactor: rename delorean_cluster --> cluster * refactor: rebane delorean_generated_types --> generated_types * refactor: rename delorean_write_buffer --> write_buffer * refactor: rename delorean_ingest --> ingest * refactor: rename delorean_storage --> storage * refactor: rename delorean_tsm --> tsm * refactor: rename delorean_test_helpers --> test_helpers * refactor: rename delorean_arrow --> arrow_deps * refactor: rename delorean_line_parser --> influxdb_line_protocol	2020-11-05 13:44:36 -05:00
Andrew Lamb	9df6c24493	refactor: rename delorean_mem_qe --> mem_qe (#414 )	2020-11-05 09:36:46 -05:00
Andrew Lamb	4f348836fe	refactor: remove delorean_parquet by compining with delorean_ingest (#412 )	2020-11-05 09:29:59 -05:00
Andrew Lamb	ff824a5477	refactor: rename delorean_wal --> wal, conslidate wal_writer (#411 )	2020-11-05 09:25:29 -05:00
Andrew Lamb	a3b88d5506	refactor: rename delorean_object_store --> object_store (#413 )	2020-11-05 08:56:30 -05:00
Andrew Lamb	8399d2a159	refactor: rename delorean_table to packers (#409 )	2020-11-05 08:52:22 -05:00
Andrew Lamb	075ba0d8d1	refactor: remove delorean_table_schema crate and fold it into data_types (#408 )	2020-11-05 06:17:20 -05:00
Carol (Nichols \|\| Goulding)	7d25dc8487	fix: Remove unused arrow dependency in delorean_ingest This wasn't really causing any problems, just confusion, because the old arrow and its deps were in the Cargo.lock.	2020-11-04 15:34:34 -05:00
Andrew Lamb	bf0c58698e	refactor: rename delorean_data_types crate to data_type (#407 ) * refactor: rename delorean_data_types crate to data_type - #401 * fix: fmt	2020-11-04 12:33:41 -05:00
Andrew Lamb	9f36914351	chore: Upgrade version of Arrow / DataFusion (3 of 3) + update code for new interfaces (#395 )	2020-11-02 11:20:44 -05:00
Paul Dix	1e966b5153	feat: implement API for storing the server configuration in object storage This adds basic API calls for persisting and loading the server configuratioon of database rules and host groups to and from object storage. It stores all the data in a single JSON file.	2020-10-26 13:43:43 -06:00
Andrew Lamb	ef501871bb	feat: remove partition_store (#387 )	2020-10-26 14:39:38 -04:00
Andrew Lamb	4e1e8dbf79	chore: Upgrade version of Arrow/DataFusion (2 of 3) (#391 ) * chore: Upgrade version of Arrow/DataFusion (2 of 3) * fix: Fixup error type usage and use async stream interface * fix: post merge fixups	2020-10-26 13:49:16 -04:00
Andrew Lamb	88b9f43110	chore: Upgrade version of Arrow/DataFusion (1 of 3) (#390 ) * chore: Upgrade version of Arrow/DataFusion * fix: update code for deps	2020-10-26 11:46:02 -04:00
Andrew Lamb	1004854403	refactor: remove uneeded dependencies, switch to tracing from log (#388 )	2020-10-26 06:15:47 -04:00
Andrew Lamb	0ef76db208	feat: implement series_query for write buffer database, tests for same (#360 ) * feat: implement series_query for write buffer database, tests for same * fix: fixup comments * fix: sort field columns too	2020-10-15 17:23:14 -04:00
Paul Dix	262a988207	Merge pull request #357 from influxdata/pd-cluster_replicate chore: refactor cluster to use in memory write buffer	2020-10-14 09:43:02 -04:00
Paul Dix	9a345e226c	chore: refactor cluster to use in memory write buffer This refactors cluster to use the in memory write buffer. It removes the injected DatabaseStore as it is no longer needed.	2020-10-14 08:36:49 -04:00
Edd Robinson	6091963d50	test: skip NaN test for now	2020-10-14 13:21:15 +01:00
Edd Robinson	74ed1904c9	feat: fixed encoding for non-null numerics	2020-10-14 13:18:42 +01:00
Andrew Lamb	206df6a325	feat: implement data fusion execution and conversion to series sets (#353 )	2020-10-13 16:53:00 -04:00
Paul Dix	a80eb0fed3	feat: Store replicated writes This commit refactors the flatbuffers data types from the wal to a new crate where they can be used by storage, write buffer, and cluster. It also refactors cluster to move the configuration types out to the data types crate so they can be used across storage and elsewhere. Finally, it adds a new method to store replicated writes on a database in the database trait and implements it.	2020-10-11 15:45:08 -04:00
Paul Dix	996f8905b6	feat: Implement partition templates and key generation This commit implements partition templates as a struct that can be serialized and deserialzed. It is comprised of parts that can include the table name, a column name and its value, a formatted time, or a string column and regex captures of its value.	2020-10-10 11:32:17 -04:00
Paul Dix	cceeebb317	Merge pull request #342 from influxdata/pd-cluster-updates feat: Update cluster with replication and subscriptions	2020-10-09 07:41:32 -04:00
Andrew Lamb	2b8c04f2b4	chore: Update arrow (again) to pick up latest changes to datafusion (#345 )	2020-10-09 07:17:02 -04:00
Andrew Lamb	a72e608810	feat: enable simd in arrow (#343 )	2020-10-08 11:21:22 -04:00
Paul Dix	05dcbd7236	feat: Update cluster with replication and subscriptions This updates cluster so that the concept of replication and subscriptions for handling queries are separated. It also adds flatbuffer structure that can be used as a common format for replication.	2020-10-08 08:40:13 -04:00
Andrew Lamb	bc5378c7fe	chore: Update arrow to latest version (#335 ) * chore: Update arrow to latest version * fix: Updates needed by new version of datafusion	2020-10-02 14:46:07 -04:00
Andrew Lamb	ff29610e44	refactor: Switch back to https://github.com/apache/arrow (#333 )	2020-10-01 16:57:12 -04:00
Andrew Lamb	2b98da593b	feat: write_database support for predicates (#326 ) * feat: write_database support for predicates * fix: temporarily pull in arrow fork to pick up fix for ARROW-10136 * fix: Update mutex usage based on PR feedback * fix: more mutex polish and use OptionExt * fix: update comments * fix: rust-fu the table lookup * fix: update docs * fix: more idomatic rust types * fix: better usage of reference types	2020-10-01 14:34:53 -04:00
Edd Robinson	a2287acb7c	Merge pull request #330 from influxdata/er/feat/segment-store-shell feat: Segment Store shell	2020-10-01 14:01:45 +01:00
Edd Robinson	bd6b0db691	refactor: address PR feedback	2020-10-01 13:13:32 +01:00
Paul Dix	fdc86fd186	feat: add some initial framework for clustering (#329 )	2020-09-30 14:41:42 -04:00
Andrew Lamb	8a14896487	chore: update version of datafusion (#324 ) * chore: update version of datafusion * chore: Update interfaces to be async	2020-09-30 08:02:15 -04:00
Edd Robinson	2470bdb975	feat: segment store shell	2020-09-30 11:25:59 +01:00
Andrew Lamb	da5c74d3c6	feat: storage interface plans + executor (#318 ) * feat: storage interface plans + executor * refactor: less `expect` * fix: use more idomatic rust From	2020-09-28 11:41:10 -04:00
Andrew Lamb	0236522dfa	feat: Send panic information to tracing events (#313 ) * feat: Send panic information to tracing events * fix: PR Review improvements * fix: PR comments * fix: Apply suggestions from code review Co-authored-by: Jake Goulding <jake.goulding@integer32.com> * fix: more fixes * fix: clarify /cleanup drop more Co-authored-by: Jake Goulding <jake.goulding@integer32.com>	2020-09-25 14:55:58 -04:00
Edd Robinson	ec1aaa3a47	chore: update dependencies	2020-09-25 17:22:48 +01:00
Edd Robinson	9eee0c2852	refactor: make clippy happy	2020-09-25 10:12:46 +01:00
Edd Robinson	c42d2dcd79	refactor: rebase with delorean_arrow	2020-09-25 10:12:46 +01:00
Edd Robinson	d0f3cae9b3	feat: add tag values schema API	2020-09-25 10:12:46 +01:00
Edd Robinson	47b2f7940b	refactor: spike on arrow encoding	2020-09-25 10:12:46 +01:00
Edd Robinson	e5f9c7c574	refactor: add encoding trait	2020-09-25 10:12:46 +01:00
alamb	54e9d38589	chore: update the refs to github	2020-09-25 10:12:46 +01:00
alamb	41899203d9	refactor: implement a prototype datafusion integration layer demonstration	2020-09-25 10:12:46 +01:00
alamb	820277a529	feat: load segments from parquet	2020-09-25 10:12:46 +01:00
alamb	acfef35a0e	feat: load segments from parquet	2020-09-25 10:12:46 +01:00
alamb	7f815099d0	feat: Read from parquet rather than arrow	2020-09-25 10:12:46 +01:00
Edd Robinson	a5a8667a42	feat: group by sorting	2020-09-25 10:12:46 +01:00
Edd Robinson	231f429a56	feat: sort group by measurement	2020-09-25 10:12:46 +01:00
Edd Robinson	2387b7c849	feat: add support for group by aggregate	2020-09-25 10:12:46 +01:00
Edd Robinson	aba02cb731	feat: basic store	2020-09-25 10:12:46 +01:00
Andrew Lamb	77f58efca7	chore: update Arrow/Parquet/DataFusion versions, consolidate references into new crate (#309 ) * chore: consolidate all arrow/parquet/datafusion dependencies * chore: update datafusion version	2020-09-24 08:46:54 -04:00
Andrew Lamb	498478c066	refactor: rename delorean_storage_interface to delorean_storage (#308 )	2020-09-22 17:18:53 -04:00
Andrew Lamb	d0f2902c8d	feat: implement tag_keys and measurement_tag_keys (#307 ) * feat: implement tag_keys and measurement_tag_keys * fix: fix timestamp bound evaluation	2020-09-22 16:42:45 -04:00
Jake Goulding	648d42568d	feat: Add a benchmark for restoring the WAL	2020-09-18 16:45:01 -04:00
alamb	2418ee5ab0	refactor: move partitioned_store into its own module	2020-09-18 08:12:19 -04:00
Andrew Lamb	642b1b4370	refactor: move write_buffer to delorean_write_buffer crate (#299 )	2020-09-18 08:11:48 -04:00
Andrew Lamb	d2c24ef7af	refactor: pull storage interface into delorean_storage_interface (#298 )	2020-09-18 07:58:19 -04:00
Andrew Lamb	5fe3bfd53c	refactor: extract WalDetails into delorean_wal_writer crate (#297 )	2020-09-18 07:47:37 -04:00
Carol (Nichols \|\| Goulding)	596c987956	feat: Compress WAL entries with Snappy Fixes #276.	2020-09-14 09:42:54 -04:00
Andrew Lamb	82d5f485c3	test: traits for database and tests for http handler (#284 ) * test: traits for database and tests for http handler * refactor: Use generics and trait bounds instead of trait objects * refactor: Replace trait objects with an associated type * refactor: Extract an associated Error type on the Database traits * refactor: Remove some explicit conversions to_string that Snafu takes care of * docs: add comments * refactor: move traits into storage module Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@integer32.com>	2020-09-11 17:42:00 -04:00
alamb	9b9ff484bb	fix: implement escaping	2020-09-11 17:14:35 -04:00
Paul Dix	8ed3a1b440	feat: Initial prototype of WriteBuffer and WAL (#271 ) This is the initial prototype of the WriteBuffer and WAL. This does the following: * accepts a slice of ParsedLine into the DB * writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported * persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch) * has a method to return a table from the buffer as an Arrow RecordBatch * recovers the WAL after the database is closed and opened back up again * has a single test that covers the end-to-end from the DB side * It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration. * hooked up to the v2 HTTP write API * hooked up to a read API which will execute a SQL query against the data in the buffer This includes a refactor of the WAL: Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes. This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic. Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.	2020-09-08 14:12:16 -04:00
Carol (Nichols \|\| Goulding)	d59702ec79	feat: Make the create bucket HTTP API match the Influx 2.0 API The `/api/v2/create_bucket` API was delorean-specific for testing purposes. This change makes it match the [Influx 2.0 API][influx] and adds a method to the client for creating buckets. The client will always send an empty array of `retentionRules` because that is a required parameter for the Influx API. Delorean always ignores `retentionRules`. The `description` and `rp` parameters are optional and are never sent. [influx]: https://v2.docs.influxdata.com/v2.0/api/#operation/PostBuckets I believe the gRPC create bucket is also delorean-specific and perhaps not needed, but I'm leaving it in for now with a note.	2020-08-12 10:08:32 -04:00
Edd Robinson	21c0155271	fix: improve pivot for certain sorts	2020-08-04 21:33:58 +01:00
Carol (Nichols \|\| Goulding)	19159138cc	fix: Turn off default features of parquet so arrow-flight doesn't repeatedly rebuild Fixes #261	2020-07-30 09:43:12 -04:00
alamb	f946e84a12	chore: revert upgrade parquet dependency to 1.0.0" This reverts commit `25259b4c99`.	2020-07-30 07:02:53 -04:00
alamb	25259b4c99	chore: upgrade parquet dependency to 1.0.0	2020-07-28 15:11:35 -04:00
Carol (Nichols \|\| Goulding)	0709f90040	test: Add a mock server test in the client crate for the newline bug	2020-07-27 14:10:54 -04:00
Jake Goulding	b72c2ffd73	Merge pull request #253 from influxdata/client-dynamic-data-point	2020-07-24 09:50:11 -04:00
Carol (Nichols \|\| Goulding)	c179a7e8b2	fix: Remove generate/seed utilities These are going to be redone in the fusion repo.	2020-07-22 17:15:30 -04:00
Jake Goulding	f8304e6e6b	feat: Add a dynamic type to construct data points for ingestion	2020-07-22 17:03:29 -04:00
Andrew Lamb	143c350ecb	Merge pull request #250 from influxdata/alamb/feat-multi-col-stats feat: Update stats command to handle directories of files	2020-07-20 16:48:31 -04:00
alamb	ca1bd79902	feat: Update stats command to handle directories of files	2020-07-17 16:47:11 -04:00
Carol (Nichols \|\| Goulding)	668aefae9b	feat: Implement a rudimentary write API in the influx client	2020-07-17 10:28:19 -04:00
Carol (Nichols \|\| Goulding)	7ed24241b5	feat: Set up an InfluxData 2.0 client crate	2020-07-17 10:27:33 -04:00
Carol (Nichols \|\| Goulding)	b3a16c080f	feat: Update croaring Jake dug into why the end-to-end tests fail with delorean running in the Docker image I built, and it appears to be a crash with an illegal instruction from CRoaring. We think it's this issue: https://github.com/saulius/croaring-rs/pull/62 which was merged and released, so let's try updating CRoaring.	2020-07-08 08:49:28 -04:00
Edd Robinson	831f647b9d	feat: implement escaped tsm key parsing	2020-07-04 08:46:45 -04:00
Edd Robinson	06e9fae845	fix: ignore conflicting field types Fixes #205.	2020-06-30 18:08:05 +01:00
Andrew Lamb	97a5eb7e19	Merge pull request #197 from influxdata/alamb/log-requests feat: Log gRPC calls using trace crate, allow custom log levels	2020-06-30 10:47:11 -04:00
alamb	283d6691c6	feat: enable rpc debug tracing, tweaked logging levels, respect RUST_FMT env var	2020-06-29 09:59:22 -04:00
Jake Goulding	ad1e3d04bb	feat: Add a local filesystem implementation of the object store	2020-06-29 08:48:48 -04:00
Edd Robinson	d15256e0e7	refactor: address PR feedback	2020-06-26 12:08:42 +01:00
Edd Robinson	9d889828c3	fix: ensure all rows are emitted for each column	2020-06-26 11:50:37 +01:00
alamb	68ce351a3a	refactor: remove direct parquet dependency from delorean_ingest	2020-06-23 16:58:31 -04:00
Carol (Nichols \|\| Goulding)	d7dbf061cb	feat: Implement String encoding/decoding Fixes #148.	2020-06-22 15:15:34 -04:00
Edd Robinson	106bd69b5a	feat: support converting from TSM->Parquet	2020-06-22 18:56:17 +01:00
Edd Robinson	85e0b4ec16	refactor: hoist tsm reader into own crate	2020-06-22 18:56:17 +01:00
Edd Robinson	ac7bb6bf68	refactor: make Packer generic	2020-06-22 11:24:29 +01:00
Jake Goulding	bfb0213ac3	feat: Update Rusoto to allow streaming data on uploads	2020-06-19 09:18:44 -04:00
Andrew Lamb	8185c80c03	fix: fix logical merge conflict (#169 )	2020-06-18 18:51:25 -04:00
Andrew Lamb	ae37548980	feat: Add support for parsing string values in line protocol parser (#155 ) * feat: add debug logging on parser error * feat: Add support for parsing string values in line protocol parser * fix: Fix comment * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-06-18 12:44:17 -04:00
Andrew Lamb	cf248f2143	feat: upgrade to latest arrow / byteorder (#154 )	2020-06-17 12:50:23 -04:00
Carol (Nichols \|\| Goulding)	d83c410a5c	feat: Update to the released version of cloud-storage My submitted API improvements got merged in!	2020-06-10 17:23:52 -04:00
Carol (Nichols \|\| Goulding)	d3283b1096	feat: Object storage in S3 and GCS	2020-06-10 17:23:52 -04:00
Andrew Lamb	faf3f534ac	refactor: move all dstool code into delorean binary (#131 ) * refactor: move all dstool code into delorean binary * fix: Move code/mods to make it compile and run * fix: warn if db dir does not exist * refactor: Match argument subcommands w/ more idomatic rust * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> fix: restore hyper logging fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: update expected code Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-06-10 16:04:46 -04:00
Andrew Lamb	0415b233ec	refactor: Instantiate the table writer on demand (#128 ) * refactor: instantiate ParquetWriter on demand, prep for multi measurements * fix: doc test * fix: update names	2020-06-09 16:11:42 -04:00
Andrew Lamb	986e12d62a	refactor: Rename crate line_protocol_schema --> delorean_table_schema (#129 ) * refactor: Rename crate line_protocol_schema --> delorean_table_schema * fix: fmt	2020-06-09 11:56:16 -04:00
Andrew Lamb	f1a3058b24	feat: Add file / metadata inspection + dumping with dstool (#112 ) * feat: Add file / metadata inspection + dumping * fix: apply some PR review comments * fix: apply suggestions from code review Co-authored-by: Jake Goulding <jake.goulding@integer32.com> * feat: Add tests, rearrange code into modules, add gzip aware interface * fix: fix comment and test * fix: test output and fmt Co-authored-by: Jake Goulding <jake.goulding@integer32.com>	2020-06-09 10:10:55 -04:00
Andrew Lamb	8475b6d183	feat: Add parquet writer, hook up conversion in dstool (#124 ) * feat: Add parquet writer, hook up conversion in dstool * fix: use bigger executor for test * fix: less cloning * fix: make unsupported messages less pejorative * fix: fmt * fix: Rename writer and do not require std::File, add example * fix: clippy and fmt * fix: remove unnecessary module in end to end tests * fix: remove strange use of tempfile * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: cleanup use * fix: Use more specific error messages * fix: comment tweak * fix: touchup temp path creation * fix: clippy! Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-06-08 16:25:24 -04:00
Andrew Lamb	ca9f9d4cae	feat: Add column packing code (#114 ) * feat: Add column packing code * fix: remove dependency on assert_approx_equal in favor of delorean_test_helpers * fix: Cleanups from pr comments * fix: Apply suggestions from code review Co-authored-by: Jake Goulding <jake.goulding@integer32.com> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: more cleanup per code review * fix: pr comments * fix: remove explict string creation from caller Co-authored-by: Jake Goulding <jake.goulding@integer32.com> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-06-06 06:04:41 -04:00
Andrew Lamb	2200def8ea	feat: Use rust nightly (#123 )	2020-06-05 17:45:44 -04:00
Jake Goulding	68fb580b43	style: Re-enable the elided lifetimes lint and move generated types to their own crate (#119 ) * refactor: rename the module containing generated types The nested `delorean` was confusing anyway, and this will make more sense when we extract a new crate. * refactor: Move the generated types to their own crate This allows us to have more lax warnings in that crate alone, keeping the main crate more strict. * style: Re-enable elided lifetimes lint in the main crate	2020-06-05 16:22:27 -04:00
Edd Robinson	887ffd5977	refactor: remove lifetime to make index re-usable	2020-06-04 14:36:43 +01:00
Edd Robinson	e3db077121	feat: add API for series key information	2020-06-04 14:36:43 +01:00
Edd Robinson	413738a264	feat: support org and bucket ID in entries	2020-06-04 14:36:43 +01:00
Andrew Lamb	234b2f5752	feat: Line Protocol Schema extractor (#108 ) * feat: schema inference from iterator of parsed lines * fix: Clean up error handing even more * fix: fmt * fix: make a sacrifice to the clippy gods	2020-06-03 18:29:57 -04:00
Andrew Lamb	5d2c5de39d	feat: Structs to represent line protocol schema (#103 ) * feat: Structs to represent line protocol schema Co-authored-by: Jake Goulding <jake.goulding@integer32.com>	2020-06-03 08:39:35 -04:00
Andrew Lamb	18b05ce9ef	fix: move test of dstool to its delorean_storage_tool package (#107 )	2020-06-02 16:10:30 -04:00
Andrew Lamb	1a2efdfd71	feat: Add dstool command line tool (#102 ) * feat: Add dstool command line tool * clippy * Update delorean_storage_tool/src/main.rs Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * Update delorean_storage_tool/src/main.rs Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * Add in tests + PR comments * fmt * build first then run tests * actually build before test Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-06-02 07:33:43 -04:00
Jake Goulding	924f20fd50	build: update semver-compatible versions	2020-05-29 13:40:44 -04:00
Jake Goulding	d4af54c3de	refactor: extract the line protocol parser to a separate crate This will facilitate reusing the parser for other tasks.	2020-05-26 13:22:34 -04:00
Carol (Nichols \|\| Goulding)	f2823ccecd	feat: WAL file rollover based on size of file	2020-05-18 14:08:24 -04:00
Carol (Nichols \|\| Goulding)	e25a4e1e83	feat: Integrate the WAL with delorean	2020-05-11 15:38:47 -04:00
Jake Goulding	4dd7a8cea8	feat: Introduce a WAL tailored for delorean	2020-05-11 15:38:44 -04:00
Jake Goulding	bff4d2d6d9	refactor: Move temporary directory creation to test helpers	2020-05-11 15:26:00 -04:00
Jake Goulding	22136d5431	build: update semver-compatible versions	2020-05-11 15:25:54 -04:00
Jake Goulding	e369ada35a	refactor: extract a crate with our custom assertions There's probably an existing crate that we should use directly, but I haven't found an exact match yet.	2020-05-01 13:04:24 -04:00
Carol (Nichols \|\| Goulding)	7f9eaf51d5	Merge pull request #74 from influxdata/cn-generate-points	2020-04-24 08:08:32 -04:00
Edd Robinson	f1d5d50b92	Merge pull request #68 from influxdata/er-block-writer feat: add Block Type	2020-04-23 22:48:38 +01:00
Carol (Nichols \|\| Goulding)	fa69101945	refactor: Move the point utilities into a workspace crate	2020-04-23 11:26:37 -04:00
Carol (Nichols \|\| Goulding)	6791b9598c	feat: Utility to take line protocol and make write requests	2020-04-23 11:11:51 -04:00
Jake Goulding	93231c64e0	perf: Use a SmallVec for escaped strings and sets of tags and values This increases the performance from 56.531 MiB/s to 58.194 MiB/s.	2020-04-08 14:41:42 -04:00
Edd Robinson	9e20743b2c	feat: add Block Type This commit adds a new Block type, which is used to keep track of values associated with individual block, and then serialise them.	2020-04-08 13:37:48 +01:00
Jake Goulding	974a142cc8	build: update semver-compatible versions	2020-04-05 16:35:27 -04:00
Jake Goulding	8629072508	build: Upgrade tonic to 0.2	2020-04-05 16:35:00 -04:00
Jake Goulding	4a28abd4de	build: Upgrade assert_cmd to 1.0 This requires that we opt into the serde `derive` feature that is no longer implicitly added from upstream.	2020-04-05 16:33:37 -04:00
Jake Goulding	48d5d16a1b	build: update semver-compatible dependency versions	2020-04-05 16:33:24 -04:00
Carol (Nichols \|\| Goulding)	df67b9715a	Merge remote-tracking branch 'origin/master' into pd-partiton-store	2020-04-02 11:15:26 -04:00
Jake Goulding	97d11633b8	feature: Use a unique directory per end-to-end test run	2020-04-02 11:06:36 -04:00
Carol (Nichols \|\| Goulding)	d9cf5c952a	fix: Remove RocksDB code	2020-04-02 09:41:30 -04:00
Jake Goulding	4fd0c6f210	feat: Error when parsing lines with duplicate tags	2020-03-11 22:43:09 -04:00
Jake Goulding	78a53aa391	refactor: Replace the hand-written parser with one built with nom	2020-03-06 10:00:29 -05:00
Jake Goulding	5d3f99da98	refactor: Remove unused failure crate	2020-02-28 16:54:28 -05:00
Edd Robinson	17051717e2	chore: remove dependency:	2020-02-28 12:55:28 +00:00
Edd Robinson	38f23ac07a	refactor: merge master in	2020-02-27 14:27:23 +00:00
Carol (Nichols \|\| Goulding)	c41652e45b	feature: Add the storage gRPC proto definitions	2020-02-24 08:26:28 -05:00
Jake Goulding	3438edd18b	feature: Switch from prost to tonic	2020-02-17 16:37:43 -05:00
Jake Goulding	68970f8ff3	build: Update bytes to latest version	2020-02-17 10:48:33 -05:00

... 3 4 5 6 7 ...

469 Commits (c46c2a35fa17ea094a83e1ce15aa167266787a1e)