influxdb

Commit Graph

Author	SHA1	Message	Date
Edd Robinson	e1b57aaec4	perf: copy as needed	2020-12-10 15:15:34 +00:00
Edd Robinson	99003b0a6a	perf: check intersection cardinality before allocating Becuase `bitset.and()` allocates a new bitset regardles of the resulting cardinality we will be allocating more bitsets than necessary. This change checks if we actually want to make the allocation. It improves `read_group` performance by ~2X. ``` segment_read_group_pre_computed_groups_no_predicates_cardinality/2000 time: [57.917 ms 58.286 ms 58.700 ms] thrpt: [34.072 Kelem/s 34.313 Kelem/s 34.532 Kelem/s] change: time: [-59.703% -59.357% -59.057%] (p = 0.00 < 0.05) thrpt: [+144.24% +146.05% +148.16%] Performance has improved. Found 6 outliers among 100 measurements (6.00%) 4 (4.00%) high mild 2 (2.00%) high severe ```	2020-12-10 15:15:34 +00:00
Edd Robinson	fe27690ca8	test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```	2020-12-10 15:15:34 +00:00
Edd Robinson	596e20ac92	feat: add from String implementation	2020-12-10 15:15:34 +00:00
Edd Robinson	e400fb71bb	feat: add from conversion for String	2020-12-10 15:15:34 +00:00
Edd Robinson	10552eb51b	refactor: create collection of ReadGroupResult type	2020-12-10 15:15:34 +00:00
Edd Robinson	8c45170a15	feat: read group aggregates on RLE columns	2020-12-10 15:15:34 +00:00
Edd Robinson	8fd211798a	refactor: aggregate sum can return a Scalar	2020-12-10 15:15:34 +00:00
Edd Robinson	6d2b69d4a3	feat: add column properties Column properties can be used to determine what abilities a column has at runtime, which will vary depending on the encoding used.	2020-12-10 15:15:34 +00:00
Edd Robinson	e4b8fb3387	refactor: use Cow for group row ids	2020-12-10 15:15:34 +00:00
Edd Robinson	f7f87164b4	refactor: initial read_group skeleton	2020-12-10 15:15:34 +00:00
Edd Robinson	c199d59c04	refactor: improve aggregate support	2020-12-10 15:15:34 +00:00
Edd Robinson	c259a461c1	feat: extend dictionary column API Add methods for getting distinct row ids for values and for getting logical values.	2020-12-10 15:15:34 +00:00
Dom	756e7de867	Merge pull request #542 from ming535/ming chore: some minor comments and rename	2020-12-10 10:18:18 +00:00
huming	a5a3cd149d	chore: some minor comments and rename	2020-12-10 10:48:57 +08:00
Brandon Sov	146bf59d8d	test: simplify test error matching	2020-12-09 11:36:49 -08:00
Brandon Sov	d179fe68d3	refactor: replace bucket_name clones with references	2020-12-09 11:03:19 -08:00
Brandon Sov	af8569378f	test: move common variable and function to general test usage	2020-12-09 11:01:51 -08:00
Brandon Sov	625542c310	fix: Update s3 error function to correct pattern	2020-12-09 10:14:50 -08:00
Brandon Sov	4be47b1ccc	fix: Move functions to the conditional compilation flag to pass linter	2020-12-08 23:42:41 -08:00
Brandon Sov	62c14de2bc	fix: Update pattern match to detect String	2020-12-08 23:42:33 -08:00
Brandon Sov	989d0ecad8	refactor: set valid format for default s3 bucket name example	2020-12-08 23:42:27 -08:00
Brandon Sov	1a4b2eac26	fix: Report bucket/location when relevant with object store errors	2020-12-08 22:29:28 -08:00
Paul Dix	fa3ecbd4ed	feat: Implement write buffer to Parquet snapshotting (#526 ) * feat: Implement write buffer to Parquet snapshotting This introduces snapshot to the server packages to manage snapshotting. It also introduces a new trait for representing a Partition. There is a very crude API wired up in http_routes for testing purposes. Follow on work will bring the server package into http_routes and rework the snapshot API.	2020-12-08 14:20:43 -05:00
Edd Robinson	91bc7fbdd1	Merge pull request #525 from influxdata/er/chore/bench-debug chore: add debug symbols to benchmarks	2020-12-04 20:05:14 +00:00
Edd Robinson	f3af86ccb4	chore: add debug symbols to benchmarks	2020-12-04 16:36:05 +00:00
Dom	4346ad62cb	Merge pull request #521 from influxdata/dom/org-bucket-types fix: unambigious bucket/org to DB mappings	2020-12-04 11:44:41 +00:00
Dom	ceea61a211	Merge branch 'main' into dom/org-bucket-types	2020-12-04 11:33:36 +00:00
Andrew Lamb	4ec75a4f22	fix: Fix gRPC panic` when multiple field selections are provided (#523 ) * fix: do not assert when multiple fields are selected * fix: clippy * fix: write unit test, fix bug * fix: tweak comments	2020-12-03 12:31:02 -05:00
Dom	ffbeb4dbcc	docs: fix RangeInclusive	2020-12-03 16:10:16 +00:00
Dom	87573256a7	chore: fmt	2020-12-03 16:10:16 +00:00
Dom	d96ed66c32	refactor: clearer lifetime for org&bucket mapping Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-12-03 16:10:16 +00:00
Dom	13f391e2b9	refactor: ignore destructured fields I temporarily forgot I can do this. Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-12-03 16:10:16 +00:00
Dom	234df612ec	refactor: avoid clones for errors Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-12-03 16:10:16 +00:00
Dom	59f9665438	test: cover org_and_bucket_to_database	2020-12-03 16:10:16 +00:00
Dom	aa1c95401e	refactor: DB names 1..=64 Co-authored-by: Edd Robinson <me@edd.io>	2020-12-03 16:10:15 +00:00
Dom	b03de0e7ef	refactor: remove needless lifetimes	2020-12-03 16:10:15 +00:00
Dom	f90a95fd80	fix: unambigious bucket/org to DB mappings Previosuly the $ORG and $BUCKET was joined as: $ORG + "_" + $BUCKET Which is fine unless either $ORG or $BUCKET includes a "_", such as: $ORG = "org_a" $BUCKET = "bucket" and $ORG = "org" $BUCKET = "a_bucket" This change continues to join $ORG and $BUCKET with an underscore, but disallows underscores in either $ORG or $BUCKET. It appears these values are non-zero u64s in the gRPC protocol converted to their base-10 string representations for the DB name, so this seems safe to enforce. In addition, this change introduces a `DatabaseName` type to avoid passing bare strings around, and allow consuming code to ensure only valid database names are provided at compile type. This type works with both owned & borrowed content so doesn't force a string copy where we can avoid it, and derefs to `str` to make it easier to use with existing code. I've been minimally invasive in pushing the `DatabaseName` through the existing code and figured I'd see what the sentement is first. Candidates for conversion from `str` to `DatabaseName` that seem to make sense to me include: - `DatabaseStore` trait - `RemoteServer` trait - Others? Basically anywhere other than the "edge" API inputs Fixes #436 (thanks @zeebo)	2020-12-03 16:10:15 +00:00
Andrew Lamb	8c0e14e039	refactor: rename src/server/rpc/storage.rs to src/server/rpc/service.rs (#513 ) * refactor: rename src/server/rpc/storage.rs src/server/rpc/service.rs * refactor: update references	2020-12-03 09:59:00 -05:00
Dom	592c5c3679	Merge pull request #522 from influxdata/dom/ci-reduce-size ci: remove IOx pre-building in rust build container	2020-12-03 13:25:08 +00:00
Dom	3589aec136	Merge branch 'main' into dom/ci-reduce-size	2020-12-03 13:14:52 +00:00
Edd Robinson	54ae680780	Merge pull request #520 from influxdata/er/refactor/read-filter-result refactor: encapsulate results from segment/table into nicer types	2020-12-03 12:51:20 +00:00
Dom	7136e5853a	ci: remove IOx pre-building in rust build container Stops adding the IOx source code and performing a cargo build/test/clippy each night. Previously this build would compile the IOx source & dependencies, populating the incremental build cache and allowing builds that used the same dependencies to complete quicker. This build caching was moved to per-dependency-set caching in #496, and this pre-build is no longer used. This should reduce the build image size substantially, making the whole CI process a bit faster.	2020-12-03 11:58:13 +00:00
Edd Robinson	254dfc14d8	refactor: apply suggestions from code review Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2020-12-03 11:47:41 +00:00
Edd Robinson	4f32778596	refactor: implement ReadFilterResults type The `ReadFilterResults` type encapsulates results from multiple segments. It implements `Display` to allow visualisation of results from segments in a `select` call.	2020-12-03 11:23:12 +00:00
Edd Robinson	7ad0b4ad9a	refactor: encapsulate read filter results in type This commit also adds `Display` and `Debug` implementations for `ReadFilterResult`. These can be used for visualising the contents of the result of a `read_filter` call on a segment. The former trait elides the column names.	2020-12-03 11:23:09 +00:00
Edd Robinson	a088f33c35	Merge pull request #519 from influxdata/er/refactor/time-predicate refactor: avoid requiring time predicate in Segment	2020-12-03 10:06:29 +00:00
Edd Robinson	05c420cc9e	Merge branch 'main' into er/refactor/time-predicate	2020-12-02 19:13:12 +00:00
Edd Robinson	381c3038aa	refactor: update segment_store/src/segment.rs Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2020-12-02 19:13:00 +00:00
Andrew Lamb	8cb8276819	fix: Update gRPC definitions so tag_key=_field requests work in IOx (#517 ) * fix: Update gRPC definitions so tag_key=_field requests work in IOx * docs: Update src/server/rpc.rs * fix: fixup test * fix: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: consistent type annotations * fix: refactor redundant test code into test_helpers Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2020-12-02 13:58:48 -05:00

... 2 3 4 5 6 ...

1462 Commits (1d972e01c8929c7f5d408b791b92778f09b78129) All Branches Search

1462 Commits (1d972e01c8929c7f5d408b791b92778f09b78129)

All Branches