influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	b1af5b3f44	feat: query log system table for querier (#4157 ) * feat: query log system table for querier Closes #4084. * fix: typo Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: extend Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 15:38:11 +00:00
Marco Neumann	2b76c31157	refactor: make statistics null counts optional (#4160 ) Min/max values and distinct counts are already optional, so let's make the null counts optional as well. This will be helpful for NG to deal w/ partial statistics (e.g. we only populate stats for the time column). Note that the total count is still mandatory, but we normally have the chunk/file-level row count at hand.	2022-03-29 17:47:57 +00:00
Marco Neumann	7d947c79d5	refactor: small query tests clean up (#4156 ) * refactor: make NG query test generation more flexible * refactor: rename OG-specfic query tests * docs: explain chunk stage generation in NG query tests * fix: typo	2022-03-29 14:00:34 +00:00
Andrew Lamb	58c630d709	chore: Update datafusion (#4133 ) * chore: Update datafusion * fix: typo * fix: Update explain plan output * fix: update Cargo.locl Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-25 15:08:39 +00:00
Marco Neumann	8ca5c337b2	refactor: port more query tests to NG, some code clean up (#4125 ) * refactor: inline function that is used once * refactor: generalize multi-chunk creation for NG * refactor: `TwoMeasurementsManyFieldsTwoChunks` is OG-specific * refactor: generalize `OneMeasurementTwoChunksDifferentTagSet` * refactor: port `OneMeasurementFourChunksWithDuplicates` to NG * refactor: `TwoMeasurementsManyFieldsLifecycle` is OG-specific * refactor: simplify NG chunk generation * refactor: port `ThreeDeleteThreeChunks` to NG Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-24 15:07:09 +00:00
Marco Neumann	cc7f744e8e	test: two-chunk scenarios for NG (#4113 ) Add the generic components to create two-chunk scenarios. Includes small scenario fixes for things like system tables that are not identical between OG and NG (also see #4111.) Ref #3934.	2022-03-24 09:50:57 +00:00
Marco Neumann	5ae1e2fecf	refactor: make query tests less OG-specific	2022-03-23 12:04:32 +01:00
Marco Neumann	89206e013c	test: run SOME query tests for querier (#4098 ) This includes some type changes to dispatch between OG and NG and allows some tests to be run against the NG querier. This only contains parquet files though, so it's somewhat a limited scope. For #3934.	2022-03-22 17:39:19 +00:00
Andrew Lamb	d9f331ba2a	chore: update datafusion, stop repartitioning so aggressively (#3633 ) * chore: update datafusion * fix: Update to use new datafusion api * chore: update expected plans * fix: support zero output partitions * fix: update test * fix: Update for new DataFusion API * fix: newly added system table * fix: update cargo lock	2022-02-09 19:53:41 +00:00
Andrew Lamb	85004831a3	refactor: extract predicate sql tests out of rust (#3683 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-09 11:24:34 +00:00
kodiakhq[bot]	a2ed6a1b75	Merge branch 'main' into combine-non-overlapping-chunks	2022-02-02 20:47:51 +00:00
Andrew Lamb	c4a234e83c	feat: Allow sql test runner to compare sorted output (#3618 ) * refactor: Add Query struct * feat: Implement sorted checking * refactor: port some sql tests over * fix: fmt * fix: Apply suggestions from code review Co-authored-by: Edd Robinson <me@edd.io> Co-authored-by: Edd Robinson <me@edd.io> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-02 19:59:52 +00:00
Raphael Taylor-Davies	8a8de19fb5	feat: combine non-overlapping chunks without deletes	2022-02-02 16:40:30 +00:00
Andrew Lamb	527885f7f8	chore: Update datafusion (#3413 ) * chore: Update datafusion and update code to handle timezone aware timestamps * fix: cargo hakari Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-12-23 14:52:12 +00:00
Andrew Lamb	61dd7e0ba0	chore: clean up `all_chunks_dropped.sql` (#3337 )	2021-12-08 14:51:56 +00:00
Andrew Lamb	9e8639f230	chore: Update DataFusion pin (#3279 ) * chore: Update DataFusion pin * fix: Update for new DF API * fix: update plan output Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-12-02 12:42:28 +00:00
Andrew Lamb	50e9e02ff7	chore: Update datafusion (#3188 ) * chore: Update datafusion * fix: Update for LogicalPlan changes * fix: Update explain plan output Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-11-23 12:23:23 +00:00
Nga Tran	71731524c4	test: add group by tests	2021-11-08 10:46:40 -05:00
Nga Tran	97206b13cb	fix: statistics for max/min(time) should have data type timstamp	2021-11-05 18:11:54 -04:00
Nga Tran	89699cf0de	chore: use latest DataFusion to fix the min/max(dictionary string) bug	2021-11-04 11:04:03 -04:00
Andrew Lamb	9974a5364c	chore(security): Replace prettytable with comfy-table (#2905 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-10-20 10:44:36 +00:00
Nga Tran	faf65f38cc	refactor: address review comments	2021-10-14 11:23:20 -04:00
Nga Tran	8dd9dcce01	test: verify if all scenarios are created correctly and add a few delete tests for read_filter	2021-10-13 17:21:03 -04:00
Nga Tran	144ce77e39	chore: merge main to branch	2021-10-12 15:59:57 -04:00
Nga Tran	459dd46ae9	refactor: move delete tests to .sql	2021-10-12 15:49:23 -04:00
Andrew Lamb	035654b4f9	refactor: do not rebuild query_test when .sql or .expected files change (#2816 ) * feat: Do not rebuild query_tests if .sql or .expected change * feat: Add CI check * refactor: move some sql tests to .sql files * tests: port tests / expected results to data files * fix: restore old name check-flatbuffers	2021-10-12 19:34:54 +00:00
Nga Tran	055e69439d	test: fix auto created tests	2021-10-05 18:11:27 -04:00
Nga Tran	aa64daca86	feat: dDisable using statistics to query data if there are soft deleted rows	2021-10-05 17:52:32 -04:00
Raphael Taylor-Davies	b402423e9e	feat: remove move lifecycle action (#2674 ) * feat: remove move_chunk lifecycle action * chore: review feedback Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-09-30 16:58:05 +00:00
Andrew Lamb	a55a21c644	chore: Update datafusion (#2635 ) * chore: Update datafusion and sqlparser * fix: remove STACK_SIZE workaround * chore: update datafusion_util * chore: update predicate * chore: update query_tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-09-27 14:13:19 +00:00
Marco Neumann	1b788732da	fix: order chunks correctly during query processing The query processing was implicitly relying on the order provided by the catalog. This had two issues: - this ordering was not defined in the API contract (neither via docs nor via typing) - the order was based on chunk IDs which is not adequate in some cases (e.g. when chunks are created while a persistence operations is in progress) Now we explicitly sort chunks by `(order, ID)`. Fixes #1963.	2021-09-14 13:00:55 +02:00
Andrew Lamb	5eef76c868	chore: Update dependencies (including datafusion) (#2521 ) * chore: Update datafusion deps to pre-release * refactor: Update IOx to use new datafusion Statistics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-09-13 21:30:44 +00:00
Andrew Lamb	f975baba6b	chore: Update datafusion + other deps again (get baseline metrics) (#2422 ) * chore: Update datafusion reference * chore: cargo update * fix: update explain tests to show Union Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-08-26 13:13:00 +00:00
Andrew Lamb	e6cbd4d217	feat: Use statistics for count() queries (#2038 ) feat: Use statistics for count() queries docs: fix mangled comment * refactor: rewrite to use fold * refactor: use sort_by_cached_key * fix: set null count properly * fix: fmt + clippy	2021-07-28 19:39:41 +00:00
Andrew Lamb	387667330a	chore: Update datafusion deps (#2073 ) * chore: Update datafusion deps * fix: update tests	2021-07-21 08:27:03 +00:00
Raphael Taylor-Davies	091837420f	feat: add PersistenceWindows sytem table (#2030 ) (#2062 ) * feat: add PersistenceWindows sytem table (#2030) * chore: update log Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-07-20 13:10:57 +00:00
Andrew Lamb	1c16988a51	chore: Update datafusion references (#2056 )	2021-07-19 18:09:06 +00:00
Nga Tran	0b1f2b1fd0	chore: merge main to branch	2021-07-14 16:17:14 -04:00
Nga Tran	552e3fb691	fix: Padd stats compute deterministic order of sort key and update tests that got changed by the use of sort key	2021-07-14 14:06:41 -04:00
Andrew Lamb	0164cabbf3	refactor: do not use DataFrame DataFusion API / stop optimizing twice (#1982 ) * refactor: do not use DataFrame DataFusion API * fix: update output to reflect not running optimizer twice Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-07-13 16:29:43 +00:00
Marco Neumann	09e611deb7	refactor: lift query schema generation up to caller Do no longer scan chunks during query planning to determine the schema (except for the lifetime jobs where we have a good reason to do so). Instead pass the schema down to from whoever is triggering the query. For real SQL queries, we then just use the the table-wide schemas introduced in #1913. Apart from avoiding schema merges we now also don't crash any longer when no chunks are left in the table (aka columns are present but all rows are gone). Fixes #1768. Fixes #1884.	2021-07-09 09:24:21 +02:00
Andrew Lamb	7602bde850	chore: Update datafusion deps (#1799 ) * chore: Update datafusion deps + rework code * refactor: remove workaround as it has been contributed upstream * fix: Update query/src/exec/split.rs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-07-08 10:58:32 +00:00
Marco Neumann	d6cff911b6	test: ensure that query tests don't rebuild all the time Beforehand: ```text ❯ env CARGO_LOG=cargo::core::compiler::fingerprint=info cargo test -p query_tests [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] stale: changed "/home/mneumann/src/influxdb_iox/query_tests/cases" [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] (vs) "/home/mneumann/src/influxdb_iox/target/debug/build/query_tests-0e8f741dfb84437f/output" [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] FileTime { seconds: 1625474716, nanos: 436081357 } != FileTime { seconds: 1625474752, nanos: 52625167 } [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Test/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) } [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/RunCustomBuild/TargetInner { ..: custom_build_target("build-script-build", "/home/mneumann/src/influxdb_iox/query_tests/build.rs", Edition2018) } [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] fingerprint error for query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests)/Build/TargetInner { ..: lib_target("query_tests", ["lib"], "/home/mneumann/src/influxdb_iox/query_tests/src/lib.rs", Edition2018) } [2021-07-05T08:52:13Z INFO cargo::core::compiler::fingerprint] err: current filesystem status shows we're outdated Compiling query_tests v0.1.0 (/home/mneumann/src/influxdb_iox/query_tests) ``` The issue is that both the input and the test output files are located under `cases/`. `build.rs` used `cargo:rerun-if-changed=cases` which per Cargo doc will scan ALL files in that directory. Note that the normal `exclude` directive in `Cargo.toml` does NOT work, see https://github.com/rust-lang/cargo/issues/4587 . So we need to split input and output files into separate directories (`cases/{in,out}`).	2021-07-05 15:30:10 +02:00
Andrew Lamb	07826306ed	fix: Always deduplicate data prior to insertion into the ReadBuffer (#1863 ) * fix: mark ReadBuffer as always deduplicated * fix: Use compact plans during merge * docs: Update server/src/db/chunk.rs Co-authored-by: Nga Tran <ntran@influxdata.com> Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: Nga Tran <ntran@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2021-07-01 16:23:37 +00:00
Andrew Lamb	fef160e24f	feat: Implement data driven query_tests and port explain tests (#1814 ) * feat: Implment data driven query testing and port explain tests * fix: do not fmt the auto generated cases * refactor: split setup and parser into separate modules * refactor: Add log to runner, add end to end tests * docs: fixu cpmments	2021-06-29 16:09:51 +00:00

45 Commits (7e5d71902722caaee257455921d4f372e58dc536)