influxdb

Commit Graph

Author	SHA1	Message	Date
Stuart Carnie	bf4fce3c77	feat: Implement `LiteralExpr` for `bool`	2023-06-02 13:38:37 +10:00
Stuart Carnie	d5719f9be2	refactor: Moved simplification of time range expressions to parser	2023-06-02 09:50:01 +10:00
Marco Neumann	86a2c249ec	refactor: faster PG `ParquetFileRepo` (#7907 ) * refactor: remove `ParquetFileRepo::flag_for_delete` * refactor: batch update parquet files in catalog * refactor: avoid data roundtrips through postgres * refactor: do not return ID from PG when we do not need it --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-01 16:17:28 +00:00
Marco Neumann	72ff001d33	feat: aggregator for DataFusion statistics (#7904 ) * feat: aggregator for DataFusion statistics Required to implement #7470, esp. to implement the statistics folding done within `RecordBatchesExec`. * docs: improve	2023-06-01 16:11:30 +00:00
Dom Dwyer	f0832818ee	test(router): invalid strftime partition template An integration test asserting that a router returns an error when attempting to partition a write with an invalid strftime partition formatter, rather than panicking.	2023-06-01 17:44:44 +02:00
Dom Dwyer	47214ec9a0	fix: prevent panics in partitioning logic Changes the partitioning logic to be fallible. This prevents an invalid partition template from causing a panic, previously possible through two known code paths: * TagValue formatter referencing a non-tag column * Time formatter using an invalid strftime format string If either occurs, the write attempt is now aborted and an error returned to the user with a HTTP 500 status code. Additionally unexpected partitioner errors now map to a catch-all error instead of panicking.	2023-06-01 17:44:44 +02:00
Dom Dwyer	6bb4f20d7c	refactor: remove redundant test test_partition_key was recreated below via a test generator.	2023-06-01 17:44:43 +02:00
Nga Tran	21752cfb69	test: reproducer for panic bug attempt to calculate the remainder with a divisor of zero (#7903 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-06-01 15:43:24 +00:00
Marco Neumann	551e838db3	refactor: remove unused PG indices (#7905 ) Similar to #7859. To test index usage, execute the following query on the writer replica: ```sql SELECT n.nspname AS namespace_name, t.relname AS table_name, pg_size_pretty(pg_relation_size(t.oid)) AS table_size, t.reltuples::bigint AS num_rows, psai.indexrelname AS index_name, pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, CASE WHEN i.indisunique THEN 'Y' ELSE 'N' END AS "unique", psai.idx_scan AS number_of_scans, psai.idx_tup_read AS tuples_read, psai.idx_tup_fetch AS tuples_fetched FROM pg_index i INNER JOIN pg_class t ON t.oid = i.indrelid INNER JOIN pg_namespace n ON n.oid = t.relnamespace INNER JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid WHERE n.nspname = 'iox_catalog' AND t.relname = 'parquet_file' ORDER BY 1, 2, 5; ```` Data for eu-west-1 at `2023-05-31T16:30:00Z`: ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_deleted_at_idx \| 6442 MB \| N \| 1693534991 \| 21602734184385 \| 21694365037 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_partition_delete_idx \| 20 MB \| N \| 17854904 \| 3087700816 \| 384603858 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_partition_idx \| 2325 MB \| N \| 1627977474 \| 12604272924323 \| 11088781876397 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_pkey \| 8290 MB \| Y \| 480767174 \| 481021514 \| 480733966 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_table_delete_idx \| 174 MB \| N \| 1006563 \| 24687617719 \| 385132581 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_table_idx \| 1905 MB \| N \| 9288042 \| 351240529272 \| 27551 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_location_unique \| 6076 MB \| Y \| 385294957 \| 109448 \| 109445 ```` and at `2023-06-01T13:00:00Z`: ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_deleted_at_idx \| 6976 MB \| N \| 1693535032 \| 21602834620294 \| 21736731439 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_partition_delete_idx \| 21 MB \| N \| 31468423 \| 7397141567 \| 677909956 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_partition_idx \| 2464 MB \| N \| 1627977474 \| 12604272924323 \| 11088781876397 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_pkey \| 8785 MB \| Y \| 492762975 \| 493017342 \| 492729691 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_table_delete_idx \| 241 MB \| N \| 1136317 \| 24735561304 \| 429892231 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_table_idx \| 2058 MB \| N \| 9288042 \| 351240529272 \| 27551 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_location_unique \| 6776 MB \| Y \| 399142416 \| 124810 \| 124807 ```` Due to #7842 and #7894, the following indices are no longer used: - `parquet_file_partition_idx` - `parquet_file_table_idx`	2023-06-01 13:45:05 +00:00
Dom	e7bb89e946	Merge pull request #7902 from influxdata/crepererum/issue7470a refactor: remove dead code	2023-06-01 13:46:18 +01:00
Marco Neumann	dd158de08b	refactor: remove dead code	2023-06-01 13:03:01 +02:00
Dom	cff6783241	Merge pull request #7893 from influxdata/dom/reversible-partition-key feat: unambiguously reversible partition keys	2023-05-31 16:09:55 +01:00
Dom	4b0753a800	Merge branch 'main' into dom/reversible-partition-key	2023-05-31 16:04:16 +01:00
Dom	c907916871	docs: fix comment Co-authored-by: Fraser Savage <fsavage@influxdata.com>	2023-05-31 16:04:08 +01:00
Andrew Lamb	a48f681e56	feat(parquet): reduce and limit buffering when writing parquet files (#7880 ) * feat: limit buffering when writing parquet files ("combined solution") * chore: Run cargo hakari tasks --------- Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-31 13:27:32 +00:00
Marco Neumann	e14305ac33	feat: add index for compactor (#7894 ) * fix: migration name * feat: add index for compactor	2023-05-31 12:29:00 +00:00
Marco Neumann	e1c1908a0b	refactor: add `parquet_file` PG index for querier (#7842 ) * refactor: add `parquet_file` PG index for querier Currently the `list_by_table_not_to_delete` catalog query is somewhat expensive: ```text iox_catalog_prod=> select table_id, sum((to_delete is NULL)::int) as n from parquet_file group by table_id order by n desc limit 5; table_id \| n ----------+------ 1489038 \| 7221 1489037 \| 7019 1491534 \| 5793 1491951 \| 5522 1513377 \| 5339 (5 rows) iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id, min_time, max_time, to_delete, file_size_bytes, row_count, compaction_level, created_at, column_set, max_l0_created_at FROM parquet_file WHERE table_id = 1489038 AND to_delete IS NULL; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on parquet_file (cost=46050.91..47179.26 rows=283 width=200) (actual time=464.368..472.514 rows=7221 loops=1) Recheck Cond: ((table_id = 1489038) AND (to_delete IS NULL)) Heap Blocks: exact=7152 -> BitmapAnd (cost=46050.91..46050.91 rows=283 width=0) (actual time=463.341..463.343 rows=0 loops=1) -> Bitmap Index Scan on parquet_file_table_idx (cost=0.00..321.65 rows=22545 width=0) (actual time=1.674..1.674 rows=7221 loops=1) Index Cond: (table_id = 1489038) -> Bitmap Index Scan on parquet_file_deleted_at_idx (cost=0.00..45728.86 rows=1525373 width=0) (actual time=460.717..460.717 rows=4772117 loops=1) Index Cond: (to_delete IS NULL) Planning Time: 0.092 ms Execution Time: 472.907 ms (10 rows) ``` I think this may also be because PostgreSQL kinda chooses the wrong strategy, because it could just look at the existing index and filter from there: ```text iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id, min_time, max_time, to_delete, file_size_bytes, row_count, compaction_level, created_at, column_set, max_l0_created_at FROM parquet_file WHERE table_id = 1489038; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using parquet_file_table_idx on parquet_file (cost=0.57..86237.78 rows=22545 width=200) (actual time=0.057..6.994 rows=7221 loops=1) Index Cond: (table_id = 1489038) Planning Time: 0.094 ms Execution Time: 7.297 ms (4 rows) ``` However PostgreSQL doesn't know the cardinalities well enough. So let's add a dedicated index to make the querier faster. * feat: new migration system * docs: explain dirty migrations	2023-05-31 10:56:32 +00:00
Stuart Carnie	8c02f81456	chore: Add some docs to the execution_props API	2023-05-31 12:31:38 +10:00
Stuart Carnie	d9d7419693	refactor: move time range logic to separate module The `rewrite_expression` module was getting large, so made sense to move time range logic to its own module.	2023-05-31 12:28:03 +10:00
Stuart Carnie	0a7e162911	refactor: rename into planner submodule	2023-05-31 12:14:12 +10:00
Stuart Carnie	8f2c235b15	refactor: these are APIs for transforming InfluxQL expressions	2023-05-31 12:09:32 +10:00
kodiakhq[bot]	306171e714	Merge pull request #7889 from influxdata/sgc/issue/7829_time_bounds fix: gap filling with multiple lower or upper time bounds	2023-05-31 00:05:45 +00:00
Stuart Carnie	10e15f81fe	Merge branch 'main' into sgc/issue/7829_time_bounds	2023-05-31 09:42:21 +10:00
Joe-Blount	baaffd4445	Merge pull request #7896 from influxdata/jrb_45_deterministic_l0_order feat: Order L0s more deterministically	2023-05-30 17:10:38 -05:00
Joe-Blount	3b77929007	chore: lint fmt	2023-05-30 16:08:40 -05:00
Joe-Blount	c2423d8a5c	feat: Order L0s more deterministically	2023-05-30 15:52:03 -05:00
Fraser Savage	24f0dca838	test(cli): Add test to ensure `regenerate-lp` continues on minor errors This adds a test to the `wal regenerate-lp` command to ensure that non-fatal errors do not block regeneration of any other recoverable entries.	2023-05-30 18:01:28 +01:00
Dom Dwyer	37bb5e0585	test: arbitrary reversible partition keys This test constructs a partition key from an arbitrary selection of pre-defined parts, and uses the resulting template to partition a write containing an arbitrary selection of pre-defined tag columns. Once a partition key is derived, the test asserts build_column_values() reverses it into the original set of tag (column_name, value) tuples present in the write.	2023-05-30 15:58:26 +02:00
Dom Dwyer	27bef292a3	feat: unambiguously reversible partition keys This commit changes the format of partition keys when generated with non-default partition key templates ONLY. A prior fixture test is unchanged by this commit, ensuring the default partition keys remain the same. When a custom partition key template is provided, it may specify one or more parts, with the TagValue template causing values extracted from tag columns to appear in the derived partition key. This commit changes the generated partition key in the following ways: * The delimiter of multi-part partition keys; the character used to delimit partition key parts is changed from "/" to "\|" (the pipe character) as it is less likely to occur in user-provided input, reducing the encoding overhead. * The format of the extracted TagValue values (see below). Building on the work of custom partition key overrides, where an immutable partition template is resolved and set at table creation time, the changes in this PR enable the derived partition key to be unambiguously reversed into the set of tag (column_name, column_value) tuples it was generated from for use in query pruning logic. This is implemented by the build_column_values() method in this commit, which requires both the template, and the derived partition key. Prior to this commit, a partition key value extracted from a tag column was in the form "tagname_x" where "x" is the value and "tagname" is the name of the tag column it was extracted from. After this commit, the partition key value is in the form "x"; the column name is removed from the derived string to reduce the catalog storage overhead (a key driver of COGS). In the case of a NULL tag value, the sentinel value "!" is inserted instead of the prior "tagname_" marker. In the case of an empty string tag value (""), the sentinel "^" value is inserted instead of the "tagname_-" marker, ensuring the distinction between an empty value and a not-present tag is preserved. Additionally tag values utilise percent encoding to encode reserved characters (part delimiter, empty sentinel character, % itself) to eliminate deserialisation ambiguity. Examples of how this has changed derived partition keys, for a template of [Time(YYYY-MM-DD), TagValue(region), TagValue(bananas)]: Write: time=1970-01-01,region=west,other=ignored Old: "1970-01-01-region_west-bananas" New: "1970-01-01\|west\|!" Write: time=1970-01-01,other=ignored Old: "1970-01-01-region-bananas" New: "1970-01-01\|!\|!"	2023-05-30 15:58:25 +02:00
Dom Dwyer	57ba3c8cf5	test: default partition key fixture This test asserts the partition key of a write derived from the default partition key template (YYYY-MM-DD). This test ensures that the default partition keys do not change with subsequent changes, as these values are what are used today.	2023-05-30 15:55:08 +02:00
Andrew Lamb	a4789b0ad3	docs: Add ticket reference for other upstream implementations of flightsql (#7891 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-30 13:33:17 +00:00
kodiakhq[bot]	7d1d636358	Merge pull request #7890 from influxdata/dom/explicit-mod refactor: explicit submod for partition_template	2023-05-30 13:21:22 +00:00
Dom Dwyer	9e0570f2bf	refactor: explicit submod for partition_template Move the import into the submodule itself, rather than re-exporting it at the crate level. This will make it possible to link to the specific module/logic.	2023-05-30 15:13:20 +02:00
Dom	e6dc8f17c3	Merge pull request #7888 from influxdata/dependabot/cargo/once_cell-1.17.2 chore(deps): Bump once_cell from 1.17.1 to 1.17.2	2023-05-30 10:14:08 +01:00
Stuart Carnie	dbdb24e3dd	Merge branch 'main' into sgc/issue/7829_time_bounds	2023-05-30 15:46:53 +10:00
Stuart Carnie	600ed6652c	refactor: rewrite time-range expressions to a single range Fixes gap filling, which was confused by multiple lower or upper time bounds.	2023-05-30 15:46:45 +10:00
dependabot[bot]	840692e1f3	chore(deps): Bump once_cell from 1.17.1 to 1.17.2 Bumps [once_cell](https://github.com/matklad/once_cell) from 1.17.1 to 1.17.2. - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.17.1...v1.17.2) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-05-30 02:58:12 +00:00
Andrew Lamb	9ceb3e117a	docs: Add upstream ticket link (#7881 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-29 16:31:18 +00:00
dependabot[bot]	bfc395a30d	chore(deps): Bump comfy-table from 6.1.4 to 6.2.0 (#7883 ) Bumps [comfy-table](https://github.com/nukesor/comfy-table) from 6.1.4 to 6.2.0. - [Release notes](https://github.com/nukesor/comfy-table/releases) - [Changelog](https://github.com/Nukesor/comfy-table/blob/main/CHANGELOG.md) - [Commits](https://github.com/nukesor/comfy-table/compare/v6.1.4...v6.2.0) --- updated-dependencies: - dependency-name: comfy-table dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-29 09:22:56 +00:00
dependabot[bot]	edbecd53af	chore(deps): Bump log from 0.4.17 to 0.4.18 (#7884 ) Bumps [log](https://github.com/rust-lang/log) from 0.4.17 to 0.4.18. - [Release notes](https://github.com/rust-lang/log/releases) - [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/log/compare/0.4.17...0.4.18) --- updated-dependencies: - dependency-name: log dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-29 09:17:08 +00:00
dependabot[bot]	e0720db138	chore(deps): Bump tokio from 1.28.1 to 1.28.2 (#7885 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.28.1 to 1.28.2. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.28.1...tokio-1.28.2) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-29 09:11:24 +00:00
kodiakhq[bot]	79f6615ab7	Merge pull request #7876 from influxdata/dom/frame-docs docs: describe what the spans capture	2023-05-29 09:05:17 +00:00
Dom	8bf834a7ea	Merge branch 'main' into dom/frame-docs	2023-05-29 09:59:37 +01:00
Dom	40ab025dc5	Merge pull request #7886 from influxdata/dependabot/cargo/criterion-0.5.1 chore(deps): Bump criterion from 0.5.0 to 0.5.1	2023-05-29 09:59:31 +01:00
Dom	8f6308fca3	Merge branch 'main' into dom/frame-docs	2023-05-29 09:59:27 +01:00
dependabot[bot]	e2b9beffad	chore(deps): Bump criterion from 0.5.0 to 0.5.1 Bumps [criterion](https://github.com/bheisler/criterion.rs) from 0.5.0 to 0.5.1. - [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/bheisler/criterion.rs/compare/0.5.0...0.5.1) --- updated-dependencies: - dependency-name: criterion dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-05-29 02:02:12 +00:00
Christopher M. Wolff	2a07b53879	feat: add more tag predicate rewrite logic for InfluxQL (#7869 ) * feat: add more tag predicate rewrite logic for InfluxQL * chore: cargo fmt * chore: fmt * test: add more tests --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-26 21:53:52 +00:00
Andrew Lamb	d3b8fa2c21	chore(docs): Update tracing.md with latest jaeger (#7878 ) Update instructions to use latest jaeger version Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-05-26 17:24:05 +00:00
Fraser Savage	bf031641c5	feat(cli): Add measurement name lookup to `wal regenerate-lp` command This commit adds support for the CLI to query the namespace and schema APIs to retrieve database and table names from the IDs found in WAL entries being regenerated.	2023-05-26 17:31:19 +01:00
Fraser Savage	51d59f8216	refactor(`wal_inspect`): Make `LineProtoWriter` namespace unaware Instead, the type responsible for initialising it handles namespaced `Write` initialisation and management, as well as the failure paths that may need handling. This commit introduces a `NamespaceDemultiplexer` type with a generic implementation allowing fallible `async` lazy init of any type from a given `NamespaceId`. This paves the way for catalog-aware initialisation of `LineProtoWriter`s.	2023-05-26 17:12:35 +01:00

... 2 3 4 5 6 ...

12633 Commits (3c0388fdea805a6794a52b1a1844ae712fb06bf3) All Branches Search

12633 Commits (3c0388fdea805a6794a52b1a1844ae712fb06bf3)

All Branches