influxdb

Commit Graph

Author	SHA1	Message	Date
dependabot[bot]	c72734473c	chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-19 10:09:23 +00:00
Carol (Nichols \|\| Goulding)	dfd979477c	fix: Update warm compaction code to optionally take shard ID	2022-12-16 17:41:57 -05:00
Carol (Nichols \|\| Goulding)	b1b9b3122a	test: Run cold compaction catalog functions first and in a transaction To avoid other tests' state bleeding into this one and this one's state bleeding into other tests, now that it's testing some queries without scoping by shard.	2022-12-16 17:28:55 -05:00
Carol (Nichols \|\| Goulding)	d7e75d43ea	fix: Make shard ID optional for compactor queries in RPC write mode	2022-12-16 17:28:53 -05:00
Luke Bond	f419e2c378	feat: warm compaction (#6192 ) * feat: warm compaction chore: add missing warm compaction config chore: tests for warm compaction chore: modify count usage in warm compaction sql chore: catalog test for warm compaction; sql fixes feat: settable target level for compact w/ budget chore: tests for warm compaction chore: clarifying comments in warm compaction test chore: fixed erroneous comment in catalog test chore: improve warm compactor test by checking file exists chore: tests for warm compaction chore: warm compactor test tidy-ups * chore: improve test for warm compaction * chore: fix erroneous comment in warm compaction code	2022-12-16 15:59:45 +00:00
Luke Bond	1bc2003cf4	chore: simplify delete namespace sql query	2022-12-16 10:23:50 +00:00
Luke Bond	a6036631ad	chore: comment typo in catalog	2022-12-16 10:23:50 +00:00
Luke Bond	6263ca234a	chore: delete ns postgres impl, test improvements, fix to mem impl	2022-12-16 10:23:50 +00:00
Luke Bond	3659be59c7	feat: delete namespace api mem impl chore: tests for delete namespace; use unique ptn names in tests	2022-12-16 10:23:50 +00:00
dependabot[bot]	e108a8b6c9	chore(deps): Bump paste from 1.0.9 to 1.0.10 (#6384 ) Bumps [paste](https://github.com/dtolnay/paste) from 1.0.9 to 1.0.10. - [Release notes](https://github.com/dtolnay/paste/releases) - [Commits](https://github.com/dtolnay/paste/compare/1.0.9...1.0.10) --- updated-dependencies: - dependency-name: paste dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-12-13 06:03:05 +00:00
dependabot[bot]	a9db7581cd	chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-21 10:21:24 +00:00
Luke Bond	7c813c170a	feat: reintroduce compactor first file in partition exception (#6176 ) * feat: compactor ignores max file count for first file chore: typo in comment in compactor * feat: restore special first file in partition compaction logic; add limit * fix: calculation in compaction max file count chore: clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 15:58:59 +00:00
Nga Tran	49a9565240	feat: gRPC that creates namespace (#6103 ) * feat: create namespace API call in router Co-authored-by: Nga Tran <nga-tran@live.com> * chore: treat retention as ns except in CLI * fix: overflow in nanosecond calc * fix: retention test after changing it from hours to ns * chore: comment clarification in cli; better response type for error in ns API * fix: correct some rebase mistakes * chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const * fix: ns autocreation test was wrong after rebase * fix: mem catalog has default 1hr retention, accidently removed in rebase * chore: remove mem catalogs default 1hr retention; make it settable in sets & router Co-authored-by: Luke Bond <luke.n.bond@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 13:02:12 +00:00
Nga Tran	6f7b1e2e26	feat: reject writes that are outside the retention period (#6148 ) * feat: reject writes that are outside the retention period * feat: add retention validator into handler stack * chore: Apply suggestions from code review Co-authored-by: Dom <dom@itsallbroken.com> * refactor: address review comments * test: unit tests fot retention validation * chore: address review comments * test: more unit tests and integration tests * refactor: make time inside retention period for emphemeral_mode test * fix: 2 hours Co-authored-by: Dom <dom@itsallbroken.com>	2022-11-17 20:55:58 +00:00
Carol (Nichols \|\| Goulding)	bdff4e8848	fix: Consistently use 'namespace' instead of 'database' in comments and other internal text	2022-11-11 15:46:04 -05:00
Nga Tran	a3f2fe489c	refactor: remove retention_duration field from namespace catalog table (#6124 )	2022-11-11 20:30:42 +00:00
Nga Tran	9c4266c503	refactor: first step to remove unused retention_duration (#6113 ) * refactor: first step to remove unused retention_duration * refactor: remove retenion_duration from update catalog Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-11 15:21:06 +00:00
Nga Tran	93e11d4c91	chore: Revert "feat: flag partitions for delete (#6075 )" (#6111 ) This reverts commit `77a2541172`.	2022-11-10 17:01:39 +00:00
Nga Tran	e81ff1f4d5	chore: Revert "feat: catalog delete old partitions (#6099 )" (#6109 ) This reverts commit `664b0578e9`.	2022-11-10 15:31:16 +00:00
Luke Bond	664b0578e9	feat: catalog delete old partitions (#6099 ) * feat: catalog delete old partitions * chore: remove debug println * chore: remove debug println * chore: clippy * chore: sql statement refactor for deleting partitions * chore: improve delete partition test * chore: clippy	2022-11-10 10:51:22 +00:00
Carol (Nichols \|\| Goulding)	43687a86d2	fix: Remove lots of needless borrows that Clippy can now identify Except for in generated code that we don't control.	2022-11-09 10:54:18 -05:00
Carol (Nichols \|\| Goulding)	07505c8f72	fix: Remove needless borrows, thanks Clippy!	2022-11-09 10:54:18 -05:00
Nga Tran	77a2541172	feat: flag partitions for delete (#6075 ) * feat: flag partition for delete * fix: compare the right date and time * chore: Run cargo hakari tasks * chore: cleanup * fix: typos * chore: rust style tidy ups in catalog Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Luke Bond <luke.n.bond@gmail.com>	2022-11-09 12:06:23 +00:00
Luke Bond	dfb820615c	feat: deletion flagging in GC based on retention policy (#6073 ) * feat: deletion flagging in GC based on retention policy * chore: typo in comment * fix: only soft delete parquet files that aren't yet soft deleted * fix: guard against flakiness in catalog test * chore: some better tests for parquet file delete flagging Co-authored-by: Nga Tran <nga-tran@live.com>	2022-11-08 20:22:35 +00:00
kodiakhq[bot]	369937d68f	Merge branch 'main' into cn/order-by-insert	2022-11-07 19:18:13 +00:00
Carol (Nichols \|\| Goulding)	74a40cc9bd	fix: Assert that there aren't two columns with the same name in the same batch This shouldn't be possible; let's make sure we know if it happens!	2022-11-07 14:10:12 -05:00
Luke Bond	5e05fa52cf	feat: soft delete parquet files based on retention period (#6070 )	2022-11-07 17:31:29 +00:00
Nga Tran	9356f2a1b9	feat: grpc for updating namespace retention period (#6041 ) * refactor: make namespace folder for all namesapce's commands * feat: WIP for add command to set retention period * feat: more on updating retention period * feat: grpc for update namespace retention period * test: end to end test fpr namespace retention * fix: lint proto * chore: cleanup * chore: kick CI run again * fix: command hierachy * chore: fix comments	2022-11-04 20:58:11 +00:00
Carol (Nichols \|\| Goulding)	9b0af96927	docs: Add a link to the idpe issue Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-11-04 13:39:25 -04:00
Carol (Nichols \|\| Goulding)	d454c66b4b	fix: Use a HashMap for column lookup instead of Vec ordering The checks for whether a column already exists with a different type were relying on ordering of the input matching the ordering of the columns returned from inserting the columns in Postgres. Rather than trying to match the new ordering that is required to avoid Postgres deadlocks, switch from a Vec to a HashMap and look up the column type from the name. This also reduces some allocations that weren't really needed.	2022-11-04 11:52:37 -04:00
Carol (Nichols \|\| Goulding)	a6634ada19	fix: Add an ORDER BY to the insert to prevent Postgres deadlocks Fixes influxdata/idpe#16298. Without this ORDER BY, concurrent writes that add many column records to this table can deadlock because they grab locks to rows/index entries in an arbitrary order to check the unique index. By switching to a consistent order across all requests, inserts won't get in a deadlock loop waiting for each other. More info: - <https://rcoh.svbtle.com/postgres-unique-constraints-can-cause-deadlock> - <https://dba.stackexchange.com/a/195220/27897>	2022-11-04 11:52:37 -04:00
NGA-TRAN	498851eaf5	feat: add catalog columns needed for retention policy	2022-11-01 15:35:15 -04:00
Carol (Nichols \|\| Goulding)	2e83e04eab	feat: Use workspace package metadata to reduce differences and repetition	2022-10-24 13:04:09 -04:00
Jake Goulding	df2ba85661	feat: add the ability to delete a skipped compaction	2022-10-21 15:12:20 -04:00
dependabot[bot]	b5574c07b7	chore(deps): Bump async-trait from 0.1.57 to 0.1.58 (#5904 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.57 to 0.1.58. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.57...0.1.58) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-19 09:40:26 +00:00
dependabot[bot]	f3c27c5c71	chore(deps): Bump dotenvy from 0.15.5 to 0.15.6 (#5881 ) Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.5 to 0.15.6. - [Release notes](https://github.com/allan2/dotenvy/releases) - [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md) - [Commits](https://github.com/allan2/dotenvy/compare/v0.15.5...v0.15.6) --- updated-dependencies: - dependency-name: dotenvy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-18 07:06:40 +00:00
Dom Dwyer	afdc008855	fix: correct default limits	2022-10-14 16:05:56 +02:00
Dom Dwyer	e4179605df	test: assert default service limit values Adds tests that assert the default service limit values.	2022-10-14 14:46:34 +02:00
Dom Dwyer	46bbee5423	refactor: reduce default column limit Reduces the default number of columns allowed per-table, from 1,000 to 200.	2022-10-14 14:45:48 +02:00
Carol (Nichols \|\| Goulding)	efb964c390	feat: Enforce table column limits from the schema cache (#5819 ) * fix: Avoid some allocations by collecting instead of inserting into a vec * refactor: Encode that adding columns is for one table at a time * test: Add another test of column limits * test: Add below/above limit tests for create_or_get_many * fix: Explicitly DO NOT check column limits when inserting many columns * feat: Cache the max_columns_per_table on the NamespaceSchema * feat: Add a function to validate column limits in-memory * fix: Provide more useful information when over column limits * fix: Swap types to remove intermediate allocation * docs: Explain the interactions of the cache and the column limits * test: Actually set up test that showcases column limit race condition * fix: Allow writing to existing columns even if table is over column limit Co-authored-by: Dom <dom@itsallbroken.com>	2022-10-14 11:34:17 +00:00
Dom Dwyer	3e70dc44a0	refactor(catalog): remove partition_info_by_id() This method used to return a subset of partition metadata, and was used exclusively for persistence in the ingester. It is now no longer necessary.	2022-10-13 15:26:36 +02:00
Dom Dwyer	3e1e4c1f0b	refactor: remove Table::get_table_persist_info() Remove the now-redundant get_table_persist_info() implementations.	2022-10-13 13:44:50 +02:00
Dom Dwyer	c4f542bbe2	refactor(ingester): remove tombstone support This commit removes tombstone support from the ingester, and deletes associated code/helpers/tests. This commit does NOT remove tombstone support from any other service, but MAY include removing overlapping test coverage. This also removes the tombstone support from the Ingester -> Querier RPC response message. This has the nice side effect of removing a whole lot of thread spawning in the ingester tests for the Executor, speeding everything up!	2022-10-11 13:10:04 +02:00
Dom Dwyer	afcb96ae47	perf(ingester): deferred sort key lookup queries This commit carries the SortKey in the PartitionData, and configures the ingester to use deferred sort key lookups, smearing the lookups across a fixed period of time after initialising the PartitionData, instead of querying for the sort key at persist time. This allows large numbers of PartitionData to be initialised without causing a equally large spike in catalog load to resolve the sort key - instead this load is spread out randomly to reduce peak query rps.	2022-10-06 16:39:54 +02:00
Nga Tran	d171697fd7	feat: always pick cold partitions in next cycle even if it has been pa… (#5772 ) * fix: always pick cold partitions in next cycle even if it has been partially compacted recently * fix: comment * fix: test output * refactor: using var instead of literal * fix: consider deleted L0s for recent writes * chore: cleanup Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-30 15:54:00 +00:00
Dom Dwyer	cd4087e00d	style: add no todo!() or dbg!() lints Some crates had theme, some not - lets be consistent and have the compiler spot dbg!() and todo!() macro calls - they should never be in prod code!	2022-09-29 13:10:07 +02:00
dependabot[bot]	227dde1dfc	chore(deps): Bump thiserror from 1.0.36 to 1.0.37 (#5753 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.36 to 1.0.37. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.36...1.0.37) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-29 10:37:14 +00:00
Dom Dwyer	e19b88cae9	feat(catalog): most recent N partitions Adds a Partition::most_recent_n() method to the catalog interface, returning the N most recent partitions for a given set of shards. The most recently created partitions are likely to be currently "hot" for writes, and are cheap to list.	2022-09-27 16:22:00 +02:00
Nga Tran	75ff805ee2	feat: instead of adding num_files and memory budget into the reason text column, let us create differnt columns for them. We will be able to filter them easily (#5742 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-26 20:14:04 +00:00
dependabot[bot]	b1740f45d6	chore(deps): Bump thiserror from 1.0.35 to 1.0.36 (#5737 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.35 to 1.0.36. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.35...1.0.36) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-26 14:44:36 +00:00
Carol (Nichols \|\| Goulding)	c8108f01e7	chore: Upgrade to Rust 1.64 (#5727 ) * chore: Upgrade to Rust 1.64 * fix: Use iter find instead of a for loop, thanks clippy * fix: Remove some needless borrows, thanks clippy * fix: Use then_some rather than then with a closure, thanks clippy * fix: Use iter retain rather than filter collect, thanks clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 18:04:00 +00:00
Marco Neumann	84e8a4ac41	fix: GC `parquet_file` table in batches (#5691 ) * fix: GC `parquet_file` table in batches Otherwise this transaction will never finish in prod. * fix: GC shutdown * refactor: use constant	2022-09-20 11:14:39 +00:00
dependabot[bot]	b6fb481b0f	chore(deps): Bump dotenvy from 0.15.3 to 0.15.5 (#5689 ) Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.3 to 0.15.5. - [Release notes](https://github.com/allan2/dotenvy/releases) - [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md) - [Commits](https://github.com/allan2/dotenvy/compare/v0.15.3...v0.15.5) --- updated-dependencies: - dependency-name: dotenvy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-20 05:28:47 +00:00
Marko Mikulicic	758649296b	fix: Close old connection pool after swap	2022-09-20 01:56:42 +02:00
Marko Mikulicic	14a6b437d8	chore: Lower sqlx logging verbosity (#5681 ) The default statement logging verbosity of the `sqlx` crate is INFO, which is frankly surprising. The reason we didn't bother with lowering this before is that the `sqlx` crate emits logs using the `log` crate, and we're using the `tracing` crate for logging too. We did bridge the two logging ecosystems with https://docs.rs/tracing-log/latest/tracing_log/ but until https://github.com/influxdata/influxdb_iox/pull/5680 the bridge wasn't really working so we didn't notice the very verbose logs of sqlx sstatement logging (which log our whole SQL multiline statements as INFO logs...)	2022-09-19 22:56:05 +00:00
Carol (Nichols \|\| Goulding)	414b0f02ca	fix: Use time helper methods in more places	2022-09-19 13:24:08 -04:00
Carol (Nichols \|\| Goulding)	c0c0349bc5	fix: Use typed Time values rather than ns	2022-09-19 12:59:20 -04:00
Carol (Nichols \|\| Goulding)	0e23360da1	refactor: Add helper methods for computing times to TimeProvider	2022-09-19 11:34:43 -04:00
Dom Dwyer	66bf0ff272	refactor(db): NULLable persisted_sequence_number Makes the partition.persisted_sequence_number column in the catalog DB NULLable. 0 is a valid persisted sequence number.	2022-09-15 18:19:39 +02:00
Dom Dwyer	234d460fcb	chore: rename update_persisted_sequence_number fn	2022-09-15 16:10:35 +02:00
Dom Dwyer	d199a83355	feat(catalog): per-partition persist mark API Adds the "persisted_sequence_number" field to the Partition model, and updates the catalog API to read & update it.	2022-09-15 16:10:35 +02:00
Dom Dwyer	c5ac17399a	refactor(db): persist marker for partition table Adds a migration to add a column "persisted_sequence_number" that defines the inclusive upper-bound on sequencer writes materialised and uploaded to object store for the partition.	2022-09-15 16:10:35 +02:00
Luke Bond	b52865e018	feat: garbage collector now cleans up old parquet files (#5588 ) * feat: garbage collector now cleans up old parquet files * chore: clarifying comment in GC * chore: typos in GC * chore: typos in GC * fix: cmdline arg in GC test needs updating after refactor * fix: use select! on shutdown rx in GC * fix: recalc cutoff in GD each loop * chore: add delete_old that returns IDs only, for GC * chore: use duration in GC args instead of usize days * chore: GC lister runs forever w/ sleep; tests updated accordingly * docs: fix link in GC comments to automatic link * chore: test for delete_old_ids_only; refactor mem impl thereof * chore: make GC test less flakey * chore: make GC test less flakey Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 14:09:28 +00:00
dependabot[bot]	b4a25fdb0e	chore(deps): Bump thiserror from 1.0.34 to 1.0.35 (#5629 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.34 to 1.0.35. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.34...1.0.35) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 12:54:12 +00:00
kodiakhq[bot]	85641efa6f	Merge branch 'main' into cn/infallible-estimated-bytes	2022-09-14 01:00:10 +00:00
Luke Bond	ee3f172d45	chore: renamed DB migration for billing trigger	2022-09-13 16:29:14 +01:00
Luke Bond	c8b545134e	chore: add index to speed up billing_summary upsert	2022-09-13 16:22:44 +01:00
Luke Bond	feae712881	fix: parquet_file billing trigger respects to_delete	2022-09-13 16:22:44 +01:00
Luke Bond	80661a5d1c	chore: clippy	2022-09-13 16:22:44 +01:00
Luke Bond	10acaf4567	chore: added test for parquet file delete trigger	2022-09-13 16:22:44 +01:00
Luke Bond	cc93b2c275	chore: add catalog trigger for billing	2022-09-13 16:22:44 +01:00
Carol (Nichols \|\| Goulding)	224e3cec10	fix: Use ColumnType in errors rather than strings	2022-09-12 17:35:27 -04:00
Carol (Nichols \|\| Goulding)	20e6d26aa9	refactor: Have sqlx decode ColumnTypes in the catalog	2022-09-12 16:50:25 -04:00
Carol (Nichols \|\| Goulding)	aba9759268	test: Correct expectations with skipped compactions	2022-09-12 13:13:29 -04:00
Carol (Nichols \|\| Goulding)	ef35f2e236	fix: Always parameterize the compaction level to postgres So that we don't have hardcoded values in SQL that could get out of sync	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	da201ba87f	fix: Select by num of both l0 and l1 files for cold compaction Now that we're going to compact level 1 files in to level 2 files as well.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	6bba3fafaa	fix: If full compaction group has only 1 file, upgrade level As opposed to running full compaction. Makes the catalog function general and take the level as a parameter rather than only upgrade to level 1.	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	327446f0cd	fix: Change default cold hours threshold from 24 hours to 8 As requested in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1212468682	2022-09-12 13:13:26 -04:00
Carol (Nichols \|\| Goulding)	20e7c4f4e5	test: Check all returned partitions to make sure the skipped one isn't there	2022-09-09 17:24:09 -04:00
Carol (Nichols \|\| Goulding)	c92aebd595	feat: Exclude skipped partitions from compaction candidacy Connects to #5458.	2022-09-09 15:31:07 -04:00
Carol (Nichols \|\| Goulding)	fbe3e360d2	feat: Record skipped compactions in memory Connects to #5458.	2022-09-09 15:31:07 -04:00
YIXIAO SHI	52ae60bf2e	chore: fix comment typo (#5551 ) Co-authored-by: Dom <dom@itsallbroken.com>	2022-09-07 08:49:29 +00:00
Marco Neumann	adeacf416c	ci: fix (#5569 ) * ci: use same feature set in `build_dev` and `build_release` * ci: also enable unstable tokio for `build_dev` * chore: update tokio to 1.21 (to fix console-subscriber 0.1.8 * fix: "must use"	2022-09-06 14:13:28 +00:00
dependabot[bot]	9f0b0328f7	chore(deps): Bump thiserror from 1.0.33 to 1.0.34 (#5556 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.33 to 1.0.34. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.33...1.0.34) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-06 09:18:41 +00:00
Nga Tran	dde65fa7ef	fix: remove timestamp functions from SQLs to be able to use index for improving performance (#5547 )	2022-09-02 19:43:52 +00:00
Nga Tran	cbfd37540a	feat: add index on parquet_file(shard_id, compaction_level, to_delete, created_at) (#5544 )	2022-09-02 14:27:29 +00:00
Nga Tran	c8cbc5299b	feat: make compactors to select candidates based on the last n minutes (#5535 ) * feat: make compactors to select candidates based on the last n minutes to reduce workload for postgres catalog query * refactor: remove 1-minute case per review comment	2022-09-01 20:07:26 +00:00
Carol (Nichols \|\| Goulding)	8a0fa616cf	fix: Rename columns, tables, indexes and constraints in postgres catalog	2022-09-01 10:00:54 -04:00
dependabot[bot]	7c61bdcf35	chore(deps): Bump paste from 1.0.8 to 1.0.9 (#5526 ) Bumps [paste](https://github.com/dtolnay/paste) from 1.0.8 to 1.0.9. - [Release notes](https://github.com/dtolnay/paste/releases) - [Commits](https://github.com/dtolnay/paste/compare/1.0.8...1.0.9) --- updated-dependencies: - dependency-name: paste dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-01 12:07:53 +00:00
dependabot[bot]	9af93ca9ba	chore(deps): Bump pretty_assertions from 1.2.1 to 1.3.0 (#5517 ) Bumps [pretty_assertions](https://github.com/rust-pretty-assertions/rust-pretty-assertions) from 1.2.1 to 1.3.0. - [Release notes](https://github.com/rust-pretty-assertions/rust-pretty-assertions/releases) - [Changelog](https://github.com/rust-pretty-assertions/rust-pretty-assertions/blob/main/CHANGELOG.md) - [Commits](https://github.com/rust-pretty-assertions/rust-pretty-assertions/compare/v1.2.1...v1.3.0) --- updated-dependencies: - dependency-name: pretty_assertions dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-01 10:20:26 +00:00
dependabot[bot]	00ed79ff1b	chore(deps): Bump thiserror from 1.0.32 to 1.0.33 (#5524 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.32 to 1.0.33. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.32...1.0.33) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-01 09:11:31 +00:00
Nga Tran	cb10a7c6d8	feat: More accurate memory estimate for compaction (#5471 ) * feat: initial implementation of memory estimation for a compaction * feat: estimate size of files and have the right actions for the needed budget * feat: run candidates in parallel * fix: have the right name for the column field of the output struct * feat: add metrics for estimated budgets * chore: cleanup * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * fix: fix syntax after applying review's suggestions * refactor: Convert a Vec to VecDeque to go well with pop and push * chore: remove max_concurrent_size_bytes and input_size_threshold_bytes * chore: remove input_file_count_threshold * test: tests for estimate_arrow_bytes_for_file Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 13:44:44 +00:00
Carol (Nichols \|\| Goulding)	dbd27f648f	refactor: Rename more mentions of Kafka to their other name where appropriate	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	1b49ad25f7	refactor: Rename KafkaTopicId to TopicId	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	58f0b63cdc	refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Carol (Nichols \|\| Goulding)	ab20828c2f	fix: Rename some more comments and test values from sequencer to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	fe9c474620	fix: rustfmt	2022-08-29 14:06:45 -04:00
Jake Goulding	4abf21c724	refactor: Rename Sequencer (and its entourage) to Shard	2022-08-29 14:06:43 -04:00
Marko Mikulicic	4beb721a9a	fix: Revert Bump dotenvy from 0.15.1 to 0.15.2 (#5450 ) (#5455 ) This reverts commit `84acbd2fad`. Closes #5454 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-24 09:10:09 +00:00
dependabot[bot]	84acbd2fad	chore(deps): Bump dotenvy from 0.15.1 to 0.15.2 (#5450 ) Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.1 to 0.15.2. - [Release notes](https://github.com/allan2/dotenvy/releases) - [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md) - [Commits](https://github.com/allan2/dotenvy/commits/v0.15.2) --- updated-dependencies: - dependency-name: dotenvy dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-23 11:24:42 +00:00
pierwill	51141f2c78	docs: Edit catalog API docs (#5409 ) * docs: Edit Catalog docs string * docs: Edit top-level catalog module doc * docs: Mark `sealed` trait w/ `doc(hidden)` * docs: Edit catalog transaction docs * docs: Edit Catolog trait docs * docs: Edit `RepoCollection` docs Clarify concept of repository. Add links. * docs: Add link to `Transaction` Co-authored-by: pierwill <pierwill@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-16 21:36:29 +00:00
Dom Dwyer	180ff9f681	feat: table name in schema validation errors Scopes all schema validation errors to include the table name in the error output.	2022-08-16 19:00:44 +02:00
Carol (Nichols \|\| Goulding)	fc62c82722	feat: Select cold partitions	2022-08-04 16:55:47 -04:00
Marco Neumann	273b3cc165	chore: replace `dotenv` with `dotenvy` (#5285 ) The latter one is a maintained fork. This avoids having both crates after #5282.	2022-08-03 12:41:38 +00:00
dependabot[bot]	94fe5b4c10	chore(deps): Bump paste from 1.0.7 to 1.0.8 (#5280 ) Bumps [paste](https://github.com/dtolnay/paste) from 1.0.7 to 1.0.8. - [Release notes](https://github.com/dtolnay/paste/releases) - [Commits](https://github.com/dtolnay/paste/compare/1.0.7...1.0.8) --- updated-dependencies: - dependency-name: paste dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 09:03:25 +00:00
dependabot[bot]	fbd39844d8	chore(deps): Bump async-trait from 0.1.56 to 0.1.57 (#5247 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.56 to 0.1.57. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.56...0.1.57) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 08:30:33 +00:00
Nga Tran	a2c82a6f1c	chore: remove min sequence number from the catalog table as we no longer use it (#5178 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 20:47:55 +00:00
Nga Tran	69cb3f2b19	refactor: remove min_sequence_number from Compactor and Querier, add `count_by_overlaps_with_level_0` and `count_by_overlaps_with_level_1` to catalog (#5151 ) * refactor: remove min_sequnce_number * fix: typos * fix: remove min_sequencer_number from new files from merging main * fix: add back throwing error if the compactor compacts files persisted by the ingester after the ingester sends max seq_num back to querier * test: add test_compactor_collision back but modify the input to make it work woth new changes Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 13:51:54 +00:00
Nga Tran	c8f4000f04	feat: Select compaction candidates (#5131 ) * feat: initial implementation for selecting compaction candidates * feat: 2 catalog functions to choose the most thorughput partitions to compact and the selecting candidate function itself * test: tests for the new 2 queries * feat: more tests and metrics for chooing compaction candidates * chore: Apply self suggestions from self review * chore: cleanup * chore: fix doc comment * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * refactor: address review comments * fix: get the right time provider for the tests * refactor: remove the left over compaction_ * fix: typos * fix: make the param name and env name consistent * refactor: make relevant iSomething to uSomething * fix: typo Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com>	2022-07-18 18:05:13 +00:00
Jake Goulding	635f535e0e	refactor: replace level_2 with level_1	2022-07-16 21:49:45 -04:00
dependabot[bot]	9b67de2f43	chore(deps): Bump tokio from 1.19.2 to 1.20.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-07-14 01:21:43 +00:00
Carol (Nichols \|\| Goulding)	61c023139b	refactor: Switch compaction levels to an enum with values rather than separate consts Bonuses: - Type checking - Validation - Less casting - Exhaustiveness checking - Less use of the numerical value	2022-07-13 11:30:36 -04:00
Carol (Nichols \|\| Goulding)	80b6c5c82f	fix: Correct typo in constant name so searching for COMPACTION_LEVEL returns all (#5077 )	2022-07-08 16:31:52 +00:00
Carol (Nichols \|\| Goulding)	a96976db46	fix: Start Kafka Partition IDs for default records at 0, not 1 In the all-in-one command, only one write buffer partition is supported, and it's specified using Kafka Partition ID 0: ``` // All-in-one mode only supports one write buffer partition. let write_buffer_partition_range_start = 0; let write_buffer_partition_range_end = 0; ``` When using all-in-one mode with an ephemeral, in-memory catalog, `create_or_get_default_records` is what puts records into the catalog that need to match the write buffer configuration.	2022-07-06 11:00:55 -04:00
Marco Neumann	be53716e4d	refactor: use IDs for `parquet_file.column_set` (#4965 ) * feat: `ColumnRepo::list_by_table_id` * refactor: use IDs for `parquet_file.column_set` Closes #4959. * refactor: introduce `TableSchema::column_id_map`	2022-06-30 15:08:41 +00:00
Marko Mikulicic	16a8d29b9f	fix: Fix typo in const name (#4993 )	2022-06-30 07:51:39 +00:00
Nga Tran	cfcc4b8426	refactor: change level 1 to level 2 preparing for next design changes (#4954 ) * refactor: change level 1 to level 2 preparing for next design changes * fix: make level-2 consistent everywhere * chore: remove unused comments * refactor: change all the name level_1 to level_2 to completely replace 1 with 2 to amke everything consistent * chore: add correspinding constants for the comapction levels in the comments Co-authored-by: Dom <dom@itsallbroken.com>	2022-06-29 14:08:58 +00:00
Marco Neumann	215f297162	refactor: parquet file metadata from catalog (#4949 ) * refactor: remove `ParquetFileWithMetadata` * refactor: remove `ParquetFileRepo::parquet_metadata` * refactor: parquet file metadata from catalog Closes #4124.	2022-06-27 15:38:39 +00:00
Marco Neumann	b9cbb3dfca	refactor: do not use in-parquet IOx metadata in compactor () (#4935 ) refactor: avoid feeding sort key from struct into same struct * feat: allow namespace schema query by ID * refactor: do not use binary parquet file MD in compactor tests * refactor: do not use in-parquet IOx metadata * refactor: reduce number of catalog queries	2022-06-27 08:06:11 +00:00
Nga Tran	92eeb5b232	chore: remove unused sort_key_old from catalog partition (#4944 ) * chore: remove unused sort_key_old from catalog partition * chore: add new line at the end of the SQL file	2022-06-24 15:02:38 +00:00
Marco Neumann	994bc5fefd	refactor: ensure that SQL parquet file column sets are not NULL (#4937 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-24 14:26:18 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Andrew Lamb	005610b172	refactor: remove some `&` use in iox_catalog (#4862 ) * refactor: remove some `&` use in iox_catalog * fix: Update data_types/src/lib.rs	2022-06-15 11:31:49 +00:00
Nga Tran	b682dbbc2e	chore: Add debug info of sort_key for ingester (#4859 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 20:39:17 +00:00
Carol (Nichols \|\| Goulding)	e875a92cf8	feat: Log time spent requesting ingester partitions (#4806 ) * feat: Log time spent requesting ingester partitions Fixes #4558. * feat: Record a metric for the duration queriers wait on ingesters * fix: Use DurationHistogram instead of U64 Histogram * test: Add a test for the ingester ms metric * feat: Add back the logging to provide both logging and metrics for ingester duration * refactor: Use sample_count method on metrics * feat: Record ingester duration separately for success or failure * fix: Create a separate test for the ingester metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 17:58:19 +00:00
Dom Dwyer	b41ea1d718	refactor: PartitionKey type This commit changes the code base to use a new reference-counted PartitionKey type wrapper, instead of passing a bare String around. This allows the compiler to type check & verify usage of the partition key, instead of passing a bare string around. By reference counting the underlying string, we reduce memory usage for some use cases.	2022-06-14 14:47:56 +01:00
kodiakhq[bot]	dd8d44e24f	Merge branch 'main' into cn/duration	2022-06-10 14:23:09 +00:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Marko Mikulicic	c09f6f6bc9	chore: Incrementally migrate sort_key to array type (#4826 ) This PR is the first step where we add a new column sort_key_arr whose content we'll manually migrate from sort_key. When we're done with this, we'll merge https://github.com/influxdata/influxdb_iox/pull/4801/ (whose migration script must be adapted slightly to rename the `sort_key_arr` column back to `sort_key`). All this must be done while we shut down the ingesters and the compactors. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-10 11:35:43 +00:00
Andrew Lamb	50697906b1	refactor: Make `DMLWrite::sequence_number` a `SequenceNumber` (#4817 )	2022-06-09 19:36:37 +00:00
Carol (Nichols \|\| Goulding)	1c7cbaf5ae	refactor: Use DurationHistogram in more places	2022-06-09 14:20:51 -04:00
dependabot[bot]	e03bf94420	chore(deps): Bump tokio from 1.18.2 to 1.19.1 (#4783 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-06 14:15:12 +00:00
dependabot[bot]	9a21292db8	chore(deps): Bump async-trait from 0.1.53 to 0.1.56 (#4774 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.53 to 0.1.56. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.53...0.1.56) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-03 09:10:40 +00:00
Ryan Russell	d279deddad	docs(various): Improve Readability (#4768 ) Signed-off-by: Ryan Russell <git@ryanrussell.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-02 18:01:06 +00:00
Andrew Lamb	dde3c3922c	refactor: use consistent spelling of serialize (#4717 )	2022-05-27 14:42:59 +00:00
Carol (Nichols \|\| Goulding)	077884c925	fix: Remove allow dead_code annotations from undead code	2022-05-06 16:58:02 -04:00
Carol (Nichols \|\| Goulding)	6681298a93	fix: Remove unused dependencies found with cargo-udeps	2022-05-06 14:51:54 -04:00
Carol (Nichols \|\| Goulding)	068096e7e1	fix: Rename data_types2 to data_types	2022-05-06 14:45:39 -04:00
Carol (Nichols \|\| Goulding)	12793bffbf	fix: Move Partition Template types to data_types2	2022-05-06 14:45:36 -04:00
Andrew Lamb	7c7d3fafe9	Merge branch 'main' into dom/schema-cache-warm	2022-04-29 09:11:53 -04:00
Marco Neumann	0a20086a58	feat: expose catalog timeouts via CLI/env (#4472 ) This is useful for local instances that run against a prod system, because port forwarding can lead to long connection delays. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-29 11:14:15 +00:00
Paul Dix	8e48fcd620	feat: add remote pull partition (#4433 ) Add lookup of partitions by table id to catalog. Add API to catalog to return partitions by table id. Add to client to return partitions by table id. Add CLI to pull remote schema, partition, and parquet files into a local catalog and object store.	2022-04-28 21:04:27 +00:00
dependabot[bot]	420c306caa	chore(deps): Bump tokio from 1.17.0 to 1.18.0 (#4453 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-04-28 08:21:17 +00:00
Dom Dwyer	bb8a19b571	feat(iox_catalog): list_schemas() Adds a function to resolve an atomic snapshot of all NamespaceSchema in the catalog with minimal query overhead.	2022-04-27 17:23:28 +01:00
Dom Dwyer	874521da8a	feat(iox_catalog): ColumnRepo::list() Allow all columns in the catalog to be fetched.	2022-04-27 17:21:00 +01:00
Dom Dwyer	eb5abce99e	feat(iox_catalog): TableRepo::list() Allow all tables in the catalog to be fetched.	2022-04-27 17:20:53 +01:00
二手掉包工程师	4b47d723b1	refactor: Rename time to iox_time (#4416 ) Signed-off-by: hi-rustin <rustin.liu@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-26 00:19:59 +00:00
Marco Neumann	86e8f05ed1	fix: make all catalog IDs 64bit (#4418 ) Closes #4365. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-25 16:49:34 +00:00
Dom Dwyer	320f1073e0	fix: revert column service limits (#4179 ) This reverts commit `ea865b63f4`.	2022-04-19 16:08:56 +01:00
Paul Dix	5bf4550259	feat: add object store service to router (#4338 ) Add method to catalog to get parquet file by object store id. Add gRPC service for object store to get a file from by its uuid. Add the object store service to router2 with object store config.	2022-04-16 17:58:31 +00:00
Carol (Nichols \|\| Goulding)	94dcde4996	fix: Do fewer queries for metadata By adding another _with_metadata catalog function. Also introduce a new type rather than passing around tuples everywhere.	2022-04-13 10:43:20 -04:00
Carol (Nichols \|\| Goulding)	bba4251363	fix: Remove duplication in metric name	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	02fee3b84f	feat: Request parquet metadata from the catalog when needed only	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	ec25620b73	feat: Add a catalog method for requesting a parquet file's metadata	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	ee56ebf0e3	feat: Store metadata in catalog, but don't fetch by default	2022-04-13 10:43:19 -04:00
Dom Dwyer	02f87e8484	refactor: reduce level_0 query limit Reduce the query limit from 10,000 to 1,000 to help reduce query execution time.	2022-04-05 15:14:56 +01:00
Paul Dix	81d41f81a1	fix: ingester replay logic (#4212 ) Fix the ingester to track the max persisted sequence number per partition. Ensure replay takes in data from unpersisted partitions. Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.	2022-04-04 18:04:34 +00:00
kodiakhq[bot]	e2439c0a4f	Merge branch 'main' into cn/sort-key-catalog	2022-04-04 16:54:48 +00:00
Dom Dwyer	61bc9c83ad	refactor: add table_id index on column_name After checking the postgres workload for the catalog in prod, this missing index was noted as the cause of unexpectedly expensive plans for simple queries.	2022-04-04 13:04:25 +01:00
dependabot[bot]	dc9632114c	chore(deps): Bump pretty_assertions from 1.2.0 to 1.2.1 (#4213 ) Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.2.0 to 1.2.1. - [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases) - [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md) - [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v1.2.0...v1.2.1) --- updated-dependencies: - dependency-name: pretty_assertions dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-04 10:53:31 +00:00
Carol (Nichols \|\| Goulding)	cbf7888435	feat: Add Partition update_sort_key method to catalog	2022-04-01 15:45:51 -04:00
Carol (Nichols \|\| Goulding)	c9bc70f03a	feat: Add optional sort_key column to partition table Connects to #4195.	2022-04-01 15:45:51 -04:00
Luke Bond	ea865b63f4	fix: create_or_get_multi for column in catalog now enforces limits (#4179 ) * fix: create_or_get_multi for column in catalog now enforces limits fix: create_or_get_multi for column in catalog now enforces limits chore: reorder catalog column create fns to be next to each other test: add failing test for multi col insert w/ limits test: bend catalog mem impl to match postgres for tests fix: postgres column insert many column type error checks chore: clippy * test: assert column counts in partial column insert test * chore: add some sql comments to the monster multicolumn insert query; s/RIGHT/INNER/ join * chore: adding comments to clarify partial failure behaviour of multi col insert * test: add tests for create_or_get_many columns in catalog * test: forgot how macros work for a moment * test: service limit test handles partial update of cols Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-01 10:59:43 +00:00
Paul Dix	6479e1fc8e	fix: add indexes to parquet_file (#4198 ) Add indexes so compactor can find candidate partitions and specific partition files quickly. Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-01 09:59:39 +00:00
Nga Tran	ddc2c8304f	fix: have the compaction level set correctly (#4184 ) * fix: have the compaction level set correctly, especially for compacted file from the compactor * fix: typo	2022-03-30 21:23:40 +00:00
Paul Dix	04d961e70d	feat: wire up compactor scheduler and config (#4139 ) Add configuration options for compactor for the max size of level 0 files and split percentage. Add metrics for compaction to track the number of candidates, compactions, and durations. Add functions to separate identifying partitions to compact from running compaction. Make compaction run in smaller chunks, specifically per partition. Update compaction to automatically promote level 0 files that are non-overlapping without waiting some period of time. Closes #4120 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 17:45:24 +00:00
Marko Mikulicic	2c47d77a5b	fix: Backfill namespace_id in schema migration (#4177 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 16:31:26 +00:00
Carol (Nichols \|\| Goulding)	79447aed33	fix: Logical merge conflict, missing namespace_id in test setup	2022-03-29 08:28:51 -04:00
Carol (Nichols \|\| Goulding)	5c8a80dca6	fix: Add an index to parquet_file to_delete	2022-03-29 08:15:26 -04:00
Carol (Nichols \|\| Goulding)	f3f792fd08	feat: Add namespace_id to the parquet_files table; object store paths need it	2022-03-29 08:15:26 -04:00
Carol (Nichols \|\| Goulding)	39a1d1b26f	feat: Delete parquet files marked to be deleted before a specified time Connects to #3954.	2022-03-29 08:13:06 -04:00
Nga Tran	80b7e9cce1	feat: delete fully processed tombstones & integration tests for find_and_compact (#4116 ) * feat: remove fully processed tombstones * test: first few tests * fix: delete SQL * fix: test how IN (...) works in PG * fix: test how IN (?) works in PG * fix: test how IN (?) works in PG * fix: dynamically add IN (?, ?, ...) * fix: dynamically add IN (?, ?, ...) & its dynamic values * fix: add argument directly in the SQL * test: more tests for catalog read and update functions * chore: move a subfunction to make it easier to read) * test: first test for find_can_compact but disabled due to bug * test: integration tests and a bug fix for find_and_compact * chore: cleanup * refactor: address review comments * fix: put 2 delete processed tombstones and tombstones in a transaction	2022-03-28 18:35:54 +00:00
dependabot[bot]	4f9515ffba	chore(deps): Bump async-trait from 0.1.52 to 0.1.53 (#4141 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.52 to 0.1.53. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.52...0.1.53) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-28 08:55:24 +00:00
dependabot[bot]	e5bbc74f7a	chore(deps): Bump paste from 1.0.6 to 1.0.7 (#4140 ) Bumps [paste](https://github.com/dtolnay/paste) from 1.0.6 to 1.0.7. - [Release notes](https://github.com/dtolnay/paste/releases) - [Commits](https://github.com/dtolnay/paste/compare/1.0.6...1.0.7) --- updated-dependencies: - dependency-name: paste dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-03-28 08:44:01 +00:00
Dom Dwyer	8e85846db6	refactor: lowercase error messages Lowercases the error messages in the big iox_catalog Error enum for better composition of messages (no random capitalisation in glued-together strings, which is common with wrapped errors).	2022-03-25 11:33:27 +00:00
Carol (Nichols \|\| Goulding)	67e13a7c34	fix: Change to_delete column on parquet_files to be a time (#4117 ) Set to_delete to the time the file was marked as deleted rather than true. Fixes #4059. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-23 18:47:27 +00:00
Carol (Nichols \|\| Goulding)	2749c37d02	fix: Query for tombstones in a time range, not for a particular parquet file The compactor at this point is still querying for each file; this is an intermediate step	2022-03-23 09:52:00 -04:00
Carol (Nichols \|\| Goulding)	87dc2981f6	feat: Query for tombstones relevant to a parquet file Connects to #3948.	2022-03-23 09:52:00 -04:00
Marco Neumann	55643945a1	refactor: `querier` w/o `db` (#4063 ) * feat: `TombstoneRepo::list_by_table` * feat: `ParquetFileRepo::list_by_table_not_to_delete` * refactor: `querier` w/o `db` Get the `querier` to work w/o relying on `db`. A few notes: - Testing is kinda shallow, we really need to get `query_tests` working w/ `querier` (see #3934). - We still run a sync loop for namespaces, tables and schemas. This will be a replaced by "update namespace incl. tables and schemas on demand". Note however that we cannot fetch single tables and schemas on demand at the moment, because DataFusion doesn't implement async schema inspection (only `scan` / "give me all the chunks" is async). I think that's OK for now and we can address this later. - There is NO cache for parquet files and tombstones at the moment. For correctness, they need to be fetched in a single transaction (or we need a kinda tricky sequence number / logical clock tracking) and I am not sure yet how this makes sense when we have the ingester data wired up and predicates pushed down to the catalog (see next point). So let's measure first and then decide on a caching strategy for this. - Predicates are currently NOT pushed down to the catalog. I'll need to figure out how to extract time range from generic DataFusion expressions to make that work (it's easier for InfluxRPC queries, but they are not tested at the moment, see first point). Sorry that this commit is kinda huge. I initially planned to only migrate the chunks away from `db` and leave the tables and schemas for a follow-up PR, but the DataFusion trait structure (chunks are bound to their tables) makes this kinda pointless. Closes #3974. * docs: explain what we're doing Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: mention tracking issues * docs: explain what we're doing Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-03-21 16:58:00 +00:00
Carol (Nichols \|\| Goulding)	8fd3d85634	refactor: Move add_parquet_file_with_tombstones from ingester to compactor	2022-03-21 10:16:57 -04:00
Marco Neumann	0779f81b6b	refactor: rework `TableCache (#4054 ) * feat: `TableRepo::get_by_namespace_and_name` * refactor: rework `TableCache` - dual cache that can also map table names to IDs - deal w/ missing tables w/o panics - set proper timeouts to missing data For #3974. * test: extend table cache tests	2022-03-21 13:40:06 +00:00
Luke Bond	da517bd8e2	feat: impl table & column limits in catalog (#3832 ) fix: refactor table & col limit enforcement in catalog into single SQL statement fix: borked rebase Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-18 13:54:07 +00:00
Dom Dwyer	0d4949cd1b	refactor: lower pg idle connection timeout Configure the postgres catalog to close unused connections after 1 minute, rather than 500s to introduce a bit of fluidity to pool of connection acquires.	2022-03-17 13:44:59 +00:00
dependabot[bot]	3f0f090c4e	chore(deps): Bump pretty_assertions from 1.1.0 to 1.2.0 (#4024 ) Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.1.0 to 1.2.0. - [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases) - [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md) - [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/commits) --- updated-dependencies: - dependency-name: pretty_assertions dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-14 10:33:27 +00:00
Carol (Nichols \|\| Goulding)	268138ceef	fix: Make SQL queries more consistent - Use "SELECT *" when possible - Left align - Wrap at 100 chars - Include semicolon	2022-03-13 20:28:12 -04:00
Carol (Nichols \|\| Goulding)	8888e4c3a2	fix: Remove MAX_COMPACT_SIZE from the compaction queries	2022-03-13 20:09:30 -04:00
Carol (Nichols \|\| Goulding)	1dacf567d9	feat: Add a function to the catalog to fetch level 1 parquet files Fixes #3946.	2022-03-11 15:40:34 -05:00
Carol (Nichols \|\| Goulding)	f184b7023c	feat: Update specified parquet file records to compaction level 1 Fixes #3950.	2022-03-11 15:34:40 -05:00
Carol (Nichols \|\| Goulding)	fabd262442	feat: Add a function to the catalog to fetch level 0 parquet files Connects to #3946.	2022-03-11 15:34:05 -05:00
Carol (Nichols \|\| Goulding)	ecd06c6ec3	fix: ParquetFileRepo create should be responsible for setting INITIAL_COMPACTION_LEVEL When created in the catalog, parquet files should always have compaction level 0. Updating the compaction level should always happen in the compactor. Only the catalog should need to know about the initial compaction level value.	2022-03-10 13:51:18 -05:00
Carol (Nichols \|\| Goulding)	ff31407dce	refactor: Extract a ParquetFileParams type for create This has the advantages of: - Not needing to create fake parquet file IDs or fake deleted_at values that aren't used by create before insertion - Not needing too many arguments for create - Naming the arguments so it's easier to see what value is what argument, especially in tests - Easier to reuse arguments or parts of arguments by using copies of params, which makes it easier to see differences, especially in tests	2022-03-10 13:51:18 -05:00
Paul Dix	27999ff72f	feat: add compaction_level and created_at to parquet_file (#3972 )	2022-03-10 15:56:57 +00:00
Carol (Nichols \|\| Goulding)	1f474bfbf0	test: Create the test database before running postgres iox_catalog tests	2022-03-09 10:43:30 -05:00
Carol (Nichols \|\| Goulding)	8af2f60b59	fix: Run catalog setup as part of end-to-end test setup	2022-03-09 09:55:43 -05:00
Carol (Nichols \|\| Goulding)	93b0cdbcc4	fix: Create the test database as part of ng server fixture startup	2022-03-09 09:55:43 -05:00
Carol (Nichols \|\| Goulding)	880344494a	fix: Remove reference to AWS from postgres test comment	2022-03-09 09:55:42 -05:00
kodiakhq[bot]	caba70f871	Merge branch 'main' into cn/not-database-url	2022-03-09 13:32:02 +00:00
Dom Dwyer	d31576b90c	perf: get_table_persist_info indexes for joins Adds indexes to the JOINed fields to reduce execution cost, as the TableRepo::get_table_persist_info() is currently by far the most expensive catalog operation.	2022-03-08 12:12:47 +00:00
Marco Neumann	db3f1e8db7	feat: wire up tombstones into querier (#3962 ) * feat: `TombstoneRepo::list_by_namespace` * test: model sequencer properly * feat: wire up tombstones into querier Closes #3932. * refactor: `override_delete_predicates` => `set_delete_predicates`	2022-03-08 10:06:22 +00:00
Carol (Nichols \|\| Goulding)	4765e447e3	chore: Wrap markdown at 100 columns	2022-03-07 11:02:58 -05:00
Carol (Nichols \|\| Goulding)	4dacf0d68f	fix: Instead of using DATABASE_URL, use INFLUXDB_IOX_CATALOG_DSN and TEST_INFLUXDB_IOX_CATALOG_DSN	2022-03-07 11:02:58 -05:00
Marco Neumann	8d00aaba90	feat: sync chunks in querier (#3911 ) * feat: `ParquetFileRepo::list_by_namespace_not_to_delete` * feat: `ChunkAddr: Clone` * test: ensure that querier keeps same partition objects * test: improve `create_parquet_file` flexibility * feat: sync chunks in querier * test: improve `test_parquet_file`	2022-03-04 08:53:39 +00:00
Paul Dix	6ba5e51897	feat: update max_persisted_sequence_number in the buffered table on persist (#3868 ) This includes a bit of a refactor in the locking structure of the buffer data. Locking at the partition collection and within the partition data was making things more complex than they needed to be. The partitions in the buffer are there only temporarily until they get persisted. Locking on the table simplifies things a bit and makes it more clear when the table state is being modified since it no longer has any interior mutability. Having access to separate partitions without the same lock isn't something we need because queries will hit all partitions and data is brought in sequentially, regardless of which partition it is hitting in a sequencer. Fixes #3850	2022-03-03 23:52:31 +00:00
Andrew Lamb	677a272095	refactor: Clean up some future clippy warnings from nightly (#3892 ) * refactor: clean up new clippy lints * refactor: complete other cleanups * fix: ignore overzealous clippy * fix: re-remove old code	2022-03-03 19:14:27 +00:00
kodiakhq[bot]	04a7a957fe	Merge branch 'main' into dom/rustls	2022-03-03 11:59:40 +00:00
Dom Dwyer	8de453edd1	feat: batch column upsert for schema validation Uses the new ColumnRepo::create_or_get_many() catalog method to perform a bulk upsert of (potentially) new columns to the catalog during schema validation.	2022-03-03 11:18:29 +00:00
Dom Dwyer	da145ffbe4	feat: batch upsert of columns to catalog Adds ColumnRepo::create_or_get_many() to upsert multiple columns in one round trip to Postgres.	2022-03-03 11:17:30 +00:00
Dom Dwyer	89dba1b7d1	docs: metric decorate macro Some helpful tips for using the metric decorator's decorate! macro when modifying the catalog traits.	2022-03-03 11:14:57 +00:00
Dom Dwyer	46bb107be4	refactor: use rustls Removes openssl as a dependency, switching to rustls[1] as the TLS implementation throughout. It is important to note that this change brings with it a significant behavioural difference - rustls does not currently support IP SANs in certificates (instead only supporting fully-qualified names / DNS) and this will manifest as a failure to connect to IP endpoints over TLS. This might be a blocker that prevents us using rustls exclusively, but there's noe asy way to know without trying it. Fortunately the rustls project has received funding to work on IP SAN support[2]. [1]: https://github.com/rustls/rustls [2]: https://www.abetterinternet.org/post/preparing-rustls-for-wider-adoption/	2022-03-03 11:05:20 +00:00
Carol (Nichols \|\| Goulding)	8f3e44bf76	refactor: Extract a crate for shared data types in the new design	2022-03-02 12:16:15 -05:00
Marco Neumann	2fd68ea75f	feat: sync tables and schemas in querier (#3895 ) * feat: convert `iox_catalog` schema to `schema::Schema` * fix: remove leftover println statements * feat: sync tables and schemas in querier * feat: `PartitionRepo::list_by_namespace` * docs: explain `QuerierNamespace` data structs a bit * refactor: improve variable naming * test: extend `test_sync_schemas * fix: do not block forever when namespace is gone	2022-03-02 15:32:03 +00:00
Marko Mikulicic	1d6200a80d	fix: make catalog db migration idempotent again (#3894 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-02 13:56:42 +00:00
Marco Neumann	48722783f9	feat: offer metrics for in-mem catalog (#3876 ) This can be quite helpful to test certain caching behavior w/o writing yet-another abstraction layer.	2022-03-01 11:33:54 +00:00
Nga Tran	0e0dc500f6	feat: prepare data to send to querier (#3825 ) * feat: changes needed to apply tombstones correctly on the life-cycle ingest bacthes * refactor: adjust the design after discussing with Paul * feat: apply the coming tombstone on all data but persiting one * chore: fmt * fix: build on buffer tombstone * test: delete & write tests for a parition and some cleanup * feat: No need add processed tombstones for newly created parquet file in the ingester becasue all deletes before that parquet file is created were applied * chore: cleanup * feat: intitial implementation for preparing data to send back to the Querier * feat: full implementation of prepare_data_to_querier * fix: apply filters for the batches * chore: Apply suggestions from code review Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> * chore: cleanup * fix: typos in comments * fix: typos in comments * fix: typos in comments * test: create different scenarios and test them * chore: fix typos * test: add tests with deletes * chore: make pub pub(crate) * chore: Apply suggestions from code review Co-authored-by: Jake Goulding <jake.goulding@integer32.com> * refactor: address review comments * fix: keep batches in their arrival order * refactor: not assign unecessary values to enum * refactor: use bitflags enum * fix: use bitflags correctly * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: avoid using use at the end of the function * chore: merge main to branch * fix: fix downgrade versions * refactor: address review comments * chore: remove unnecessary comments * refactor: Make the whole test_utils module test-only and bring paths into module scope Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: Carol (Nichols \|\| Goulding) <193874+carols10cents@users.noreply.github.com> Co-authored-by: Jake Goulding <jake.goulding@integer32.com> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com>	2022-03-01 01:00:45 +00:00
Marco Neumann	6e2470bf5f	feat: create CatalogChunk in querier (#3862 ) * feat: `NamespaceRepo::get_by_id` * feat: create `CatalogChunk` in querier	2022-02-28 17:20:38 +00:00
Marco Neumann	b213796c98	feat: sync namespaces in querier (#3865 ) * feat: `NamespaceRepo::list` * feat: sync namespaces in querier Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-28 15:01:28 +00:00
Dom Dwyer	3579252682	refactor: configurable max catalog connections Allow the maximum number of catalog (postgres) connections to be specified as part of the catalog configuration.	2022-02-25 11:23:21 +00:00
Dom Dwyer	b07f15bec7	refactor: parallel column resolution A quick change to perform the ColumnRepo::create_or_get() calls in parallel (up to a maximum of 3 in-flight at any one time) in order to mitigate the latency of the call and reduce the overall schema validation call duration. The in-flight limit is enforced to avoid starving the DB connection pool of connections.	2022-02-24 21:04:25 +00:00
Paul Dix	8571c132cc	feat: add method to get table persistence information from catalog (#3848 )	2022-02-24 16:18:14 +00:00
Carol (Nichols \|\| Goulding)	252ced7adf	feat: Add row count to the parquet_file record in the catalog (#3847 ) Fixes #3842.	2022-02-24 15:20:50 +00:00
Marco Neumann	d62a052394	feat: extend catalog so we can recover `ParquetChunk`s from it (#3852 ) * refactor: less parquet data copying * feat: `PartitionRepo::get_by_id` * feat: `TableRepo::get_by_id` * feat: `ParquetFile::file_size_bytes` * feat: `ParquetFile::parquet_metadata` Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-24 13:16:15 +00:00
Dom Dwyer	aaf8951927	feat: instrument postgres catalog impl Wraps the postgres implementation of the catalog with a MetricDecorator. This is slightly intrusive, with the metrics registry being pushed into the PostgresCatalog type in order to decorate the impls returned when calling the Catalog::repositories() and Catalog::start_transaction() methods (rather than being a pure decorator) in order to use static dispatch and let the compiler optimise away as much overhead as possible.	2022-02-23 14:34:26 +00:00
Dom Dwyer	a0b43323d6	feat: catalog instrumentation decorator Adds a MetricDecorator type that wraps a Catalog impl, delegating calls to the underlying impl and recording per-method call latency broken down by success/error state.	2022-02-23 14:34:26 +00:00
Marco Neumann	10a1670341	test: catalog `maybe_skip_integration!` and `should_panic` (#3834 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-23 13:25:30 +00:00
Paul Dix	276d9b123a	feat: Add min_sequence_number tracking for sequencers in ingester (#3785 ) Fixes #3702. This pulls the min sequence tracking into the LifecycleManager. Because the number requires looking at all other partitions in memory, this was the most efficient place to put it. The manager updates the sequencer state after it calls persist. The number is meant to be a lower bound on the sequence number. Issue #3783 will add functionality for the ingester to ignore replayed data that has already been persisted.	2022-02-22 21:53:33 +00:00
Nga Tran	a91e2eadc7	feat: apply tombstones to the batches of the ingest life-cycle (#3770 ) * feat: changes needed to apply tombstones correctly on the life-cycle ingest bacthes * refactor: adjust the design after discussing with Paul * feat: apply the coming tombstone on all data but persiting one * chore: fmt * fix: build on buffer tombstone * test: delete & write tests for a parition and some cleanup * feat: No need add processed tombstones for newly created parquet file in the ingester becasue all deletes before that parquet file is created were applied * chore: cleanup Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-22 18:54:21 +00:00
Luke Bond	e19609ab7b	feat: routing service protection (#3807 ) * chore: db migration for namespace table & column limits * feat: impl table & column limits in catalog * chore: improved comment in catalog Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-22 17:26:37 +00:00
dependabot[bot]	ad3868ed7c	chore(deps): Bump tokio from 1.16.1 to 1.17.0 (#3814 ) * chore(deps): Bump tokio from 1.16.1 to 1.17.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.16.1 to 1.17.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.16.1...tokio-1.17.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * build: update workspace-hack Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom Dwyer <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-22 16:27:43 +00:00
Dom Dwyer	2500dc3ac7	fix: add public to schema search path The _sql_migrations table cannot be created by the `catalog setup` bootstrap command because it is created before the migrations run, one of which creates the iox_catalog namespace the catalog operates in. This change allows the _sql_migrations table to be created in the public schema to start the migration process, with everything else living in the iox_catalog schema.	2022-02-21 11:58:55 +00:00
Luke Bond	7969ec682c	feat: catalog now supports dsn-file:// (#3787 ) * feat: catalog now supports dsn-file:// * chore: rename fn in catalog PG hotswap code	2022-02-18 16:42:55 +00:00
Dom Dwyer	568a15510e	fix: assert on conflicting tombstone creation If two callers call create_or_get() for a tombstone, providing the same (table_id, sequencer_id, sequence_number) triplet but a different predicate / timestamps the catalog MUST NOT silently continue. As this is unexpected, this behaviour causes a panic.	2022-02-17 16:29:55 +00:00
Dom Dwyer	0cc5c979c6	fix: assert on conflicting partition creation If two callers call create_or_get() for a partition, providing the same partition key & table ID, but different sequence numbers the catalog MUST NOT continue silently. As this is unexpected, this behaviour causes a panic.	2022-02-17 16:29:55 +00:00
Dom Dwyer	0796bd079b	test(iox_catalog): assert no spurious writes This commit introduces test cases that ensure that for each of the tombstone & partition repos: * create_or_get() is idempotent * create_or_get() does not silently drop conflicting writes The former covers the expected use case: callers issuing potentially multiple calls to create_or_get() with the same args, causing the catalog impl to transparently turn the subsequent calls into NOPs. The latter illustrates an issue where multiple calls to create_or_get() with differing arguments is silently accepted by the catalog, causing both callers to believe they have committed the (differing) details they provided. This test is expected to pass but fails in this commit.	2022-02-17 16:29:55 +00:00
Dom Dwyer	4d54f8b42c	refactor: remove migration create schema	2022-02-17 14:41:32 +00:00
Dom Dwyer	44e9eaf92b	test(iox_catalog): isolated catalog tests This commit changes the iox_catalog test harness so that each test is run in an independent, randomly generated schema to avoid concurrent tests interfering with each other. Each test creates a randomly-named schema, grants permissions to the new schema, runs the full migration stack against the new schema, and then executes the code under test.	2022-02-17 14:19:01 +00:00
Dom Dwyer	3b378418f7	refactor: do not specify schema in migrations Allow the caller to set the Postgres schema a migration should be applied to, rather than restricting the migration to a specific, hard-coded schema. BREAKING CHANGE: manually adds a new migration that precedes the existing migration to ensure the iox_catalog schema exists before applying the migration. You'll probably have to drop any existing databases and migrate from scratch: sqlx database drop; sqlx database create;	2022-02-17 14:15:58 +00:00
Marko Mikulicic	4655d4b4f1	docs: Improve iox_catalog testing docs (#3760 ) * docs: Improve iox_catalog testing docs * fix: Update iox_catalog/README.md Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: Dom <dom@itsallbroken.com>	2022-02-16 10:23:53 +00:00
Luke Bond	a66e29e5b3	chore: port sqlx-hotswap-pool over from conductor (#3750 ) * chore: port sqlx-hotswap-pool over from conductor Co-authored-by: Marko Mikulicic <mkm@influxdata.com> * chore: workspace hack fixes * fix: unique schema per test db connection * fix: adjust search path in catalog pg tests to see if it fixes test schema issue * fix: actually fixed sqlx hotswap pool test Co-authored-by: Marko Mikulicic <mkm@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-15 16:18:36 +00:00
Marco Neumann	9e7a27b344	fix: default Kafka topic name is `iox-shared` (#3747 ) Do NOT use underscores in the Kafka topic because this is not supported by Kafka. This was initially fixed by #3555 but reverted by #3623.	2022-02-15 12:34:46 +00:00
Marco Neumann	c6e374a025	feat: allow catalog access w/o a transaction (#3735 ) * feat: allow catalog access w/o a transaction Now the caller has the full control if they want to use a transaction or not. * fix: remove non-transaction-safe `create_many` * fix: remove unnecessary transactions	2022-02-15 10:15:36 +00:00
Nga Tran	d3bd03e37a	feat: Support Projection Pushdown for a QueryableBatch (#3712 ) * feat: projection pushdown for QueryableBatch * chore: clean up and remove unwrap * fix: Add Sync to a Snafu source to have the code compile * chore: cleanup and add comments for tests * refactor: Add tests for scanning non existing columns and fix related bugs * chore: modify comment to trigger auto check in github work	2022-02-10 19:29:21 +00:00
Marco Neumann	5de4d6203f	refactor: catalog transaction (#3660 ) * refactor: catalog Unit of Work (= transaction) Setup an inteface to handle Units of Work within our catalog. Previously both the Postgres and the in-mem backend used "mini-transactions on demand". Now the caller has a clear way to establish boundaries and gets read and write isolation. A single `Arc<dyn Catalog>` can create as many `Box<dyn UnitOfWork>` as you like, but note that depending on the backend you may not scale infinitely (postgres will likely impose certain limits and the in-mem backend limits concurrency to 1 to keep things simple). * docs: improve wording Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: rename Unit of Work to Transaction * test: improve `test_txn_isolation` * feat: clearify transaction drop semantics Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-08 13:38:33 +00:00
Paul Dix	245676ff5d	feat: add method to catalog to get its partition info by id (#3650 ) This is a little bit specific for how things are structured in IngesterData right now. Easy enough to take back out later if/when we restructure.	2022-02-07 20:57:25 +00:00
Nga Tran	17fbeaaade	feat: insert the persisted info into the catalog in one transaction (#3636 ) * feat: add ProcessedTombstoneRepo * feat: add function add_parquet_file_with_tombstones * fix: remove unecessary use * feat: handling transaction when adding parquet file and its processed tombstones * feat: tests update catalog for parquet file and processed tombstones * fix: make add parquet file & its processed tombstones fully transactional * chore: cleanup * test: add integration tests for new catalog update functions * chore: remove catalog_update.rs * chore: cleanup * fix: assert the right values * fix: create unique namespace * fix: support non transaction create_many * test: remove tests that do not work in a transaction * fix: one more case with unique namespace * chore: more verification around for better understanding why certain tests fail * fix: compare difference rather than absolute becasue the DB already has data * fix: fix the argument provided to SQL * fix: return non-empty processed tombstones * fix: insert the right parquet file * chore: remove unsed file Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-07 14:44:15 +00:00
Dom Dwyer	0fd122e365	refactor: "inf" retention const Adds the iox_catalog::INFINITE_RETENTION_POLICY constant.	2022-02-04 15:35:33 +00:00
Paul Dix	ce46bbaada	feat: wire up the write buffer to the ingester process (#3533 ) This adds the scaffolding for the ingester server to consume data from Kafka. This ingests data in an in memory structure while creating records in the catalog for any partitions that don't yet exist. I've removed catalog_update.rs in ingester for now. That was mostly a placeholder and will be going in a combination of handler.rs and data.rs on my next PR which will have some primitive lifecycle wired up. There's one ugly bit here where the DML write is cloned because it's getting borrowed to output spans and metrics. I'll need to follow up with a refactor to make it so that the DML write's tables can be consumed without it gumming up the metrics stuff. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-03 11:47:28 +00:00
Dom Dwyer	f9f9beac36	refactor: get_schema_by_name no Option Fetching a NamespaceSchema always succeeds and never returns None.	2022-02-02 13:04:53 +00:00
Carol (Nichols \|\| Goulding)	ea18c71e6d	feat: Create an object store path for a new parquet file	2022-01-31 10:36:32 -05:00

... 3 4 5 6 7 ...

478 Commits (196c589ef64f73677eb3e89e60b219f862bde19a)