influxdb

Commit Graph

Author	SHA1	Message	Date
Dom Dwyer	85d6efafe1	refactor: snapshot_to_persisting redundant ID Partition::snapshot_to_persisting() passes the ID of the partition it is calling `snapshot_to_persisting()` on. The partition already knows what its ID is, so at best it's redundant, and at worst, inconsistent with the actual ID.	2022-09-16 17:08:08 +02:00
dependabot[bot]	099dda430e	chore(deps): Bump digest from 0.10.3 to 0.10.5 (#5655 ) Bumps [digest](https://github.com/RustCrypto/traits) from 0.10.3 to 0.10.5. - [Release notes](https://github.com/RustCrypto/traits/releases) - [Commits](https://github.com/RustCrypto/traits/compare/digest-v0.10.3...digest-v0.10.5) --- updated-dependencies: - dependency-name: digest dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Dom <dom@itsallbroken.com>	2022-09-16 14:03:54 +00:00
kodiakhq[bot]	2c9f9ec52b	Merge pull request #5657 from influxdata/dom/per-partition-read-offset perf: O(1) partition persist mark discovery	2022-09-16 12:13:31 +00:00
Dom Dwyer	ce0d189260	perf: O(1) partition persist mark discovery Changes the ingest code path to eliminate scanning the parquet_files table to discover the last persisted offset per partition, instead utilising the new persisted_sequence_number field on the Partition itself to read the same value. This lookup blocks ingest for the shard, so removing the expensive query from the ingest hot path should improve catch-up time after a restart/deployment.	2022-09-16 14:06:42 +02:00
kodiakhq[bot]	69c9e7b5ff	Merge pull request #5650 from influxdata/cn/partition-estimates-size refactor: Clear up responsibilities of different parts of the compactor	2022-09-15 18:50:50 +00:00
kodiakhq[bot]	1c0b6997c1	Merge branch 'main' into cn/partition-estimates-size	2022-09-15 18:43:36 +00:00
Carol (Nichols \|\| Goulding)	f5497a3a3d	refactor: Extract a conversion for convenience in tests	2022-09-15 12:48:36 -04:00
kodiakhq[bot]	609707c2d5	Merge pull request #5652 from influxdata/dom/nullable-partition-persist refactor(db): NULLable persisted_sequence_number	2022-09-15 16:27:37 +00:00
kodiakhq[bot]	4fe5311d8b	Merge branch 'main' into dom/nullable-partition-persist	2022-09-15 16:20:54 +00:00
Dom Dwyer	66bf0ff272	refactor(db): NULLable persisted_sequence_number Makes the partition.persisted_sequence_number column in the catalog DB NULLable. 0 is a valid persisted sequence number.	2022-09-15 18:19:39 +02:00
Marco Neumann	e346433914	refactor: concurrent table scan for "table names" (#5649 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 15:39:00 +00:00
Carol (Nichols \|\| Goulding)	dcab9d0ffc	refactor: Combine relevant data with the FilterResult state This encodes the result directly and has the FilterResult hold only the relevant data to the state. So no longer any need to create or check for empty vectors or 0 budget_bytes. Also creates a new type after checking the filter result state and handling the budget, as actual compaction doesn't need to care about that. This could still use more refactoring to become a clearer pipeline of different states, but I think this is a good start.	2022-09-15 11:13:18 -04:00
Carol (Nichols \|\| Goulding)	e57387b8e4	refactor: Extract an inner function so partition isn't needed in tests	2022-09-15 11:10:14 -04:00
Carol (Nichols \|\| Goulding)	a284cebb51	refactor: Store estimated bytes on the CompactorParquetFile	2022-09-15 11:10:14 -04:00
Carol (Nichols \|\| Goulding)	70094aead0	refactor: Make estimating bytes a responsibility of the Partition Table columns for a partition don't change, so rather than carrying around table columns for the partition and parquet files to look up repeatedly, have the `PartitionCompactionCandidateWithInfo` keep track of its column types and be able to estimate bytes given a number of rows from a parquet file.	2022-09-15 11:10:14 -04:00
kodiakhq[bot]	f718cfd71c	Merge pull request #5648 from influxdata/dom/per-partition-persist-markers feat: store per partition persist markers	2022-09-15 14:59:33 +00:00
Dom Dwyer	f4cc9a6984	docs: partition persist visibility invariants Document the invariants (and non-invariants) of Partition.persisted_sequence_number.	2022-09-15 16:10:35 +02:00
Dom Dwyer	234d460fcb	chore: rename update_persisted_sequence_number fn	2022-09-15 16:10:35 +02:00
Dom Dwyer	f91d802107	feat: store per-partition persist markers Changes the ingester to record the per-partition, maximum persisted sequencer offsets to the catalog. This will enable quick O(1) lookup in the future, but the currently persisted value is only used to assert the per-partition monotonic persist ordering invariant.	2022-09-15 16:10:35 +02:00
Dom Dwyer	300938f858	refactor: assert partition persistence ordering Assert the per-shard / per-partition persistence watermarks monotonically increase, and document the invariant. NOTE: this is not a new invariant, just a new assertion to validate it.	2022-09-15 16:10:35 +02:00
Dom Dwyer	d199a83355	feat(catalog): per-partition persist mark API Adds the "persisted_sequence_number" field to the Partition model, and updates the catalog API to read & update it.	2022-09-15 16:10:35 +02:00
Dom Dwyer	c5ac17399a	refactor(db): persist marker for partition table Adds a migration to add a column "persisted_sequence_number" that defines the inclusive upper-bound on sequencer writes materialised and uploaded to object store for the partition.	2022-09-15 16:10:35 +02:00
Marco Neumann	159250e776	refactor: concurrent table planning in InfluxRPC (#5647 ) * refactor: concurrent table planning in InfluxRPC Some InfluxRPC can scan multiple tables. Prior to this PR we were always scanning the tables in sequence, adding up potential latencies (catalog, ingester, object store). There is no reason we need to do this, "ordinary" SQL queries would not serialize this way either. So let's scan tables concurrently. This add concurrency to: - read filter - read group - read window aggregate There are other query types that could benefit from a similar treatment. They will be changed in a follow-up. * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * test: explain `Send` assertion * refactor: change `CONCURRENT_TABLE_JOBS` to 10 Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-09-15 13:55:22 +00:00
Marco Neumann	513fdf1e26	feat: split "pruned" metric into "early" and "late" (#5645 ) * feat: split "pruned" metric into "early" and "late" * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: explain `PruningMetrics` * test: try to test pruning Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-09-15 13:42:00 +00:00
Marco Neumann	f7b6f81fe1	feat: concurrent chunk creation (#5646 ) Create chunks in querier concurrently after we've pre-filtered them. Chunk creation still may require a bit of cached information (e.g. the partition sort key) and we can easily fetch these concurrently instead of in order. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 12:30:02 +00:00
kodiakhq[bot]	1bac7792db	Merge pull request #5644 from influxdata/dom/split-data refactor: hoist per-partition persistence watermark from buffer	2022-09-15 09:32:05 +00:00
Dom	f84ca2a44f	Merge branch 'main' into dom/split-data	2022-09-15 09:58:31 +01:00
Stuart Carnie	e5d8f23fcd	chore: Remove variants from Identifier and BindParameter types (#5642 ) * chore: Remove variants from Identifier and BindParameter types This simplifies usage of these types. Display traits have been updated to properly quote and escape the output, when necessary. * chore: Fix docs	2022-09-15 06:52:31 +00:00
Nga Tran	7c4c918636	chore: add parttion id into panic message (#5641 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 02:21:13 +00:00
Stuart Carnie	e6f2a105e5	feat: Improved InfluxQL error messages (#5632 ) * chore: Drive by to improve tests and coverage * chore: Make Error generic, so we can change it * chore: change visibility pub(crate) is superfluous, as we are yet to specify which APIs are public outside the crate in lib.rs * chore: Introduce crate IResult type In preparation of adding custom error type * feat: Initial implementation of custom error type * chore: Add module docs * chore: Rename IResult → ParseResult; syntax and expect errors * chore: ParserResult and error refactoring * chore: Drive by simplification * feat: Add custom errors to string parsing * feat: Added public API to parse a set of statements * chore: Errors are dyn Display to convey their intent Errors from the parser are only displayable messages. * chore: Separate SHOW for improved error handling By moving SHOW to a separate parser, we can display clearer error messages when consuming SHOW followed by an unexpected token. * chore: Docs and cleanup * chore: Add tests and a specific `ParseError` type The fields are intentionally not public yet, as we would like clients of the package to display the message only. * chore: PR feedback to improve the `ORDER BY` error message	2022-09-15 00:19:03 +00:00
kodiakhq[bot]	a5aa871ff8	Merge pull request #5639 from influxdata/cn/always-get-extra-info refactor: Move fetching of table columns, extra partition info into the method	2022-09-14 17:08:57 +00:00
kodiakhq[bot]	08e2523295	Merge branch 'main' into cn/always-get-extra-info	2022-09-14 17:01:59 +00:00
Dom Dwyer	fc17f2ec2d	refactor: hoist persistence watermark from buffer The maximum persisted sequence number is tracked to answer "up to where has this partition been persisted", used for querying and skipping writes that have already been applied (though I suspect this is redundant). This is a property of the partition, not the actual data buffer, so this commit hoists it up out of the data buffer and onto the per-partition data structure, internalising the field in the process (not pub).	2022-09-14 18:07:45 +02:00
Nga Tran	44e12aa512	feat: add needed budget and memory budget into the message for us to diagnose and increase our memory budget as needed (#5640 )	2022-09-14 16:06:19 +00:00
Carol (Nichols \|\| Goulding)	e16306d21c	refactor: Move fetching of extra partition info into the method because it's always needed	2022-09-14 11:14:17 -04:00
Andrew Lamb	8b273c2a7d	docs: Add comments about how to see debug logs via `cargo test` (#5627 ) * docs: Add documentation about how to see debug logs via `cargo test` * fix: Update test_helpers/src/lib.rs Co-authored-by: Marco Neumann <marco@crepererum.net> * fix: Update test_helpers/src/lib.rs Co-authored-by: Marco Neumann <marco@crepererum.net> * fix: Update test_helpers/src/lib.rs * fix: fmt Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 14:16:46 +00:00
Luke Bond	b52865e018	feat: garbage collector now cleans up old parquet files (#5588 ) * feat: garbage collector now cleans up old parquet files * chore: clarifying comment in GC * chore: typos in GC * chore: typos in GC * fix: cmdline arg in GC test needs updating after refactor * fix: use select! on shutdown rx in GC * fix: recalc cutoff in GD each loop * chore: add delete_old that returns IDs only, for GC * chore: use duration in GC args instead of usize days * chore: GC lister runs forever w/ sleep; tests updated accordingly * docs: fix link in GC comments to automatic link * chore: test for delete_old_ids_only; refactor mem impl thereof * chore: make GC test less flakey * chore: make GC test less flakey Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 14:09:28 +00:00
dependabot[bot]	7e1f013346	chore(deps): Bump itertools from 0.10.3 to 0.10.4 (#5631 ) Bumps [itertools](https://github.com/rust-itertools/itertools) from 0.10.3 to 0.10.4. - [Release notes](https://github.com/rust-itertools/itertools/releases) - [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-itertools/itertools/compare/v0.10.3...v0.10.4) --- updated-dependencies: - dependency-name: itertools dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 14:02:14 +00:00
Marco Neumann	2332e5de10	refactor: slightly increase querier namespace cache TTLs (#5635 ) This should lower catalog load and eliminate a few costly cache misses. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 13:54:51 +00:00
dependabot[bot]	1353a429d7	chore(deps): Bump tokio from 1.21.0 to 1.21.1 (#5630 ) * chore(deps): Bump tokio from 1.21.0 to 1.21.1 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.0 to 1.21.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.0...tokio-1.21.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-09-14 13:22:03 +00:00
dependabot[bot]	b4a25fdb0e	chore(deps): Bump thiserror from 1.0.34 to 1.0.35 (#5629 ) Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.34 to 1.0.35. - [Release notes](https://github.com/dtolnay/thiserror/releases) - [Commits](https://github.com/dtolnay/thiserror/compare/1.0.34...1.0.35) --- updated-dependencies: - dependency-name: thiserror dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 12:54:12 +00:00
Dom	9ed931271a	Merge pull request #5634 from influxdata/dom/split-data refactor(ingester): split data.rs into submodules	2022-09-14 13:44:01 +01:00
Dom Dwyer	ee8cdb48af	style(ingester): fmt imports & long strings Rewrite the imports to be a consistent order; std, external, crate and merge all crate-level imports into one use statement.	2022-09-14 14:20:19 +02:00
Dom Dwyer	074722eb3e	refactor(ingester): split data.rs into modules Breaks the gigantic data.rs file into sub-modules for Shard, Namespace, Table, Partition, and finally the actual data buffer used to store writes.	2022-09-14 14:20:19 +02:00
Andrew Lamb	45d795055a	feat: Support calling influxql/flux selector aggregates from IOx SQL (#5628 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 10:37:17 +00:00
kodiakhq[bot]	674670e17f	Merge pull request #5622 from influxdata/cn/infallible-estimated-bytes fix: Use `ColumnType` as an enum in more places; make `estimate_arrow_bytes_for_file` infallible	2022-09-14 01:07:05 +00:00
kodiakhq[bot]	85641efa6f	Merge branch 'main' into cn/infallible-estimated-bytes	2022-09-14 01:00:10 +00:00
Luke Bond	51dac55652	Merge pull request #5567 from influxdata/chore/parquetfile-size-trigger feat: parquetfile size trigger	2022-09-13 16:39:57 +01:00
Luke Bond	ee3f172d45	chore: renamed DB migration for billing trigger	2022-09-13 16:29:14 +01:00
Luke Bond	c8b545134e	chore: add index to speed up billing_summary upsert	2022-09-13 16:22:44 +01:00

1 2 3 4 5 ...

9088 Commits (85d6efafe19627a0107a0d044490e9fcaaa4dbd6) All Branches Search

9088 Commits (85d6efafe19627a0107a0d044490e9fcaaa4dbd6)

All Branches