Commit Graph

8663 Commits (b21799acaea38be920b1cc5942a3243abd37dbbf)

Author SHA1 Message Date
kodiakhq[bot] c7911e0124
Merge pull request #5024 from influxdata/cn/s3-gc
feat: Automatically remove old / unreferenced objects from the object store
2022-07-07 14:10:04 +00:00
Carol (Nichols || Goulding) 6b4325642f
refactor: Rename inner_main to main as it's now public 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 9a681e75cc
fix: Box a large enum variant 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 70dd6009e8
fix: Integrate the object store garbage collector into the main binary 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) b6ff82c06e
fix: Correct copypasta'd doc comment 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 5ef2298677
fix: Adding appropriate messages to log lines 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) d860a61bb2
fix: Remove unused error type 2022-07-07 09:48:05 -04:00
Carol (Nichols || Goulding) c8182eaf71
fix: Print to stdout when a file is deleted by the object store garbage collector 2022-07-07 09:48:05 -04:00
CircleCI[bot] 4369431e03
chore: Run cargo hakari tasks 2022-07-07 09:48:05 -04:00
Jake Goulding f55812ea86
fix: exclude the garbage collector from the workspace hack 2022-07-07 09:48:05 -04:00
Jake Goulding 9a827b8259
refactor: remove unneeded dependency features 2022-07-07 09:48:05 -04:00
Jake Goulding f1b0b4da93
refactor: eagerly validate garbage collector command line arguments 2022-07-07 09:48:05 -04:00
Jake Goulding 6fc17164bd
test: Add basic end-to-end tests of the garbage collector 2022-07-07 09:48:05 -04:00
Carol (Nichols || Goulding) 72d1ed9b73
feat: Support configurable cutoff time for too new object store files 2022-07-07 09:48:05 -04:00
Carol (Nichols || Goulding) 3fcaf34d54
feat: Add trogging logging config to object store garbage collector 2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) 90d7b22d86
fix: Use dotenv and the right features in the garbage collector 2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) c541ca68a2
fix: Test logic for whether to delete object store files
And fix the bugs the tests found
2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) d6f75e6767
refactor: Extract modules to files 2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) 98e7133ebf
refactor: Reuse clap blocks rather than creating new ones 2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) 5c6ff365af
fix: Use lints in the object store cleanup tool 2022-07-07 09:48:04 -04:00
Carol (Nichols || Goulding) 869e09f545
fix: Organize Cargo.toml 2022-07-07 09:48:04 -04:00
Jake Goulding 428f41f747
feat: Walking skeleton of the object store GC tool 2022-07-07 09:48:04 -04:00
Marco Neumann aacdeaca52
refactor: prep work for #5032 (#5060)
* refactor: remove parquet chunk ID to `ChunkMeta`

* refactor: return `Arc` from `QueryChunk::summary`

This is similar to how we handle other chunk data like schemas. This
allows a chunk to change/refine its "believe" over its own payload while
it is passed around in the query stack.

Helps w/ #5032.
2022-07-07 13:21:48 +00:00
Andrew Lamb 8f5210ea3e
test: add test for "duration since production" in kafka `write_buffer` implementation (#5043)
* test: add test for timestamps in kafka write buffer

* refactor: move timestamp batching test to generic tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 10:27:27 +00:00
Marco Neumann d33525055b
ci: update CI machine images (#5061) 2022-07-07 09:14:02 +00:00
dependabot[bot] c443d07a5c
chore(deps): Bump criterion from 0.3.5 to 0.3.6 (#5059)
Bumps [criterion](https://github.com/bheisler/criterion.rs) from 0.3.5 to 0.3.6.
- [Release notes](https://github.com/bheisler/criterion.rs/releases)
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.3.5...0.3.6)

---
updated-dependencies:
- dependency-name: criterion
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 07:09:27 +00:00
Marco Neumann cec169b7ca
feat: add "peek" functionality for caches (#5049)
Required for #5032.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 07:00:44 +00:00
kodiakhq[bot] 3f549dd14e
Merge pull request #5056 from influxdata/cn/fix-all-in-one-querying
fix: Correctly set up querier-to-ingester config, sequencers, and catalog for all-in-one ephemeral mode
2022-07-06 18:14:08 +00:00
kodiakhq[bot] b4d3d806cc
Merge branch 'main' into cn/fix-all-in-one-querying 2022-07-06 18:08:08 +00:00
Marco Neumann 2e5366a62a
refactor: disable TTL (caching) for non-existing namespaces (#5053)
This is not relevant at the moment for prod since other layers
prevent/filter queries for non-existing namespaces.

However this messes up the flux integration tests, see
https://github.com/influxdata/conductor/issues/997

So let's disable this specific cache case until #4617 is implemented
which may be used by the flux tests.

Fixes https://github.com/influxdata/conductor/issues/997

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 15:22:58 +00:00
Carol (Nichols || Goulding) a96976db46
fix: Start Kafka Partition IDs for default records at 0, not 1
In the all-in-one command, only one write buffer partition is supported,
and it's specified using Kafka Partition ID 0:

```
        // All-in-one mode only supports one write buffer partition.
        let write_buffer_partition_range_start = 0;
        let write_buffer_partition_range_end = 0;
```

When using all-in-one mode with an ephemeral, in-memory catalog,
`create_or_get_default_records` is what puts records into the catalog
that need to match the write buffer configuration.
2022-07-06 11:00:55 -04:00
Carol (Nichols || Goulding) 311d4c1f9a
fix: All-in-one mode only supports one partition/sequencer 2022-07-06 11:00:55 -04:00
Carol (Nichols || Goulding) 89f5091546
refactor: Don't require DSN for test config
This enables writing a test for all-in-one's ephemeral mode, which
currently isn't working
2022-07-06 11:00:29 -04:00
Marco Neumann e84e1f3de2
ci: clean up and fixes (#5054)
* ci: remove unused helper script

* ci: update CI machine images

Try to fix error a la:

```text
error: failed to solve: failed to solve with frontend dockerfile.v0: failed to solve with frontend gateway.v0: failed to copy: httpReadSeeker: failed open: failed to do request: Get "https://docker-images-prod.s3.dualstack.us-east-1.amazonaws.com/registry-v2/docker/registry/v2/blobs/sha256/...": dial tcp 52.216.28.112:443: i/o timeout
```

See https://discuss.circleci.com/t/increased-rate-of-errors-when-pulling-docker-images-on-machine-executor/42094/9
2022-07-06 14:47:50 +00:00
Nga Tran 425b8a63cf
fix: avoid combing groups that overlap with other groups even if they are small (#5052)
* fix: avoid  combing groups that overlap with other groups  even if they are small

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 14:03:15 +00:00
Sam Arnold e193913ed3
fix: optimize field columns for all-time predicates (#5046)
* fix: optimize field columns for all-time predicates

Also fix timestamp range to allow selecting points at MAX_NANO_TIME

* fix: clamp end to MIN_NANO_TIME for safety

* refactor: add contains_all method to TimestampRange
2022-07-06 12:01:28 +00:00
dependabot[bot] 2b527bbf64
chore(deps): Bump regex from 1.5.6 to 1.6.0 (#5048)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.6 to 1.6.0.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.6...1.6.0)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 10:25:28 +00:00
dependabot[bot] b3522086a8
chore(deps): Bump regex-syntax from 0.6.26 to 0.6.27 (#5047)
Bumps [regex-syntax](https://github.com/rust-lang/regex) from 0.6.26 to 0.6.27.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

---
updated-dependencies:
- dependency-name: regex-syntax
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 10:19:07 +00:00
Marko Mikulicic 8854ee317e
chore: Add ACS trigger action (#5050) 2022-07-06 09:43:04 +00:00
Andrew Lamb 5944f27e77
refactor: avoid write buffer cloning in `store_operation` (#5042)
* refactor: avoid write buffer cloning in `store_operation`

* fix: update usage

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-06 06:57:03 +00:00
Marko Mikulicic 015bba8589
chore: c2updater to acs transition (#5045)
1. disable c2updater call
2. use full sha tags (required by `acs`)
2022-07-05 21:47:05 +00:00
Nga Tran d8b74f6af8
refactor: convert a panic into an error and throw a warning if we choose non-actionable compacting candidates (#5041)
* refactor: convert a panic into an error and throw a warning if we choose non-actionable candidates

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* chore: run fmt

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-05 18:53:52 +00:00
Nga Tran 1de022136c
feat: add max desired file size config param (#5025)
* feat: add max desired file size config param

* fix: comment typos

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-05 15:32:45 +00:00
Marco Neumann 16bd3e67c0
refactor: unify `apply_predicate_to_metadata` (#5030)
Instead of using some hand-rolled timestamp-based logic (or just
"unknown") all over the place, just use logic introduced in #5017.

This requires slightly improved table summaries within the querier that
at least has min/max for the timestamp column. For that, the former
`IngesterChunk`-specific `calculate_summary` method was extended to
`create_basic_summary` to include that data and is now also used by
`QuerierParquetChunk`.

Note: `QuerierRBChunk` already has detailled metrics that are provided
by the read buffer implementation.

Should we ever need even better pruning for `QuerierParquetChunk` (or
`IngesterChunk`) then we _only_ need add extra data to the table
summaries.

Closes #4976.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-05 12:51:59 +00:00
Sam Arnold 03f456d8fd
fix: optimize tag_keys to go only to schema when predicate is empty (#4985)
* docs: fix comment

* test: add test for delete behaviour

* fix: tag_keys optimization for empty predicate

Also need to eliminate 'true' predicates from simplified predicate so
is_empty works correctly.

* refactor: use lit instead of spelling out literal true

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-05 12:45:25 +00:00
dependabot[bot] 68eff79594
chore(deps): Bump once_cell from 1.12.0 to 1.13.0 (#5033)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.12.0...v1.13.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-05 08:54:51 +00:00
dependabot[bot] 1f5863ebcb
chore(deps): Bump serde from 1.0.137 to 1.0.138 (#5029)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.137 to 1.0.138.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.137...v1.0.138)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-04 10:07:46 +00:00
dependabot[bot] b8f2f3b9e8
chore(deps): Bump pin-project from 1.0.10 to 1.0.11 (#5028)
Bumps [pin-project](https://github.com/taiki-e/pin-project) from 1.0.10 to 1.0.11.
- [Release notes](https://github.com/taiki-e/pin-project/releases)
- [Changelog](https://github.com/taiki-e/pin-project/blob/main/CHANGELOG.md)
- [Commits](https://github.com/taiki-e/pin-project/compare/v1.0.10...v1.0.11)

---
updated-dependencies:
- dependency-name: pin-project
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-04 09:45:31 +00:00
dependabot[bot] 98ac454e88
chore(deps): Bump crypto-common from 0.1.3 to 0.1.4 (#5027)
Bumps [crypto-common](https://github.com/RustCrypto/traits) from 0.1.3 to 0.1.4.
- [Release notes](https://github.com/RustCrypto/traits/releases)
- [Commits](https://github.com/RustCrypto/traits/compare/crypto-common-v0.1.3...crypto-common-v0.1.4)

---
updated-dependencies:
- dependency-name: crypto-common
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-04 09:07:37 +00:00
Marco Neumann 6f445ccd94
feat: Prune chunks using table summary (stats) (#5017)
* feat: easy tests of table summary against predicate

Helps with #4976.

Alternative to #4995.

* refactor: address review comments

* refactor: address review comments

* refactor: address review comments
2022-07-04 09:01:34 +00:00