Commit Graph

73 Commits (44e266d00036005a0ece1d44d01bd2f0397ff903)

Author SHA1 Message Date
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
Marco Neumann 004b401a05
chore: upgrade to sqlx 0.7.1 (#8266)
There are a bunch of dependencies in `Cargo.lock` that are related to
mysql. These are NOT compiled at all, and are also not part of `cargo
tree`. The reason for the inclusion is a bug in cargo:

https://github.com/rust-lang/cargo/issues/10801

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-19 12:18:57 +00:00
Carol (Nichols || Goulding) 22c17fb970
feat: Abstract over which partition ID type we're using to list Parquet files 2023-07-10 13:40:01 -04:00
Carol (Nichols || Goulding) 41420cb920
fix: Borrow transition partition ID when possible 2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) 62ba18171a
feat: Add a new hash column on the partition and parquet file tables
This will hold the deterministic ID for partitions.

Until all existing partitions have this value, this is optional/nullable.

The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.

The hash_id has a unique index so that we can look up records based on
it (if it's available).

If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
2023-06-22 09:01:22 -04:00
Phil Bracikowski e34ec77e8d
feat(garbage-collector): batch parquet existence checks to catalog (#7964)
* feat(garbage-collector): batch parquet existence checks to catalog

The core feature of this PR is batching the existence checks of parquet
files in object store against the catalog. Before, there was 1 catalog
query per each parquet file in object store. This can be a lot of
requests.

This PR can perform one query of at most 100 parquet file uuids against
the catalog in one query. A hundred seems like a decent starting place.

The batch may not reach 100 because there is also a timeout on receiving
object store meta objects from the object store lister thread. That
timeout is set to 100 milliseconds. If more than 100 are received, they
are batched into 100 for the catalog.

Additionally, this PR includes surrounding code changes to make it more
idiomatic (but not perfect). It follows up some suggested work from
 #7652 for watching for shutdown on the threads.

* fixes #7784

* use hashset instead of vec to test for contains
* chore: add test for db failure path
* remove ParquetFileExistsByOSID and other single field structs that are
  just for sql deserialization; map to uuid explicitly
* fix the sqlite query by using a blob literal X'<hex>' for uuids
* comment clarifications
* adjust loggings to warn from debug for expected rare events

Many thanks to Carol for help implementing this!
2023-06-14 07:59:00 -07:00
Phil Bracikowski 92a83270f3
fix(garbage-collector): just test parquet file exists (#7948)
* fix(garbage-collector): just test parquet file existence

The GC, when checking files in object store against the catalog, only
cares if the parquet file for the given object store id exists in the
catalog. It doesn't need the full parquet file. Let's not transmit it
over the wire.

This PR uses a SELECT 1 and boolean to test for parquet file existing.

* helps #7784

* chore: use struct for from_row

* chore: satisfy clippy

* chore: fmt
2023-06-07 15:12:48 -07:00
Andrew Lamb 17c0d837b3
chore: Update DataFusion, arrow, object_store pins (#7942)
* chore: Update DataFusion, arrow, object_store pins

* chore: Update for hakari

* chore: Update for new APIs

* fix: update test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-07 17:08:31 +00:00
dependabot[bot] d8b06c59c4
chore(deps): Bump once_cell from 1.17.2 to 1.18.0
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.17.2 to 1.18.0.
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.17.2...v1.18.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-06-05 02:03:15 +00:00
Stuart Carnie ba27b7760c
chore: remove unused file 2023-06-05 06:57:42 +10:00
Stuart Carnie a4371254eb
chore: rustfmt 2023-06-04 06:56:28 +10:00
Andrew Lamb 1ff76b7bf2 chore: use workspace dependencies for `object_store` 2023-05-26 07:03:42 -04:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
Andrew Lamb 6344fe8c3f
chore: Add rationale for `clippy::future_not_send` (#7822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-18 16:58:56 +00:00
Andrew Lamb 1ff11d0856
refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres (#7736)
* refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres

Restores code from https://github.com/influxdata/influxdb_iox/pull/7708

Revert "revert: PR #7708"

This reverts commit c9cfe05f8d.

* fix: merge

* fix: Update new test
2023-05-17 13:36:25 +00:00
Carol (Nichols || Goulding) 7268ea5c29
refactor: Extract a test helper function to create a basic table 2023-05-15 14:31:24 -04:00
Carol (Nichols || Goulding) 57bedb1c2d
refactor: Extract a test helper function to create a basic namespace 2023-05-15 14:20:38 -04:00
Kaya Gökalp 5fe8affb18
refactor: accept NamespaceName with Namespace create (#7774)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-05-15 10:03:55 +00:00
Phil Bracikowski 8b87a10fe0
fix(garbage collector): larger list batches and another tunable (#7738)
This PR increases the batch/page size of list operations in the gc 10x
to 10,000; it introduces a new cli config for the sleep interval between
batches. Previously a single sleep config was used between batches and
between entirely new list operations. This PR also moves to debug some
noisy logging.

* tag to #7689
2023-05-04 17:47:28 +00:00
Carol (Nichols || Goulding) b0959667d5
fix: Move topic and query pool within iox catalog (#7734)
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-04 13:45:56 +00:00
Carol (Nichols || Goulding) 621caab2e9
fix: Remove unused parquet_max_sequence_number metadata 2023-05-03 10:57:27 -04:00
Dom Dwyer c9cfe05f8d
revert: PR #7708
This reverts commit 61abb58933.
2023-05-02 13:51:30 +02:00
Andrew Lamb 61abb58933
refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres (#7708)
* refactor: Change catalog configuration so it is entirely dsn based

* docs: Add documentation

* chore: update docs

* chore: review feedback

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-02 10:48:33 +00:00
Carol (Nichols || Goulding) cd4d961db9
fix: garbage_collector no longer uses chrono-english so it can use the workspace hack crate 2023-05-01 11:31:42 -04:00
Phil Bracikowski b7bd66195f
chore(garbage collector): improve logging in lister (#7695)
* follow up to #7689
2023-04-28 16:37:08 +00:00
Phil Bracikowski fb4083d993
fix(garbage collector): respect lister limit, display source errors (#7689)
GC object store lister adjustments: the previous take wasn't being
respected because of where it was; a chunked list is now used instead.
The lister specific errors will now display the source error too.

* follow up to #7562

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-28 13:25:39 +00:00
Phil Bracikowski 8609015821
chore(garbage collector): backoff first connect to objectstore (#7652)
* chore(garbage collector): backoff first connect to objectstore

This pr replaces the initial sleep added in #7562 with expential backoff
retry of connecting to objectstore. This avoids the issue of waiting the
configured sleep which can be quite long in a world where the service is
getting redeployed often.

* for #7562

* chore: rewrite lister::perform based on pr feedback

This commit redoes perform as a do..while loop, putting the call to list
from the object store at the top of the loop so the infinite backoff
retry and be used for each loop iteration - it might fail on more than
the first time!

There are 3 selects as there are 3 wait stages and each needs to check
for shutdown: os list, processing the list, and sleeping on the poll
interval.

* chore: hoist cancellation check higher; limit listing to 1000 files

Responding to PR feedback.

* chore: add error info message

* chore: make build. :|

* chore: linter
2023-04-27 17:51:52 +00:00
Carol (Nichols || Goulding) 038f8e9ce0
fix: Move shard concepts into only the catalog
This still inserts the shard id into the database, always set to the
TRANSITION_SHARD_ID, but never reads it back out again.
2023-04-26 11:42:32 -04:00
dependabot[bot] 0b9240cbbe
chore(deps): Bump tokio-util from 0.7.7 to 0.7.8 (#7665)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.7 to 0.7.8.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.7...tokio-util-0.7.8)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 09:24:39 +00:00
Phil Bracikowski 8e7e22d167
chore(garbage collector): info logging for deleting parquet file (#7638)
* for influxdata/idpe#17451
* follow up to #7562
2023-04-24 16:57:18 +00:00
Phil Bracikowski 02344af813
fix(garbage collector): decrease info msg; don't loop immediately (#7622)
* Don't print an info message for each deleted file. This can be 1000s
  at a time and many more in total.
* Even if there are more files to delete, sleep the interval to decrease
  catalog load.

* part of influxdata/idpe#17451
2023-04-21 23:14:33 +00:00
Phil Bracikowski ec87f356db chore: rearrange for linter 2023-04-21 12:24:32 -07:00
Phil Bracikowski 420e3d0a70 chore: Merge branch 'main' into pjb-17451-gc-s3-connection-refused
* fix conflicts
2023-04-21 11:45:57 -07:00
Phil Bracikowski f0b9a0b315 chore: respond to pr feedback
* remove dry-run catalog method
* improve info and debug messages
2023-04-21 10:59:25 -07:00
Armin Primadi dd54d8b7fe
fix: Garbage collector hangs indefinitely on shutdown (#7567)
* fix: Garbage collector hangs indefinitely on shutdown

* style(garbage_collector): conform to linter and fmt

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-19 08:27:38 +00:00
Dom Dwyer c5bb88e173
chore: remove unused dependencies
Some crates import dependencies they never use.
2023-04-18 12:07:13 +02:00
Marco Neumann 808a13cf40
chore: remove `time` 0.1 & fix RUSTSEC-2020-0071 (#7568)
`time` 0.1 suffers from [RUSTSEC-2020-0071] and many upstream crates
have tried to remove it for years. The last dependency is

1. `chrono-english`
2. `chrono` (default features)
3. `chrono` (oldtime)
4. `time` 0.1

`chrono-english` doesn't seem to be super well maintained, but I
couldn't find a nice replacement for it. Luckily the master branch of
`chrono-english` is already fixed, so let's just directly use that.

[RUSTSEC-2020-0071]: https://rustsec.org/advisories/RUSTSEC-2020-0071

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-17 12:36:10 +00:00
Phil Bracikowski 1d64cb1b1e fix(garbage_collector): delay initial s3 checker loop, fix dryrun
This PR makes 3 improvements.

* It adds the configured sleep interval at the start of the object store
  checker to avoid issues with making a remote list immediately at
startup. We see issues with the s3 api.
* the --dry-run flag was stopping deletes of objects from object store,
  but the retention flagger was still making updates to the catalog.
These writes to the catalog are surprising when the --dry-run flag is
provided. Now, with --dry-run the catalog is not updated. The logging
instead says how many records would be updated because of retention.
* It decreases logging in should_delete of the checker as it will be
  extremely noisey when reporting files it skips. An internal
environment has 3.8 million parquet files, most of which would be
skipped.

* related to #7363
* fixes influxdata/idpe#17451
2023-04-14 17:03:07 -07:00
dependabot[bot] 66982f988b
chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/commits)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-04 08:43:34 +00:00
dependabot[bot] 3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
dependabot[bot] 0cbd9f6a82
chore(deps): Bump tokio-util from 0.7.5 to 0.7.7 (#6964)
---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 10:10:53 +00:00
dependabot[bot] c0c9b51b9e
chore(deps): Bump tokio-util from 0.7.4 to 0.7.5 (#6941)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.4 to 0.7.5.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.4...tokio-util-0.7.5)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-10 09:42:00 +00:00
dependabot[bot] 0ecde75af5
chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 09:40:11 +00:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
Andrew Lamb 6caf31acf3
chore: Move garbage collection configuration into clap_blocks (#6678)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-25 11:31:48 +00:00
Luke Bond 11fa648116
chore: clarify cli args for iox gc (#6628)
Signed-off-by: Luke Bond <luke.n.bond@gmail.com>

Signed-off-by: Luke Bond <luke.n.bond@gmail.com>
2023-01-19 07:43:29 +00:00
dependabot[bot] 0aacef3c59
chore(deps): Bump once_cell from 1.16.0 to 1.17.0 (#6473)
* chore(deps): Bump once_cell from 1.16.0 to 1.17.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Change once_cell version specifier to major.minor for less churn

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
2023-01-02 17:07:15 +00:00
dependabot[bot] 1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339)
* chore(deps): Bump object_store from 0.5.1 to 0.5.2

Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Nga Tran 49a9565240
feat: gRPC that creates namespace (#6103)
* feat: create namespace API call in router

Co-authored-by: Nga Tran <nga-tran@live.com>

* chore: treat retention as ns except in CLI

* fix: overflow in nanosecond calc

* fix: retention test after changing it from hours to ns

* chore: comment clarification in cli; better response type for error in ns API

* fix: correct some rebase mistakes

* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const

* fix: ns autocreation test was wrong after rebase

* fix: mem catalog has default 1hr retention, accidently removed in rebase

* chore: remove mem catalogs default 1hr retention; make it settable in sets & router

Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 13:02:12 +00:00
Nga Tran 9c4266c503
refactor: first step to remove unused retention_duration (#6113)
* refactor: first step to remove unused retention_duration

* refactor: remove retenion_duration from update catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-11 15:21:06 +00:00