Commit Graph

812 Commits (5398a1b986ceb42d76a62dff43c7b005159774f1)

Author SHA1 Message Date
Carol (Nichols || Goulding) 43687a86d2
fix: Remove lots of needless borrows that Clippy can now identify
Except for in generated code that we don't control.
2022-11-09 10:54:18 -05:00
Carol (Nichols || Goulding) fa46951524
fix: Remove needless deref done by auto deref, thanks Clippy! 2022-11-09 10:54:18 -05:00
Carol (Nichols || Goulding) 5d9d1d9ee5
fix: Allow using an if to turn a boolean into 1 or 0
I'm not sure I agree with this clippy lint; I think the `if` is much
clearer

<https://rust-lang.github.io/rust-clippy/master/index.html#bool_to_int_with_if>
2022-11-09 10:54:18 -05:00
Dom d9c97795fc
feat: use IDs in ingester query API (#6093)
* refactor: NS+table ID (instead of name) in querier<>ingester

* feat(ingester): use IDs for query API

Changes the ingester to utilise the ID fields (instead of names) sent
over the query wire message wrapped within the Flight API.

BREAKING: this changes the "query-ingester" CLI command arguments which
now expects the namespace & table IDs, rather than their names.

* refactor(ingester): add more query logging context

Updates the log messages during query execution to include more context
fields.

* style: remove unused import

Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-11-09 11:25:13 +00:00
Marco Neumann 903f7bafa7
refactor: expose `ParquetExec` directly to DataFusion phys. plan (#6072)
* refactor: expose `ParquetExec` directly to DataFusion phys. plan

Closes #5897.

* fix: update tracing tests

* refactor: use `EmptyExec`

* refactor: use `target_partitions`

* refactor: improve UUID normalization in query tests

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-11-08 12:19:28 +00:00
Luke Bond f9316decee
chore: expose compactor's hot compaction hours thresholds as cfg (#6060)
* chore: expose compactor's hot compaction hours thresholds as cfg

* fix: add missing compactor arg envar; fix some comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 15:29:17 +00:00
Andrew Lamb 034d9b371d
chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` (#6061)
* chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0`

* fix: Update query_functions

* fix: update for TimestampNanosecondArray API changes

* fix: update for TimestampNanosecondArray API changes

* chore: Update flatbuffers and remove rustsec warning

* chore: Update text

* fix: update more test

* fix: Lock ahash to exactly 0.8.0

* fix: Update datafusion pin

* chore: Run cargo hakari tasks

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 11:01:58 +00:00
Nga Tran 9356f2a1b9
feat: grpc for updating namespace retention period (#6041)
* refactor: make namespace folder for all namesapce's commands

* feat: WIP for add command to set retention period

* feat: more on updating retention period

* feat: grpc for update namespace retention period

* test: end to end test fpr namespace retention

* fix: lint proto

* chore: cleanup

* chore: kick CI run again

* fix: command hierachy

* chore: fix comments
2022-11-04 20:58:11 +00:00
Nga Tran 654ed98d1f
feat: config param to set when partition is cold (#6044)
* feat: config param to set when partition is cold

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* fix: make default 8 hours and avoid using 8 * 60 becasue it is a string, not expression which makes a test fail

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-03 15:03:56 +00:00
Marco Neumann a38995ca0f
feat: add `MeasurementTagKeys` support to storage CLI (#6039)
Needed this to debug something.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 18:08:07 +00:00
Andrew Lamb 3ba0458653
feat: Add object_store handler to querier so `remote get-table` works (#6014)
* feat: Add object_store handler to querier

* test: end to end test for get-table from querier

* fix: doc links

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 14:20:26 +00:00
Nga Tran fba4408d05
refactor: move `influxdb_iox debug namespace` command to `influxdb_iox namespace` (#6031)
* refactor: move  command to

* docs: update the doc accordingly

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 10:57:58 +00:00
dependabot[bot] b1572c50a6
chore(deps): Bump once_cell from 1.15.0 to 1.16.0 (#6009)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.15.0...v1.16.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:23:40 +00:00
Carol (Nichols || Goulding) 69a2e6b871
feat: Last 2 bonus features of remote store get-table (#5991)
* feat: Only get files that aren't already on disk with the reported size

* feat: Stream Parquet file bytes to file on disk

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 11:03:08 +00:00
Carol (Nichols || Goulding) ace497d47c
fix: Rename database to namespace in the commands I just added 2022-10-27 10:40:39 -04:00
Carol (Nichols || Goulding) d65a6a86dd
fix: Make error output less repetitive/wordy 2022-10-27 10:30:58 -04:00
Carol (Nichols || Goulding) 47faca6843
feat: Allow specifying output dir for get-table 2022-10-27 10:30:57 -04:00
Carol (Nichols || Goulding) dc4adfeefb
feat: Add the partition ID to fetched parquet files 2022-10-27 10:30:57 -04:00
Carol (Nichols || Goulding) f720dcee36
docs: Clarifications suggested in code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-10-27 10:10:28 -04:00
Carol (Nichols || Goulding) de2ae6f557
feat: MVP of remote store get-table command 2022-10-26 13:50:03 -04:00
Carol (Nichols || Goulding) 8697ef4967
feat: Set up CLI args for new get-table command 2022-10-26 11:19:00 -04:00
Carol (Nichols || Goulding) 71770486af
refactor: Extract influxdb_iox remote CLI tests to their own file 2022-10-26 11:19:00 -04:00
Carol (Nichols || Goulding) 3145e2c05b
feat: Use workspace dep inheritance for the arrow crate 2022-10-26 10:34:29 -04:00
Carol (Nichols || Goulding) 44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate 2022-10-26 10:33:56 -04:00
kodiakhq[bot] 48d806b326
Merge branch 'main' into cn/workspace-inheritance 2022-10-24 20:10:08 +00:00
Andrew Lamb 335dafa3f7
fix: Do not truncate data retrieved from `remote store get` command (#5966) 2022-10-24 17:45:56 +00:00
Carol (Nichols || Goulding) 2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition 2022-10-24 13:04:09 -04:00
Marco Neumann 1d440ddb2d
refactor: `IOxReadFilterNode` can always accumulate statistics (#5954)
* refactor: `IOxReadFilterNode` can always accumulate statistics

`IOxReadFilterNode` used to not emit statistics if one chunk has
duplicates or delete predicates. This is wrong (or at least overly
conservative), because the node itself (or the chunks themselves) do NOT
perform dedup or delete predicate filtering. Instead this is done is
done by parent nodes (`DeduplicateExec` and `FilterExec`) and its their
job to propagate statistics correctly.

Helps w/ #5897.

* test: explain setup

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-10-24 13:34:22 +00:00
Marco Neumann e0062f2d40
refactor: do NOT use fake DF context for parquet reading (#5942)
Use the proper top-level DataFusion context and register the object
store there.

Note that we still hide the `ParquetExec` behind an opaque record batch
stream. Fixing that is next on my list.

Helps with #5897.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-24 08:20:26 +00:00
Carol (Nichols || Goulding) fe98e7a65c
fix: Compactor skipped_at is in seconds, not nanoseconds oops 2022-10-21 16:41:55 -04:00
Jake Goulding fa7fe2e9cf feat: Add a gRPC endpoint to delete a skipped compaction
Also add a CLI usage of it for convenience
2022-10-21 15:12:20 -04:00
Carol (Nichols || Goulding) 68e310f45d
feat: Display skipped compactions in a table instead of JSON 2022-10-21 13:59:19 -04:00
Carol (Nichols || Goulding) 0132a33946
fix: Rename SkippedCompactionService to CompactionService
To make a good place for other compactor-related gRPC actions in the
future.
2022-10-21 13:40:37 -04:00
Carol (Nichols || Goulding) ba25300b01
feat: Create compactor service to list skipped compactions 2022-10-21 13:40:31 -04:00
kodiakhq[bot] 9b67db3c06
Merge branch 'main' into cn/ingester-tracing 2022-10-21 13:13:13 +00:00
dependabot[bot] 6e6e180aad
chore(deps): Bump assert_cmd from 2.0.4 to 2.0.5 (#5937)
Bumps [assert_cmd](https://github.com/assert-rs/assert_cmd) from 2.0.4 to 2.0.5.
- [Release notes](https://github.com/assert-rs/assert_cmd/releases)
- [Changelog](https://github.com/assert-rs/assert_cmd/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/assert_cmd/compare/v2.0.4...v2.0.5)

---
updated-dependencies:
- dependency-name: assert_cmd
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-21 10:50:06 +00:00
Carol (Nichols || Goulding) 59e1c1d5b9
feat: Pass trace id through Flight requests from querier to ingester
Fixes #5723.
2022-10-20 08:55:30 -04:00
Andrew Lamb 3fd0c5e4c2
fix: improve error message for storage read group command (#5915)
* fix: improve error message for storage read group command

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-20 09:31:50 +00:00
dependabot[bot] bebb15d30f
chore(deps): Bump serde_json from 1.0.86 to 1.0.87
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.86 to 1.0.87.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.86...v1.0.87)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-20 07:52:33 +00:00
Andrew Lamb 82d6fc3bda
feat: support queries via influxrpc with periods in field names (#5919)
* feat: support queries via influxrpc with periods in field names

* fix: update comments

* fix: more tests

* fix: more tests
2022-10-19 20:09:55 +00:00
Andrew Lamb 1df7a0d4fb
refactor: remove outdated observer sql repl mode (#5918)
* refactor: remove Observer mode from repl

* chore: remove outdated SQL docs

* fix: more update of sql docs
2022-10-19 18:39:05 +00:00
Andrew Lamb d706f8221d
chore: Update datafusion and arrow / parquet / arrow-flight 25.0.0 (#5900)
* chore: Update datafusion and  `arrow` / `parquet` / `arrow-flight` 25.0.0

* chore: Update for structure changes

* chore: Update for new projection pushdown

* chore: Run cargo hakari tasks

* fix: fmt

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-18 20:58:47 +00:00
Carol (Nichols || Goulding) c28ac4a3c3
fix: Return an error for unsupported SQL queries (#5876)
* test: Failing tests for unsupported queries

* fix: Catch unsupported SQL operations and error rather than return nothing

* test: Document a few more error messages that come through DataFusion

* refactor: Extract a Step to make query error tests nicer to read and write

* fix: update tests for new error codes

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-18 19:27:29 +00:00
dependabot[bot] f3c27c5c71
chore(deps): Bump dotenvy from 0.15.5 to 0.15.6 (#5881)
Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.5 to 0.15.6.
- [Release notes](https://github.com/allan2/dotenvy/releases)
- [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/allan2/dotenvy/compare/v0.15.5...v0.15.6)

---
updated-dependencies:
- dependency-name: dotenvy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-18 07:06:40 +00:00
Andrew Lamb d57c99638c
chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792)
* chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0

* fix: Update for coercion, fix explain plans for change in column name display

* chore: Update datafusion lock

* fix: Update for other API changes

* chore: Update to latest datafusion pin

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-12 16:19:14 +00:00
kodiakhq[bot] 266b8f2a58
Merge branch 'main' into dependabot/cargo/clap-4.0.2 2022-10-12 14:01:28 +00:00
Dom Dwyer c4f542bbe2 refactor(ingester): remove tombstone support
This commit removes tombstone support from the ingester, and deletes
associated code/helpers/tests. This commit does NOT remove tombstone
support from any other service, but MAY include removing overlapping
test coverage.

This also removes the tombstone support from the Ingester -> Querier RPC
response message.

This has the nice side effect of removing a whole lot of thread spawning
in the ingester tests for the Executor, speeding everything up!
2022-10-11 13:10:04 +02:00
dependabot[bot] 933493fab3
chore(deps): Bump object_store from 0.5.0 to 0.5.1
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-11 01:19:10 +00:00
dependabot[bot] 2277fcf08a
chore(deps): Bump serde_json from 1.0.85 to 1.0.86
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.85 to 1.0.86.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.85...v1.0.86)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-10 01:42:37 +00:00
Marco Neumann c4c83e0840
fix: query error propagation (#5801)
- treat OOM protection as "resource exhausted"
- use `DataFusionError` in more places instead of opaque `Box<dyn Error>`
- improve conversion from/into `DataFusionError` to preserve more
  semantics

Overall, this improves our error handling. DF can now return errors like
"resource exhausted" and gRPC should now automatically generate a
sensible status code for it.

Fixes #5799.
2022-10-06 08:54:01 +00:00