Commit Graph

9573 Commits (45b3984aa30159e253c94e18bee24e518c65f63d)

Author SHA1 Message Date
Marco Neumann 45b3984aa3
refactor: simplify `QueryChunk` data access (#6015)
* refactor: simplify `QueryChunk` data access

We have only two types for chunks (now that the RUB is gone):

1. In-memory RecordBatches
2. Parquet files

Loads of logic is duplicated in the different `read_filter`
implementations. Also `read_filter` hides a solid amount of logic from
DataFusion, which will prevent certain (future) optimizations. To enable #5897
and to simplify the interface, let the chunks return the data (batches
or metadata for parquet files) directly and let `iox_query` perform the
actual heavy-lifting.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 08:18:33 +00:00
Andrew Lamb 1eb0d64210
chore: remove unecessary doc exclude (#6018)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-01 20:17:27 +00:00
Nga Tran 50061cd2fe
Merge pull request #6030 from influxdata/ntran/catalog-retention
feat: add catalog columns needed for retention policy
2022-11-01 15:56:33 -04:00
Nga Tran 3aa1b50b6f
Merge branch 'main' into ntran/catalog-retention 2022-11-01 15:39:17 -04:00
Rowan Hamilton 7f8d58e21a
fix: add missing content type in headers for `influxdb2_client` (#6021)
* fix: add missing content-type to buckets

* fix: add missing content-type to label

* fix: add missing content-type to setup
2022-11-01 19:35:45 +00:00
NGA-TRAN 498851eaf5 feat: add catalog columns needed for retention policy 2022-11-01 15:35:15 -04:00
Andrew Lamb 643fd58e02
docs: Document new CLI commands (#6013)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-01 16:44:14 +00:00
Cannon Palms 1c22216711
Merge pull request #6019 from influxdata/crepererum/fix_request_canceled
fix: be more conservative w/ "request canceled" messages
2022-11-01 12:22:31 -04:00
Marco Neumann d2399764c5 fix: be more conservative w/ "request canceled" messages 2022-11-01 17:08:59 +01:00
Cannon Palms ee92d28dfd
Merge pull request #6016 from influxdata/crepererum/issue5981
feat: enable ZSTD compression for write buffer payload
2022-11-01 09:45:58 -04:00
Marco Neumann 254be59856 feat: enable ZSTD compression for write buffer payload
Closes #5981.
2022-11-01 14:22:33 +01:00
Marco Neumann aa4eec9939 chore: update rskafka
Mostly upstream dependencies updates8678dfe049de05415929ffec7c1be8921bb057f7.
2022-11-01 14:19:32 +01:00
Marco Neumann d6cbae16ac
chore: update rskafka (#5998)
Includes additional logging to debug https://github.com/influxdata/idpe/issues/16278
2022-11-01 06:39:26 +00:00
dependabot[bot] 7785b20d7f
chore(deps): Bump hyper from 0.14.21 to 0.14.22 (#6012)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.21 to 0.14.22.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/v0.14.22/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.21...v0.14.22)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-01 05:53:09 +00:00
Andrew Lamb 9c1f0a3644
refactor: move SessionConfig creation into datafusion_utils (#6011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 20:04:49 +00:00
Andrew Lamb 00953460fb
chore: Update datafusion pin (#6010) 2022-10-31 17:14:56 +00:00
Marco Neumann 072439e428
refactor: mandatory `QueryChunkMeta::summary` (#5997)
With #5963 merged, all chunks now provide a summary (even though it may
not contain data for all columns). So let's make it mandatory, which
also removes a few 🙈-style `.except(...)` calls.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:38:02 +00:00
dependabot[bot] 62e51a5e06
chore(deps): Bump cc from 1.0.73 to 1.0.74 (#6008)
Bumps [cc](https://github.com/rust-lang/cc-rs) from 1.0.73 to 1.0.74.
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Commits](https://github.com/rust-lang/cc-rs/compare/1.0.73...1.0.74)

---
updated-dependencies:
- dependency-name: cc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:30:57 +00:00
dependabot[bot] b1572c50a6
chore(deps): Bump once_cell from 1.15.0 to 1.16.0 (#6009)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.15.0...v1.16.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:23:40 +00:00
Andrew Lamb ace3c11f12
chore: Update datafusion (#6004)
* chore: Update datafusion

* chore: change path

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:16:28 +00:00
kodiakhq[bot] 71dd3b5fa5
Merge pull request #6005 from influxdata/cn/add-catalog-service-everywhere
feat: Add the catalog service to ingester, querier, and compactor
2022-10-28 16:30:46 +00:00
Carol (Nichols || Goulding) dad1ad1318
feat: Add the catalog service to ingester, querier, and compactor
So that `remote get` that uses the catalog service can work no matter
what kind of server you contact.
2022-10-28 10:49:26 -04:00
Carol (Nichols || Goulding) 53445af25d
chore: Alphabetize some dependencies
I can't handle not knowing where to look for a dependency or knowing
where to add a new dependency.
2022-10-28 10:34:25 -04:00
Andrew Lamb e9d04ffcb5
feat: Log how long each persist plan takes to complete (#5989)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 13:52:39 +00:00
Dom e9f3b425f3
Merge pull request #6000 from influxdata/dom/router-namespaceid
refactor(router): pass NamespaceId through handler stack
2022-10-28 13:45:00 +01:00
Dom Dwyer d166de931d refactor: resolve namespace before DML dispatch
This commit introduces a new (composable) trait; a NamespaceResolver is
an abstraction responsible for taking a string namespace from a user
request, and mapping to it's catalog ID.

This allows the NamespaceId to be injected through the DmlHandler chain
in addition to the namespace name.

As part of this change, the NamespaceAutocreation layer was changed from
an implementator of the DmlHandler trait, to a NamespaceResolver as it
is a more appropriate abstraction for the functionality it provides.
2022-10-28 13:41:05 +02:00
Dom Dwyer 0c5eb3f70f style: format imports
Re-order and re-format the imports so that they follow a consistent
pattern.

This helps eliminate conflicts due to imports.
2022-10-28 13:39:19 +02:00
Carol (Nichols || Goulding) 69a2e6b871
feat: Last 2 bonus features of remote store get-table (#5991)
* feat: Only get files that aren't already on disk with the reported size

* feat: Stream Parquet file bytes to file on disk

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 11:03:08 +00:00
kodiakhq[bot] 16473f66e7
Merge pull request #5996 from influxdata/dom/require-partition-key
refactor(dml): PartitionKey required for writes
2022-10-28 10:38:17 +00:00
kodiakhq[bot] 1567227b49
Merge branch 'main' into dom/require-partition-key 2022-10-28 10:31:22 +00:00
Marco Neumann 8447d46093
refactor: remove `QueryChunkMeta::timestamp_min_max` (#5963)
Use the table summary instead. This allows us to have a single mechanism
that both IOx and DataFusion understand. This basically lifts the "basic
table summary" mechanism that the querier uses to `iox_query` and let
the compactor and ingester use the same mechanism.

While not strictly necessary, simplifying the `QueryChunk[Meta]`
interface helps with #5897.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 10:29:16 +00:00
Stuart Carnie 9f8c5856fc
chore: Keep types in their respective modules (#5993)
* chore: Keep types in their respective modules

Also adds required documentation now that the individual modules are
public.

* chore: Fix incomplete docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 10:06:49 +00:00
Andrew Lamb a0c0ae91ec
refactor: Simplify manipulations of BooleanArray (#5992)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 09:59:18 +00:00
Dom Dwyer 72a358e52f refactor(dml): PartitionKey required for writes
Changes the DmlWrite type to require a PartitionKey be specified,
instead of accepting an Option.

This requirement was already in place - the write buffer upheld an
invariant that all writes contained a partition key value (was not
"None") or it panicked at runtime when attempting to enqueue the write.

It is now possible to encode this invariant in the type system, which is
what this change does.
2022-10-28 10:57:30 +02:00
kodiakhq[bot] 3568564d39
Merge pull request #5958 from influxdata/dom/buffer-state-machine
refactor(ingester): use partition buffer FSM
2022-10-28 08:52:56 +00:00
kodiakhq[bot] f24dec8ac7
Merge branch 'main' into dom/buffer-state-machine 2022-10-28 08:46:07 +00:00
dependabot[bot] 4f031b4abd
chore(deps): Bump comfy-table from 6.1.1 to 6.1.2 (#5994)
Bumps [comfy-table](https://github.com/nukesor/comfy-table) from 6.1.1 to 6.1.2.
- [Release notes](https://github.com/nukesor/comfy-table/releases)
- [Changelog](https://github.com/Nukesor/comfy-table/blob/main/CHANGELOG.md)
- [Commits](https://github.com/nukesor/comfy-table/compare/v6.1.1...v6.1.2)

---
updated-dependencies:
- dependency-name: comfy-table
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-28 06:00:48 +00:00
kodiakhq[bot] 6d3398045e
Merge pull request #5990 from influxdata/cn/remote-store-more
feat: Some bonus features for `remote store get-table`
2022-10-27 15:53:42 +00:00
Carol (Nichols || Goulding) ace497d47c
fix: Rename database to namespace in the commands I just added 2022-10-27 10:40:39 -04:00
Carol (Nichols || Goulding) d65a6a86dd
fix: Make error output less repetitive/wordy 2022-10-27 10:30:58 -04:00
Carol (Nichols || Goulding) 47faca6843
feat: Allow specifying output dir for get-table 2022-10-27 10:30:57 -04:00
Carol (Nichols || Goulding) dc4adfeefb
feat: Add the partition ID to fetched parquet files 2022-10-27 10:30:57 -04:00
kodiakhq[bot] e0722623d6
Merge pull request #5984 from influxdata/cn/remote-store
feat: MVP of remote store get-table command
2022-10-27 14:17:29 +00:00
kodiakhq[bot] 90c93cb06a
Merge branch 'main' into cn/remote-store 2022-10-27 14:11:07 +00:00
Carol (Nichols || Goulding) f720dcee36
docs: Clarifications suggested in code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-10-27 10:10:28 -04:00
Marco Neumann 6369d88633
refactor: enforce name of the one-and-only time column (#5982)
* refactor: enforce name of the one-and-only time column

We currently only support a single time dimension and some parts of
other stack rely on the name of the time column. So lets enforce the
name (note that `schema::try_from_arrow` already checks for duplicate
column, so we are now left with a single dimension).

* refactor: mark a few errors as "internal"

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-27 12:42:49 +00:00
Marco Neumann 3d524baa9e
fix: update rskafka to support throttling (#5988)
See https://github.com/influxdata/rskafka/issues/182 .
2022-10-27 12:34:43 +00:00
Dom c1528f4d61
Merge branch 'main' into dom/buffer-state-machine 2022-10-27 09:23:02 +01:00
dependabot[bot] e6b45f8bde
chore(deps): Bump pbjson-types from 0.5.0 to 0.5.1 (#5987)
Bumps [pbjson-types](https://github.com/influxdata/pbjson) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson-types
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-27 08:21:08 +00:00
Dom Dwyer 5d2f4a0ad1 docs: fix issue URL for memory tracking bug 2022-10-27 10:15:15 +02:00