Commit Graph

9866 Commits (651b7a1ce64df19c730e1fd721a1dcf4ecf618cf)

Author SHA1 Message Date
Marco Neumann d2399764c5 fix: be more conservative w/ "request canceled" messages 2022-11-01 17:08:59 +01:00
Cannon Palms ee92d28dfd
Merge pull request #6016 from influxdata/crepererum/issue5981
feat: enable ZSTD compression for write buffer payload
2022-11-01 09:45:58 -04:00
Marco Neumann 254be59856 feat: enable ZSTD compression for write buffer payload
Closes #5981.
2022-11-01 14:22:33 +01:00
Marco Neumann aa4eec9939 chore: update rskafka
Mostly upstream dependencies updates8678dfe049de05415929ffec7c1be8921bb057f7.
2022-11-01 14:19:32 +01:00
Marco Neumann d6cbae16ac
chore: update rskafka (#5998)
Includes additional logging to debug https://github.com/influxdata/idpe/issues/16278
2022-11-01 06:39:26 +00:00
dependabot[bot] 7785b20d7f
chore(deps): Bump hyper from 0.14.21 to 0.14.22 (#6012)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.21 to 0.14.22.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/v0.14.22/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.21...v0.14.22)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-11-01 05:53:09 +00:00
Andrew Lamb 9c1f0a3644
refactor: move SessionConfig creation into datafusion_utils (#6011)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 20:04:49 +00:00
Andrew Lamb 00953460fb
chore: Update datafusion pin (#6010) 2022-10-31 17:14:56 +00:00
Marco Neumann 072439e428
refactor: mandatory `QueryChunkMeta::summary` (#5997)
With #5963 merged, all chunks now provide a summary (even though it may
not contain data for all columns). So let's make it mandatory, which
also removes a few 🙈-style `.except(...)` calls.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:38:02 +00:00
dependabot[bot] 62e51a5e06
chore(deps): Bump cc from 1.0.73 to 1.0.74 (#6008)
Bumps [cc](https://github.com/rust-lang/cc-rs) from 1.0.73 to 1.0.74.
- [Release notes](https://github.com/rust-lang/cc-rs/releases)
- [Commits](https://github.com/rust-lang/cc-rs/compare/1.0.73...1.0.74)

---
updated-dependencies:
- dependency-name: cc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:30:57 +00:00
dependabot[bot] b1572c50a6
chore(deps): Bump once_cell from 1.15.0 to 1.16.0 (#6009)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.15.0...v1.16.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:23:40 +00:00
Andrew Lamb ace3c11f12
chore: Update datafusion (#6004)
* chore: Update datafusion

* chore: change path

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:16:28 +00:00
Carol (Nichols || Goulding) 729ffffa3e
fix: Let's not all write to the same file at the same time
Fixes #6001.

The generator can create multiple agents that all write in parallel to
the same file, which results in garbage.

Share the same File instance with a Mutex around it and lock the file
until you've written one whole line.
2022-10-28 13:44:33 -04:00
kodiakhq[bot] 71dd3b5fa5
Merge pull request #6005 from influxdata/cn/add-catalog-service-everywhere
feat: Add the catalog service to ingester, querier, and compactor
2022-10-28 16:30:46 +00:00
Carol (Nichols || Goulding) dad1ad1318
feat: Add the catalog service to ingester, querier, and compactor
So that `remote get` that uses the catalog service can work no matter
what kind of server you contact.
2022-10-28 10:49:26 -04:00
Carol (Nichols || Goulding) 53445af25d
chore: Alphabetize some dependencies
I can't handle not knowing where to look for a dependency or knowing
where to add a new dependency.
2022-10-28 10:34:25 -04:00
Andrew Lamb e9d04ffcb5
feat: Log how long each persist plan takes to complete (#5989)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 13:52:39 +00:00
Dom e9f3b425f3
Merge pull request #6000 from influxdata/dom/router-namespaceid
refactor(router): pass NamespaceId through handler stack
2022-10-28 13:45:00 +01:00
Dom Dwyer d166de931d refactor: resolve namespace before DML dispatch
This commit introduces a new (composable) trait; a NamespaceResolver is
an abstraction responsible for taking a string namespace from a user
request, and mapping to it's catalog ID.

This allows the NamespaceId to be injected through the DmlHandler chain
in addition to the namespace name.

As part of this change, the NamespaceAutocreation layer was changed from
an implementator of the DmlHandler trait, to a NamespaceResolver as it
is a more appropriate abstraction for the functionality it provides.
2022-10-28 13:41:05 +02:00
Dom Dwyer 0c5eb3f70f style: format imports
Re-order and re-format the imports so that they follow a consistent
pattern.

This helps eliminate conflicts due to imports.
2022-10-28 13:39:19 +02:00
Carol (Nichols || Goulding) 69a2e6b871
feat: Last 2 bonus features of remote store get-table (#5991)
* feat: Only get files that aren't already on disk with the reported size

* feat: Stream Parquet file bytes to file on disk

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 11:03:08 +00:00
kodiakhq[bot] 16473f66e7
Merge pull request #5996 from influxdata/dom/require-partition-key
refactor(dml): PartitionKey required for writes
2022-10-28 10:38:17 +00:00
kodiakhq[bot] 1567227b49
Merge branch 'main' into dom/require-partition-key 2022-10-28 10:31:22 +00:00
Marco Neumann 8447d46093
refactor: remove `QueryChunkMeta::timestamp_min_max` (#5963)
Use the table summary instead. This allows us to have a single mechanism
that both IOx and DataFusion understand. This basically lifts the "basic
table summary" mechanism that the querier uses to `iox_query` and let
the compactor and ingester use the same mechanism.

While not strictly necessary, simplifying the `QueryChunk[Meta]`
interface helps with #5897.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 10:29:16 +00:00
Stuart Carnie 9f8c5856fc
chore: Keep types in their respective modules (#5993)
* chore: Keep types in their respective modules

Also adds required documentation now that the individual modules are
public.

* chore: Fix incomplete docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 10:06:49 +00:00
Andrew Lamb a0c0ae91ec
refactor: Simplify manipulations of BooleanArray (#5992)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 09:59:18 +00:00
Dom Dwyer 72a358e52f refactor(dml): PartitionKey required for writes
Changes the DmlWrite type to require a PartitionKey be specified,
instead of accepting an Option.

This requirement was already in place - the write buffer upheld an
invariant that all writes contained a partition key value (was not
"None") or it panicked at runtime when attempting to enqueue the write.

It is now possible to encode this invariant in the type system, which is
what this change does.
2022-10-28 10:57:30 +02:00
kodiakhq[bot] 3568564d39
Merge pull request #5958 from influxdata/dom/buffer-state-machine
refactor(ingester): use partition buffer FSM
2022-10-28 08:52:56 +00:00
kodiakhq[bot] f24dec8ac7
Merge branch 'main' into dom/buffer-state-machine 2022-10-28 08:46:07 +00:00
dependabot[bot] 4f031b4abd
chore(deps): Bump comfy-table from 6.1.1 to 6.1.2 (#5994)
Bumps [comfy-table](https://github.com/nukesor/comfy-table) from 6.1.1 to 6.1.2.
- [Release notes](https://github.com/nukesor/comfy-table/releases)
- [Changelog](https://github.com/Nukesor/comfy-table/blob/main/CHANGELOG.md)
- [Commits](https://github.com/nukesor/comfy-table/compare/v6.1.1...v6.1.2)

---
updated-dependencies:
- dependency-name: comfy-table
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-28 06:00:48 +00:00
kodiakhq[bot] 6d3398045e
Merge pull request #5990 from influxdata/cn/remote-store-more
feat: Some bonus features for `remote store get-table`
2022-10-27 15:53:42 +00:00
Carol (Nichols || Goulding) ace497d47c
fix: Rename database to namespace in the commands I just added 2022-10-27 10:40:39 -04:00
Carol (Nichols || Goulding) d65a6a86dd
fix: Make error output less repetitive/wordy 2022-10-27 10:30:58 -04:00
Carol (Nichols || Goulding) 47faca6843
feat: Allow specifying output dir for get-table 2022-10-27 10:30:57 -04:00
Carol (Nichols || Goulding) dc4adfeefb
feat: Add the partition ID to fetched parquet files 2022-10-27 10:30:57 -04:00
kodiakhq[bot] e0722623d6
Merge pull request #5984 from influxdata/cn/remote-store
feat: MVP of remote store get-table command
2022-10-27 14:17:29 +00:00
kodiakhq[bot] 90c93cb06a
Merge branch 'main' into cn/remote-store 2022-10-27 14:11:07 +00:00
Carol (Nichols || Goulding) f720dcee36
docs: Clarifications suggested in code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-10-27 10:10:28 -04:00
Marco Neumann 6369d88633
refactor: enforce name of the one-and-only time column (#5982)
* refactor: enforce name of the one-and-only time column

We currently only support a single time dimension and some parts of
other stack rely on the name of the time column. So lets enforce the
name (note that `schema::try_from_arrow` already checks for duplicate
column, so we are now left with a single dimension).

* refactor: mark a few errors as "internal"

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-27 12:42:49 +00:00
Marco Neumann 3d524baa9e
fix: update rskafka to support throttling (#5988)
See https://github.com/influxdata/rskafka/issues/182 .
2022-10-27 12:34:43 +00:00
Dom c1528f4d61
Merge branch 'main' into dom/buffer-state-machine 2022-10-27 09:23:02 +01:00
dependabot[bot] e6b45f8bde
chore(deps): Bump pbjson-types from 0.5.0 to 0.5.1 (#5987)
Bumps [pbjson-types](https://github.com/influxdata/pbjson) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson-types
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-27 08:21:08 +00:00
Dom Dwyer 5d2f4a0ad1 docs: fix issue URL for memory tracking bug 2022-10-27 10:15:15 +02:00
Dom Dwyer f6416675c2 docs: mark hyperlink in rustdoc comments 2022-10-27 10:15:15 +02:00
Dom Dwyer 678fb81892 refactor(ingester): use partition buffer FSM
This commit makes use of the partition buffer state machine introduced
in https://github.com/influxdata/influxdb_iox/pull/5943.

This commit significantly changes the buffering, and querying, of data
from a partition, swapping out the existing "DataBuffer" for the new
state machine implementation (itself simplified due to temporary lack of
incremental snapshot generation, see #5944).

This commit simplifies the query path, removing multiple types that
wrapped one-another to pass around various state necessary to perform a
query, with various query functions needing different types or
combinations of types. The query path now operates using a single type
(named "QueryAdaptor") that provides a queryable interface over the set
of RecordBatch returned from a partition.

There is significantly increased testing of the PartitionData itself,
covering data in various states and the ordering of returned RecordBatch
(to ensure correct materialisation of updates). There are also
invariants upheld by the type system / compiler to minimise the
complexities of working with empty batches & states, and many asserts
that ensure (mostly existing!) invariants are upheld.
2022-10-27 10:15:15 +02:00
dependabot[bot] 5f35e88706
chore(deps): Bump pbjson-build from 0.5.0 to 0.5.1 (#5986)
Bumps [pbjson-build](https://github.com/influxdata/pbjson) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-27 08:12:25 +00:00
dependabot[bot] d8baabba4b
chore(deps): Bump pbjson from 0.5.0 to 0.5.1 (#5985)
Bumps [pbjson](https://github.com/influxdata/pbjson) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-27 07:43:49 +00:00
Marco Neumann b438e4b18a
feat: log cancelled HTTP/gRPC requests (#5980)
This will be helpful to see when the querier or router is too slow and
we timeout. In contrast to the existing metrics, this also helps w/ log
correlation (i.e. "when did we get stuck").

Closes #5975.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-27 07:33:22 +00:00
Carol (Nichols || Goulding) de2ae6f557
feat: MVP of remote store get-table command 2022-10-26 13:50:03 -04:00
Marco Neumann d466a04ad8
feat: allow deploys to set Kafka client ID (#5983) 2022-10-26 15:59:16 +00:00