Commit Graph

9324 Commits (7202dddab6d9ede46c74664c0675fe349da2fd13)

Author SHA1 Message Date
dependabot[bot] 7202dddab6
chore(deps): Bump tokio-stream from 0.1.10 to 0.1.11 (#5838)
Bumps [tokio-stream](https://github.com/tokio-rs/tokio) from 0.1.10 to 0.1.11.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-stream-0.1.10...tokio-stream-0.1.11)

---
updated-dependencies:
- dependency-name: tokio-stream
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-12 12:37:24 +00:00
Luke Bond 11900cea4d
chore: add some tracing logs to the ingester (#5839) 2022-10-12 12:10:20 +00:00
Nga Tran b7153862b0
refactor: due to limit in size uplaoed to S3, we need to split output file of cold compaction, too (#5834)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-11 17:22:19 +00:00
kodiakhq[bot] 990fa55e28
Merge pull request #5832 from influxdata/dom/move-query-types
refactor: move query types to query_handler
2022-10-11 16:14:01 +00:00
Dom Dwyer b294bb98aa refactor: move query types to query_handler
Moves types that are only used for handling queries to the query_handler
module.
2022-10-11 17:58:55 +02:00
kodiakhq[bot] 7c9a26849b
Merge pull request #5831 from influxdata/dom/remove-tombstones
refactor(ingester): remove tombstone support and delete tests from `query_tests`
2022-10-11 15:22:24 +00:00
kodiakhq[bot] 96ff3b020a
Merge branch 'main' into dom/remove-tombstones 2022-10-11 15:14:54 +00:00
Dom Dwyer c4f542bbe2 refactor(ingester): remove tombstone support
This commit removes tombstone support from the ingester, and deletes
associated code/helpers/tests. This commit does NOT remove tombstone
support from any other service, but MAY include removing overlapping
test coverage.

This also removes the tombstone support from the Ingester -> Querier RPC
response message.

This has the nice side effect of removing a whole lot of thread spawning
in the ingester tests for the Executor, speeding everything up!
2022-10-11 13:10:04 +02:00
kodiakhq[bot] a205c01b16
Merge pull request #5830 from influxdata/dom/revert-rdkafka
revert: rdkafka/rskafka swapping (#5800)
2022-10-11 11:10:01 +00:00
Dom Dwyer b77c3540e1 revert: rdkafka/rskafka swapping (#5800)
This reverts commit 33391af973.
2022-10-11 13:01:10 +02:00
Luke Bond fda1479db0
chore: add trace log to ingester to aid debugging (#5829) 2022-10-11 10:33:42 +00:00
Carol (Nichols || Goulding) 33391af973
feat: Swap Kafka Producer implementation back to rdkafka as diagnosis of latency problem (#5800)
* feat: Add back rdkafka dependency

* feat: Remove RSKafkaProducer

* feat: Remove write buffer RecordAggregator

* feat: Add back rdkafka producer

Using code from 58a2a0b9c8311303c796495db4f167c99a2ea3aa then getting it
to compile with the latest

* feat: Add a metric around enqueue

* fix: Remove unused imports

* fix: Increase Kafka timeout to 20s

* docs: Clarify that Kafka topics should only be created in test/dev envs

* fix: Remove metrics that aren't needed for this experiment

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-11 09:14:45 +00:00
Dom 2b5ca28374
Merge pull request #5827 from influxdata/dependabot/cargo/object_store-0.5.1
chore(deps): Bump object_store from 0.5.0 to 0.5.1
2022-10-11 10:05:31 +01:00
Dom d2467d0b63
Merge branch 'main' into dependabot/cargo/object_store-0.5.1 2022-10-11 09:56:27 +01:00
Dom 2b8958fc03
Merge pull request #5826 from influxdata/dom/table-name-type
refactor: use TableName, not Arc<str>
2022-10-11 09:54:29 +01:00
dependabot[bot] 933493fab3
chore(deps): Bump object_store from 0.5.0 to 0.5.1
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-11 01:19:10 +00:00
Dom Dwyer 97c6e0f8ce refactor: use TableName, not Arc<str>
Adds a type wrapper TableName, internally an Arc<str> to leverage the
type system instead of passing around untyped strings.
2022-10-10 19:09:43 +02:00
Dom c95bf8ff87
Merge pull request #5807 from influxdata/dom/deferred-sort-key-load
perf(ingester): deferred sort key load
2022-10-10 18:09:31 +01:00
Dom c1277fb15d
Merge branch 'main' into dom/deferred-sort-key-load 2022-10-10 16:05:14 +01:00
dependabot[bot] 0eac3812c8
chore(deps): Bump snafu from 0.7.1 to 0.7.2 (#5821)
Bumps [snafu](https://github.com/shepmaster/snafu) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/shepmaster/snafu/releases)
- [Changelog](https://github.com/shepmaster/snafu/blob/main/CHANGELOG.md)
- [Commits](https://github.com/shepmaster/snafu/compare/0.7.1...0.7.2)

---
updated-dependencies:
- dependency-name: snafu
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-10 13:15:29 +00:00
dependabot[bot] f8bc4d8881
chore(deps): Bump libc from 0.2.134 to 0.2.135 (#5822)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.134 to 0.2.135.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.134...0.2.135)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-10 13:06:59 +00:00
Dom Dwyer 4518bd49d1 test: constify duration seconds 2022-10-10 14:39:35 +02:00
Dom Dwyer ab78f99ab2 refactor: eager background task abort
Changes the get() code path to abort the background load task when the
caller will resolve the sort key.

Note that an aborted future will leave the DeferredSortKey without a
background task to fetch the key, and the next caller will have to query
the catalog. Given the rarity of aborted futures, and desire to minimise
catalog load, this seems like a decent trade-off.

This commit also documents the many-readers eager loading problem.
2022-10-10 14:39:35 +02:00
Dom 3bdcf4dc7a
Merge pull request #5820 from influxdata/dependabot/cargo/serde_json-1.0.86
chore(deps): Bump serde_json from 1.0.85 to 1.0.86
2022-10-10 12:22:20 +01:00
dependabot[bot] 2277fcf08a
chore(deps): Bump serde_json from 1.0.85 to 1.0.86
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.85 to 1.0.86.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.85...v1.0.86)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-10 01:42:37 +00:00
Andrew Lamb d8a318eb57
docs: Tweak local run guide (#5787)
Update the instructions on how to run IOx locally

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-07 20:34:00 +00:00
Andrew Lamb 8013781ac2
feat: rewrite missing column references to NULL (#5818)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-07 18:05:54 +00:00
dependabot[bot] 02e3ab125c
chore(deps): Bump syn from 1.0.101 to 1.0.102 (#5813)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.101 to 1.0.102.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.101...1.0.102)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-07 13:35:03 +00:00
dependabot[bot] 07361f5b40
chore(deps): Bump tracing from 0.1.36 to 0.1.37 (#5811)
Bumps [tracing](https://github.com/tokio-rs/tracing) from 0.1.36 to 0.1.37.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.36...tracing-0.1.37)

---
updated-dependencies:
- dependency-name: tracing
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-07 13:25:26 +00:00
Stuart Carnie 5bd6b43666
fix: Correct representation of 3-part measurement name (#5794)
Closes #5662
2022-10-07 00:01:22 +00:00
Nga Tran 95ed41f140
feat: Projection pushdown for querier -> ingester for rpc queries (#5782)
* feat: initial step to identify where the projection should be provided

* feat: start getting columns of all expressions

* chore: format

* test: test for the table_chunk_stream

* fix: fix a compile error. Thanks @alamb

* test: full tests for table_chunk_stream

* chore: cleanup

* fix: do not cut any columns in case all fields are needed

* test: add one more test case of reading all columns

* refactor: move code that identify columbs ot push down to a function. Add the use of  field_columns

* chore: cleanup

* refactor: make sream_from_batch support empty batches

* chore: cleanup

* chore: fix clippy after auto merge

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-06 17:21:23 +00:00
Dom Dwyer afcb96ae47 perf(ingester): deferred sort key lookup queries
This commit carries the SortKey in the PartitionData, and configures the
ingester to use deferred sort key lookups, smearing the lookups across a
fixed period of time after initialising the PartitionData, instead of
querying for the sort key at persist time.

This allows large numbers of PartitionData to be initialised without
causing a equally large spike in catalog load to resolve the sort key -
instead this load is spread out randomly to reduce peak query rps.
2022-10-06 16:39:54 +02:00
Dom Dwyer c022ab6786 feat: deferred partition sort key fetcher
Adds a new DeferredSortKey type that fetches a partition's sort key from
the catalog in the background, or on-demand if not yet pre-fetched.

From the caller's perspective, little has changed compared to reading it
from the catalog directly - the sort key is always returned when calling
get(), regardless of the mechanism, and retries are handled
transparently. Internally the sort key MAY have been pre-fetched in the
background between the DeferredSortKey being initialised, and the call
to get().

The background task waits a (uniformly) random duration of time before
issuing the catalog query to pre-fetch the sort key. This allows large
numbers of DeferredSortKey to (randomly) smear the lookup queries over a
large duration of time. This allows a large number of DeferredSortKey to
be initialised in a short period of time, without creating an equally
large spike in queries against the catalog in the same time period.
2022-10-06 16:37:04 +02:00
kodiakhq[bot] ace30b9d1d
Merge pull request #5798 from influxdata/dom/namespace-name
refactor: carry namespace name in NamespaceData
2022-10-06 14:06:37 +00:00
kodiakhq[bot] ffa1704d96
Merge branch 'main' into dom/namespace-name 2022-10-06 13:58:47 +00:00
Marco Neumann c4c83e0840
fix: query error propagation (#5801)
- treat OOM protection as "resource exhausted"
- use `DataFusionError` in more places instead of opaque `Box<dyn Error>`
- improve conversion from/into `DataFusionError` to preserve more
  semantics

Overall, this improves our error handling. DF can now return errors like
"resource exhausted" and gRPC should now automatically generate a
sensible status code for it.

Fixes #5799.
2022-10-06 08:54:01 +00:00
Dom Dwyer abb9122e2c refactor: carry namespace name in NamespaceData
Changes the ingester's NamespaceData to carry a ref-counted string
identifier as well as the ID.

The backing storage for the name in NamespaceData is shared with the
index map in ShardData, so it is effectively free!
2022-10-05 13:03:16 +02:00
kodiakhq[bot] e81dad972f
Merge pull request #5791 from influxdata/dom/remove-partition-queries
refactor: reference buffer tree nodes by ID
2022-10-05 10:54:19 +00:00
Dom c48aef27b4
Merge branch 'main' into dom/remove-partition-queries 2022-10-05 11:46:33 +01:00
dependabot[bot] c9a2445fd4
chore(deps): Bump handlebars from 4.3.4 to 4.3.5 (#5797)
* chore(deps): Bump handlebars from 4.3.4 to 4.3.5

Bumps [handlebars](https://github.com/sunng87/handlebars-rust) from 4.3.4 to 4.3.5.
- [Release notes](https://github.com/sunng87/handlebars-rust/releases)
- [Changelog](https://github.com/sunng87/handlebars-rust/blob/v4.3.5/CHANGELOG.md)
- [Commits](https://github.com/sunng87/handlebars-rust/compare/v4.3.4...v4.3.5)

---
updated-dependencies:
- dependency-name: handlebars
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-05 09:48:11 +00:00
Dom cc238c6a8f
Merge branch 'main' into dom/remove-partition-queries 2022-10-05 10:40:31 +01:00
dependabot[bot] 9bbbf86116
chore(deps): Bump sqlparser from 0.24.0 to 0.25.0 (#5795)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.24.0 to 0.25.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.24.0...v0.25.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-05 09:04:18 +00:00
Andrew Lamb a11aafe25b
chore: Update SQL repl to refer to `namespace` rather than `database` (#5788) 2022-10-04 12:53:17 +00:00
Dom Dwyer 1a7eb47b81 refactor: persist() passes all necessary IDs
This commit changes the persist() call so that it passes through all
relevant IDs so that the impl can locate the partition in the buffer
tree - this will enable elimination of many queries against the catalog
in the future.

This commit also cleans up the persist() impl, deferring queries until
the result will be used to avoid unnecessary load, improves logging &
error handling, and documents a TOCTOU bug in code:

    https://github.com/influxdata/influxdb_iox/issues/5777
2022-10-04 14:28:01 +02:00
Dom Dwyer f9bf86927d refactor: ref PartitionData by key & ID
Changes the TableData to hold a map of partition key -> PartitionData,
and partition ID -> PartitionData simultaneously. This allows for cheap
lookups when the caller holds an ID.

This commit also manages to internalise the partition map within the
TableData - one less pub / peeking!

This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.
2022-10-04 14:28:01 +02:00
Dom Dwyer 0847cc5458 refactor: PartitionData::id() -> partition_id()
Consistent naming is consistent - all the others are thing_id().
2022-10-04 14:28:01 +02:00
Dom Dwyer 66e05b5ea7 refactor: ref NamespaceData by name & ID
Changes the ShardData to hold a map of namespace name -> NamespaceData,
and namespace ID -> NamespaceData simultaneously.

This allows for cheap lookups when the caller holds an ID, and is part
of preparatory work to transition away from using string names in the
ingester for tables.

This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.
2022-10-04 14:28:01 +02:00
Dom Dwyer 9c0e4e98c4 refactor: ref TableData by name & ID
Changes the NamespaceData to hold a map of table name -> TableData, and
table ID -> TableData simultaneously.

This allows for cheap lookups when the caller holds an ID, and is part
of preparatory work to transition away from using string names in the
ingester for tables.

This commit also switches from a BTreeMap to a HashMap as the backing
collection, as maintaining key ordering doesn't appear to be necessary.
2022-10-04 14:28:01 +02:00
kodiakhq[bot] 75178e4591
Merge pull request #5786 from influxdata/dom/fix-mem-counting
fix(ingester): incorrect memory tracking of failed writes
2022-10-03 10:31:43 +00:00
Dom Dwyer 7efd81a63a docs: comment write record ordering 2022-10-03 12:23:30 +02:00