Commit Graph

7476 Commits (a053077a0501366f912d38661295851b3649eba0)

Author SHA1 Message Date
dependabot[bot] 6607c5e179
chore(deps): Bump arrow-flight from 11.0.0 to 11.1.0 ()
Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 11.0.0 to 11.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/11.0.0...11.1.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-06 10:04:23 +00:00
Paul Dix a6f18e86fe
chore: add compactor logs () 2022-04-05 21:26:59 +00:00
dependabot[bot] bea49e7611
chore(deps): Bump arrow from 11.0.0 to 11.1.0 ()
Bumps [arrow](https://github.com/apache/arrow-rs) from 11.0.0 to 11.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/11.0.0...11.1.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-05 16:54:28 +00:00
kodiakhq[bot] a11aef69f8
Merge pull request from influxdata/dom/watermark-fetcher
feat: periodic high watermark fetcher
2022-04-05 16:15:59 +00:00
Dom 3134763ea2
Merge branch 'main' into dom/watermark-fetcher 2022-04-05 17:08:41 +01:00
Dom 39199b0305
Merge pull request from influxdata/dom/compactor-parquet-limit
refactor: reduce level_0 query limit
2022-04-05 15:29:55 +01:00
Dom Dwyer 02f87e8484 refactor: reduce level_0 query limit
Reduce the query limit from 10,000 to 1,000 to help reduce query
execution time.
2022-04-05 15:14:56 +01:00
Dom Dwyer 506cdebf38 refactor: remove manual Debug impl
Derive the debug impl so it prints all the fields (specifically the
"number of sequencers configured" is pretty helpful in a test).

Manual impls drift over time and are more effort than the derive!
2022-04-05 12:02:07 +01:00
Dom Dwyer 891d2e1368 feat: periodic kafka max watermark offset fetcher
Adds the PeriodicWatermarkFetcher type responsible for querying write
buffer / Kafka for the maximum sequence number / offset, surfacing any
errors via both logs & metrics.

This high watermark / max offset value is used within the ingest
instrumentation metrics. This use case is tolerant of caching / stale
values, and as such the value is periodically updated to minimise load
on the write buffer.
2022-04-05 12:02:07 +01:00
kodiakhq[bot] faa85a04ad
Merge pull request from influxdata/dom/ingester-stream-handler
refactor: handle errors in ingester stream handler
2022-04-05 10:41:13 +00:00
Dom Dwyer aaa677dec8 docs: describe graceful shutdown behaviour 2022-04-05 11:31:55 +01:00
Dom Dwyer 8edefc415d refactor: rename ttbr -> write_time in tests 2022-04-05 11:31:55 +01:00
Dom Dwyer a387ec361d refactor: use self.deref() instead of **self 2022-04-05 11:31:55 +01:00
Dom Dwyer f15275cf96 feat: expose ingest sequencer errors
Instruments the SequencedStreamHandler with a series of new metrics that
record the various error classes observable in the stream handler.

These metrics are labelled with potential_data_loss=true where relevant
to surface potential data loss events for alerting & further review.
2022-04-05 11:31:55 +01:00
Dom Dwyer 083ff1f8e3 refactor: ingest stream handler
Refactors the stream_in_sequenced_entries() into a new impl in the
SequencedStreamHandler type, decoupling the reading / decoding of ops
from Kafka (and associated error handling) from the "what happens to
those ops" concern to ease testing, encapsulate the specifics of "how to
get an op" and improve flexibility.

This is intended to provide robust error handling within what is
reasonably possible (unexpected errors are always unexpected!) while
retaining the existing metrics and functionality. I've also separated
out code that exists in the current impl specifically to drive tests
from the prod code path, instead driving those behaviours through mocks.

As of this commit, the handler is not used - this commit simply adds the
new impl.
2022-04-05 11:31:54 +01:00
Dom Dwyer 850308cdc9 feat(tests): future timeout helper
Adds a timeout test helper for futures - this lets us easily write tests
that await on futures for a bounded duration of time.

Optional feature to avoid dragging tokio into existing consumers of the
test_helpers crate that don't need it.
2022-04-05 11:30:47 +01:00
Andrew Lamb 5d66cd0a81
feat: Add WriteSummary serialization and deserialization to protobuf ()
* feat: Add WriteSummary serialization and deserialization to protobuf

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-05 09:57:32 +00:00
Andrew Lamb 756116b497
chore: update datafusion ()
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-05 09:36:36 +00:00
Andrew Lamb 266f12b494
refactor: delete dead code () 2022-04-05 00:10:43 +00:00
Paul Dix 81d41f81a1
fix: ingester replay logic ()
Fix the ingester to track the max persisted sequence number per partition.
Ensure replay takes in data from unpersisted partitions.
Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.
2022-04-04 18:04:34 +00:00
kodiakhq[bot] f1799d836f
Merge pull request from influxdata/cn/sort-key-catalog
feat: Add optional sort_key column to partition table in the catalog
2022-04-04 17:02:56 +00:00
kodiakhq[bot] e2439c0a4f
Merge branch 'main' into cn/sort-key-catalog 2022-04-04 16:54:48 +00:00
kodiakhq[bot] e10e63403b
Merge pull request from influxdata/dom/column_name-table_id-index
refactor: add table_id index on column_name
2022-04-04 12:22:49 +00:00
kodiakhq[bot] 7b1b8878d7
Merge branch 'main' into dom/column_name-table_id-index 2022-04-04 12:15:08 +00:00
dependabot[bot] 276449ee09
chore(deps): Bump pbjson from 0.2.3 to 0.3.0 ()
Bumps [pbjson](https://github.com/influxdata/pbjson) from 0.2.3 to 0.3.0.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.2.3...0.3.0)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-04 12:05:46 +00:00
Dom Dwyer 61bc9c83ad refactor: add table_id index on column_name
After checking the postgres workload for the catalog in prod, this
missing index was noted as the cause of unexpectedly expensive plans for
simple queries.
2022-04-04 13:04:25 +01:00
dependabot[bot] 26f6a1721f
chore(deps): Bump tracing-core from 0.1.23 to 0.1.24 ()
Bumps [tracing-core](https://github.com/tokio-rs/tracing) from 0.1.23 to 0.1.24.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-core-0.1.23...tracing-core-0.1.24)

---
updated-dependencies:
- dependency-name: tracing-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:17:24 +00:00
dependabot[bot] d19b944ba5
chore(deps): Bump tracing-subscriber from 0.3.9 to 0.3.10 ()
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.3.9 to 0.3.10.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.9...tracing-subscriber-0.3.10)

---
updated-dependencies:
- dependency-name: tracing-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:09:29 +00:00
dependabot[bot] 4c052be568
chore(deps): Bump sqlparser from 0.15.0 to 0.16.0 ()
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.15.0 to 0.16.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.15.0...v0.16.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:01:14 +00:00
dependabot[bot] dc9632114c
chore(deps): Bump pretty_assertions from 1.2.0 to 1.2.1 ()
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 10:53:31 +00:00
dependabot[bot] 36dd6f26a3
chore(deps): Bump pbjson-build from 0.2.3 to 0.3.0 ()
Bumps [pbjson-build](https://github.com/influxdata/pbjson) from 0.2.3 to 0.3.0.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.2.3...0.3.0)

---
updated-dependencies:
- dependency-name: pbjson-build
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 10:45:31 +00:00
dependabot[bot] 1edd89eb67
chore(deps): Bump clap from 3.1.7 to 3.1.8 ()
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.7 to 3.1.8.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.7...v3.1.8)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-04 10:36:58 +00:00
Andrew Lamb edda409b19
refactor: Extract `ioxd_test`, `ioxd_compactor`, `ioxd_ingester`; remove `ioxd` ()
* refactor: Extract test, compactor, ingester, and test

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-03 10:42:22 +00:00
Paul Dix 0892ccf7fb
fix: compactor use join_all ()
I forgot to address this in . Have the compactor use join and make sure the error gets logged.
2022-04-02 14:23:33 -04:00
Andrew Lamb 833c10c083
feat: return write_token from HTTP writes to router2 ()
* feat: return write_token from HTTP writes to router2

* fix: Update router2/src/dml_handlers/instrumentation.rs

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: Use WriteSummary::default more vigorously

* fix: fix typo and add links to follow on issues

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-02 10:34:51 +00:00
Paul Dix 3aa3ebe0e8
chore: add compactor logging () 2022-04-01 18:51:01 -04:00
Carol (Nichols || Goulding) cbf7888435
feat: Add Partition update_sort_key method to catalog 2022-04-01 15:45:51 -04:00
Carol (Nichols || Goulding) c9bc70f03a
feat: Add optional sort_key column to partition table
Connects to .
2022-04-01 15:45:51 -04:00
kodiakhq[bot] 403ae51099
Merge pull request from influxdata/cn/sort-key
feat: Compute a sort key in the ingester
2022-04-01 19:34:36 +00:00
kodiakhq[bot] b561f06c9e
Merge branch 'main' into cn/sort-key 2022-04-01 19:26:58 +00:00
Nga Tran 77ad4a7dad
feat: replace a compactor constant with an CLI config param () 2022-04-01 17:50:43 +00:00
Carol (Nichols || Goulding) d41adf074f
test: Add assertions for sort keys 2022-04-01 13:13:04 -04:00
Nga Tran a6eb83d47d
feat: compact small contiguous files of the same partition even if they do not overlap ()
* feat: compact small contiguous files of the same partition even if they do not overlap

* test: more tests

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-04-01 15:26:43 +00:00
Luke Bond ea865b63f4
fix: create_or_get_multi for column in catalog now enforces limits ()
* fix: create_or_get_multi for column in catalog now enforces limits

fix: create_or_get_multi for column in catalog now enforces limits
chore: reorder catalog column create fns to be next to each other
test: add failing test for multi col insert w/ limits

test: bend catalog mem impl to match postgres for tests

fix: postgres column insert many column type error checks

chore: clippy

* test: assert column counts in partial column insert test

* chore: add some sql comments to the monster multicolumn insert query; s/RIGHT/INNER/ join

* chore: adding comments to clarify partial failure behaviour of multi col insert

* test: add tests for create_or_get_many columns in catalog

* test: forgot how macros work for a moment

* test: service limit test handles partial update of cols

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-01 10:59:43 +00:00
Paul Dix 6479e1fc8e
fix: add indexes to parquet_file ()
Add indexes so compactor can find candidate partitions and specific partition files quickly.
Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-01 09:59:39 +00:00
dependabot[bot] e8b0655ac8
chore(deps): Bump clap from 3.1.6 to 3.1.7 ()
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.6 to 3.1.7.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.6...v3.1.7)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 09:49:30 +00:00
Carol (Nichols || Goulding) f4b5fa1b5e
feat: Implement distinct counts in terms of distinct values
For one record batch.

Connects to .
2022-03-31 16:46:27 -04:00
Carol (Nichols || Goulding) 832495a7c9
feat: Implement ingester compute_sort_key similarly to query compute_sort_key
And add a test that currently fails because this implementation doesn't
include actually computing the cardinalities.

Connects to .
2022-03-31 16:35:16 -04:00
Carol (Nichols || Goulding) 9d83554f20
feat: Get the sort key from the schema and data in the QueryableBatch
Connects to .
2022-03-31 16:34:48 -04:00
Carol (Nichols || Goulding) 9043966443
docs: Fix some typos in comments as I noticed them 2022-03-31 16:34:47 -04:00