Commit Graph

7407 Commits (81d41f81a1c55e0b6f842e5b1081246c4cae1002)

Author SHA1 Message Date
Paul Dix 81d41f81a1
fix: ingester replay logic (#4212)
Fix the ingester to track the max persisted sequence number per partition.
Ensure replay takes in data from unpersisted partitions.
Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.
2022-04-04 18:04:34 +00:00
kodiakhq[bot] f1799d836f
Merge pull request #4206 from influxdata/cn/sort-key-catalog
feat: Add optional sort_key column to partition table in the catalog
2022-04-04 17:02:56 +00:00
kodiakhq[bot] e2439c0a4f
Merge branch 'main' into cn/sort-key-catalog 2022-04-04 16:54:48 +00:00
kodiakhq[bot] e10e63403b
Merge pull request #4225 from influxdata/dom/column_name-table_id-index
refactor: add table_id index on column_name
2022-04-04 12:22:49 +00:00
kodiakhq[bot] 7b1b8878d7
Merge branch 'main' into dom/column_name-table_id-index 2022-04-04 12:15:08 +00:00
dependabot[bot] 276449ee09
chore(deps): Bump pbjson from 0.2.3 to 0.3.0 (#4215)
Bumps [pbjson](https://github.com/influxdata/pbjson) from 0.2.3 to 0.3.0.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.2.3...0.3.0)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-04 12:05:46 +00:00
Dom Dwyer 61bc9c83ad refactor: add table_id index on column_name
After checking the postgres workload for the catalog in prod, this
missing index was noted as the cause of unexpectedly expensive plans for
simple queries.
2022-04-04 13:04:25 +01:00
dependabot[bot] 26f6a1721f
chore(deps): Bump tracing-core from 0.1.23 to 0.1.24 (#4217)
Bumps [tracing-core](https://github.com/tokio-rs/tracing) from 0.1.23 to 0.1.24.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-core-0.1.23...tracing-core-0.1.24)

---
updated-dependencies:
- dependency-name: tracing-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:17:24 +00:00
dependabot[bot] d19b944ba5
chore(deps): Bump tracing-subscriber from 0.3.9 to 0.3.10 (#4222)
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from 0.3.9 to 0.3.10.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.9...tracing-subscriber-0.3.10)

---
updated-dependencies:
- dependency-name: tracing-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:09:29 +00:00
dependabot[bot] 4c052be568
chore(deps): Bump sqlparser from 0.15.0 to 0.16.0 (#4219)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.15.0 to 0.16.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.15.0...v0.16.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 11:01:14 +00:00
dependabot[bot] dc9632114c
chore(deps): Bump pretty_assertions from 1.2.0 to 1.2.1 (#4213)
Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases)
- [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v1.2.0...v1.2.1)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 10:53:31 +00:00
dependabot[bot] 36dd6f26a3
chore(deps): Bump pbjson-build from 0.2.3 to 0.3.0 (#4220)
Bumps [pbjson-build](https://github.com/influxdata/pbjson) from 0.2.3 to 0.3.0.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.2.3...0.3.0)

---
updated-dependencies:
- dependency-name: pbjson-build
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-04 10:45:31 +00:00
dependabot[bot] 1edd89eb67
chore(deps): Bump clap from 3.1.7 to 3.1.8 (#4221)
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.7 to 3.1.8.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.7...v3.1.8)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-04 10:36:58 +00:00
Andrew Lamb edda409b19
refactor: Extract `ioxd_test`, `ioxd_compactor`, `ioxd_ingester`; remove `ioxd` (#4210)
* refactor: Extract test, compactor, ingester, and test

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-03 10:42:22 +00:00
Paul Dix 0892ccf7fb
fix: compactor use join_all (#4211)
I forgot to address this in #4139. Have the compactor use join and make sure the error gets logged.
2022-04-02 14:23:33 -04:00
Andrew Lamb 833c10c083
feat: return write_token from HTTP writes to router2 (#4202)
* feat: return write_token from HTTP writes to router2

* fix: Update router2/src/dml_handlers/instrumentation.rs

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: Use WriteSummary::default more vigorously

* fix: fix typo and add links to follow on issues

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-02 10:34:51 +00:00
Paul Dix 3aa3ebe0e8
chore: add compactor logging (#4207) 2022-04-01 18:51:01 -04:00
Carol (Nichols || Goulding) cbf7888435
feat: Add Partition update_sort_key method to catalog 2022-04-01 15:45:51 -04:00
Carol (Nichols || Goulding) c9bc70f03a
feat: Add optional sort_key column to partition table
Connects to #4195.
2022-04-01 15:45:51 -04:00
kodiakhq[bot] 403ae51099
Merge pull request #4138 from influxdata/cn/sort-key
feat: Compute a sort key in the ingester
2022-04-01 19:34:36 +00:00
kodiakhq[bot] b561f06c9e
Merge branch 'main' into cn/sort-key 2022-04-01 19:26:58 +00:00
Nga Tran 77ad4a7dad
feat: replace a compactor constant with an CLI config param (#4204) 2022-04-01 17:50:43 +00:00
Carol (Nichols || Goulding) d41adf074f
test: Add assertions for sort keys 2022-04-01 13:13:04 -04:00
Nga Tran a6eb83d47d
feat: compact small contiguous files of the same partition even if they do not overlap (#4197)
* feat: compact small contiguous files of the same partition even if they do not overlap

* test: more tests

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-04-01 15:26:43 +00:00
Luke Bond ea865b63f4
fix: create_or_get_multi for column in catalog now enforces limits (#4179)
* fix: create_or_get_multi for column in catalog now enforces limits

fix: create_or_get_multi for column in catalog now enforces limits
chore: reorder catalog column create fns to be next to each other
test: add failing test for multi col insert w/ limits

test: bend catalog mem impl to match postgres for tests

fix: postgres column insert many column type error checks

chore: clippy

* test: assert column counts in partial column insert test

* chore: add some sql comments to the monster multicolumn insert query; s/RIGHT/INNER/ join

* chore: adding comments to clarify partial failure behaviour of multi col insert

* test: add tests for create_or_get_many columns in catalog

* test: forgot how macros work for a moment

* test: service limit test handles partial update of cols

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-01 10:59:43 +00:00
Paul Dix 6479e1fc8e
fix: add indexes to parquet_file (#4198)
Add indexes so compactor can find candidate partitions and specific partition files quickly.
Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-01 09:59:39 +00:00
dependabot[bot] e8b0655ac8
chore(deps): Bump clap from 3.1.6 to 3.1.7 (#4199)
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.6 to 3.1.7.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.6...v3.1.7)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-01 09:49:30 +00:00
Carol (Nichols || Goulding) f4b5fa1b5e
feat: Implement distinct counts in terms of distinct values
For one record batch.

Connects to #4194.
2022-03-31 16:46:27 -04:00
Carol (Nichols || Goulding) 832495a7c9
feat: Implement ingester compute_sort_key similarly to query compute_sort_key
And add a test that currently fails because this implementation doesn't
include actually computing the cardinalities.

Connects to #4194.
2022-03-31 16:35:16 -04:00
Carol (Nichols || Goulding) 9d83554f20
feat: Get the sort key from the schema and data in the QueryableBatch
Connects to #4194.
2022-03-31 16:34:48 -04:00
Carol (Nichols || Goulding) 9043966443
docs: Fix some typos in comments as I noticed them 2022-03-31 16:34:47 -04:00
Andrew Lamb d37af1a7f5
fix: include git sha (again) in release build (#4193)
* fix: error if git-sha can not be found

* refactor: move main to influxdb_iox

* fix: fmt
2022-03-31 19:14:21 +00:00
Andrew Lamb 532d227d11
refactor: extract router2 into ioxd_router2 (#4183)
* refactor: extract router2 from ioxd

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-03-31 17:39:05 +00:00
Andrew Lamb 367e926d35
refactor: extract querier into ioxd_querier (#4182)
* refactor: extract querier into ioxd_querier

* fix: dep
2022-03-31 16:03:31 +00:00
Andrew Lamb a384448b92
refactor: rename Sequence::id and Sequence::number field names (#4190)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 15:17:58 +00:00
Marco Neumann 5bebc73e3f
fix: consider "in-between" tombstones as processed (#4187)
Abstract
========
We need to be careful w/ tombstone that fall exactly in sequence number range of a parquet file.

Current Bug
===========
Imagine the following order of events:

1. Router creates write at sequence number 1:
   - `table,selector=1 payload=1 1`
   - `table,selector=2 payload=2 2`
2. Ingester pulls write, waits a bit and persists it to parquet file 1:
   - `table,selector=1 payload=1 1`
   - `table,selector=2 payload=2 2`
4. Router creates write at sequence number 2:
   - `table,selector=1 payload=3 3`
   - `table,selector=2 payload=4 4`
5. Ingester pulls write
6. Router create delete at sequencer number 3: full time range, `selector=1`
7. Ingeser pulls delete and creates tombstone 1
8. Router creates write at sequence number 4:
   - `table,selector=1 payload=5 5`
   - `table,selector=2 payload=6 6`
9. Ingester pulls write
10. Ingester persists parquet file 2:
    - `table,selector=2 payload=4 4`
    - `table,selector=1 payload=5 5`
    - `table,selector=2 payload=6 6`

When reading parquet file 2, the tombstone MUST NOT be applied. Otherwise `table,selector=1 payload=5 5` will be
deleted.

Notes
=====
Technically this issue also applies to files created by the compactor, however the compactor marks tombstones as
processed that fall into the sequence number range. It even does that in a single transaction:

fc4635a334/compactor/src/compact.rs (L821-L861)

Alternative
===========
An alternative solution would be if the ingester would mark tombstones that it materialized during persistence as
"processed" (tombstone 1 for parquet file 2 in the example above). However "processed" markers are currently a mere
optimization and don't affect correctness, which is nice for caching on the querier side as well as reasoning.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 15:09:58 +00:00
Nga Tran 9c50a4c9fb
test: replace find_and_compact with compact_partition in tests (#4185)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 13:51:22 +00:00
Andrew Lamb de6505c801
fix: retry catalog list operation (#4188) 2022-03-31 13:07:00 +00:00
Andrew Lamb a1df864283
feat: Support 'SHOW NAMESPACES' in sql repl (#4164)
* feat: Support `SHOW NAMESPACES` in sql repl

* feat: add basic support to clients

* fix: add get_namespaces service test

* fix: proper error handling

* test: end to end test for namespace client

* refactor: Use QuerierDatabase rather than Catalog

* refactor: remove unused function
2022-03-31 12:57:33 +00:00
dependabot[bot] fc4635a334
chore(deps): Bump lock_api from 0.4.6 to 0.4.7 (#4186)
Bumps [lock_api](https://github.com/Amanieu/parking_lot) from 0.4.6 to 0.4.7.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/lock_api-0.4.6...lock_api-0.4.7)

---
updated-dependencies:
- dependency-name: lock_api
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-31 07:45:03 +00:00
Nga Tran ddc2c8304f
fix: have the compaction level set correctly (#4184)
* fix: have the compaction level set correctly, especially for compacted file from the compactor

* fix: typo
2022-03-30 21:23:40 +00:00
Andrew Lamb d1940c0c47
fix: flaky `error_converting_data_from_write_buffer_to_sequenced_entry_is_reported` (#4181) 2022-03-30 19:00:35 +00:00
Paul Dix 04d961e70d
feat: wire up compactor scheduler and config (#4139)
Add configuration options for compactor for the max size of level 0 files and split percentage.
Add metrics for compaction to track the number of candidates, compactions, and durations.
Add functions to separate identifying partitions to compact from running compaction.
Make compaction run in smaller chunks, specifically per partition.
Update compaction to automatically promote level 0 files that are non-overlapping without waiting some period of time.

Closes #4120

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 17:45:24 +00:00
Raphael Taylor-Davies bfe24b3418
feat: add protobuf-compiler to CI image (#4180)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 17:36:27 +00:00
Andrew Lamb 92da65a065
feat: Add end to end tests for querier and schema client (#4178)
* refactor: split up ingester schema test, add mini cluster

* feat: add schema cli test

* feat: add end to end test for querier

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 17:07:32 +00:00
Andrew Lamb 22b24bdab3
chore: Update datafusion again (#4148)
* chore: update datafusoon

* refactor: Update for DataFusion API changes

* chore: TEMP TEMP change df to local copy

* chore: Update to datafusion again

* fix: Update Cargo.lock

* fix: logical conflict
2022-03-30 16:51:48 +00:00
Marko Mikulicic 2c47d77a5b
fix: Backfill namespace_id in schema migration (#4177)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 16:31:26 +00:00
Edd Robinson 7a437387d9
feat: add KeySortCapability capability (#4176) 2022-03-30 15:57:03 +00:00
Marco Neumann b1af5b3f44
feat: query log system table for querier (#4157)
* feat: query log system table for querier

Closes #4084.

* fix: typo

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: extend

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 15:38:11 +00:00
Marco Neumann 5c6583e09a
feat: `CacheBackend::is_empty` (#4174)
* test: run generic tests for dual cache backend

* feat: `CacheBackend::is_empty`

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 15:28:59 +00:00