Commit Graph

7255 Commits (0779f81b6bdb2d2d4ea9294ded3a434cd372bfe8)

Author SHA1 Message Date
Marco Neumann 01f4d3ed3b
fix: check arrow instead of storage gRPC health for SQL REPL (#3886)
There is no reason a query pod should support the storage API. Note that
some features like the observer mode or `show databases;` still need the
management API. We'll probably need to fix that for NG at some point.
2022-03-01 15:46:02 +00:00
Marco Neumann ace4af1b66
feat: `DedicatedExecutor` async `join` and job `detach`. (#3835)
* feat: detach dedicated exec jobs

* feat: async `DedicatedExecutor::join`

Now `DedicatedExecutor` follows the system we use for other server
components:

- `shutdown`:  a quick sync call that signals the shutdown but doesn't
  drop
- `join`: async awaits until the executor has finished shutdown
- `drop`: warn but still try to shut down

* test: irmpove `detach_receiver` test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-03-01 15:25:31 +00:00
Marko Mikulicic 4a56fcdcab
fix: Use bigger executor for test job (#3885)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-01 14:42:55 +00:00
Marco Neumann e5c45aeab6
refactor: generic top-level flight implementation (#3882)
This allows us to implement flight for the NG querier by just
implementing a few traits and reuse all the existing glue code and
optimizations (like dictionary handling).
2022-03-01 14:33:08 +00:00
Raphael Taylor-Davies 43ada68f37
chore: reset release codegen-units to default (#3883) 2022-03-01 13:44:14 +00:00
Marco Neumann daf14f6506 refactor: clean up `querier` a bit
Before adding more and more features, here is a bit of a clean up and
prep work:

- Pull out caching into its own module and add proper tests for it.
- Start to build a test infrastructure so tests are shorter and easier
  to read. This doesn't fully pay off just yet but gets more and more
  important when we actually sync tables and chunks.
2022-03-01 13:24:20 +01:00
Marco Neumann 48722783f9
feat: offer metrics for in-mem catalog (#3876)
This can be quite helpful to test certain caching behavior w/o writing
yet-another abstraction layer.
2022-03-01 11:33:54 +00:00
Nga Tran 0e0dc500f6
feat: prepare data to send to querier (#3825)
* feat: changes needed to apply tombstones correctly on the life-cycle ingest bacthes

* refactor: adjust the  design after discussing with Paul

* feat: apply the coming tombstone on all data but persiting one

* chore: fmt

* fix: build on buffer tombstone

* test: delete & write tests for a parition and some cleanup

* feat: No need add processed tombstones for newly created parquet file in the ingester becasue all deletes before that parquet file is created were applied

* chore: cleanup

* feat: intitial implementation for preparing data to send back to the Querier

* feat: full implementation of prepare_data_to_querier

* fix: apply filters for the batches

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* chore: cleanup

* fix: typos in comments

* fix: typos in comments

* fix: typos in comments

* test: create different scenarios and test them

* chore: fix typos

* test: add tests with deletes

* chore: make pub pub(crate)

* chore: Apply suggestions from code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* refactor: address review comments

* fix: keep batches in their arrival order

* refactor: not assign unecessary values to enum

* refactor: use bitflags enum

* fix: use bitflags correctly

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: avoid using use at the end of the function

* chore: merge main to branch

* fix: fix downgrade versions

* refactor: address review comments

* chore: remove unnecessary comments

* refactor: Make the whole test_utils module test-only and bring paths into module scope

Co-authored-by: Paul Dix <paul@pauldix.net>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
2022-03-01 01:00:45 +00:00
Raphael Taylor-Davies 792241c89d
feat: harden write buffer aggregator (#3805) (#3877)
* feat: harden write buffer aggregator (#3805)

* chore: more logs

* fix: build

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 18:31:24 +00:00
Marco Neumann 6e2470bf5f
feat: create CatalogChunk in querier (#3862)
* feat: `NamespaceRepo::get_by_id`

* feat: create `CatalogChunk` in querier
2022-02-28 17:20:38 +00:00
Marco Neumann 33851be3a5
chore: upgrade Rust to 1.59 (#3875)
Mostly a few new clippy crates around `flat_map`, `and_then`, and
"underscore locks" (!!!):
https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_lock

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:14:19 +00:00
Marco Neumann b213796c98
feat: sync namespaces in querier (#3865)
* feat: `NamespaceRepo::list`

* feat: sync namespaces in querier

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:01:28 +00:00
Marco Neumann 77a0c74830
chore: `cargo update` + less rustsec exceptions (#3873)
It seems that dependabot doesn't catch with the flood of updates, so
this is a manual update run:

```console
$ cargo update
    Updating crates.io index
    Updating git repository `https://github.com/Azure/azure-sdk-for-rust.git`
    Updating git repository `https://github.com/apache/arrow-datafusion.git`
    Updating git repository `https://github.com/influxdata/rskafka.git`
    Updating git repository `https://github.com/mkmik/heappy`
    Updating anyhow v1.0.53 -> v1.0.55
    Updating autocfg v1.0.1 -> v1.1.0
    Updating clang-sys v1.3.0 -> v1.3.1
    Updating clap_derive v3.1.0 -> v3.1.2
    Updating core-foundation v0.9.2 -> v0.9.3
    Updating crossbeam-epoch v0.9.6 -> v0.9.7
    Updating crossbeam-queue v0.3.3 -> v0.3.4
    Updating crossbeam-utils v0.8.6 -> v0.8.7
    Updating debugid v0.7.2 -> v0.7.3
    Updating fd-lock v3.0.3 -> v3.0.4
    Updating inferno v0.10.10 -> v0.10.12
    Updating io-lifetimes v0.4.4 -> v0.5.3
    Updating kube-client v0.69.0 -> v0.69.1
    Updating kube-core v0.69.0 -> v0.69.1
    Updating libm v0.2.1 -> v0.2.2
    Updating linux-raw-sys v0.0.37 -> v0.0.42
    Removing memmap v0.7.0
      Adding memmap2 v0.5.3
    Updating ntapi v0.3.6 -> v0.3.7
    Updating output_vt100 v0.1.2 -> v0.1.3
    Updating rgb v0.8.31 -> v0.8.32
    Updating rustix v0.32.1 -> v0.33.3
    Updating rustls v0.20.2 -> v0.20.4
    Updating security-framework v2.6.0 -> v2.6.1
    Updating security-framework-sys v2.6.0 -> v2.6.1
    Updating semver v1.0.4 -> v1.0.6
    Updating symbolic-common v8.5.0 -> v8.6.1
    Updating symbolic-demangle v8.5.0 -> v8.6.1
    Updating tower-http v0.2.2 -> v0.2.3
    Updating unicode-segmentation v1.8.0 -> v1.9.0
    Updating zeroize v1.5.2 -> v1.5.3
```

This also allows us to remove a rustsec exception.
2022-02-28 11:24:32 +00:00
dependabot[bot] a9d52aaef3
chore(deps): Bump mockito from 0.30.0 to 0.31.0 (#3871)
* chore(deps): Bump mockito from 0.30.0 to 0.31.0

Bumps [mockito](https://github.com/lipanski/mockito) from 0.30.0 to 0.31.0.
- [Release notes](https://github.com/lipanski/mockito/releases)
- [Commits](https://github.com/lipanski/mockito/compare/0.30.0...0.31.0)

---
updated-dependencies:
- dependency-name: mockito
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: disallow `RUSTSEC-2020-0095`

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 10:05:24 +00:00
dependabot[bot] c289467239
chore(deps): Bump clap from 3.1.1 to 3.1.2 (#3870)
Bumps [clap](https://github.com/clap-rs/clap) from 3.1.1 to 3.1.2.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.1...v3.1.2)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 09:53:25 +00:00
Marco Neumann 6bb18672a4
fix: do not ignore failed persist tasks (#3866)
I'm seeing some panics in our test bench, but it the ingester happily
continues and thinks it persisted tasks even though it didn't. Let's at
least bail out if a persist task fails.
2022-02-28 09:30:42 +00:00
Carol (Nichols || Goulding) d7bd46f086
fix: Update hakari files for cargo-hakari 0.9.12 (#3872) 2022-02-28 09:18:23 +00:00
Raphael Taylor-Davies 2a842fbb1a
feat: correctly sort data and store in catalog metadata (#3864)
* feat: respect sort order in ChunkTableProvider (#3214)

feat: persist sort order in catalog (#3845)

refactor: owned SortKey (#3845)

* fix: size tests

* refactor: immutable SortKey

* test: test sort order restart (#3845)

* chore: explicit None for sort key

* chore: test cleanup

* fix: handling of sort keys containing fields

* chore: remove unused selected_sort_key

* chore: more docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 17:56:27 +00:00
Nga Tran 8edc462c37
fix: while executing deduplication, do not return empty record batches as a result of deduplication (#3861)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 15:00:13 +00:00
kodiakhq[bot] c79186901d
Merge pull request #3863 from influxdata/dom/configurable-db-pool
refactor: configurable max catalog connections
2022-02-25 11:51:18 +00:00
Dom Dwyer 3579252682 refactor: configurable max catalog connections
Allow the maximum number of catalog (postgres) connections to be
specified as part of the catalog configuration.
2022-02-25 11:23:21 +00:00
Raphael Taylor-Davies a32b952104
fix: chunk overlap missing columns (#3844)
* fix: chunk overlap missing columns

* chore: more tests

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 09:46:03 +00:00
Marco Neumann f966f4c7a4
feat: create `ParquetChunk` in querier (#3857)
Adds a small adapter that is able to produce `ParquetChunk`s for NG.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 08:54:16 +00:00
Paul Dix c965221df1
feat: have ingester ignore already persisted data (#3849) 2022-02-25 00:08:33 +00:00
kodiakhq[bot] 2f868f2222
Merge pull request #3860 from influxdata/dom/parallel-column-resolution
refactor: parallel column resolution
2022-02-24 21:20:49 +00:00
Dom Dwyer b07f15bec7 refactor: parallel column resolution
A quick change to perform the ColumnRepo::create_or_get() calls in
parallel (up to a maximum of 3 in-flight at any one time) in order to
mitigate the latency of the call and reduce the overall schema
validation call duration.

The in-flight limit is enforced to avoid starving the DB connection pool
of connections.
2022-02-24 21:04:25 +00:00
Carol (Nichols || Goulding) 723a0c659f
fix: Remove greater_than_sequence_number from IngesterQueryRequest (#3856) 2022-02-24 19:23:44 +00:00
Raphael Taylor-Davies a3a628c783
refactor: use ChunkMetadata in catalog interface (#3845) (#3858)
* refactor: use ChunkMetadata in catalog interface (#3845)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 18:05:07 +00:00
kodiakhq[bot] 26ac93334f
Merge pull request #3854 from influxdata/dom/ns-cache-content-metrics
feat: namespace cache table/column count metrics
2022-02-24 17:29:02 +00:00
kodiakhq[bot] a1b38b7102
Merge branch 'main' into dom/ns-cache-content-metrics 2022-02-24 16:31:09 +00:00
Luke Bond 34e06e8689
fix: compactor server stays up; removed unused delegates (#3855)
* fix: compactor server stays up; removed unused delegates

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 16:30:44 +00:00
Paul Dix 8571c132cc
feat: add method to get table persistence information from catalog (#3848) 2022-02-24 16:18:14 +00:00
Dom Dwyer 3d77cf5845 test: validate metrics for adding namespace 2022-02-24 16:07:02 +00:00
Marco Neumann 49d1be30e7
feat: wire up `ParquetFilePath` for NG (#3853)
It's a bit of a duck-type hack, but if we wanna just `ParquetFileChunk`
in the new architecture, we somehow need it to accept new-gen paths.
Also path handling should be somewhat centralized since
ingester/compactor/querier all need to construct them. So having a
`ParquetFilePath` that supports both path styles seems to be a
not-to-bad solution. This should obviously be cleaned up in some
not-to-distant future.
2022-02-24 16:05:38 +00:00
Luke Bond 4731913c44
feat: skeleton of querier CLI (#3843)
* feat: skeleton of querier CLI

* chore: wrap metrics in opt&arc in querier to satisfy new api

* chore: derive debug in querier handler

* chore: add join handles and their shutdown to nascent querier server

* chore: querier server http unimpl -> 404

* fix: join/shutdown fix in querier; removed unused delegates
2022-02-24 15:42:56 +00:00
Carol (Nichols || Goulding) 252ced7adf
feat: Add row count to the parquet_file record in the catalog (#3847)
Fixes #3842.
2022-02-24 15:20:50 +00:00
Dom Dwyer 0ddc35ce73 feat: instrument namespace cache contents
Adds two new metrics:

    * namespace_cache_table_count: total number of tables in cache
    * namespace_cache_column_count: total number of columns in cache

The metric decorator keeps a running total of each of the table and
column counts as namespaces are inserted into the cache, and adjusts the
value accordingly when an existing namespace is overwrote.
2022-02-24 15:11:14 +00:00
Dom Dwyer 4024e95ce9 refactor: borrow metric registry
There's no need for the namespace metrics to take (shared) ownership of
the metric registry, so lend it at the call site instead of cloning the
arc.
2022-02-24 15:04:49 +00:00
Dom Dwyer 77f649210d feat: inc/dec gauge metrics
Adds methods to increment, or decrement gauge metrics.
2022-02-24 15:04:49 +00:00
Marco Neumann d62a052394
feat: extend catalog so we can recover `ParquetChunk`s from it (#3852)
* refactor: less parquet data copying

* feat: `PartitionRepo::get_by_id`

* feat: `TableRepo::get_by_id`

* feat: `ParquetFile::file_size_bytes`

* feat: `ParquetFile::parquet_metadata`

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 13:16:15 +00:00
kodiakhq[bot] c8b964ebef
Merge pull request #3846 from influxdata/dom/schema-before-partition
refactor: early schema validation
2022-02-24 10:49:01 +00:00
Dom Dwyer d7eda88581 refactor: early schema validation
Changes the configuration of the router request pipeline to move schema
validation before partitioning.

This reduces the concurrency of callsm into the schema validator when a
single write is split into one or more partitions, reducing contention
and cash thrashing. It also ensures we don't bother partitioning the
writes if the request will fail.
2022-02-23 18:59:14 +00:00
Marco Neumann 9079e6ddb0
feat: backoff retries in ingester (#3841)
* feat: add `backoff` crate

* feat: backoff retries in ingester

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 17:58:16 +00:00
Raphael Taylor-Davies bc1cc8dc5c
chore: update database rebuild instructions (#3838)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 17:46:22 +00:00
kodiakhq[bot] 6d15bed6be
Merge pull request #3830 from influxdata/dom/router-traces
feat: emit trace spans for router stages
2022-02-23 17:34:28 +00:00
kodiakhq[bot] e1f54f67af
Merge branch 'main' into dom/router-traces 2022-02-23 17:23:17 +00:00
Dom Dwyer 9707d85e5e test: InstrumentationDecorator DML handler impls 2022-02-23 17:23:02 +00:00
kodiakhq[bot] c4dda758b2
Merge pull request #3837 from influxdata/dom/catalog-instrumentation
feat: catalog instrumentation
2022-02-23 16:27:15 +00:00
kodiakhq[bot] 3e69a5e1b4
Merge branch 'main' into dom/catalog-instrumentation 2022-02-23 16:14:40 +00:00
Carol (Nichols || Goulding) 71f62eee68
fix: Remove min_time and max_time from IngesterQueryRequest (#3839)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 15:46:31 +00:00