Commit Graph

22 Commits (9043966443dfa041fa503dbe1eb04fa638f7bc58)

Author SHA1 Message Date
dependabot[bot] 17af5fcbd1
chore(deps): Bump tokio-util from 0.7.0 to 0.7.1 (#4154)
* chore(deps): Bump tokio-util from 0.7.0 to 0.7.1

Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.0 to 0.7.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.0...tokio-util-0.7.1)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-29 08:39:02 +00:00
dependabot[bot] 4f9515ffba
chore(deps): Bump async-trait from 0.1.52 to 0.1.53 (#4141)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.52 to 0.1.53.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.52...0.1.53)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-28 08:55:24 +00:00
Andrew Lamb 5c69a3f43b
chore: Update deps: datafusion, arrow/arrow-flight/parquet to 11, zstd to 0.11 (#4119)
* chore: update datafusion

* chore(deps): Bump arrow from 10.0.0 to 11.0.0

Bumps [arrow](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): Bump arrow-flight from 10.0.0 to 11.0.0

Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update parquet to 11.0.0

* fix: error on create schema, test for same

* fix: upgrade zstd

* chore: Run cargo hakari tasks

* fix: fix logical merge conflict

* fix: hakari

* fix: hakari

* fix: update newly introduced dep

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-24 15:27:36 +00:00
Andrew Lamb 29b89aaca7
refactor: extract influxrpc, flight and testing gRPC out of influxdb_ioxd (#4106)
* refactor: extract grpc service implementations out of influxdb_ioxd

* chore: Run cargo hakari tasks

* refactor: rename server_common to service_common

* refactor: rename server_grpc_influxrpc to service_grpc_influxrpc

* refactor: rename server_grpc_flight to service_grpc_flight

* refactor: rename server_grpc_testing to service_grpc_testing

* fix: Cargo.toml

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-03-23 20:14:45 +00:00
Marco Neumann 55643945a1
refactor: `querier` w/o `db` (#4063)
* feat: `TombstoneRepo::list_by_table`

* feat: `ParquetFileRepo::list_by_table_not_to_delete`

* refactor: `querier` w/o `db`

Get the `querier` to work w/o relying on `db`. A few notes:

- Testing is kinda shallow, we really need to get `query_tests` working
  w/ `querier` (see #3934).
- We still run a sync loop for namespaces, tables and schemas. This will
  be a replaced by "update namespace incl. tables and schemas on demand".
  Note however that we cannot fetch single tables and schemas on demand
  at the moment, because DataFusion doesn't implement async schema
  inspection (only `scan` / "give me all the chunks" is async). I think
  that's OK for now and we can address this later.
- There is NO cache for parquet files and tombstones at the moment. For
  correctness, they need to be fetched in a single transaction (or we
  need a kinda tricky sequence number / logical clock tracking) and I am
  not sure yet how this makes sense when we have the ingester data wired
  up and predicates pushed down to the catalog (see next point). So
  let's measure first and then decide on a caching strategy for this.
- Predicates are currently NOT pushed down to the catalog. I'll need to
  figure out how to extract time range from generic DataFusion
  expressions to make that work (it's easier for InfluxRPC queries, but
  they are not tested at the moment, see first point).

Sorry that this commit is kinda huge. I initially planned to only
migrate the chunks away from `db` and leave the tables and schemas for a
follow-up PR, but the DataFusion trait structure (chunks are bound to
their tables) makes this kinda pointless.

Closes #3974.

* docs: explain what we're doing

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: mention tracking issues

* docs: explain what we're doing

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-03-21 16:58:00 +00:00
Marco Neumann 27efb66237
test: add proptest for `AddressableHeap` (#4025)
* test: add proptest for `AddressableHeap`

For #3985.

* refactor: simplify code

Co-authored-by: Edd Robinson <me@edd.io>

Co-authored-by: Edd Robinson <me@edd.io>
2022-03-14 12:38:30 +00:00
Nga Tran f03ebd79ab
refactor: move querier's test utils to a new crate to get reused by tests in other crates (#4013)
* refactor: move querier's test utils to a new crate to be able resued by tests in other crates

* chore: remove unused import
2022-03-10 18:17:58 +00:00
Andrew Lamb 2c3d30ca32
chore: Update datafusion, arrow, flight and parquet (#4000)
* chore: Update datafusion, arrow, flight and parquet

* fix: api change

* fix: fmt

* fix: update test metadata size

* fix: Update sizes in parquet test

* fix: more metadata size update
2022-03-10 12:24:47 +00:00
Marco Neumann 8d00aaba90
feat: sync chunks in querier (#3911)
* feat: `ParquetFileRepo::list_by_namespace_not_to_delete`

* feat: `ChunkAddr: Clone`

* test: ensure that querier keeps same partition objects

* test: improve `create_parquet_file` flexibility

* feat: sync chunks in querier

* test: improve `test_parquet_file`
2022-03-04 08:53:39 +00:00
Carol (Nichols || Goulding) 3f2a58b47f
refactor: pub use data_types from data_types2
So it's clearer which parts of data_types the NG design is using, and
which types can be cleaned up eventually.
2022-03-02 13:55:31 -05:00
Carol (Nichols || Goulding) 2a90841715
refactor: Move IngesterQueryRequest to data_types2
So that querier doesn't need to depend on ingester.
2022-03-02 13:52:13 -05:00
Carol (Nichols || Goulding) 8f3e44bf76
refactor: Extract a crate for shared data types in the new design 2022-03-02 12:16:15 -05:00
Carol (Nichols || Goulding) 141a6087d0
feat: Querier able to send Flight queries to Ingester
Fixes #3773.
2022-03-02 11:50:45 -05:00
Marco Neumann af57664f53
feat: wire up flight into the querier (#3889) 2022-03-02 09:20:26 +00:00
Marco Neumann 936f51013d
test: querier shutdown and background task handling (#3881)
This is similar to what we already have for the ingester.
2022-03-02 08:59:50 +00:00
Marco Neumann 6e2470bf5f
feat: create CatalogChunk in querier (#3862)
* feat: `NamespaceRepo::get_by_id`

* feat: create `CatalogChunk` in querier
2022-02-28 17:20:38 +00:00
Marco Neumann b213796c98
feat: sync namespaces in querier (#3865)
* feat: `NamespaceRepo::list`

* feat: sync namespaces in querier

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:01:28 +00:00
Marco Neumann f966f4c7a4
feat: create `ParquetChunk` in querier (#3857)
Adds a small adapter that is able to produce `ParquetChunk`s for NG.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 08:54:16 +00:00
Luke Bond 4731913c44
feat: skeleton of querier CLI (#3843)
* feat: skeleton of querier CLI

* chore: wrap metrics in opt&arc in querier to satisfy new api

* chore: derive debug in querier handler

* chore: add join handles and their shutdown to nascent querier server

* chore: querier server http unimpl -> 404

* fix: join/shutdown fix in querier; removed unused delegates
2022-02-24 15:42:56 +00:00
dependabot[bot] 5a79b3a68b
chore(deps): Bump arrow-flight from 9.0.2 to 9.1.0 (#3829)
Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:03:22 +00:00
Andrew Lamb a30803e692
chore: Update datafusion, update `arrow`/`parquet`/`arrow-flight` to 9.0 (#3733)
* chore: Update datafusion

* chore: Update arrow

* fix: missing updates

* chore: Update cargo.lock

* fix: update for smaller parquet size

* fix: update test for smaller parquet files

* test: ensure parquet_file tests write multiple row groups

* fix: update callsite

* fix: Update for tests

* fix: harkari

* fix: use IoxObjectStore::existing

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-15 12:10:24 +00:00
Carol (Nichols || Goulding) 73828323ac
feat: Ingester Flight gRPC API (#3623)
* feat: Add a way to run ingester with an in-memory catalog from the CLI

If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.

Planning on using this in tests, so not documenting.

* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC

Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.

* feat: Add a flight API (handshake only) to ingester

* fix: Create partitions if using file-based write buffer

* fix: Change the server fixture to handle ingester server type

For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.

* feat: Start implementing ingester do_get, namely decoding the query

Skip serialization of the predicate for the moment.

* refactor: Rename ingest protos to ingester to match crate name

* refactor: Rename QueryResults to QueryData

* feat: Move ingester flight client to new querier crate

* fix: Off by one error, different starting indexes in sequencers

* fix: Create new CLI argument to pick the catalog type

* fix: Create a CLI option to set the number of topics to auto-create in the write buffer

* fix: Check the arrow flight service's health to tell that the ingester gRPC is up

* fix: Set postgres as the default catalog type

* fix: Return an error rather than panicking if CLI args aren't right
2022-02-09 19:07:44 +00:00