Commit Graph

213 Commits (651b7a1ce64df19c730e1fd721a1dcf4ecf618cf)

Author SHA1 Message Date
Luke Bond 7c813c170a
feat: reintroduce compactor first file in partition exception (#6176)
* feat: compactor ignores max file count for first file

chore: typo in comment in compactor

* feat: restore special first file in partition compaction logic; add limit

* fix: calculation in compaction max file count

chore: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 15:58:59 +00:00
Christopher M. Wolff 6d3dfa781e
chore: marshal InfluxDbError into status details (#6161)
* chore: marshal InfluxDbError into status details

* chore: address feedback and CI issues

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-17 19:51:01 +00:00
Carol (Nichols || Goulding) 43687a86d2
fix: Remove lots of needless borrows that Clippy can now identify
Except for in generated code that we don't control.
2022-11-09 10:54:18 -05:00
Dom d9c97795fc
feat: use IDs in ingester query API (#6093)
* refactor: NS+table ID (instead of name) in querier<>ingester

* feat(ingester): use IDs for query API

Changes the ingester to utilise the ID fields (instead of names) sent
over the query wire message wrapped within the Flight API.

BREAKING: this changes the "query-ingester" CLI command arguments which
now expects the namespace & table IDs, rather than their names.

* refactor(ingester): add more query logging context

Updates the log messages during query execution to include more context
fields.

* style: remove unused import

Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-11-09 11:25:13 +00:00
Carol (Nichols || Goulding) 44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate 2022-10-26 10:33:56 -04:00
Carol (Nichols || Goulding) ba25300b01
feat: Create compactor service to list skipped compactions 2022-10-21 13:40:31 -04:00
Andrew Lamb 9134ccd6c3
chore: Update datafusion again (#5855)
* chore: Update datafusion

* chore: Updates for changes in datafusion

* chore: more updates

* fix: update doc example

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-13 19:18:57 +00:00
Carol (Nichols || Goulding) 74c9529062
fix: Rename KafkaPartition to ShardIndex 2022-08-29 14:07:18 -04:00
Carol (Nichols || Goulding) 240946d8f5
fix: Deprecate proto sequencer_id fields; add shard_id fields 2022-08-29 14:06:44 -04:00
Dom Dwyer c6d4109e07 build: generate gRPC bindings for ShardService
Builds the ShardService proto file in the generated_types package.
2022-08-24 11:39:59 +02:00
Carol (Nichols || Goulding) b982bdaf2f
fix: Derive Eq when we derive PartialEq and members can derive Eq
Allow this in generated code that we don't control, though.

Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
2022-08-11 15:04:06 -04:00
Marco Neumann 1993448abf
refactor: remove `Predicat::partition_key` (#5016)
There is no way a user can filter for partition keys (neither via
InfluxRPC nor via SQL) and the query engine doesn't use this field at
all. So let's remove it.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-01 17:17:29 +00:00
Dom Dwyer 75c425f375 refactor(schema-api): column data type enum
Previously the column data type was exposed using an internal i32 value.
This commit changes the Schema API to use a self-descriptive proto enum
for the column data type.
2022-06-27 16:14:49 +01:00
Andrew Lamb 2ec7764fdd
refactor: rename builder like predicate methods to be `with_` (#4808)
* refactor: rename builder like predicate methods to be `with_`

* fix: merge conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-09 11:26:03 +00:00
Andrew Lamb afc1c12062
refactor: consolidate `PredicateBuilder` into `Predicate` (#4799)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-08 12:21:24 +00:00
Andrew Lamb dde3c3922c
refactor: use consistent spelling of serialize (#4717) 2022-05-27 14:42:59 +00:00
Marco Neumann 2029bd16ba
feat: enable debugging of failed querier->ingester requests (#4659)
* feat: enable debugging of failed querier->ingester requests

- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario

* test: explain

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: move b64 pred. serde into a single crate

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-23 15:37:31 +00:00
Carol (Nichols || Goulding) 2ee4a6669a
refactor: Move the code merging write infos to generated_types to share 2022-05-11 14:07:42 -04:00
Carol (Nichols || Goulding) 26170b7a07
refactor: Move gRPC conversion code to generated_types to share 2022-05-11 14:07:12 -04:00
Carol (Nichols || Goulding) 068096e7e1
fix: Rename data_types2 to data_types 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) fb8f8d22c0
fix: Remove now-unused ServerId. Fixes #4451 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) 485d6edb8f
refactor: Move IngesterQueryRequest to generated_types 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) e9a42c418a
fix: Only use data_types2 in generated_types 2022-05-06 14:45:36 -04:00
Carol (Nichols || Goulding) 91961273c2
fix: Remove unused Rust code in generated_types 2022-05-06 11:50:03 -04:00
Carol (Nichols || Goulding) e6e0655b31
fix: Remove OG proto definitions
Fixes #4475.
2022-05-06 11:50:03 -04:00
Andrew Lamb 6381ea60bb
chore: port remaining read_filter influxrpc tests to NG (#4383)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-29 14:06:50 +00:00
Andrew Lamb e13d3433ae
feat: Use datafusion serialization code rather than our own copy of it (#4421)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-28 13:03:34 +00:00
Andrew Lamb 115f007317
refactor: Use DataFusion `Expr` instead of our own custom wrapper for `ValueExpr` (#4440)
* refactor: Use DataFusion `Expr` instead of custom wrapper for BinaryExprs

* fix: apply code review suggestions

* fix: more code review suggestions
2022-04-27 19:20:15 +00:00
二手掉包工程师 4b47d723b1
refactor: Rename time to iox_time (#4416)
Signed-off-by: hi-rustin <rustin.liu@gmail.com>

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 00:19:59 +00:00
Andrew Lamb e3d83fe757
chore: update datafusion (#4342)
* chore: update datafusion

* fix: Update imports for change in datafusion organization
2022-04-19 13:38:12 +00:00
Paul Dix 5bf4550259
feat: add object store service to router (#4338)
Add method to catalog to get parquet file by object store id.
Add gRPC service for object store to get a file from by its uuid.
Add the object store service to router2 with object store config.
2022-04-16 17:58:31 +00:00
Paul Dix 99cbb28a89
feat: add initial catalog service to router (#4316)
Create new crate for iox_catalog_service.
Add rpc to return parquet_file records by partition id.
Add CatalogService to router2.

The catalog service will be added to over time to provide access to the catalog over gRPC.
2022-04-14 17:39:18 +00:00
Marco Neumann 83f77712b1
refactor: querier<>ingester flight protocol adjustments (#4286)
* refactor: querier<>ingester flight protocol adjustments

This makes a few adjustments to the querier<>ingester flight protocol.

Query Scope
===========
The querier will request data for ALL sequencer IDs for now. There is
no reason to have a request per sequencer ID. We can add a range/set
filter later if we want, but this is not required for now.

Partition-level
===============
The only time when the querier cares about sequencer IDs (i.e. sharding)
at all is when it selects which ingesters to ask for unpersisted data
(this is currently not implemented, it just asks all ingesters).
Afterwards the querier only cares about partitions (which are bound to
specific sequencers anyways) because this is the level where parquet
file persistence and compaction as well as deduplication happen. So we
make partitions a first-class citizen in the ingester response.

Metadata VS RecordBatches
=========================
The global app-metadata will list all partitions and their max
persisted parquet files and tombstones (theoretically tombstones are at
table-level, but the ingester could in the future break them down to the
partition-level). Then it receives a stream of record batches. Each
record batch is tagged (via key-value metadata in its schema) so it can
be assigned to a partition. At the moment the ingester returns 0 or 1
batches per unpersisted partition (0 in case we've filtered out all the
data via the predicate), but in the future it is free to return multiple
batches. This setup gives the ingester more freedom over memory
management and (potentially parallel) query processing, while at the
same time keeps the set of duplicated information minimal and allows
easy extensions (since the global metadata is a full-blown protobuf
message).

Querier
=======
At the moment the querier ignores all the metdata. Follow-up PRs will
change that.

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: make code clearer

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-12 16:48:40 +00:00
Marco Neumann 380cd9bbff
refactor: use a single flight client implementation (#4273)
"end-user -> querier" and "querier -> ingester" should use a single
Flight client implementation. The difference is just the request and
response metadata.

This changes our default Flight client to use protobuf instead of JSON
for the ticket format.
2022-04-12 09:08:25 +00:00
Andrew Lamb 833c10c083
feat: return write_token from HTTP writes to router2 (#4202)
* feat: return write_token from HTTP writes to router2

* fix: Update router2/src/dml_handlers/instrumentation.rs

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: Use WriteSummary::default more vigorously

* fix: fix typo and add links to follow on issues

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-02 10:34:51 +00:00
Andrew Lamb a1df864283
feat: Support 'SHOW NAMESPACES' in sql repl (#4164)
* feat: Support `SHOW NAMESPACES` in sql repl

* feat: add basic support to clients

* fix: add get_namespaces service test

* fix: proper error handling

* test: end to end test for namespace client

* refactor: Use QuerierDatabase rather than Catalog

* refactor: remove unused function
2022-03-31 12:57:33 +00:00
Andrew Lamb 58c630d709
chore: Update datafusion (#4133)
* chore: Update datafusion

* fix: typo

* fix: Update explain plan output

* fix: update Cargo.locl

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-25 15:08:39 +00:00
Luke Bond b098828c97
feat: schema grpc server & proto in router2 (#4081)
* feat: schema grpc server & proto in router2

* chore: comments in schema proto

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-22 11:27:20 +00:00
Andrew Lamb 8f1938a482
chore: Update datafusion (#4022)
* chore: Update datafusion

* chore: update for change in Expr

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-14 17:24:00 +00:00
Carol (Nichols || Goulding) 2a90841715
refactor: Move IngesterQueryRequest to data_types2
So that querier doesn't need to depend on ingester.
2022-03-02 13:52:13 -05:00
Raphael Taylor-Davies 26fd5273f0
feat: static database configuration (#2436) (#3732)
* feat: static database configuration (#2436)

* chore: fmt

* feat: don't base64 encode UUIDs in ServerConfigFile

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-14 19:42:49 +00:00
Raphael Taylor-Davies 866777ecd2
feat: static router configuration (#2436) (#3725)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-11 14:09:37 +00:00
Raphael Taylor-Davies b1190262b7
feat: restartable `Database` (#3368) (#3711)
* feat: restartable `Database` (#3368)

* chore: fmt

* fix: wipe_preserved_catalog

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-10 18:32:05 +00:00
Carol (Nichols || Goulding) 73828323ac
feat: Ingester Flight gRPC API (#3623)
* feat: Add a way to run ingester with an in-memory catalog from the CLI

If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.

Planning on using this in tests, so not documenting.

* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC

Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.

* feat: Add a flight API (handshake only) to ingester

* fix: Create partitions if using file-based write buffer

* fix: Change the server fixture to handle ingester server type

For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.

* feat: Start implementing ingester do_get, namely decoding the query

Skip serialization of the predicate for the moment.

* refactor: Rename ingest protos to ingester to match crate name

* refactor: Rename QueryResults to QueryData

* feat: Move ingester flight client to new querier crate

* fix: Off by one error, different starting indexes in sequencers

* fix: Create new CLI argument to pick the catalog type

* fix: Create a CLI option to set the number of topics to auto-create in the write buffer

* fix: Check the arrow flight service's health to tell that the ingester gRPC is up

* fix: Set postgres as the default catalog type

* fix: Return an error rather than panicking if CLI args aren't right
2022-02-09 19:07:44 +00:00
Carol (Nichols || Goulding) dd9620da0c
feat: Create a new proto definition for the new design's IoxMetadata 2022-01-31 10:36:32 -05:00
Andrew Lamb 9c19cd6cc4
fix: clamp start/end of TimestampRange to min/max valid timestamp values (#3487)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-20 16:08:00 +00:00
Andrew Lamb 758b65dd29
feat: Add database initialization state and errors to CLI and remove list_databases_detailed gRPC (#3377)
* feat: Add database initialization state and errors to CLI:

* fix: do not use optional in protobuf

* fix: clippy

* fix: correct check I broke appeasing clippy
2021-12-15 12:18:41 +00:00
Nga Tran efbfbb1a0b feat: compact all object store chunks of a given partition 2021-12-08 16:06:03 -05:00
Andrew Lamb c6a3765d76
feat: Add force flag to RebuildCatalog (#3292)
* feat: Add force flag to RebuildCatalog

* fix: small cleanups

* docs: Update comments and add WARNING
2021-12-08 15:36:07 +00:00
Raphael Taylor-Davies b0e01edb86
fix: include pbjson code for storage package (#3334) 2021-12-08 11:58:09 +00:00