Commit Graph

883 Commits (889804e46a26c52bf4f2cffe01c8b2c2c505e338)

Author SHA1 Message Date
Luke Bond 54890525da
feat: gitops adapter ci build (#3692)
* chore: remove references to perf_image in CI

* chore: adding gitops adapter image build in CI

* chore: gitops adapter bin now same as dir & package so docker build works

* fix: circle config package change after renaming gitops adapter package
2022-02-10 11:23:06 +00:00
dependabot[bot] ad60dc6949
chore(deps): bump httparse from 1.5.1 to 1.6.0 (#3708)
Bumps [httparse](https://github.com/seanmonstar/httparse) from 1.5.1 to 1.6.0.
- [Release notes](https://github.com/seanmonstar/httparse/releases)
- [Commits](https://github.com/seanmonstar/httparse/compare/v1.5.1...v1.6.0)

---
updated-dependencies:
- dependency-name: httparse
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-10 09:54:42 +00:00
dependabot[bot] f23574bc5f
chore(deps): bump futures from 0.3.19 to 0.3.21 (#3706)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.19 to 0.3.21.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.19...0.3.21)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-10 09:19:19 +00:00
dependabot[bot] 6a002f09e1
chore(deps): bump clap from 3.0.13 to 3.0.14 (#3698)
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.13 to 3.0.14.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.13...v3.0.14)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-09 21:21:24 +00:00
dependabot[bot] 85d44552fa
chore(deps): bump cloud-storage from 0.10.3 to 0.11.0 (#3697)
Bumps [cloud-storage](https://github.com/ThouCheese/cloud-storage-rs) from 0.10.3 to 0.11.0.
- [Release notes](https://github.com/ThouCheese/cloud-storage-rs/releases)
- [Changelog](https://github.com/ThouCheese/cloud-storage-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/ThouCheese/cloud-storage-rs/commits)

---
updated-dependencies:
- dependency-name: cloud-storage
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-02-09 20:03:44 +00:00
Andrew Lamb d9f331ba2a
chore: update datafusion, stop repartitioning so aggressively (#3633)
* chore: update datafusion

* fix: Update to use new datafusion api

* chore: update expected plans

* fix: support zero output partitions

* fix: update test

* fix: Update for new DataFusion API

* fix: newly added system table

* fix: update cargo lock
2022-02-09 19:53:41 +00:00
dependabot[bot] 803296e8cf
chore(deps): bump sqlparser from 0.13.0 to 0.14.0 (#3696)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.13.0 to 0.14.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.13.0...v0.14.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-02-09 19:40:03 +00:00
Carol (Nichols || Goulding) 73828323ac
feat: Ingester Flight gRPC API (#3623)
* feat: Add a way to run ingester with an in-memory catalog from the CLI

If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.

Planning on using this in tests, so not documenting.

* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC

Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.

* feat: Add a flight API (handshake only) to ingester

* fix: Create partitions if using file-based write buffer

* fix: Change the server fixture to handle ingester server type

For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.

* feat: Start implementing ingester do_get, namely decoding the query

Skip serialization of the predicate for the moment.

* refactor: Rename ingest protos to ingester to match crate name

* refactor: Rename QueryResults to QueryData

* feat: Move ingester flight client to new querier crate

* fix: Off by one error, different starting indexes in sequencers

* fix: Create new CLI argument to pick the catalog type

* fix: Create a CLI option to set the number of topics to auto-create in the write buffer

* fix: Check the arrow flight service's health to tell that the ingester gRPC is up

* fix: Set postgres as the default catalog type

* fix: Return an error rather than panicking if CLI args aren't right
2022-02-09 19:07:44 +00:00
dependabot[bot] 7d11485c77
chore(deps): bump tracing from 0.1.29 to 0.1.30 (#3677)
Bumps [tracing](https://github.com/tokio-rs/tracing) from 0.1.29 to 0.1.30.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-0.1.29...tracing-0.1.30)

---
updated-dependencies:
- dependency-name: tracing
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 16:44:16 +00:00
Paul Dix 59b2141c0b
feat: Add lifecycle manager to ingester (#3645)
This adds the lifecycle manager to the ingester. It will trigger based on a threshold for max partition size or age or based on keeping total memory under a certain threshold.

It defines a new interface for a persister, which is stubbed out for IngesterData. I'm not sure yet how persistence errors should be handled. The assumption here is that the persister continues to retry persistence forever until it succeeds.

There is one scenario I can think of that may cause this lifecycle manager problems. If a single partition is very high throughput, it could cause things to back up as persistence is not parallelized within a single partition. Any given partition can currently only run one persistence operation at a time. We can address this later.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 15:23:40 +00:00
dependabot[bot] 9369152096
chore(deps): bump tracing-core from 0.1.21 to 0.1.22 (#3675)
Bumps [tracing-core](https://github.com/tokio-rs/tracing) from 0.1.21 to 0.1.22.
- [Release notes](https://github.com/tokio-rs/tracing/releases)
- [Commits](https://github.com/tokio-rs/tracing/compare/tracing-core-0.1.21...tracing-core-0.1.22)

---
updated-dependencies:
- dependency-name: tracing-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 15:12:39 +00:00
dependabot[bot] ba87ae2918
chore(deps): bump crc32fast from 1.3.1 to 1.3.2 (#3668)
Bumps [crc32fast](https://github.com/srijs/rust-crc32fast) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/srijs/rust-crc32fast/releases)
- [Commits](https://github.com/srijs/rust-crc32fast/compare/v1.3.1...v1.3.2)

---
updated-dependencies:
- dependency-name: crc32fast
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-08 14:48:31 +00:00
dependabot[bot] 3c9a27b31b
chore(deps): bump futures-test from 0.3.19 to 0.3.21 (#3667)
Bumps [futures-test](https://github.com/rust-lang/futures-rs) from 0.3.19 to 0.3.21.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.19...0.3.21)

---
updated-dependencies:
- dependency-name: futures-test
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-08 14:35:34 +00:00
dependabot[bot] 9a874186d5
chore(deps): bump libc from 0.2.116 to 0.2.117 (#3666)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.116 to 0.2.117.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.116...0.2.117)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 13:59:02 +00:00
Marco Neumann 5de4d6203f
refactor: catalog transaction (#3660)
* refactor: catalog Unit of Work (= transaction)

Setup an inteface to handle Units of Work within our catalog. Previously
both the Postgres and the in-mem backend used "mini-transactions on
demand". Now the caller has a clear way to establish boundaries and
gets read and write isolation. A single `Arc<dyn Catalog>` can create as
many `Box<dyn UnitOfWork>` as you like, but note that depending on the
backend you may not scale infinitely (postgres will likely impose
certain limits and the in-mem backend limits concurrency to 1 to keep
things simple).

* docs: improve wording

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: rename Unit of Work to Transaction

* test: improve `test_txn_isolation`

* feat: clearify transaction drop semantics

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 13:38:33 +00:00
Luke Bond de2a013786
feat: gitops adapter (#3656)
* feat: scaffolding of gitops adapter bin crate

* chore: refactor gitops adapter; calls CLI now; status update fixes

* feat: gitops adapter now calls out to CLI once per topic; improved tests

* chore: add mock failure script for gitops adapter

* chore: update workspace-hack

* chore: refactor away unecessary to_string in gitops syncer

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 13:27:36 +00:00
Markus Westerlind 0bd7941a18
fix(REPL): Don't buffer lines until a trailing semicolon is found and add history hinting (#3630)
* fix(REPL): Don't buffer lines until a trailing semicolon is found

The repl would silently buffer all lines until a trailing semicolon were found which
resulted in some very confusing error messages as I would input invalid commands followed
by a command I thought were valid, except I'd still get an error due to the previous command being buffered.

This uses rustyline's helper feature to detect incomplete input (no trailing semicolon) and makes
it accept multiline input until the input is completed.

I also included some of rustyline's default hint and highlighting while I was at it.

* chore: cargo clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 17:11:01 +00:00
Marco Neumann 50cff27b01
chore: remove rdkafka dependency (#3625)
All features are now covered by rskafka. This also removes the need to
specify a server ID for write buffer consumers. This was only used for
rdkafka since there we needed to specify a consumer group, even though
we did not use any transactions.
2022-02-03 13:33:56 +00:00
Marco Neumann bc4b7f8a5b
test: ensure that rskafka and rdkakfa work together (#3624)
* chore: upgrade rskafka + enable snappy support

* test: ensure that rskafka and rdkakfa work together

Before removing rdkafka ensure that:

- rskafka can consume existing messages produced by rdkafka so we do not
  need to drain existing topics
- rdkafka can consume new messages produced by rskafka so we can roll
  back

I ran the whole `write_buffer` test suite (including the newly added
tests) using Apache Kafka as well as Redpanda.

* test: ensure we handle consumer offset in error case correctly

* docs: explain test setup

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 12:52:42 +00:00
Paul Dix ce46bbaada
feat: wire up the write buffer to the ingester process (#3533)
This adds the scaffolding for the ingester server to consume data from Kafka. This ingests data in an in memory structure while creating records in the catalog for any partitions that don't yet exist.

I've removed catalog_update.rs in ingester for now. That was mostly a placeholder and will be going in a combination of handler.rs and data.rs on my next PR which will have some primitive lifecycle wired up.

There's one ugly bit here where the DML write is cloned because it's getting borrowed to output spans and metrics. I'll need to follow up with a refactor to make it so that the DML write's tables can be consumed without it gumming up the metrics stuff.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 11:47:28 +00:00
kodiakhq[bot] 9079baf19a
Merge branch 'main' into dom/schema-validation 2022-02-02 16:27:35 +00:00
Andrew Lamb 030a2cb4c1
chore: Update datafusion (#3613)
* chore: Update datafusion

* fix: update for latest DF API

* fix: another API change

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-02 16:27:11 +00:00
Dom Dwyer 4744c5804e refactor: remove Dashmap
Swap Dashmap for a regular RwLock<HashMap<..,>> due to soundness issues:

    https://rustsec.org/advisories/RUSTSEC-2022-0002
2022-02-02 14:04:53 +00:00
Dom Dwyer 39d489d9e7 refactor: enable schema validation
Adds the SchemaValidator to the DML handler stack - this adds it into
the request path in router2.
2022-02-02 14:04:14 +00:00
Dom Dwyer 6598023726 feat: cache NamespaceSchema in validator
Adds an in-memory cache of table schemas to the SchemaValidator DML
handler.

The cache pulls from the global catalog when observing a column for the
first time, and pushes the column type to set it for subsequent requests
if it does not exist (this pull & push is done by atomically by the
catalog in an "upsert" call).

The in-memory cache is sharded by namespace, with each shard guarded by
an individual lock to minimise contention between readers (the expected
average case) and writers (only when adding new columns/tables).

Relies on the catalog to serialise new column creation and validate
parallel creation requests.
2022-02-02 13:04:53 +00:00
Dom Dwyer c81f207298 feat: schema validation
Implements a write schema validation DML handler, denying requests that
conflict with the schema within the global catalog. Additive schema
changes are accepted, incrementally updating the global catalog schema.

Deletes are passed through unchanged and unvalidated.
2022-02-02 13:04:53 +00:00
Edd Robinson 5441682207 feat: add support for parsing predicate 2022-02-02 11:02:33 +00:00
Edd Robinson c1f5994660
refactor: move sql parsing -> RPC predicate into own crate (#3604)
* chore: create crate

* refactor: move module to new crate
2022-02-02 10:41:57 +00:00
Edd Robinson fa546047fb
refactor: update dep with API change (#3596) 2022-02-01 14:53:10 +00:00
Edd Robinson 443fd00c1b
Merge branch 'main' into er/feat/parse_sql_rpc 2022-02-01 13:55:08 +00:00
Marco Neumann 0a2cb36ddf
fix: workspace-hack (#3593) 2022-02-01 12:32:58 +00:00
Marco Neumann 22778a3a80
chore: upgrade rskafka and parking_lot (#3592) 2022-02-01 11:50:42 +00:00
Edd Robinson 175732c3ca feat: basic sqlparser -> RPCNode 2022-02-01 10:26:03 +00:00
Marco Neumann b326b62b44
feat: buffer writes when writing to RSKafka (#3520) 2022-02-01 10:07:52 +00:00
kodiakhq[bot] 8bef2c105c
Merge branch 'main' into cn/persist 2022-01-31 18:50:45 +00:00
Andrew Lamb 7b96a37165
chore: Update datafusion (#3586)
* chore: update DataFusion to f849968057ddddccc9aa19915ef3ea56bf14d80d

* fix: reduce overhead of creating physical expressions

* chore: use MemTrackingMetrics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-31 18:15:28 +00:00
Carol (Nichols || Goulding) bf89162fa5
refactor: Move IoxMetadata to parquet_file 2022-01-31 10:36:33 -05:00
Carol (Nichols || Goulding) dd9620da0c
feat: Create a new proto definition for the new design's IoxMetadata 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) 5e0e0d8aa7
feat: Write parquet to object storage in a similar way as parquet_file::Storage 2022-01-31 10:36:32 -05:00
Carol (Nichols || Goulding) c633c9bc5c
feat: Wire object store into ingester persistence 2022-01-31 10:36:30 -05:00
Raphael Taylor-Davies 4101d16f71
chore: feature flag consistency (#3574)
* chore: feature flag consistency

* chore: add aarch64-apple-darwin to hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 16:38:59 +00:00
Nga Tran 8735ede74f
feat: IoxMetadata for parquet file (#3547)
* feat: IoxMetadata for parquet file

* fix: typos

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 14:41:59 +00:00
Dom 1669acc0c2
build: update tokio (#3566)
Release notes:
    https://github.com/tokio-rs/tokio/releases/tag/tokio-1.16.0
2022-01-28 10:36:12 +00:00
Andrew Lamb b486258dfb
chore: run cargo update (#3559)
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 21:05:31 +00:00
Dom 9201023ea4
feat: schema validation for MutableBuffer instances (#3554)
* refactor: Debug bounds on Catalog trait

* feat: validate MutableBatch schema

Changes the schema validation code to validate MutableBatch instances
(coming from a pre-parsed LP write, and non-LP-based writes) instead of
parsed LP lines.

* refactor: Send bound on boxed errors

* refactor: clippy

Allow assert_eq!(bool, bool) for readability.

* refactor: no PartialEq<MB Column> for ColumnSchema

Remove the PartialEq<mutable_buffer::Column> for ColumnSchema - it's
definitely more readable as a method call.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 20:55:18 +00:00
Luke Bond 4a96e52290
feat: router2 sharder benchmarking (#3558)
* feat: benchmarking the router2 sharder

* chore: added throughput to sharder benchmarks; vary num buckets
2022-01-27 18:09:16 +00:00
Andrew Lamb 2062267d0f
chore: Update hashbrown (#3551)
* chore: Update hashbrown

* fix: hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:34:10 +00:00
Dom ce568ab447
build(iox_catalog): remove unused dependencies (#3552)
* build: don't pull in all of tokio

We already specify the tokio features we need so "full" (all features)
is not necessary.

* build: remove chrono dependency

Appears unused.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:14:42 +00:00
Raphael Taylor-Davies 21c1824a7a
refactor: remove table_names from Predicate (#3545)
* refactor: remove table_names from Predicate

* chore: fix benchmarks

* chore: review feedback

Co-authored-by: Edd Robinson <me@edd.io>

* chore: review feedback

* chore: replace Default::default with InfluxRpcPredicate::default()

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:44:49 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00