Commit Graph

1278 Commits (7daf680e76ac035342094c4bc00c890ffbcdfa54)

Author SHA1 Message Date
Andrew Lamb 3592aa52d8
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0` (#4743)
* chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0`

* chore: Update APIs

* chore: Run cargo hakari tasks

* feat: normalize parquet file metadata

* chore: update size tests

* chore: add docs on metadata stripping

* chore: TEMP UPDATE TO DF BRANCH

* chore: Update for new API

* fix: Update to latest DF

* fix: cargo hakari

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
2022-06-03 10:32:26 +00:00
dependabot[bot] 9a21292db8
chore(deps): Bump async-trait from 0.1.53 to 0.1.56 (#4774)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.53 to 0.1.56.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.53...0.1.56)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-03 09:10:40 +00:00
dependabot[bot] 73a7e6f0a5
chore(deps): Bump syn from 1.0.95 to 1.0.96 (#4773)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.95 to 1.0.96.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.95...1.0.96)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-03 08:02:56 +00:00
Marco Neumann 9e30a3eb29
refactor: rework querier concurrency limiting (#4760)
* refactor: rework querier concurrency limiting

With #4752 we introduced a concurrency limit into the querier. It works
by drawing permits from a central semaphore whenever we create a
`QuerierNamespace`. This however only limits concurrency during query
planning and not query execution, because the objects contained within
the plan (chunks and some metadata) neither reference the permit nor the
`QuerierNamespace`.

Now one approach to fix that would be to wire up the permit all the down
into all the query-related data structures. This however is very fiddly
and potentially will get lost at some point, because as soon as we
transform these data structures -- e.g. into streams -- the permit might
get lost again. This will be potentially query-dependent and very hard
to debug.

So instead we reverse the approach and track the permits at the upper
layer of the stack: the gRPC service entry points. There we also need to
be careful -- e.g. when we return streams to tonic -- but it's way
easier to review that then the deeply nested object hierarchy that is
involved with queries. Also the separation of concerns is a bit clearer,
because why would a "chunk" care about the "query concurrency" as a
whole.

* refactor: improve gRPC permit keeping and prepare tests
2022-06-02 09:49:58 +00:00
dependabot[bot] e638385782
chore(deps): Bump pbjson-types from 0.3.1 to 0.3.2 (#4750)
Bumps [pbjson-types](https://github.com/influxdata/pbjson) from 0.3.1 to 0.3.2.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.3.1...0.3.2)

---
updated-dependencies:
- dependency-name: pbjson-types
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-01 09:02:37 +00:00
dependabot[bot] 21ec05a6ee
chore(deps): Bump pbjson from 0.3.1 to 0.3.2 (#4749)
Bumps [pbjson](https://github.com/influxdata/pbjson) from 0.3.1 to 0.3.2.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.3.1...0.3.2)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-01 08:12:46 +00:00
dependabot[bot] 043dee43c8
chore(deps): Bump pbjson-build from 0.3.1 to 0.3.2 (#4748)
Bumps [pbjson-build](https://github.com/influxdata/pbjson) from 0.3.1 to 0.3.2.
- [Release notes](https://github.com/influxdata/pbjson/releases)
- [Commits](https://github.com/influxdata/pbjson/compare/0.3.1...0.3.2)

---
updated-dependencies:
- dependency-name: pbjson-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-01 07:54:13 +00:00
Marco Neumann c91dbe062e
test: "optimize" ingesterrecord batches in query tests (#4700)
* test: "optimize" ingesterrecord batches in query tests

It seems that I had the right idea in #4656 but wasn't able to trigger
https://github.com/influxdata/conductor/issues/955 because the query
tests do not "optimize" the record batches in the same way the actual
gRPC implementation does. If we apply the same transformation we indeed
end up with the same error.

* fix: all batches within the ingester flight response must have same schema

* refactor: simplify and reuse code

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-01 07:37:11 +00:00
Marco Neumann 988bd38e93
refactor: remove unused code (#4742) 2022-05-31 11:36:02 +00:00
Marco Neumann 5a95da7327
refactor: do NOT use ANY file IO for parquet reading (#4741) 2022-05-31 11:18:24 +00:00
dependabot[bot] 01fc550034
chore(deps): Bump parking_lot from 0.12.0 to 0.12.1 (#4732)
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.12.0 to 0.12.1.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/0.12.0...0.12.1)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-31 08:01:58 +00:00
dependabot[bot] 5b5f0efef5
chore(deps): Bump rayon from 1.5.2 to 1.5.3 (#4731)
Bumps [rayon](https://github.com/rayon-rs/rayon) from 1.5.2 to 1.5.3.
- [Release notes](https://github.com/rayon-rs/rayon/releases)
- [Changelog](https://github.com/rayon-rs/rayon/blob/master/RELEASES.md)
- [Commits](https://github.com/rayon-rs/rayon/compare/v1.5.2...v1.5.3)

---
updated-dependencies:
- dependency-name: rayon
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-31 07:50:42 +00:00
dependabot[bot] 642aef103d
chore(deps): Bump comfy-table from 5.0.1 to 6.0.0 (#4730)
Bumps [comfy-table](https://github.com/nukesor/comfy-table) from 5.0.1 to 6.0.0.
- [Release notes](https://github.com/nukesor/comfy-table/releases)
- [Changelog](https://github.com/Nukesor/comfy-table/blob/main/CHANGELOG.md)
- [Commits](https://github.com/nukesor/comfy-table/compare/v5.0.1...v6.0.0)

---
updated-dependencies:
- dependency-name: comfy-table
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-31 07:10:37 +00:00
Marco Neumann 79c054ffc9
fix: do NOT block in parquet file IO (#4727)
* fix: do NOT block in parquet file IO

I think for historical reason we were using blocking IO to read parquet
files. With the current streaming `SendableRecordStream` approach this
is technically NOT required anymore.

Now one might think that the sync-async dance that we did is kinda
harmless, but looking at our producition querier I think it is really
bad. The querier seems to be stuck but looking at `strace` and other
health signal it seems it is not entirely dead. Looking at GDB
backtraces it seems that nearly all threads are busy in
`download_and_scan_parquet`. Looking at the tokio docs
(<https://docs.rs/tokio/1.18.2/tokio/task/fn.spawn_blocking.html>)
for `spawn_blocking` (which is used to start the sync download) this
makes sense: tokio only starts replacement threads for the current
runtime thread (which calls `spawn_blocking`) if this does NOT exceed the
runtime thread limit. However we set the runtime thread limit to the
number of CPU cores available to IOx, so this is a limiting factor. This
means that there are only a few threads left to do actual work (I've
seen postgres data flowing back and forth for example) but tokio is not
able to use its full potential anymore. This is esp. bad because the
sync code in `download_and_scan_parquet` then uses `futures` `block_on`
functionality to call back into async code, so it waits for tokio
itself.

The change is rather simple: just use async task spawns.

* fix: use async IO to write stream to temp file

* fix: do not block tokio thread during parquet file reading

* refactor: ensure parquet IO tasks are cancelled if they are not needed anymore

There is no REAL way to cancel sync tasks, but at least we can try our
best.
2022-05-30 13:32:20 +00:00
dependabot[bot] 73168f7989
chore(deps): Bump flate2 from 1.0.23 to 1.0.24 (#4726)
Bumps [flate2](https://github.com/rust-lang/flate2-rs) from 1.0.23 to 1.0.24.
- [Release notes](https://github.com/rust-lang/flate2-rs/releases)
- [Commits](https://github.com/rust-lang/flate2-rs/commits)

---
updated-dependencies:
- dependency-name: flate2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 08:26:12 +00:00
dependabot[bot] 29069be7d4
chore(deps): Bump hyper from 0.14.18 to 0.14.19 (#4725)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.18 to 0.14.19.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.18...v0.14.19)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 07:58:12 +00:00
dependabot[bot] 7d4670e171
chore(deps): Bump indexmap from 1.8.1 to 1.8.2 (#4724)
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Changelog](https://github.com/bluss/indexmap/blob/1.8.2/RELEASES.rst)
- [Commits](https://github.com/bluss/indexmap/compare/1.8.1...1.8.2)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 07:47:05 +00:00
Andrew Lamb cddd6d9b6d
chore: Update datafusion (#4723)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-28 19:00:54 +00:00
Carol (Nichols || Goulding) 25b8260b72
feat: Implement ReadBufferCache::new 2022-05-25 17:19:10 -04:00
Marco Neumann 31d1b37d73
refactor: de-duplicate low-level arrow code (#4697)
It seems that during prototyping NG we've copied low level code (w/o
tests!) and never cleaned up. Let's not have this functionality twice.
2022-05-25 16:24:28 +00:00
Marko Mikulicic cdbe546e50
fix: return gRPC error on panic (#4686) 2022-05-25 07:06:25 +00:00
dependabot[bot] 24ee251080
chore(deps): Bump prost from 0.10.3 to 0.10.4 (#4688)
Bumps [prost](https://github.com/tokio-rs/prost) from 0.10.3 to 0.10.4.
- [Release notes](https://github.com/tokio-rs/prost/releases)
- [Commits](https://github.com/tokio-rs/prost/compare/v0.10.3...v0.10.4)

---
updated-dependencies:
- dependency-name: prost
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 06:07:05 +00:00
dependabot[bot] a8d3fe619c
chore(deps): Bump prost-build from 0.10.3 to 0.10.4 (#4687)
Bumps [prost-build](https://github.com/tokio-rs/prost) from 0.10.3 to 0.10.4.
- [Release notes](https://github.com/tokio-rs/prost/releases)
- [Commits](https://github.com/tokio-rs/prost/compare/v0.10.3...v0.10.4)

---
updated-dependencies:
- dependency-name: prost-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-25 06:00:27 +00:00
Andrew Lamb 95e6a8ed46
chore: Update datafusion (again) (#4679)
* chore: Update datafusion deps

* fix: fix for changes in ScalarValue

* fix: fix for using TableSource rather than TableProvider

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-24 15:54:39 +00:00
Luke Bond b76a0080d5
chore: remove unused iox_gitops_adapter (#4675)
* chore: remove unused iox_gitops_adapter

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-24 10:28:43 +00:00
dependabot[bot] ca49820a0f
chore(deps): Bump console-subscriber from 0.1.5 to 0.1.6 (#4670)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.5 to 0.1.6.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.5...console-subscriber-v0.1.6)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:24:12 +00:00
dependabot[bot] 76f7043417
chore(deps): Bump once_cell from 1.11.0 to 1.12.0 (#4666)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:14:03 +00:00
Marco Neumann 2029bd16ba
feat: enable debugging of failed querier->ingester requests (#4659)
* feat: enable debugging of failed querier->ingester requests

- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario

* test: explain

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: move b64 pred. serde into a single crate

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-23 15:37:31 +00:00
Dom f0d0f1ba0c
Merge branch 'main' into dom/codec-object-store 2022-05-23 15:39:54 +01:00
dependabot[bot] 5c033b462e
chore(deps): Bump regex from 1.5.5 to 1.5.6 (#4655)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.5 to 1.5.6.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.5...1.5.6)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 08:39:01 +00:00
dependabot[bot] 292f71759e
chore(deps): Bump http-body from 0.4.4 to 0.4.5 (#4654)
Bumps [http-body](https://github.com/hyperium/http-body) from 0.4.4 to 0.4.5.
- [Release notes](https://github.com/hyperium/http-body/releases)
- [Changelog](https://github.com/hyperium/http-body/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http-body/compare/v0.4.4...v0.4.5)

---
updated-dependencies:
- dependency-name: http-body
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 08:30:49 +00:00
dependabot[bot] 1bc02b1487
chore(deps): Bump regex-syntax from 0.6.25 to 0.6.26 (#4653)
Bumps [regex-syntax](https://github.com/rust-lang/regex) from 0.6.25 to 0.6.26.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/commits)

---
updated-dependencies:
- dependency-name: regex-syntax
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-23 08:21:39 +00:00
Dom Dwyer b9a745d42d feat: RecordBatch stream to Parquet file upload
Implements an upload() method on the ParquetStorage type, consuming a
stream of RecordBatch, serialising the Parquet file, and uploading the
result to object storage. Returns the IOx-specific file metadata.

Currently while the upload() method accepts a stream of RecordBatch, the
actual resulting Parquet file is buffered in memory before uploading to
object store, due to lack of streaming upload functionality in the
ObjectStore abstraction - this isn't the end of the world, as the files
tend to be relatively small with our current usage.

This impl should be easily modified to be fully streaming once streaming
object store puts are implemented:

    https://github.com/influxdata/object_store_rs/issues/9
2022-05-20 15:17:40 +01:00
Dom Dwyer 70856a645f feat: streaming RecordBatch -> parquet encoding
Implements a streaming RecordBatch to Parquet file serialiser.

This impl automatically discovers the schema of the RecordBatch stream,
and accepts &mut destination types (internalising the handle
cloning/etc) to simplify caller usage.

This encoder returns the resulting FileMetaData to allow callers to
inspect the resulting metadata without reading back the file.

Currently unused / not yet plumbed in.
2022-05-20 15:09:26 +01:00
dependabot[bot] 7010af30b7
chore(deps): Bump prometheus from 0.13.0 to 0.13.1 (#4648)
Bumps [prometheus](https://github.com/tikv/rust-prometheus) from 0.13.0 to 0.13.1.
- [Release notes](https://github.com/tikv/rust-prometheus/releases)
- [Changelog](https://github.com/tikv/rust-prometheus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tikv/rust-prometheus/compare/v0.13.0...v0.13.1)

---
updated-dependencies:
- dependency-name: prometheus
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-20 08:12:45 +00:00
dependabot[bot] 40a69c6e29
chore(deps): Bump pprof from 0.9.0 to 0.9.1 (#4647)
Bumps [pprof](https://github.com/tikv/pprof-rs) from 0.9.0 to 0.9.1.
- [Release notes](https://github.com/tikv/pprof-rs/releases)
- [Changelog](https://github.com/tikv/pprof-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tikv/pprof-rs/commits)

---
updated-dependencies:
- dependency-name: pprof
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-20 08:04:46 +00:00
dependabot[bot] 6bc0c74c7d
chore(deps): Bump once_cell from 1.10.0 to 1.11.0 (#4646)
* chore(deps): Bump once_cell from 1.10.0 to 1.11.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:40:38 +00:00
Marco Neumann addc45327e
fix: ensure that query tokio background tasks are canceled (#4643)
* fix: ensure that query tokio background tasks are canceled

While I am not entirely sure if this explains some of the memory leaks I
am seeing in prod, not canceling the tasks correctly certainly makes
debugging way harder and also renders certain form of throttling (e.g.
max. concurrent queries) somewhat ineffective.

Note that parquet file downloads are currently NOT canceled because
tokios `spawn_blocking` cannot be canceled.

* refactor: `Vec` -> `Option`

* refactor: `spawn_blocking` creates a join handle, even though it is useless

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:18:52 +00:00
Dom Dwyer baa86d846f refactor: use ParquetStore instead of ObjectStore
Changes the code paths that interact with Parquet files in the object
store to reference the ParquetStorage directly (DRY refactor).

This change takes us from a dependency graph of:

                    ┌─────────────────┐
                    │                 │
                    ▼                 │
            Parquet Consumer          │
                    │         ┌──────────────┐
                    ├────────▶│ParquetStorage│
                    ▼         └──────────────┘
            ┌──────────────┐
            │ ObjectStore  │
            └──────────────┘
                    │
               ┌────┴────┐
               ▼         ▼
             File       s3
            System    (etc)

to:

                Parquet Consumer
                        │
                        ▼
                ┌──────────────┐
                │ParquetStorage│
                └──────────────┘
                        │
                        ▼
                ┌──────────────┐
                │ ObjectStore  │
                └──────────────┘
                        │
                   ┌────┴────┐
                   ▼         ▼
                 File       s3
                System    (etc)

With the ParquetStorage being solely responsible for managing
interactions with the object store when dealing with Parquet files.
2022-05-19 13:52:51 +01:00
dependabot[bot] 409ae0ee0d
chore(deps): Bump handlebars from 4.2.2 to 4.3.0 (#4639)
Bumps [handlebars](https://github.com/sunng87/handlebars-rust) from 4.2.2 to 4.3.0.
- [Release notes](https://github.com/sunng87/handlebars-rust/releases)
- [Changelog](https://github.com/sunng87/handlebars-rust/blob/master/CHANGELOG.md)
- [Commits](https://github.com/sunng87/handlebars-rust/compare/v4.2.2...v4.3.0)

---
updated-dependencies:
- dependency-name: handlebars
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-19 08:15:51 +00:00
Marco Neumann 770293a973
feat: add LRU cache metrics (#4632)
* refactor: require `Resource`s to be convertible to `u64`

* refactor: require `Resource`s to have a unit name

* refactor: make LRU cache IDs static

* feat: add LRU cache metrics

* docs: improve type names in LRU doctest

* docs: epxlain `MeasuredT`

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: explain `test_metrics`

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-19 08:05:17 +00:00
Dom Dwyer 43300878bc fix(pb): encoding entirely NULL columns (#4272)
This commit changes the protobuf record batch encoding to skip entirely
NULL columns when serialising. This prevents the deserialisation from
erroring due to a column type inference failure.

Prior to this commit, when the system was presented a record batch such
as this:

            | time       | A    | B    |
            | ---------- | ---- | ---- |
            | 1970-01-01 | 1    | NULL |
            | 1970-07-05 | NULL | 1    |

Which would be partitioned by YMD into two separate partitions:

            | time       | A    | B    |
            | ---------- | ---- | ---- |
            | 1970-01-01 | 1    | NULL |

and:

            | time       | A    | B    |
            | ---------- | ---- | ---- |
            | 1970-07-05 | NULL | 1    |

Both partitions would contain an entirely NULL column.

Both of these partitioned record batches would be successfully encoded,
but decoding the partition fails due to the inability to infer a column
type from the serialised format which contains no values, which on the
wire, looks like:

            Column {
                column_name: "B",
                semantic_type: Field,
                values: Some(
                    Values {
                        i64_values: [],
                        f64_values: [],
                        u64_values: [],
                        string_values: [],
                        bool_values: [],
                        bytes_values: [],
                        packed_string_values: None,
                        interned_string_values: None,
                    },
                ),
                null_mask: [
                    1,
                ],
            },

In a column that is not entirely NULL, one of the "Values" fields would
be non-empty, and the decoder would use this to infer the type of the
column.

Because we have chosen to not differentiate between "NULL" and "empty"
in our proto encoding, the decoder cannot infer which field within the
"Values" struct the column belongs to - all are valid, but empty.

This commit prevents this type inference failure by skipping any columns
that are entirely NULL during serialisation, preventing the deserialiser
from having to process columns with ambiguous types.
2022-05-18 13:33:26 +01:00
dependabot[bot] c99ad9f4a4
chore(deps): Bump schemars from 0.8.9 to 0.8.10 (#4627)
Bumps [schemars](https://github.com/GREsau/schemars) from 0.8.9 to 0.8.10.
- [Release notes](https://github.com/GREsau/schemars/releases)
- [Changelog](https://github.com/GREsau/schemars/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GREsau/schemars/compare/v0.8.9...v0.8.10)

---
updated-dependencies:
- dependency-name: schemars
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-18 10:23:34 +00:00
dependabot[bot] b542e453e8
chore(deps): Bump rustls from 0.20.4 to 0.20.6
Bumps [rustls](https://github.com/rustls/rustls) from 0.20.4 to 0.20.6.
- [Release notes](https://github.com/rustls/rustls/releases)
- [Changelog](https://github.com/rustls/rustls/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/rustls/rustls/compare/v/0.20.4...v/0.20.6)

---
updated-dependencies:
- dependency-name: rustls
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-05-18 09:54:48 +00:00
dependabot[bot] db384ba387
chore(deps): Bump libc from 0.2.125 to 0.2.126 (#4626)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.125 to 0.2.126.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.125...0.2.126)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-18 09:52:35 +00:00
Marco Neumann 52346642a0
ci: fix cargo deny (#4629)
* ci: fix cargo deny

* chore: downgrade `socket2`, version 0.4.5 was yanked

* chore: rename `query` to `iox_query`

`query` is already taken on crates.io and yanked and I am getting tired
of working around that.
2022-05-18 09:38:35 +00:00
Marco Neumann 12937ee724
feat: add SOCKS5 support to Kafka write buffer (#4623) 2022-05-17 15:21:35 +00:00
Andrew Lamb 3a33e806c7
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `14.0.0` (#4619)
* chore: Update datafusion deps

* chore: update arrow/parquet/arrow flight deps

* chore: Run cargo hakari tasks

* chore: Update location of utils

* chore: Update some more APIs

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-17 14:13:03 +00:00
dependabot[bot] 87d8a8b684
chore(deps): Bump schemars from 0.8.8 to 0.8.9 (#4614)
Bumps [schemars](https://github.com/GREsau/schemars) from 0.8.8 to 0.8.9.
- [Release notes](https://github.com/GREsau/schemars/releases)
- [Changelog](https://github.com/GREsau/schemars/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GREsau/schemars/compare/v0.8.8...v0.8.9)

---
updated-dependencies:
- dependency-name: schemars
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-17 09:49:27 +00:00
dependabot[bot] 5a0006e40b
chore(deps): Bump syn from 1.0.94 to 1.0.95 (#4610)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.94 to 1.0.95.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.94...1.0.95)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-17 09:40:36 +00:00