Commit Graph

8225 Commits (127467b5c4154181a508b6a50400b98a255c0409)

Author SHA1 Message Date
Marco Neumann 79c054ffc9
fix: do NOT block in parquet file IO (#4727)
* fix: do NOT block in parquet file IO

I think for historical reason we were using blocking IO to read parquet
files. With the current streaming `SendableRecordStream` approach this
is technically NOT required anymore.

Now one might think that the sync-async dance that we did is kinda
harmless, but looking at our producition querier I think it is really
bad. The querier seems to be stuck but looking at `strace` and other
health signal it seems it is not entirely dead. Looking at GDB
backtraces it seems that nearly all threads are busy in
`download_and_scan_parquet`. Looking at the tokio docs
(<https://docs.rs/tokio/1.18.2/tokio/task/fn.spawn_blocking.html>)
for `spawn_blocking` (which is used to start the sync download) this
makes sense: tokio only starts replacement threads for the current
runtime thread (which calls `spawn_blocking`) if this does NOT exceed the
runtime thread limit. However we set the runtime thread limit to the
number of CPU cores available to IOx, so this is a limiting factor. This
means that there are only a few threads left to do actual work (I've
seen postgres data flowing back and forth for example) but tokio is not
able to use its full potential anymore. This is esp. bad because the
sync code in `download_and_scan_parquet` then uses `futures` `block_on`
functionality to call back into async code, so it waits for tokio
itself.

The change is rather simple: just use async task spawns.

* fix: use async IO to write stream to temp file

* fix: do not block tokio thread during parquet file reading

* refactor: ensure parquet IO tasks are cancelled if they are not needed anymore

There is no REAL way to cancel sync tasks, but at least we can try our
best.
2022-05-30 13:32:20 +00:00
Andrew Lamb d0903b11bb
refactor: reduce test duplication in `querier/src/table/mod.rs` (#4698)
* refactor: reduce test duplication in `querier/src/table/mod.rs`

* fix: Apply suggestions from code review

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* fix: Update querier/src/table/test_util.rs

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

* fix: use now_nanos()

* refactor: Add TestQuerierTable

* refactor: rename functions for explicitness

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2022-05-30 12:56:09 +00:00
Paul Dix 6af32b7750
feat: add concurrency limit for ingester queries (#4703)
I've defaulted it to 20, we can adjust as needed.

Closes #4657

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-30 10:22:17 +00:00
dependabot[bot] 73168f7989
chore(deps): Bump flate2 from 1.0.23 to 1.0.24 (#4726)
Bumps [flate2](https://github.com/rust-lang/flate2-rs) from 1.0.23 to 1.0.24.
- [Release notes](https://github.com/rust-lang/flate2-rs/releases)
- [Commits](https://github.com/rust-lang/flate2-rs/commits)

---
updated-dependencies:
- dependency-name: flate2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 08:26:12 +00:00
dependabot[bot] 29069be7d4
chore(deps): Bump hyper from 0.14.18 to 0.14.19 (#4725)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.18 to 0.14.19.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.18...v0.14.19)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 07:58:12 +00:00
dependabot[bot] 7d4670e171
chore(deps): Bump indexmap from 1.8.1 to 1.8.2 (#4724)
Bumps [indexmap](https://github.com/bluss/indexmap) from 1.8.1 to 1.8.2.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Changelog](https://github.com/bluss/indexmap/blob/1.8.2/RELEASES.rst)
- [Commits](https://github.com/bluss/indexmap/compare/1.8.1...1.8.2)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-30 07:47:05 +00:00
Andrew Lamb cddd6d9b6d
chore: Update datafusion (#4723)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-28 19:00:54 +00:00
Carol (Nichols || Goulding) b52a3586a7
fix: Turn cargo doc warnings into errors (#4710)
* fix: Correct intra-doc links

* fix: Turn cargo doc warnings into errors

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-28 11:24:22 +00:00
Andrew Lamb 9f21512296
chore: reduce `debug!` log spew in `parquet_file` (#4718)
* chore: reduce log spew

* chore: trace another overly verbose message

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-27 20:57:10 +00:00
kodiakhq[bot] 0a84727c72
Merge pull request #4709 from influxdata/cn/fetch-from-parquet-file
feat: Make a QuerierRBChunk wrapper that implement QueryChunk and QueryChunkMeta
2022-05-27 17:14:05 +00:00
kodiakhq[bot] 842ef8e308
Merge branch 'main' into cn/fetch-from-parquet-file 2022-05-27 17:08:28 +00:00
Carol (Nichols || Goulding) 55cd8d15be
fix: Update method name to specify the kind of chunk it makes 2022-05-27 13:04:24 -04:00
Carol (Nichols || Goulding) f0b4d71f47
docs: Update comment to reflect new implementation 2022-05-27 13:04:24 -04:00
Carol (Nichols || Goulding) 5232594aab
docs: Fix grammar in a comment
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-27 13:04:13 -04:00
Nga Tran 16e7a6d596
test: test that hits panic becasue of no column meta data (#4719)
* test: test that hits panic becasue of no column meta data

* chore: Apply suggestions from code review

* chore: run format after applying changes

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* chore: run clippy

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-27 15:27:03 +00:00
Andrew Lamb dde3c3922c
refactor: use consistent spelling of serialize (#4717) 2022-05-27 14:42:59 +00:00
Nga Tran ea81152fac
refactor: add partition ID into debug info and panic earlier to identify the bug easier (#4716)
* chore: point tests to the new ticket

* chore: cleanup

* refactor: add partition ID into debug info and panic earlier to identify the bug easier
2022-05-27 12:20:36 +00:00
Nga Tran 09b55a209d
chore: point tests to the new ticket (#4715)
* chore: point tests to the new ticket

* chore: cleanup
2022-05-27 11:12:55 +00:00
Nga Tran 372b262f37
test: parquet meta decoded tests and more debug info (#4713)
* test: reproducer for 4695

* chore: some debug info

* test: test with many columns and rows

* chore: cleanup and add debug info

* chore: cleanup

* chore: cleanup

* chore: more debug info
2022-05-27 09:53:07 +00:00
Andrew Lamb 700a1de8f3
fix: fix at least one intermittent failure (#4711) 2022-05-26 21:24:37 +00:00
Carol (Nichols || Goulding) 2cb351cd0d
feat: Make a QuerierRBChunk wrapper to handle traits and extra data
This brings back a bunch of code from OG from read buffer backed
DbChunks.
2022-05-26 16:52:14 -04:00
Carol (Nichols || Goulding) b2905650aa
refactor: Extract extract_range to be a method on TableSummary
So that other kinds of chunks can use this code too.
2022-05-26 16:52:14 -04:00
Carol (Nichols || Goulding) 5fd3ffc17f
refactor: Rename ParquetChunkAdapter to only ChunkAdapter
It might be creating chunks of different kinds other than ParquetChunks.
2022-05-26 16:52:14 -04:00
Andrew Lamb 633117e595
feat: avoid catalog access on each query (#4650)
* feat: cache catalog access on query

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-05-26 20:44:22 +00:00
Nga Tran 05151d5c69
test: reproducer for 4695 (#4706)
* test: reproducer for 4695

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-26 15:32:30 +00:00
kodiakhq[bot] f645ec8a42
Merge pull request #4704 from influxdata/cn/welcome-back-read-buffer
feat: Start of a read buffer chunk cache
2022-05-26 13:53:29 +00:00
kodiakhq[bot] 1043c98e17
Merge branch 'main' into cn/welcome-back-read-buffer 2022-05-26 13:47:27 +00:00
Andrew Lamb 2d5a327bf4
fix: expire empty parquet_files cache and empty tombstones cache (#4701)
* fix: expire empty parquet_files cache

* fix: expire empty tombstones cache

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-26 11:08:15 +00:00
Carol (Nichols || Goulding) cddcca1e05
feat: Implement a method to get a read buffer chunk from a stream of record batches 2022-05-25 17:24:35 -04:00
Carol (Nichols || Goulding) f7bc551d9a
feat: Sketch out skeleton methods for RBChunk cache 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) 04531e77dd
feat: Implement get on ReadBufferCache 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) 25b8260b72
feat: Implement ReadBufferCache::new 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) ab9010d9a6
refactor: Rename QuerierParquetChunk::new_parquet to new 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) df10452e2e
refactor: Rename methods from new_querier_chunk to new_querier_parquet_chunk 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) 4a90d0af32
refactor: Remove ChunkStorage enum; inline into QuerierParquetChunk instead 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) b2c62c6808
refactor: Rename QuerierChunk to QuerierParquetChunk 2022-05-25 17:19:10 -04:00
Carol (Nichols || Goulding) 66823522f3
docs: Fix comment wrapping while reading through 2022-05-25 17:19:10 -04:00
Nga Tran 6cc767efcc
feat: teach compactor to compact smaller number of files (#4671)
* refactor: split compact_partition into two functions to handle concurrency better

* feat: limit number of files to compact

* test: add test for limit num files

* chore: fix cipply

* feat: split group if over max size

* fix: split the overlapped group to limit size or file num

* chore: reduce config values

* test: add tests and clearer comments for the split_overlapped_groups and test_limit_size_and_num_files

* chore: more comments

* chore: cleanup
2022-05-25 19:54:34 +00:00
Marco Neumann 31d1b37d73
refactor: de-duplicate low-level arrow code (#4697)
It seems that during prototyping NG we've copied low level code (w/o
tests!) and never cleaned up. Let's not have this functionality twice.
2022-05-25 16:24:28 +00:00
Marko Mikulicic 9ddb0a816e
fix: Return panic message in internal error (#4693)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 15:11:17 +00:00
kodiakhq[bot] eb815f2b30
Merge pull request #4664 from influxdata/cn/fix-metric-names
fix: Make Kafka Partition related metric names potentially less confusing
2022-05-25 14:33:58 +00:00
Carol (Nichols || Goulding) 788e6eaf69
docs: Fix a comment that was very confused about what means kafka partition 2022-05-25 10:04:40 -04:00
Carol (Nichols || Goulding) 6ce6a38094
fix: Make metric names potentially less confusing 2022-05-25 10:04:39 -04:00
Marco Neumann a08a91c5ba
fix: ensure querier cache is refreshed for partition sort key (#4660)
* test: call `maybe_start_logging` in auto-generated cases

* fix: ensure querier cache is refreshed for partition sort key

Fixes #4631.

* docs: explain querier sort key handling and test

* test: test another version of issue 4631

* fix: correctly invalidate partition sort keys

* fix: fix `table_not_found_on_ingester`
2022-05-25 10:44:42 +00:00
Marko Mikulicic cdbe546e50
fix: return gRPC error on panic (#4686) 2022-05-25 07:06:25 +00:00
dependabot[bot] 24ee251080
chore(deps): Bump prost from 0.10.3 to 0.10.4 (#4688)
Bumps [prost](https://github.com/tokio-rs/prost) from 0.10.3 to 0.10.4.
- [Release notes](https://github.com/tokio-rs/prost/releases)
- [Commits](https://github.com/tokio-rs/prost/compare/v0.10.3...v0.10.4)

---
updated-dependencies:
- dependency-name: prost
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 06:07:05 +00:00
dependabot[bot] a8d3fe619c
chore(deps): Bump prost-build from 0.10.3 to 0.10.4 (#4687)
Bumps [prost-build](https://github.com/tokio-rs/prost) from 0.10.3 to 0.10.4.
- [Release notes](https://github.com/tokio-rs/prost/releases)
- [Commits](https://github.com/tokio-rs/prost/compare/v0.10.3...v0.10.4)

---
updated-dependencies:
- dependency-name: prost-build
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-25 06:00:27 +00:00
Andrew Lamb 935743b525
refactor: Implement `new_querier_chunk` and `new_querier_chunk_from_file_with_metadata` (#4685) 2022-05-24 21:58:27 +00:00
Andrew Lamb a8d5f7f5f7
test: add debug output to test (#4684) 2022-05-24 19:57:11 +00:00
Andrew Lamb 95e6a8ed46
chore: Update datafusion (again) (#4679)
* chore: Update datafusion deps

* fix: fix for changes in ScalarValue

* fix: fix for using TableSource rather than TableProvider

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-24 15:54:39 +00:00