Commit Graph

360 Commits (76f7043417ba2f14cfdbe85d601d37faa320df83)

Author SHA1 Message Date
Andrew Lamb e877a64462
feat: Add `ParquetFiles` cache and memory size estimation for ParquetMetadata (#4661)
* feat: Add `ParquetFiles` cache

* fix: Apply suggestions from code review

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>

* fix: remove commented out debugging println

* refactor: Improve size calculation

* fix: mark `ParquetFileCache::clear` test only

* fix: assert on metric count

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
2022-05-23 17:11:38 +00:00
Marco Neumann 779f0e9cdf
feat: querier RAM pool (#4593)
* feat: `SortKey::size`

* feat: `FunctionEstimator`

* feat: querier RAM pool

Let's put all the caches into a single RAM pool, so we can at least
somewhat control RAM usage. Note that this does NOT limit the peak
memory during query execution though, but should at least stop unlimited
cache growth. A follow-up PR will add metrics.

* refactor: improve some size calculations

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-17 13:11:20 +00:00
kodiakhq[bot] 0f8f294319
Merge branch 'main' into cn/remove-chunk-addr 2022-05-13 13:54:44 +00:00
Carol (Nichols || Goulding) 55313d290a
fix: Update or remove comments that mention NG or OG
Connects to #4450.
2022-05-12 16:09:08 -04:00
Carol (Nichols || Goulding) b581a42fde
fix: Rename new_id_for_ng to new_id
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) faba90d992
fix: Remove ChunkAddr 2022-05-12 15:50:41 -04:00
Carol (Nichols || Goulding) 8545bb60c6
refactor: Move KafkaPartitionWriteStatus to data_types to share more 2022-05-11 14:07:06 -04:00
Carol (Nichols || Goulding) 068096e7e1
fix: Rename data_types2 to data_types 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) e1bef1c218
fix: Remove OG data_types crate 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 44209faa8e
fix: Move write buffer data types to write_buffer crate 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) d7304c1114
fix: Move TimestampSummary to the only place it's used 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) 4c56ba1e25
fix: Move ErrorLogger trait to the only place it's used 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) fb8f8d22c0
fix: Remove now-unused ServerId. Fixes #4451 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) b76c1e1ad6
fix: Remove now-unused DML sharding and related types 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) 94be7407ba
refactor: Move BooleanFlag to the only place it's used 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) 236edb9181
fix: Move Sequence type to data_types2 2022-05-06 14:45:38 -04:00
Carol (Nichols || Goulding) afdff2b1db
fix: Move DatabaseName to data_types2 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) 1ea4a40b1f
fix: Move NonEmptyString to data_types2 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) 6b0e7ae46a
fix: Move name parsing code to data_types2 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) 3ab0788a94
fix: Move DeletePredicate types to data_types2 2022-05-06 14:45:37 -04:00
dependabot[bot] 912d73a6f3
chore(deps): Bump ordered-float from 2.10.0 to 3.0.0 (#4502)
Bumps [ordered-float](https://github.com/reem/rust-ordered-float) from 2.10.0 to 3.0.0.
- [Release notes](https://github.com/reem/rust-ordered-float/releases)
- [Commits](https://github.com/reem/rust-ordered-float/compare/v2.10.0...v3.0.0)

---
updated-dependencies:
- dependency-name: ordered-float
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-02 14:27:25 +00:00
二手掉包工程师 4b47d723b1
refactor: Rename time to iox_time (#4416)
Signed-off-by: hi-rustin <rustin.liu@gmail.com>

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 00:19:59 +00:00
Marco Neumann f444e63960
test: include materialized delete predicates in NG query tests (#4371)
* refactor: move `batch_filter` to `datafusion_util`

* fix: outdated docstring

* feat: allow passing record batches to `iox_tests` parquet files

* test: include materialized delete predicates in NG query tests

* docs: improve wording

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-21 13:00:13 +00:00
Andrew Lamb c244b03281
feat: Add `SequencerProgress` reporting to ingester (#4238)
* feat: Add `SequencerProgress` reporting to ingester

* refactor: Use KafkaPartition in write_summary

* fix: Update docstrings

* refactor: Change ingester to use KafkaPartition everywhere

* refactor: add SequencerProgress::combine

* refactor: return new SequencerProgress rather than updating

* fix: distinguish between yes/no/unknown in WriteSummary

* docs: Update data_types2/src/lib.rs

Co-authored-by: Paul Dix <paul@pauldix.net>

Co-authored-by: Paul Dix <paul@pauldix.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-06 15:13:21 +00:00
Andrew Lamb a384448b92
refactor: rename Sequence::id and Sequence::number field names (#4190)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 15:17:58 +00:00
Marco Neumann 20bbb88dc5
refactor: remove table name from `TableSummary` (#4170)
This allows us to remove the table name from the low-level chunk
representations (like `ParquetFile`, RUB, ...) since table names are
already tracked by the higher-level data structures (e.g. catalog,
catalog chunk) that manage the low-level chunk representations.

This is similar to #4167.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 13:24:00 +00:00
Nga Tran bfd5568acf
fix: make sure the QueryableParquetChunks are always sorted correctly (#4163)
* fix: make sure the chunks are always sorted correctly

* fix: output

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: make new function for new chunk id

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-29 21:36:45 +00:00
Marco Neumann 2b76c31157
refactor: make statistics null counts optional (#4160)
Min/max values and distinct counts are already optional, so let's make
the null counts optional as well. This will be helpful for NG to deal w/
partial statistics (e.g. we only populate stats for the time column).

Note that the total count is still mandatory, but we normally have the
chunk/file-level row count at hand.
2022-03-29 17:47:57 +00:00
Andrew Lamb 204dd7c8e9
refactor: Fix some random clippy lints from the future (#4118)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-24 09:21:29 +00:00
Raphael Taylor-Davies 80fb75d90b
feat: add a flag to enable per-partition tracing (#3928)
* feat: add a flag to enable per-partition tracing

* chore: rename constant

* feat: use BooleanFlag and cache result
2022-03-07 13:49:23 +00:00
Marco Neumann 8d00aaba90
feat: sync chunks in querier (#3911)
* feat: `ParquetFileRepo::list_by_namespace_not_to_delete`

* feat: `ChunkAddr: Clone`

* test: ensure that querier keeps same partition objects

* test: improve `create_parquet_file` flexibility

* feat: sync chunks in querier

* test: improve `test_parquet_file`
2022-03-04 08:53:39 +00:00
Marco Neumann 33851be3a5
chore: upgrade Rust to 1.59 (#3875)
Mostly a few new clippy crates around `flat_map`, `and_then`, and
"underscore locks" (!!!):
https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_lock

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:14:19 +00:00
Raphael Taylor-Davies 442d63e65b
feat: catalog timestamp pruning (#3571)
* feat: catalog timestamp pruning

* chore: test
2022-01-28 13:45:13 +00:00
Dom 32d7c4cbfe
refactor: remove InfluxColumnType::IOx (#3565)
* refactor: remove InfluxColumnType::IOx

Remove unused column variant - see #3554 for context.

* refactor: reserve SEMANTIC_TYPE_IOX name in proto

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 21:15:36 +00:00
Raphael Taylor-Davies 21c1824a7a
refactor: remove table_names from Predicate (#3545)
* refactor: remove table_names from Predicate

* chore: fix benchmarks

* chore: review feedback

Co-authored-by: Edd Robinson <me@edd.io>

* chore: review feedback

* chore: replace Default::default with InfluxRpcPredicate::default()

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:44:49 +00:00
Andrew Lamb 9c19cd6cc4
fix: clamp start/end of TimestampRange to min/max valid timestamp values (#3487)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-20 16:08:00 +00:00
Andrew Lamb f0d50f447a
fix: Special case tag_keys with max timestamp range (#3485)
* fix: Special case tag_keys with max timestamp range

* docs: comment

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-20 14:14:34 +00:00
Marco Neumann 168afb63ad feat: add `size` methods to DML-related types
This will be helpful when we want to batch DML operations in memory
(e.g. when using RSKafka).

This also ensures that `MBChunk` accounts for the column names that
are stored within `MutableBatch`.
2022-01-18 13:52:31 +01:00
Dom 5cce69b481 fix: reject empty org & bucket
Previously a request that specified an empty org & bucket value would be
mapped to a database named "_".

This commit changes the org/bucket mapping fn to return an error if
either org or bucket is empty.
2022-01-14 12:17:26 +00:00
Edd Robinson d1816b662f
chore: update rust to 1.58 (#3461)
* chore: update rust to 1.58

* fix: clippy

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-01-13 21:12:46 +00:00
Andrew Lamb cdf5c21cd4
fix: Fix max timestamp value comparison in chunk metadata (#3453)
* fix: Fix max timestamp value comparison in chunk metadata

* refactor: rename contains to overlaps

Co-authored-by: Edd Robinson <me@edd.io>
2022-01-13 16:58:30 +00:00
Marco Neumann f3f6f335a9
chore: upgrade to snafu 0.7 (#3440) 2022-01-11 19:22:36 +00:00
Andrew Lamb a93ae739a9
feat: Add table_name to Partition API (#3421) 2022-01-06 16:38:39 +00:00
Andrew Lamb 758b65dd29
feat: Add database initialization state and errors to CLI and remove list_databases_detailed gRPC (#3377)
* feat: Add database initialization state and errors to CLI:

* fix: do not use optional in protobuf

* fix: clippy

* fix: correct check I broke appeasing clippy
2021-12-15 12:18:41 +00:00
Carol (Nichols || Goulding) 30c4da7ca7
fix: Be consistent with regex version range specification 2021-12-06 09:37:15 -05:00
kodiakhq[bot] 0dffcad109
Merge branch 'main' into ntran/chunkid_catalogchunk 2021-12-03 14:21:04 +00:00
Nga Tran 10b1598e68 refactor: remove comments 2021-12-03 09:20:12 -05:00
kodiakhq[bot] 2857b6a990
Merge branch 'main' into er/feat/load_chunk_cli 2021-12-02 20:20:56 +00:00
Edd Robinson b4ea9887ba refactor: error name 2021-12-02 20:14:02 +00:00
Nga Tran ffc970a60f feat: add to-be-create-chunk-id for compacting OS chunks in CatalogChunk 2021-12-02 14:49:06 -05:00