influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	0534b80886	fix: `ParquetFile::size` must include column set (#4925 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 13:06:02 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Nga Tran	72c8cfa6ed	fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888 ) * fix: make ChunkOrder u64 data type to accept min sequence number 0 * fix: make ChunkOrder i64 to match with sequence number type Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-17 13:45:17 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Dom Dwyer	0da8ec87d5	refactor: always generate a partition key Changes the partitioner to always generate a partition key, even if the column being used to partition doesn't exist. This doesn't functionally change the batch partitioning output, but ensures we always have a non-empty string for the partition key.	2022-06-15 15:38:02 +01:00
Andrew Lamb	005610b172	refactor: remove some `&` use in iox_catalog (#4862 ) * refactor: remove some `&` use in iox_catalog * fix: Update data_types/src/lib.rs	2022-06-15 11:31:49 +00:00
Dom Dwyer	b41ea1d718	refactor: PartitionKey type This commit changes the code base to use a new reference-counted PartitionKey type wrapper, instead of passing a bare String around. This allows the compiler to type check & verify usage of the partition key, instead of passing a bare string around. By reference counting the underlying string, we reduce memory usage for some use cases.	2022-06-14 14:47:56 +01:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Andrew Lamb	50697906b1	refactor: Make `DMLWrite::sequence_number` a `SequenceNumber` (#4817 )	2022-06-09 19:36:37 +00:00
Dom Dwyer	d1436c9f06	refactor(data_types): no Timestamp wraparound This commit changes addition/subtraction of Timestamp values to panic if they would trigger under/overflow rather than silently wrapping around.	2022-06-09 13:23:03 +01:00
Carol (Nichols \|\| Goulding)	b2905650aa	refactor: Extract extract_range to be a method on TableSummary So that other kinds of chunks can use this code too.	2022-05-26 16:52:14 -04:00
Carol (Nichols \|\| Goulding)	788e6eaf69	docs: Fix a comment that was very confused about what means kafka partition	2022-05-25 10:04:40 -04:00
Andrew Lamb	4d8ece5524	feat: Add `Tombstone` to querier cache (#4663 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-24 13:21:23 +00:00
Andrew Lamb	e877a64462	feat: Add `ParquetFiles` cache and memory size estimation for ParquetMetadata (#4661 ) * feat: Add `ParquetFiles` cache * fix: Apply suggestions from code review Co-authored-by: Marko Mikulicic <mkm@influxdata.com> * fix: remove commented out debugging println * refactor: Improve size calculation * fix: mark `ParquetFileCache::clear` test only * fix: assert on metric count Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-05-23 17:11:38 +00:00
Marco Neumann	779f0e9cdf	feat: querier RAM pool (#4593 ) * feat: `SortKey::size` * feat: `FunctionEstimator` * feat: querier RAM pool Let's put all the caches into a single RAM pool, so we can at least somewhat control RAM usage. Note that this does NOT limit the peak memory during query execution though, but should at least stop unlimited cache growth. A follow-up PR will add metrics. * refactor: improve some size calculations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-17 13:11:20 +00:00
kodiakhq[bot]	0f8f294319	Merge branch 'main' into cn/remove-chunk-addr	2022-05-13 13:54:44 +00:00
Carol (Nichols \|\| Goulding)	55313d290a	fix: Update or remove comments that mention NG or OG Connects to #4450.	2022-05-12 16:09:08 -04:00
Carol (Nichols \|\| Goulding)	b581a42fde	fix: Rename new_id_for_ng to new_id Connects to #4450.	2022-05-12 16:09:07 -04:00
Carol (Nichols \|\| Goulding)	faba90d992	fix: Remove ChunkAddr	2022-05-12 15:50:41 -04:00
Carol (Nichols \|\| Goulding)	8545bb60c6	refactor: Move KafkaPartitionWriteStatus to data_types to share more	2022-05-11 14:07:06 -04:00
Carol (Nichols \|\| Goulding)	068096e7e1	fix: Rename data_types2 to data_types	2022-05-06 14:45:39 -04:00
Carol (Nichols \|\| Goulding)	e1bef1c218	fix: Remove OG data_types crate	2022-05-06 14:45:39 -04:00
Carol (Nichols \|\| Goulding)	44209faa8e	fix: Move write buffer data types to write_buffer crate	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	d7304c1114	fix: Move TimestampSummary to the only place it's used	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	4c56ba1e25	fix: Move ErrorLogger trait to the only place it's used	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	fb8f8d22c0	fix: Remove now-unused ServerId. Fixes #4451	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	b76c1e1ad6	fix: Remove now-unused DML sharding and related types	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	94be7407ba	refactor: Move BooleanFlag to the only place it's used	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	236edb9181	fix: Move Sequence type to data_types2	2022-05-06 14:45:38 -04:00
Carol (Nichols \|\| Goulding)	afdff2b1db	fix: Move DatabaseName to data_types2	2022-05-06 14:45:37 -04:00
Carol (Nichols \|\| Goulding)	1ea4a40b1f	fix: Move NonEmptyString to data_types2	2022-05-06 14:45:37 -04:00
Carol (Nichols \|\| Goulding)	6b0e7ae46a	fix: Move name parsing code to data_types2	2022-05-06 14:45:37 -04:00
Carol (Nichols \|\| Goulding)	3ab0788a94	fix: Move DeletePredicate types to data_types2	2022-05-06 14:45:37 -04:00
dependabot[bot]	912d73a6f3	chore(deps): Bump ordered-float from 2.10.0 to 3.0.0 (#4502 ) Bumps [ordered-float](https://github.com/reem/rust-ordered-float) from 2.10.0 to 3.0.0. - [Release notes](https://github.com/reem/rust-ordered-float/releases) - [Commits](https://github.com/reem/rust-ordered-float/compare/v2.10.0...v3.0.0) --- updated-dependencies: - dependency-name: ordered-float dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-05-02 14:27:25 +00:00
二手掉包工程师	4b47d723b1	refactor: Rename time to iox_time (#4416 ) Signed-off-by: hi-rustin <rustin.liu@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-26 00:19:59 +00:00
Marco Neumann	f444e63960	test: include materialized delete predicates in NG query tests (#4371 ) * refactor: move `batch_filter` to `datafusion_util` * fix: outdated docstring * feat: allow passing record batches to `iox_tests` parquet files * test: include materialized delete predicates in NG query tests * docs: improve wording Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-21 13:00:13 +00:00
Andrew Lamb	c244b03281	feat: Add `SequencerProgress` reporting to ingester (#4238 ) * feat: Add `SequencerProgress` reporting to ingester * refactor: Use KafkaPartition in write_summary * fix: Update docstrings * refactor: Change ingester to use KafkaPartition everywhere * refactor: add SequencerProgress::combine * refactor: return new SequencerProgress rather than updating * fix: distinguish between yes/no/unknown in WriteSummary * docs: Update data_types2/src/lib.rs Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: Paul Dix <paul@pauldix.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-06 15:13:21 +00:00
Andrew Lamb	a384448b92	refactor: rename Sequence::id and Sequence::number field names (#4190 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-31 15:17:58 +00:00
Marco Neumann	20bbb88dc5	refactor: remove table name from `TableSummary` (#4170 ) This allows us to remove the table name from the low-level chunk representations (like `ParquetFile`, RUB, ...) since table names are already tracked by the higher-level data structures (e.g. catalog, catalog chunk) that manage the low-level chunk representations. This is similar to #4167. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 13:24:00 +00:00
Nga Tran	bfd5568acf	fix: make sure the QueryableParquetChunks are always sorted correctly (#4163 ) * fix: make sure the chunks are always sorted correctly * fix: output * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: make new function for new chunk id Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-29 21:36:45 +00:00
Marco Neumann	2b76c31157	refactor: make statistics null counts optional (#4160 ) Min/max values and distinct counts are already optional, so let's make the null counts optional as well. This will be helpful for NG to deal w/ partial statistics (e.g. we only populate stats for the time column). Note that the total count is still mandatory, but we normally have the chunk/file-level row count at hand.	2022-03-29 17:47:57 +00:00
Andrew Lamb	204dd7c8e9	refactor: Fix some random clippy lints from the future (#4118 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-24 09:21:29 +00:00
Raphael Taylor-Davies	80fb75d90b	feat: add a flag to enable per-partition tracing (#3928 ) * feat: add a flag to enable per-partition tracing * chore: rename constant * feat: use BooleanFlag and cache result	2022-03-07 13:49:23 +00:00
Marco Neumann	8d00aaba90	feat: sync chunks in querier (#3911 ) * feat: `ParquetFileRepo::list_by_namespace_not_to_delete` * feat: `ChunkAddr: Clone` * test: ensure that querier keeps same partition objects * test: improve `create_parquet_file` flexibility * feat: sync chunks in querier * test: improve `test_parquet_file`	2022-03-04 08:53:39 +00:00
Marco Neumann	33851be3a5	chore: upgrade Rust to 1.59 (#3875 ) Mostly a few new clippy crates around `flat_map`, `and_then`, and "underscore locks" (!!!): https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_lock Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-28 15:14:19 +00:00
Raphael Taylor-Davies	442d63e65b	feat: catalog timestamp pruning (#3571 ) * feat: catalog timestamp pruning * chore: test	2022-01-28 13:45:13 +00:00
Dom	32d7c4cbfe	refactor: remove InfluxColumnType::IOx (#3565 ) * refactor: remove InfluxColumnType::IOx Remove unused column variant - see #3554 for context. * refactor: reserve SEMANTIC_TYPE_IOX name in proto Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-27 21:15:36 +00:00
Raphael Taylor-Davies	21c1824a7a	refactor: remove table_names from Predicate (#3545 ) * refactor: remove table_names from Predicate * chore: fix benchmarks * chore: review feedback Co-authored-by: Edd Robinson <me@edd.io> * chore: review feedback * chore: replace Default::default with InfluxRpcPredicate::default() Co-authored-by: Edd Robinson <me@edd.io> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-27 14:44:49 +00:00
Andrew Lamb	9c19cd6cc4	fix: clamp start/end of TimestampRange to min/max valid timestamp values (#3487 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-20 16:08:00 +00:00
Andrew Lamb	f0d50f447a	fix: Special case tag_keys with max timestamp range (#3485 ) * fix: Special case tag_keys with max timestamp range * docs: comment Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-20 14:14:34 +00:00

1 2 3 4 5 ...

373 Commits (6f9d8b54cf5cacf6e5750f1448d54f139f29b486)