kodiakhq[bot]
3f9a58362b
Merge pull request #4111 from influxdata/crepererum/issue3934h
...
refactor: make query tests less OG-specific
2022-03-23 19:18:08 +00:00
kodiakhq[bot]
93485a11ec
Merge branch 'main' into crepererum/issue3934h
2022-03-23 19:10:02 +00:00
Carol (Nichols || Goulding)
67e13a7c34
fix: Change to_delete column on parquet_files to be a time ( #4117 )
...
Set to_delete to the time the file was marked as deleted rather than
true.
Fixes #4059 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 18:47:27 +00:00
Marco Neumann
51da6dd7fa
feat: store sort key in NG metadata ( #4110 )
...
The sort key is optional and currently only produced by `iox_tests`.
Writing it within the ingester/compactor is tracked by #3968 . The sort
key is read by the querier (and this will be verified by the query tests
and is required to merge #4103 ).
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 18:24:46 +00:00
Andrew Lamb
7f2c2fde2c
fix: fix all in one mode argument handling so it can start ( #4115 )
...
* fix: fix all in one mode argument handling
* fix: clippy
* fix: fmt
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 18:16:22 +00:00
Marco Neumann
c33ef79375
test: improve query test runner output ( #4112 )
...
- prints more previous/expected values when failing (instead of just
emitting an `Err` which will be debug-printed)
- fixed newline handling (i.e. do not add additional newlines in
`PrintlnWriter::write`)
Before:
```text
Running scenario 'Two chunks: NG Chunk Parquet; NG Chunk Parquet'
SQL: '"SELECT * from information_schema.tables;"'
thread 'cases::test_cases_sql_information_schema_sql' panicked at 'test failed: ScenarioMismatch { scenario_name: "Two chunks: NG Chunk Parquet; NG Chunk Parquet", previous_results: ["+---------------+--------------------+---------------------+------------+", "| table_catalog | table_schema | table_name | table_type |", "+---------------+--------------------+---------------------+------------+", "| public | information_schema | columns | VIEW |", "| public | information_schema | tables | VIEW |", "| public | iox | h2o | BASE TABLE |", "| public | iox | o2 | BASE TABLE |", "| public | system | chunk_columns | BASE TABLE |", "| public | system | chunks | BASE TABLE |", "| public | system | columns | BASE TABLE |", "| public | system | operations | BASE TABLE |", "| public | system | persistence_windows | BASE TABLE |", "| public | system | queries | BASE TABLE |", "+---------------+--------------------+---------------------+------------+"], current_results: ["+---------------+--------------------+------------+------------+", "| table_catalog | table_schema | table_name | table_type |", "+---------------+--------------------+------------+------------+", "| public | information_schema | columns | VIEW |", "| public | information_schema | tables | VIEW |", "| public | iox | h2o | BASE TABLE |", "| public | iox | o2 | BASE TABLE |", "+---------------+--------------------+------------+------------+"] }', query_tests/src/cases.rs:169:10
stack backtrace:
0: rust_begin_unwind
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
```
After:
```text
Running scenario 'Two chunks: NG Chunk Parquet; NG Chunk Parquet'
SQL: '"SELECT * from information_schema.tables;"'
Answers produced by scenario Two chunks: NG Chunk Parquet; NG Chunk Parquet differ from previous answer
previous:
+---------------+--------------------+---------------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+---------------------+------------+
| public | information_schema | columns | VIEW |
| public | information_schema | tables | VIEW |
| public | iox | h2o | BASE TABLE |
| public | iox | o2 | BASE TABLE |
| public | system | chunk_columns | BASE TABLE |
| public | system | chunks | BASE TABLE |
| public | system | columns | BASE TABLE |
| public | system | operations | BASE TABLE |
| public | system | persistence_windows | BASE TABLE |
| public | system | queries | BASE TABLE |
+---------------+--------------------+---------------------+------------+
current:
+---------------+--------------------+------------+------------+
| table_catalog | table_schema | table_name | table_type |
+---------------+--------------------+------------+------------+
| public | information_schema | columns | VIEW |
| public | information_schema | tables | VIEW |
| public | iox | h2o | BASE TABLE |
| public | iox | o2 | BASE TABLE |
+---------------+--------------------+------------+------------+
thread 'cases::test_cases_sql_information_schema_sql' panicked at 'test failed: ScenarioMismatch { scenario_name: "Two chunks: NG Chunk Parquet; NG Chunk Parquet", previous_results: ["+---------------+--------------------+---------------------+------------+", "| table_catalog | table_schema | table_name | table_type |", "+---------------+--------------------+---------------------+------------+", "| public | information_schema | columns | VIEW |", "| public | information_schema | tables | VIEW |", "| public | iox | h2o | BASE TABLE |", "| public | iox | o2 | BASE TABLE |", "| public | system | chunk_columns | BASE TABLE |", "| public | system | chunks | BASE TABLE |", "| public | system | columns | BASE TABLE |", "| public | system | operations | BASE TABLE |", "| public | system | persistence_windows | BASE TABLE |", "| public | system | queries | BASE TABLE |", "+---------------+--------------------+---------------------+------------+"], current_results: ["+---------------+--------------------+------------+------------+", "| table_catalog | table_schema | table_name | table_type |", "+---------------+--------------------+------------+------------+", "| public | information_schema | columns | VIEW |", "| public | information_schema | tables | VIEW |", "| public | iox | h2o | BASE TABLE |", "| public | iox | o2 | BASE TABLE |", "+---------------+--------------------+------------+------------+"] }', query_tests/src/cases.rs:169:10
stack backtrace:
0: rust_begin_unwind
at /rustc/9d1b2106e23b1abd32fce1f17267604a5102f57a/library/std/src/panicking.rs:498:5
```
2022-03-23 18:06:09 +00:00
kodiakhq[bot]
32adb37591
Merge pull request #4049 from influxdata/cn/get-tombstones
...
feat: Add tombstones to parquet files for compaction
2022-03-23 14:28:20 +00:00
kodiakhq[bot]
58bfab5a8c
Merge branch 'main' into cn/get-tombstones
2022-03-23 14:18:41 +00:00
Paul Dix
4f5321d19b
feat: add compactor configuration for kafka topic and sequencers ( #4107 )
2022-03-23 14:11:47 +00:00
Carol (Nichols || Goulding)
c3a8834970
test: Add a test for add_tombstones_to_groups
2022-03-23 09:56:27 -04:00
Carol (Nichols || Goulding)
080156aa27
fix: Only do one catalog query for tombstones per each group of parquet files
...
The query will get all tombstones that could be relevant to the group;
then associate subsets of the results with each parquet file.
2022-03-23 09:56:26 -04:00
Carol (Nichols || Goulding)
2749c37d02
fix: Query for tombstones in a time range, not for a particular parquet file
...
The compactor at this point is still querying for each file; this is an
intermediate step
2022-03-23 09:52:00 -04:00
Carol (Nichols || Goulding)
4d2e71c03e
feat: Wrap parquet files with their relevant tombstones
2022-03-23 09:52:00 -04:00
Carol (Nichols || Goulding)
87dc2981f6
feat: Query for tombstones relevant to a parquet file
...
Connects to #3948 .
2022-03-23 09:52:00 -04:00
Luke Bond
e109fa4987
feat: schema client and CLI ( #4105 )
...
* feat: schema client and CLI
* chore: clarification in comment in schema command
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 13:49:24 +00:00
dependabot[bot]
8ee9b793f5
chore(deps): Bump hyper from 0.14.17 to 0.14.18 ( #4109 )
...
Bumps [hyper](https://github.com/hyperium/hyper ) from 0.14.17 to 0.14.18.
- [Release notes](https://github.com/hyperium/hyper/releases )
- [Changelog](https://github.com/hyperium/hyper/blob/master/CHANGELOG.md )
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.17...v0.14.18 )
---
updated-dependencies:
- dependency-name: hyper
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-23 13:12:43 +00:00
dependabot[bot]
36071e2d12
chore(deps): Bump log from 0.4.14 to 0.4.16 ( #4108 )
...
Bumps [log](https://github.com/rust-lang/log ) from 0.4.14 to 0.4.16.
- [Release notes](https://github.com/rust-lang/log/releases )
- [Changelog](https://github.com/rust-lang/log/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rust-lang/log/commits )
---
updated-dependencies:
- dependency-name: log
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-23 13:01:30 +00:00
Marco Neumann
5ae1e2fecf
refactor: make query tests less OG-specific
2022-03-23 12:04:32 +01:00
Marco Neumann
89206e013c
test: run SOME query tests for querier ( #4098 )
...
This includes some type changes to dispatch between OG and NG and allows
some tests to be run against the NG querier. This only contains parquet
files though, so it's somewhat a limited scope.
For #3934 .
2022-03-22 17:39:19 +00:00
Nga Tran
c3ef56588f
feat: use creation time to check level upgradable ( #4094 )
...
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-22 13:51:18 +00:00
Paul Dix
b18b18afd9
fix: have ingester use single mutable batch for buffer ( #4095 )
...
Removed some unnecessary tests as they no longer apply with the new buffer structure. This will hopefully reduce the memory footprint of the ingesters significantly.
Closes #4072
2022-03-22 13:42:52 +00:00
Nga Tran
886f9dc8c1
feat: split compacted data into 2 compacted sets ( #4088 )
...
* feat: split compacted data into 2 compacted sets
* chore: clean up
* refactor: address review comments
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-22 13:28:32 +00:00
Andrew Lamb
b83b000590
chore: Update datafusion ( #4071 )
...
* chore: update to datafusion 5936edc2a94d5fb20702a41eab2b80695961b9dc
* chore: Update apis to match datafusion changes
2022-03-22 13:17:41 +00:00
Luke Bond
b098828c97
feat: schema grpc server & proto in router2 ( #4081 )
...
* feat: schema grpc server & proto in router2
* chore: comments in schema proto
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-22 11:27:20 +00:00
Marco Neumann
c9908b260c
refactor: dyn-dispatch database in query subsystem ( #4083 )
...
* refactor: dyn-dispatch database in query subsystem
This is similar to #4080 but concerns the database itself.
For #3934 .
* docs: improve wording
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-22 09:15:52 +00:00
Luke Bond
9ec45f5aec
Revert "fix: propagate shutdown into QuerierHandlerImpl" ( #4090 )
...
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-03-21 18:53:06 +00:00
Luke Bond
24e03deb5b
Revert "fix: propagate shutdown into CompactorHandler" ( #4091 )
2022-03-21 18:43:24 +00:00
Luke Bond
3c2775f8f2
fix: teach semantic commits script about GH revert PRs ( #4092 )
2022-03-21 17:37:28 +00:00
Marco Neumann
55643945a1
refactor: `querier` w/o `db` ( #4063 )
...
* feat: `TombstoneRepo::list_by_table`
* feat: `ParquetFileRepo::list_by_table_not_to_delete`
* refactor: `querier` w/o `db`
Get the `querier` to work w/o relying on `db`. A few notes:
- Testing is kinda shallow, we really need to get `query_tests` working
w/ `querier` (see #3934 ).
- We still run a sync loop for namespaces, tables and schemas. This will
be a replaced by "update namespace incl. tables and schemas on demand".
Note however that we cannot fetch single tables and schemas on demand
at the moment, because DataFusion doesn't implement async schema
inspection (only `scan` / "give me all the chunks" is async). I think
that's OK for now and we can address this later.
- There is NO cache for parquet files and tombstones at the moment. For
correctness, they need to be fetched in a single transaction (or we
need a kinda tricky sequence number / logical clock tracking) and I am
not sure yet how this makes sense when we have the ingester data wired
up and predicates pushed down to the catalog (see next point). So
let's measure first and then decide on a caching strategy for this.
- Predicates are currently NOT pushed down to the catalog. I'll need to
figure out how to extract time range from generic DataFusion
expressions to make that work (it's easier for InfluxRPC queries, but
they are not tested at the moment, see first point).
Sorry that this commit is kinda huge. I initially planned to only
migrate the chunks away from `db` and leave the tables and schemas for a
follow-up PR, but the DataFusion trait structure (chunks are bound to
their tables) makes this kinda pointless.
Closes #3974 .
* docs: explain what we're doing
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* docs: mention tracking issues
* docs: explain what we're doing
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-03-21 16:58:00 +00:00
dependabot[bot]
a23efce408
chore(deps): Bump kube-derive from 0.69.1 to 0.70.0 ( #4073 )
...
* chore(deps): Bump kube-derive from 0.69.1 to 0.70.0
Bumps [kube-derive](https://github.com/kube-rs/kube-rs ) from 0.69.1 to 0.70.0.
- [Release notes](https://github.com/kube-rs/kube-rs/releases )
- [Changelog](https://github.com/kube-rs/kube-rs/blob/master/CHANGELOG.md )
- [Commits](https://github.com/kube-rs/kube-rs/compare/0.69.1...0.70.0 )
---
updated-dependencies:
- dependency-name: kube-derive
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore(deps): Bump kube-runtime from 0.69.1 to 0.70.0
Bumps [kube-runtime](https://github.com/kube-rs/kube-rs ) from 0.69.1 to 0.70.0.
- [Release notes](https://github.com/kube-rs/kube-rs/releases )
- [Changelog](https://github.com/kube-rs/kube-rs/blob/master/CHANGELOG.md )
- [Commits](https://github.com/kube-rs/kube-rs/compare/0.69.1...0.70.0 )
---
updated-dependencies:
- dependency-name: kube-runtime
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
* chore: upgrade kube to version 0.70
* chore: hakari
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-03-21 15:32:45 +00:00
kodiakhq[bot]
d3d628fcf3
Merge pull request #4070 from influxdata/cn/update-catalog
...
feat: Update the catalog for a completed compaction set
2022-03-21 14:28:49 +00:00
Carol (Nichols || Goulding)
201ced1d66
test: Mark a parquet file deleted in the update catalog operation
2022-03-21 10:16:58 -04:00
Carol (Nichols || Goulding)
dbca54d917
refactor: Move add parquet file and tombstones within update catalog
...
This should never be done on its own so doesn't really need to be its
own method. We also don't do anything with the returned data, so no need
to allocate those vectors.
2022-03-21 10:16:58 -04:00
Carol (Nichols || Goulding)
2fea10dfd7
feat: Mark old compacted parquet files to be deleted in transaction
...
Connects to #3952
2022-03-21 10:16:58 -04:00
Carol (Nichols || Goulding)
5b294968a5
feat: Add processed tombstone records with compacted parquet file
...
In a transaction when the parquet file is added to the catalog.
Connects to #3952 .
2022-03-21 10:16:57 -04:00
Carol (Nichols || Goulding)
b983b24fcf
fix: Adding processed tombstones to catalog only needs tombstone ID
2022-03-21 10:16:57 -04:00
Carol (Nichols || Goulding)
8fd3d85634
refactor: Move add_parquet_file_with_tombstones from ingester to compactor
2022-03-21 10:16:57 -04:00
Carol (Nichols || Goulding)
933dc69ecf
feat: For each compacted data set, persist new parquet file to object store ( #4058 )
...
* feat: Rearrange skeleton functions for split/persist/catalog update
* feat: Persist compacted files to object storage
Fixes #3951 .
* docs: Add comment about batches' schemas
2022-03-21 14:16:03 +00:00
Marco Neumann
0779f81b6b
refactor: rework `TableCache ( #4054 )
...
* feat: `TableRepo::get_by_namespace_and_name`
* refactor: rework `TableCache`
- dual cache that can also map table names to IDs
- deal w/ missing tables w/o panics
- set proper timeouts to missing data
For #3974 .
* test: extend table cache tests
2022-03-21 13:40:06 +00:00
kodiakhq[bot]
26a7a61d0a
Merge pull request #4080 from influxdata/crepererum/issue3934d
...
refactor: dyn-dispatch chunks in query subsystem
2022-03-21 12:47:28 +00:00
Marco Neumann
d1df95df87
refactor: dyn-dispatch chunks in query subsystem
...
- this is what DataFusion is doing as well; it's also fast enough
because the number of chunks in a query is not THAT massive (it's not
like we are doing row-level dyn dispatching)
- it simplifies abstracting over different databases
- it allows us to drop our enum-based dispatching that we have for
`DbChunk` and that we would also need for the querier (e.g. depending
on if a chunk is backed by a parquet file or ingester data)
- it likely speeds up compile times because the `query` is no longer
contains massive amounts of generic code
For #3934 .
2022-03-21 12:47:54 +01:00
dependabot[bot]
cd36229e27
chore(deps): Bump synchronized-writer from 1.1.10 to 1.1.11 ( #4075 )
...
Bumps [synchronized-writer](https://github.com/magiclen/synchronized-writer ) from 1.1.10 to 1.1.11.
- [Release notes](https://github.com/magiclen/synchronized-writer/releases )
- [Commits](https://github.com/magiclen/synchronized-writer/compare/v1.1.10...v1.1.11 )
---
updated-dependencies:
- dependency-name: synchronized-writer
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-21 10:53:11 +00:00
dependabot[bot]
66cf34c2e2
chore(deps): Bump tokio-rustls from 0.23.2 to 0.23.3 ( #4074 )
...
Bumps [tokio-rustls](https://github.com/tokio-rs/tls ) from 0.23.2 to 0.23.3.
- [Release notes](https://github.com/tokio-rs/tls/releases )
- [Commits](https://github.com/tokio-rs/tls/commits )
---
updated-dependencies:
- dependency-name: tokio-rustls
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-21 10:43:07 +00:00
kodiakhq[bot]
c543b61f6c
Merge pull request #4079 from influxdata/crepererum/issue3934c
...
refactor: steps towards dynamic database type dispatch
2022-03-21 10:17:37 +00:00
Marco Neumann
ca152e7934
refactor: avoid generics in `QueryDatabase`
...
A step to make this trait object-safe.
Ref #3934 .
2022-03-21 10:45:05 +01:00
Marco Neumann
0071b85c22
refactor: make `ExecutionContextProvider` object-safe
...
Ref #3934 .
2022-03-21 10:40:53 +01:00
dependabot[bot]
836aecc7ad
chore(deps): Bump libc from 0.2.120 to 0.2.121 ( #4076 )
...
Bumps [libc](https://github.com/rust-lang/libc ) from 0.2.120 to 0.2.121.
- [Release notes](https://github.com/rust-lang/libc/releases )
- [Commits](https://github.com/rust-lang/libc/compare/0.2.120...0.2.121 )
---
updated-dependencies:
- dependency-name: libc
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-21 09:28:39 +00:00
kodiakhq[bot]
0e5dc716e3
Merge pull request #4061 from influxdata/crepererum/issue3934b
...
refactor: make `QueryChunk` object-safe
2022-03-19 07:05:51 +00:00
kodiakhq[bot]
67939fb37d
Merge branch 'main' into crepererum/issue3934b
2022-03-19 06:56:30 +00:00
kodiakhq[bot]
c75be65a46
Merge pull request #4067 from influxdata/dom/router-precision
...
feat(router2): write timestamp precision
2022-03-18 17:45:02 +00:00