Commit Graph

124 Commits (6f7cb5ccf05544fb3efe40b0f7148046d12f34d3)

Author SHA1 Message Date
dependabot[bot] 1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339)
* chore(deps): Bump object_store from 0.5.1 to 0.5.2

Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Marco Neumann cd6a8a1a82
refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313)
* refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics

Closes #6310.

* refactor: rename and tune default exec mem limits

* fix: ingester2 bits after rebase
2022-12-05 12:38:28 +00:00
Nga Tran dd1755b23a
feat: querier filters data outsude retnetion period (#6209) 2022-11-22 15:41:00 +00:00
dependabot[bot] 04c00bbb62
chore(deps): Bump bytes from 1.2.1 to 1.3.0 (#6199)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/commits)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 08:23:24 +00:00
Nga Tran 49a9565240
feat: gRPC that creates namespace (#6103)
* feat: create namespace API call in router

Co-authored-by: Nga Tran <nga-tran@live.com>

* chore: treat retention as ns except in CLI

* fix: overflow in nanosecond calc

* fix: retention test after changing it from hours to ns

* chore: comment clarification in cli; better response type for error in ns API

* fix: correct some rebase mistakes

* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const

* fix: ns autocreation test was wrong after rebase

* fix: mem catalog has default 1hr retention, accidently removed in rebase

* chore: remove mem catalogs default 1hr retention; make it settable in sets & router

Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 13:02:12 +00:00
Nga Tran 6f7b1e2e26
feat: reject writes that are outside the retention period (#6148)
* feat: reject writes that are outside the retention period

* feat: add retention validator into handler stack

* chore: Apply suggestions from code review

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: address review comments

* test: unit tests fot retention validation

* chore: address review comments

* test: more unit tests and integration tests

* refactor: make time inside retention period for emphemeral_mode test

* fix: 2 hours

Co-authored-by: Dom <dom@itsallbroken.com>
2022-11-17 20:55:58 +00:00
Andrew Lamb 448911794c
test: test coverage for sorting and merging in compactor (#6136)
* test: test coverage for sorting and merging in compactor

* fix: Apply suggestions from code review (comments)

Co-authored-by: Marco Neumann <marco@crepererum.net>

* feat: use itertools to cover all permutations

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-15 20:39:45 +00:00
dependabot[bot] a969754819
chore(deps): Bump chrono from 0.4.22 to 0.4.23 (#6129)
* chore(deps): Bump chrono from 0.4.22 to 0.4.23

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.22 to 0.4.23.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* refactor: chrono future compat

Integer->timstamp conversions should not silently panic.

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-14 13:34:09 +00:00
Nga Tran 9c4266c503
refactor: first step to remove unused retention_duration (#6113)
* refactor: first step to remove unused retention_duration

* refactor: remove retenion_duration from update catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-11 15:21:06 +00:00
Andrew Lamb 4fb2843d05
refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` (#6037)
* chore: Rename `schema::selection::Selection` to `schema::projection::Projection`

* fix: docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 18:15:04 +00:00
Marco Neumann 45b3984aa3
refactor: simplify `QueryChunk` data access (#6015)
* refactor: simplify `QueryChunk` data access

We have only two types for chunks (now that the RUB is gone):

1. In-memory RecordBatches
2. Parquet files

Loads of logic is duplicated in the different `read_filter`
implementations. Also `read_filter` hides a solid amount of logic from
DataFusion, which will prevent certain (future) optimizations. To enable #5897
and to simplify the interface, let the chunks return the data (batches
or metadata for parquet files) directly and let `iox_query` perform the
actual heavy-lifting.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 08:18:33 +00:00
dependabot[bot] b1572c50a6
chore(deps): Bump once_cell from 1.15.0 to 1.16.0 (#6009)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.15.0 to 1.16.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.15.0...v1.16.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:23:40 +00:00
Carol (Nichols || Goulding) 3145e2c05b
feat: Use workspace dep inheritance for the arrow crate 2022-10-26 10:34:29 -04:00
Carol (Nichols || Goulding) 44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate 2022-10-26 10:33:56 -04:00
Carol (Nichols || Goulding) 2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition 2022-10-24 13:04:09 -04:00
Marco Neumann e0062f2d40
refactor: do NOT use fake DF context for parquet reading (#5942)
Use the proper top-level DataFusion context and register the object
store there.

Note that we still hide the `ParquetExec` behind an opaque record batch
stream. Fixing that is next on my list.

Helps with #5897.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-24 08:20:26 +00:00
dependabot[bot] 9fb1433317
chore(deps): Bump futures from 0.3.24 to 0.3.25 (#5938)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.24 to 0.3.25.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.24...0.3.25)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-21 11:03:20 +00:00
Marco Neumann 42b89ade03
refactor: use `SendableRecordBatchStream` to write parquets (#5911)
Use a proper typed stream instead of peeking the first element. This is
more in line with our remaining stack and shall also improve error
handling.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-19 12:59:53 +00:00
Marco Neumann eb5a661ab3
refactor: prep work for #5897 (#5907)
* refactor: add ID to `ParquetStorage`

* refactor: remove duplicate code

* refactor: use dedicated `StorageId`
2022-10-19 11:54:42 +00:00
Andrew Lamb d706f8221d
chore: Update datafusion and arrow / parquet / arrow-flight 25.0.0 (#5900)
* chore: Update datafusion and  `arrow` / `parquet` / `arrow-flight` 25.0.0

* chore: Update for structure changes

* chore: Update for new projection pushdown

* chore: Run cargo hakari tasks

* fix: fmt

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-18 20:58:47 +00:00
Carol (Nichols || Goulding) efb964c390
feat: Enforce table column limits from the schema cache (#5819)
* fix: Avoid some allocations by collecting instead of inserting into a vec

* refactor: Encode that adding columns is for one table at a time

* test: Add another test of column limits

* test: Add below/above limit tests for create_or_get_many

* fix: Explicitly DO NOT check column limits when inserting many columns

* feat: Cache the max_columns_per_table on the NamespaceSchema

* feat: Add a function to validate column limits in-memory

* fix: Provide more useful information when over column limits

* fix: Swap types to remove intermediate allocation

* docs: Explain the interactions of the cache and the column limits

* test: Actually set up test that showcases column limit race condition

* fix: Allow writing to existing columns even if table is over column limit

Co-authored-by: Dom <dom@itsallbroken.com>
2022-10-14 11:34:17 +00:00
Andrew Lamb d57c99638c
chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792)
* chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0

* fix: Update for coercion, fix explain plans for change in column name display

* chore: Update datafusion lock

* fix: Update for other API changes

* chore: Update to latest datafusion pin

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-12 16:19:14 +00:00
dependabot[bot] 933493fab3
chore(deps): Bump object_store from 0.5.0 to 0.5.1
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-10-11 01:19:10 +00:00
Jake Goulding 8187b003c3
refactor: improve panic message 2022-09-29 12:33:26 -04:00
Dom Dwyer cd4087e00d style: add no todo!() or dbg!() lints
Some crates had theme, some not - lets be consistent and have the
compiler spot dbg!() and todo!() macro calls - they should never be in
prod code!
2022-09-29 13:10:07 +02:00
Andrew Lamb 66dbb9541f
chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to 23.0.0, `thrift` to 0.16.0 (#5694)
* chore: Update datafusion and `arrow`/`parquet`/`arrow-flight`  to 23.0.0

* chore: Update thrift / remove parquet_format

* fix: Update APIs

* chore: Update lock + Run cargo hakari tasks

* fix: use patched version of arrow-rs to work around https://github.com/apache/arrow-rs/issues/2779

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-27 12:50:54 +00:00
dependabot[bot] 0d18943ad2
chore(deps): Bump once_cell from 1.14.0 to 1.15.0 (#5701)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.14.0 to 1.15.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.14.0...v1.15.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-21 05:20:55 +00:00
Carol (Nichols || Goulding) c0c0349bc5
fix: Use typed Time values rather than ns 2022-09-19 12:59:20 -04:00
Carol (Nichols || Goulding) a284cebb51
refactor: Store estimated bytes on the CompactorParquetFile 2022-09-15 11:10:14 -04:00
Andrew Lamb f86d3e31da
chore: Update datafusion + object_store (#5619)
* chore: Update datafusion pin

* chore: update object_store to 0.5.0

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-13 12:34:54 +00:00
Andrew Lamb 1fd31ee3bf
chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 (#5591)
* chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0

* fix: enable dynamic comparison flag

* chore: derive Eq for clippy

* chore: update explain plans

* chore: Update sizes for ReadBuffer encoding

* chore: update more tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-12 17:45:03 +00:00
Carol (Nichols || Goulding) 85fb0acea6
refactor: Extract read_parquet_file test helper function to iox_tests::utils 2022-09-07 13:21:28 -04:00
YIXIAO SHI 52ae60bf2e
chore: fix comment typo (#5551)
Co-authored-by: Dom <dom@itsallbroken.com>
2022-09-07 08:49:29 +00:00
dependabot[bot] 366c4d9965
chore(deps): Bump once_cell from 1.13.1 to 1.14.0 (#5555)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.1 to 1.14.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.13.1...v1.14.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-06 09:02:28 +00:00
Marco Neumann f45cbfb88d
refactor: fine-grained file size mocking (#5541)
* refactor: do not override parquet file size in querier

This is going to be an issue when we actually rely on the size for
reading, see #5531.

* refactor: use selected file size mocking in compactor

Do not blindly override parquet file sizes for all subsystems.

This is going to be an issue when we actually rely on the size for
reading, see #5531.

* refactor: remove ability to override file sizes in catalog

Blindly overriding data for all subsystems is dangerous, because some
parts of our stack actually rely on the actual file size. See #5531.

* docs: explain `size_overrides`
2022-09-05 08:50:04 +00:00
Carol (Nichols || Goulding) 62b8819d49
fix: Carry object store ID through test builder, but pick new every time 2022-08-31 14:58:46 -04:00
Carol (Nichols || Goulding) a9d664d0bf
feat: Add a way to set the row count on Parquet file catalog entries
And only allow setting this when no record batch or line protocol is
specified so that there isn't a way to create a parquet file with data
that has a mismatched row count.
2022-08-31 14:36:42 -04:00
Carol (Nichols || Goulding) c21ac9050b
refactor: Extract a test util fn that will only create parquet file catalog records 2022-08-31 14:00:00 -04:00
Andrew Lamb 6669d85fb4
chore: Update datafusion + arrow/parquet to `21.0.0` (#5519)
* chore: Update arrow/arrow-flight/parquet to 21.0.0

* chore: Update datafusion pin

* chore: Fix arrow update script

* chore: Update Cargo.lock

* chore: Update for new API
2022-08-31 13:30:47 +00:00
dependabot[bot] 0137db9adc
chore(deps): Bump futures from 0.3.23 to 0.3.24
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.23 to 0.3.24.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.23...0.3.24)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-08-30 01:24:21 +00:00
Carol (Nichols || Goulding) 58f0b63cdc
refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 74c9529062
fix: Rename KafkaPartition to ShardIndex 2022-08-29 14:07:18 -04:00
Carol (Nichols || Goulding) fe9c474620
fix: rustfmt 2022-08-29 14:06:45 -04:00
Carol (Nichols || Goulding) 698f1a47ff
refactor: Rename test structures from sequencer to shard where appropriate 2022-08-29 14:06:44 -04:00
Jake Goulding 4abf21c724
refactor: Rename Sequencer (and its entourage) to Shard 2022-08-29 14:06:43 -04:00
Andrew Lamb 7f0ae53d6f
chore: Update to (almost) released object_store 0.4.0 (#5419)
* chore: update object_store

* chore: update hakari config

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-17 13:44:48 +00:00
dependabot[bot] 78665d3092
chore(deps): Bump once_cell from 1.13.0 to 1.13.1 (#5413)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.0 to 1.13.1.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.13.0...v1.13.1)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-17 08:31:46 +00:00
dependabot[bot] 0a44a42999
chore(deps): Bump futures from 0.3.21 to 0.3.23 (#5394)
Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.21 to 0.3.23.
- [Release notes](https://github.com/rust-lang/futures-rs/releases)
- [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.21...0.3.23)

---
updated-dependencies:
- dependency-name: futures
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-15 12:19:17 +00:00
Andrew Lamb 16ddc5efc6
chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360)
* chore: Update datafusion and arrow

* chore: Update Cargo.lock

* chore: update to Decimal128

* chore: Update tonic/prost/pbjson/etc

* chore: Run cargo hakari tasks

* fix: doctest in generated types

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 17:30:44 +00:00
Andrew Lamb 9215a534d0
chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229)
* chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0`

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: clippy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 08:10:47 +00:00