Commit Graph

8955 Commits (439bcf08d98066c167b3b144d4caaecd2853babe)

Author SHA1 Message Date
dependabot[bot] 55e1e2ec2b
chore(deps): Bump thiserror from 1.0.31 to 1.0.32 (#5294)
* chore(deps): Bump thiserror from 1.0.31 to 1.0.32

Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.31 to 1.0.32.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.31...1.0.32)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 16:20:36 +00:00
Marco Neumann 039950b4fd feat: ensure clean compactor executor shutdown 2022-08-03 18:07:00 +02:00
Marco Neumann fd74f2639b fix: do not attempt to poll future lists in compactor
It seems that the buffering / parallelization code cannot deal with
empty lists and just freezes forever (which blocks shutdown but will
also freeze the compactor forever).
2022-08-03 18:04:05 +02:00
Marco Neumann 4bd8977d55 refactor: add some main function debug logs 2022-08-03 18:00:28 +02:00
Andrew Lamb 6011c4cd1f
chore: update datafusion pin (#5290)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:43:18 +00:00
Marco Neumann 840e4801b8
feat: make querier RAM pool split a proper feature (#5283)
* feat: make querier RAM pool split a proper feature

- use propre pool names
- expose sizing via CLI/env

Closes https://github.com/influxdata/conductor/issues/1102.

* refactor: improve naming and docs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:27:23 +00:00
Andrew Lamb d0f88c664c
docs: improve some docstrings in `iox_query` (#5291)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:20:12 +00:00
Nga Tran ee151c8b41
fix: make batch size back to 1024 to see if the OOM in the compactor go away (#5289)
* fix: make batch size back to 1024 to see if the OOM in the compactor go away

* fix: address review comments

* chore: Apply suggestions from code review

Co-authored-by: Marco Neumann <marco@crepererum.net>

* fix: import needed constant

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:13:04 +00:00
Marco Neumann 663a20d743
refactor: remove `--ingster-address` (#5255)
Closes #5002.
2022-08-03 15:05:01 +00:00
Nga Tran 471b8be92f
chore: Revert "refactor: bump batch size (#5251)" (#5288)
This reverts commit bb172f8fa8.
2022-08-03 14:23:45 +00:00
Marco Neumann 9fbc95c3ad
feat: add sequencer reset count metric and log to ingester (#5286)
Split out from #5253.

Helps with #5128.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 13:00:36 +00:00
dependabot[bot] 5fd59fcc43
chore(deps): Bump syn from 1.0.98 to 1.0.99 (#5284)
Bumps [syn](https://github.com/dtolnay/syn) from 1.0.98 to 1.0.99.
- [Release notes](https://github.com/dtolnay/syn/releases)
- [Commits](https://github.com/dtolnay/syn/compare/1.0.98...1.0.99)

---
updated-dependencies:
- dependency-name: syn
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-03 12:50:56 +00:00
Marco Neumann 273b3cc165
chore: replace `dotenv` with `dotenvy` (#5285)
The latter one is a maintained fork. This avoids having both crates
after #5282.
2022-08-03 12:41:38 +00:00
dependabot[bot] 7c67b93015
chore(deps): Bump sqlx from 0.6.0 to 0.6.1 (#5282)
Bumps [sqlx](https://github.com/launchbadge/sqlx) from 0.6.0 to 0.6.1.
- [Release notes](https://github.com/launchbadge/sqlx/releases)
- [Changelog](https://github.com/launchbadge/sqlx/blob/main/CHANGELOG.md)
- [Commits](https://github.com/launchbadge/sqlx/commits)

---
updated-dependencies:
- dependency-name: sqlx
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 12:19:16 +00:00
Marco Neumann 772dd858a8
feat: support for regex comparisons in storage CLI (#5281)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 10:08:44 +00:00
dependabot[bot] 94fe5b4c10
chore(deps): Bump paste from 1.0.7 to 1.0.8 (#5280)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.7 to 1.0.8.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.7...1.0.8)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 09:03:25 +00:00
Stuart Carnie 964062b40c
feat: Add information about profiling using Instruments on macOS (#5275) 2022-08-03 08:45:33 +00:00
Marko Mikulicic a4e2f880be
feat: Expose a C API for the IOx LP parser (#5267)
Can be useful to call the IOx LP parser from other processes, for example from Go.
I used it to run an online comparison of IOx and influxdb Go LP parser in order to identify compatibility
issues.
2022-08-02 15:44:41 +00:00
Marco Neumann 8e2443d879
feat: use two RAM pools in querier (#5271)
Quick&Dirty implementation of a RAM-pool split to see if this has any
effect. I expect the querier performance to improve due to this because
large read buffers can no longer evict precious metadata.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-02 15:14:26 +00:00
Nga Tran 4812db9887
feat: fewer buckets but larger ranges for compaction duration histogram (#5259)
* chore: reduce log info

* feat: fewer buckets but larger ranges for compaction duration histogram

* chore: Apply suggestions from code review

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>

* chore: run fmt after appying reviewer's suggestions

Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-02 14:19:30 +00:00
Andrew Lamb 9c9658ca38
test(influxdb_line_protocol): add value verification test (#5270) 2022-08-02 11:18:09 +00:00
Marco Neumann ee491cbbfc
fix: re-enable querier read buffer cache (#5268)
This reverts commit 82913743f1 / #5252.

I misjudged the cache hit ratio for the RB, see
https://github.com/influxdata/k8s-infra/pull/4548

So let's bring back the RB cache until we have some form of parquet
cache in place.
2022-08-02 08:37:30 +00:00
dependabot[bot] e57ae07db7
chore(deps): Bump serde from 1.0.140 to 1.0.141 (#5260)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.140 to 1.0.141.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.140...v1.0.141)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-02 07:52:06 +00:00
Marco Neumann a8f6d579c8
feat: add metric for predicate-based cache entry removal (#5257) 2022-08-02 07:44:53 +00:00
Marco Neumann fec6b18d80
feat: add metric for TTL cache expiration (#5256)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-02 07:00:30 +00:00
Marko Mikulicic 7d15bd6029
Merge pull request #5265 from influxdata/lp_fp
fix: Fix bug and incompatibility in floating point parsing of scientific notation
2022-08-02 06:23:16 +02:00
Marko Mikulicic 84a856069b fix: Scientific notation without + or -
Closes #5264
2022-08-02 05:46:28 +02:00
Marko Mikulicic a926996485 fix: Negative scientific notation without decimal parts
Closes #5263
2022-08-02 05:40:55 +02:00
Nga Tran 8f1b6f2465
chore: reduce log info (#5254)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 16:00:34 +00:00
Marco Neumann 82913743f1
refactor: disable querier read buffer cache (#5252)
Let's try and see how this performs in prod.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 15:43:22 +00:00
Marco Neumann bb172f8fa8
refactor: bump batch size (#5251)
This is what DataFusion uses by default and I don't see a reason why we
should use such small batch sizes.

The affect is probably only visible in certain filter-aggregate queries
that don't focus on a single series (because there we likely end up with
1 or 2 batches only, esp. after #5250) for coarse-grained filters, esp.
  when the filter key is not the first sort key.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 13:49:58 +00:00
Marco Neumann b12ebe1109
fix: do not panic on invalid timestamp ranges (#5249)
Timestamp ranges come from "untrusted" inputs (via gRPC) and must not
lead to panics. The only case where this could happen is at `start >
end`. Let's just set `start = end` in this case. Reaonsing:

- Semantically this is a sound range, since this is only a somewhat
  degenerated case of "empty".
- We already allow `start = end` to represent "empty" ranges.
- We already clamp (and therefore modify) `start` to the valid range.

Fixes https://github.com/influxdata/conductor/issues/1080.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 13:35:34 +00:00
Marco Neumann 3ae89de324
refactor: increase batch size when reading parquet (#5250)
* refactor: increase batch size when reading parquet

This reduced our overhead when reading parquet files quite a lot.

In some internal benchmark, this reduces the size to perform a single
series aggregation of a rather large series with cold caches from 58s to
48s for cold caches. No real difference could be measured for warm
caches (~21ms for both).

This should also help the compactor since the record batches should be
larger.

* refactor: ensure that parquet row group size is in-sync

Ensure that we use the same row group size for reading and writing
parquet files. This is the same value as upstream currently uses as a
default, but let's make sure we don't diverge from that:

3032a521c9/parquet/src/file/properties.rs (L65)

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-01 10:31:26 +00:00
dependabot[bot] de340cfa53
chore(deps): Bump sqlparser from 0.18.0 to 0.19.0 (#5237)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.18.0 to 0.19.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.18.0...v0.19.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 08:58:32 +00:00
dependabot[bot] fbd39844d8
chore(deps): Bump async-trait from 0.1.56 to 0.1.57 (#5247)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.56 to 0.1.57.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.56...0.1.57)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 08:30:33 +00:00
dependabot[bot] d0db73d8de
chore(deps): Bump clap from 3.2.15 to 3.2.16 (#5239)
Bumps [clap](https://github.com/clap-rs/clap) from 3.2.15 to 3.2.16.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.2.15...v3.2.16)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 08:00:47 +00:00
dependabot[bot] a687844ddf
chore(deps): Bump bytes from 1.2.0 to 1.2.1 (#5244)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.0 to 1.2.1.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/commits)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 07:50:06 +00:00
Andrew Lamb 7cc8486e5a
fix: remove left over `deb!` macro (#5224)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-30 15:33:02 +00:00
Marco Neumann 0e9695f202
feat: add a few helpful compactor debug logs (#5235)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 17:38:33 +00:00
Marco Neumann 87bdabb38a
feat: log external span for query gRPC requests (#5187)
* feat: log external span for query gRPC requests

This should simplify the correlation with our binlog data.

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 12:53:12 +00:00
Sam Arnold 3fbe860bb9
fix: interpret [MIN_NANO_TIME, MAX_NANO_TIME) range as all time for optimization (#5231)
InfluxQL queries can send (technically incorrect) ranges like this, meaning all time
but excluding the max nanosecond time.

Since this is an important case, we should handle it specially and use the optimized
'all time' handling for meta queries even though this is technically wrong in that
it does not filter out column names / measurement names at MAX_NANO_TIME exactly.

Closes: https://github.com/influxdata/conductor/issues/1072

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 12:24:26 +00:00
Andrew Lamb 9215a534d0
chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229)
* chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0`

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: clippy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 08:10:47 +00:00
Nga Tran fcce00bf09
feat: run many compact partitions in parallel (#5230)
* feat: run many compact partitions in parallel

* refactor: Use rust futures fu to run compactor jobs in parallel

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-27 20:55:45 +00:00
Andrew Lamb 7eebe061a6
fix: reduce log verbosity for `found compaction candidates` message (#5225)
* fix: reduce log verbosity

* refactor: sleep for a sec if no work, print debug

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-27 19:35:31 +00:00
Marko Mikulicic 9da8062a16
fix: Fix typo in log message (#5222)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-27 15:34:37 +00:00
Marco Neumann 9a9a1a4777
feat: limit per-table chunk data for every query (#5223)
* feat: `QueryChunk::as_any`

* feat: allo `ChunkPruner::prune_chunks` to fail

* feat: limit per-table chunk data for every query

Closes #5211.

* fix: address review comments

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-27 13:20:05 +00:00
Marko Mikulicic 6d01a9ad68
fix: Clarify error msg: line number is 1-based (#5219)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-27 08:37:28 +00:00
Marco Neumann d7ab7362fd
refactor: avoid schema copies in `select_schema` (#5214)
This massively helps with #5202.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-27 08:30:26 +00:00
Marco Neumann 85c186f5b8
feat: cache projected chunk schemas in querier (#5213)
* feat: cache projected chunk schemas in querier

Ref #5202.

* refactor: simplify size calculations

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-27 08:23:20 +00:00
Marko Mikulicic 2bbc419f95
fix: Tell which column failed typecheck (#5220) 2022-07-27 08:16:15 +00:00