Commit Graph

8955 Commits (439bcf08d98066c167b3b144d4caaecd2853babe)

Author SHA1 Message Date
Dom Dwyer 130785977f ci: ignore RUSTSEC-2022-0048
XML parsing lib for the Azure SDK is unmaintained and reportedly
contains integer overflow / panic issues in the parsing functionality.

Low risk ignore as it is used when talking to Azure only. The Azure SDK
is in the progress of being removed as a dependency.
2022-08-29 13:47:04 +02:00
Marco Neumann 1b230d9291
refactor: clarify async/runtime requirement in cache system (#5483)
Instead of using a "fake async" function to ensure that we have a
running tokio runtime, use an explicit handle.
2022-08-29 09:51:25 +00:00
dependabot[bot] 05e599228b
chore(deps): Bump sqlparser from 0.21.0 to 0.22.0 (#5482)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.21.0 to 0.22.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.21.0...v0.22.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-29 09:07:19 +00:00
dependabot[bot] 9e2fc11d04
chore(deps): Bump time from 0.3.13 to 0.3.14 (#5472)
Bumps [time](https://github.com/time-rs/time) from 0.3.13 to 0.3.14.
- [Release notes](https://github.com/time-rs/time/releases)
- [Changelog](https://github.com/time-rs/time/blob/main/CHANGELOG.md)
- [Commits](https://github.com/time-rs/time/compare/v0.3.13...v0.3.14)

---
updated-dependencies:
- dependency-name: time
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-29 08:42:28 +00:00
dependabot[bot] 14a3215b7b
chore(deps): Bump lock_api from 0.4.7 to 0.4.8 (#5481)
Bumps [lock_api](https://github.com/Amanieu/parking_lot) from 0.4.7 to 0.4.8.
- [Release notes](https://github.com/Amanieu/parking_lot/releases)
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Amanieu/parking_lot/compare/lock_api-0.4.7...lock_api-0.4.8)

---
updated-dependencies:
- dependency-name: lock_api
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-29 08:08:27 +00:00
dependabot[bot] 6b81d4c0be
chore(deps): Bump comfy-table from 6.0.0 to 6.1.0 (#5480)
Bumps [comfy-table](https://github.com/nukesor/comfy-table) from 6.0.0 to 6.1.0.
- [Release notes](https://github.com/nukesor/comfy-table/releases)
- [Changelog](https://github.com/Nukesor/comfy-table/blob/main/CHANGELOG.md)
- [Commits](https://github.com/nukesor/comfy-table/compare/v6.0.0...v6.1.0)

---
updated-dependencies:
- dependency-name: comfy-table
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-29 07:38:37 +00:00
Nga Tran 283e908132
test: workaround for time > a number (#5477)
* test: workaround for time > a number

* chore: cargo update

* chore: Revert "chore: cargo update"

This reverts commit 0798e4e14674267ddd2308b12a25031fc35de8b6.
2022-08-26 20:52:12 +00:00
Carol (Nichols || Goulding) 5c20d6248f
chore: Upgrade lz4-sys due to RUSTSEC-2022-0051 (#5479)
https://rustsec.org/advisories/RUSTSEC-2022-0051
2022-08-26 20:30:19 +00:00
Stuart Carnie b08655a952
feat: Parse InfluxQL identifiers (#5425)
* feat: Parse InfluxQL identifiers

Closes #5424

* chore: Add common derive implementations

* feat: Implement fmt::Display trait for Identifier

* feat: Display implementation, nouns for combinator functions

* chore: Update docs

* chore: Double quoted are identifiers, single quoted are literals

Single quoted strings will be parsed in a separate module.
2022-08-25 23:03:12 +00:00
Marco Neumann e3e72c69f2
test: extend cache-system policy integration tests (#5448)
This is somewhat for #5318 but also a bit aftermath of #5320.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-25 15:44:38 +00:00
Luke Bond 3950ca3a17
feat: upsert partition & update sort key for each day in bulk ingest (#5447)
* feat: upsert partition & update sort key for each day in bulk ingest

feat: import schema now supports earliest/latest time merging
chore: tests & tidying up for bulk ingest catalog update

* fix: always sort time last in PK in import schema update catalog

* chore: additional test for computing sort key in bulk ingest

* chore: bulk import catalog update gets sequencer from sharder service

chore: import update schema tests refactor using sharder svc mock

* chore: dead code fix

* chore: import schema sequencer lookup test

* chore: clarifying comment in import schema catalog update
2022-08-25 10:47:12 +00:00
Dom 0c2ed1f19b
Merge pull request #5465 from influxdata/dom/impl-router-sharding-api
feat(router): sharding lookup gRPC service
2022-08-25 11:32:11 +01:00
Dom Dwyer 7afa3bfaec feat: expose ShardService over gRPC
Plumbs in the ShardService impl, and exposes it over the router's gRPC
interface.
2022-08-24 15:47:26 +02:00
Dom Dwyer 3594e5d095 refactor: impl Sharder for Arc-wrapped Sharder
Allows any Sharder wrapped in an Arc to be used as a Sharder impl,
allowing the same Sharder instance to be shared across independent code
modules.
2022-08-24 15:46:33 +02:00
Dom Dwyer da31f31406 feat: ShardService gRPC endpoint handler
Implements the ShardService to expose the shard mapping produced by
routers to external clients.

This impl uses an internal cache to eliminate unnecessary Catalog
queries, as the Kafka partition/Sequencer/Shard mapping is static once a
router has initialised.
2022-08-24 15:46:33 +02:00
Dom Dwyer baacc015d3 feat: impl Sharder<()> for JumpHash
Implement a payload-less Sharder for the JumpHash concrete type. This
models the input the gRPC ShardService will provide.
2022-08-24 15:45:34 +02:00
Dom ae12efbf0a
Merge pull request #5463 from influxdata/dom/infallible-sharder-cleanup
refactor: infallible JumpHash initialisation
2022-08-24 14:43:41 +01:00
Dom Dwyer abf26767c1 refactor: infallible JumpHash initialisation
This doesn't really need to be fallible but forces propagation of a ton
of error handling - no shards is always a sign of something being very
wrong, and can be caught in the caller if it's for some reason an
acceptable state / can be recovered from.
2022-08-24 13:18:57 +02:00
kodiakhq[bot] af9a68a853
Merge pull request #5462 from influxdata/dom/update-shardservice-grpc
refactor(proto): update ShardService gRPC defintion
2022-08-24 09:51:05 +00:00
Dom bd15a7366f
Merge branch 'main' into dom/update-shardservice-grpc 2022-08-24 10:44:12 +01:00
Dom Dwyer c6d4109e07 build: generate gRPC bindings for ShardService
Builds the ShardService proto file in the generated_types package.
2022-08-24 11:39:59 +02:00
Dom Dwyer f11af90c46 refactor(proto): simplify RPC messages & types
Removes the input oneof - a shard caller MUST always provide a
table/namespace, and MAY provide an optional payload (which in the
future will enable sharding using column valuess/etc). As there is
currently no payload-based sharding, this simplifies the RPC message.

Changes the returned types to better reflect the types we use internally
- this should avoid type juggling for both server & client.
2022-08-24 11:39:59 +02:00
Marko Mikulicic 99daa13897
test: Test dotenvy regression (#5461) 2022-08-24 09:39:55 +00:00
Marko Mikulicic 4beb721a9a
fix: Revert Bump dotenvy from 0.15.1 to 0.15.2 (#5450) (#5455)
This reverts commit 84acbd2fad.

Closes #5454

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-24 09:10:09 +00:00
Nga Tran 3220c6f88b
feat: add file_count_threshold for comapcting cold partitions (#5456)
* feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier

* chore: remove unecessary commments
2022-08-23 20:12:21 +00:00
Dom e1cbc23a0f
Merge pull request #5452 from influxdata/dom/router-sequencer-types
refactor(router): use KafkaPartition in sequencer
2022-08-23 15:17:14 +01:00
Dom Dwyer 9b920f1cbb refactor(router): use KafkaPartition in sequencer
The Sequencer (which will be renamed shortly) is a type that represents
a single sequencer/shard/kafka partition in the router.

In order to minimise confusion with all the various IDs floating around,
we have a KafkaPartition - this commit changes the Sequencer to return
the Kafka partition index as a typed value, rather than a usize to help
eliminate any inconsistencies.

As a side effect of these conversion changes, I've tightened up the
casting to ensure we assert on any overflows - we juggle a lot of
numeric types!
2022-08-23 16:02:11 +02:00
kodiakhq[bot] 9bd2b9aa12
Merge pull request #5451 from influxdata/dom/router-sharding-api
feat: sharder API definition
2022-08-23 12:19:34 +00:00
kodiakhq[bot] 8edd886bb9
Merge branch 'main' into dom/router-sharding-api 2022-08-23 12:12:39 +00:00
dependabot[bot] 84acbd2fad
chore(deps): Bump dotenvy from 0.15.1 to 0.15.2 (#5450)
Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.1 to 0.15.2.
- [Release notes](https://github.com/allan2/dotenvy/releases)
- [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/allan2/dotenvy/commits/v0.15.2)

---
updated-dependencies:
- dependency-name: dotenvy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-23 11:24:42 +00:00
Dom Dwyer 57bbe6b216 feat: sharder API definition
This commit adds a gRPC endpoint for callers to map (table, namespace)
tuples to Sequencer IDs, using the logic internal to the router.

Reference:
    https://github.com/influxdata/influxdb_iox/pull/5447#pullrequestreview-1080574538
2022-08-23 13:21:59 +02:00
Dom 8cc6724714
Merge pull request #5427 from influxdata/dom/kafka-remove-dml-merging
feat: simple RecordAggregator for write buffer
2022-08-22 15:24:36 +01:00
Dom 12abba5d37
Merge branch 'main' into dom/kafka-remove-dml-merging 2022-08-22 12:04:44 +01:00
Dom Dwyer 8b054c14a8 test: update batching tests for new aggregator
Previously aggregated writes were merged into a single Kafka Record -
this meant that all merged ops would be placed into the same Record, and
therefore receive the same sequence number once published to Kafka.

The new aggregator batches at the Record level, therefore aggregated
writes now get their own distinct sequence number. This commit updates
the batching tests to reflect this new sequence number assignment
behaviour.
2022-08-22 12:59:43 +02:00
Dom Dwyer 6d6fc9a08b test: reduce timestamp precision for comparisons
Reduce the precision of timestamps in tests before comparing the DML
metadata objects.

This allows tests to accept different timestamp precisions, such as when
ops pass "through" Kafka vs. files, etc.
2022-08-22 12:58:03 +02:00
Dom Dwyer 312def5acd refactor: assert writes partitioned
The previous aggregator impl would assert that writes had been
partitioned before aggregating them (or rather, that the DML write had a
partition key assigned).

This should be true for all writes passing through the write buffer,
irrespective of which aggregator is used, therefore this assert is moved
"up" into the write buffer itself.
2022-08-22 12:52:37 +02:00
Dom Dwyer a66d16576d refactor: use dyn TimeProvider in RecordAggregator
For ease of integration with the existing tests, use dyn TimeProvider in
the RecordAggregator.
2022-08-22 12:50:50 +02:00
Dom Dwyer 37727105b5 refactor: remove redundant timestamp conversions
Removes the existing, copy-pasted timestamp conversion code to remove
redundant conversions.
2022-08-22 11:06:36 +02:00
Marco Neumann 064606380b
feat: refresh policy for caches (#5431)
* feat: refresh policy for caches

For #5318 we want to have a policy that refreshes keys before they are
too old. I initially tried to fold both TTL and the refresh system into
a single policy but than decided that they will basically be two
policies in one with a harder-to-test interface. Semantically TTL and
refresh are also a bit different (but will usually be used together):

- **TTL:** Prevents that a users gets data that is too old. It is some kind
  of "soft correctness". In some sense this is related to the "remove
  if" policy where some part of the system knows for sure (or with
  reasonable likelyhood) that a cache entry is outdated. Note that TTL's
  primary job is NOT to clean up memory from old keys (even though it
  indirectly does that). There is no reason cached entries should be
  removed except for correctness (TTL and remove-if) or resource
  pressure -- and the latter is handled by the LRU policy.
- **Refresh:** While TTL is some kind of deadline, we often have good
  reason to refresh the key before we pull the plug, namely when an
  entry is used and a bit old (but not too old). The concrete mechanism
  to archive this is flexible. At the moment the policy is rather
  simple -- just start a refresh task if a key is old and we receive a
  GET request -- but can be extended in the future.

This also adds some integration tests for TTL+refresh. There will be
follow-up changes to test the interaction with LRU as well, althouh I am
pretty certain that there won't be any surprises due to the excessive
testing we have in place for the policy backend itself as well as all
the policies.

This change also does NOT integrate the refresh with the querier for the
sake of keeping the changeset "small" (i.e. it is already rather large).

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-08-22 08:45:22 +00:00
Stuart Carnie 2d795bd604
chore: cargo update (#5439)
* pest 2.2.1 → 2.3.0
* serde 1.0.143 → 1.0.144
* serde-json 1.0.83 → 1.0.85

pest_meta and pest_generator 2.2.1 were yanked
2022-08-22 05:15:48 +00:00
Andrew Lamb 35f99fe940
fix: fix intermittent failures in `data::tests::persist` (#5437)
* fix: fix intermittent failures in data::tests::persist

* fix: tweak comments and message

* fix: space
2022-08-19 21:16:00 +00:00
dependabot[bot] ed38b01e91
chore(deps): Bump sqlparser from 0.20.0 to 0.21.0 (#5429)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.20.0 to 0.21.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.20.0...v0.21.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-19 10:41:38 +00:00
Marco Neumann 5ce0618b8f
chore: `cargo update` (#5432)
```
cpufeatures v0.2.2 -> v0.2.3
plotters v0.3.2 -> v0.3.3
plotters-svg v0.3.2 -> v0.3.3
```

`plotters v0.3.2` was yanked.
2022-08-19 09:56:23 +00:00
Stuart Carnie b4e5895d7a
feat: Add influxdb_influxql_parser crate (#5415)
* feat: Add crate; parse quoted identifiers

* chore: Run cargo hakari tasks

* chore: satisfy linter

* chore: Use `test_helpers::Result`

* feat: Add all InfluxQL keywords

* chore: Update influxdb_influxql_parser/src/lib.rs

Co-authored-by: Marco Neumann <marco@crepererum.net>

* chore: PR feedback

* chore: PR Feedback, remove Result<()>

* chore: Update Cargo.lock

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-08-18 23:09:45 +00:00
Dom Dwyer 59c2d84d1e refactor: use RecordAggregator
Replaces the DmlAggregator with the simpler RecordAggregator.

Metrics gathered as part of #5323 shows there is practically no benefit
to the additional complexity of the DmlAggregator over the simpler
RecordAggregator impl.
2022-08-18 17:12:23 +02:00
Dom Dwyer 30e23f6e82 feat: simple RecordAggregator for write buffer
This commit adds a new write buffer aggregator used by rskafka to
increase the size of Kafka messages on the wire. The Kafka write buffer
impl is the only impl to perform aggregation.

This Aggregator impl maps IOx-specific DML operations to rskafka Records
with no additional processing - it can be thought of as an IOx-specific
adaptor over rskafka's RecordAggregator.

By delegating batching of Record instances to rskakfa's simple
RecordAggregator, we minimise code complexity / bug surface area / LoC.
2022-08-18 11:42:58 +01:00
Marco Neumann d75df2b610
chore: `cargo update` (#5426)
```
bumpalo v3.10.0 -> v3.11.0
either v1.7.0 -> v1.8.0
iana-time-zone v0.1.45 -> v0.1.46
rustix v0.35.8 -> v0.35.9
```

`rustix` is important because `0.35.8` was yanked.
2022-08-18 08:53:00 +00:00
kodiakhq[bot] 8eb3a79d7f
Merge pull request #5348 from influxdata/cn/upgrade-l0-metrics
feat: Add metrics on the size of files created by ingestion and used for compaction
2022-08-17 16:08:59 +00:00
kodiakhq[bot] 2b3ca54168
Merge branch 'main' into cn/upgrade-l0-metrics 2022-08-17 16:01:42 +00:00
Luke Bond f4443f0b3a
feat: import schema override (#5420)
* chore: struct for overrides of import schema conflicts

* chore: import schema override shouldn't support tags

* feat: import schema merge can take an override schema

* fix: schema override in test had superfluous tag

* chore: test for batch schema merge with override in import schema

* feat: import schema merge now takes override schema
2022-08-17 14:59:50 +00:00