Commit Graph

10647 Commits (f6e7724d19444cedc550f44771113a8948279e8b)

Author SHA1 Message Date
Marko Mikulicic f6e7724d19
fix(compactor2): Update other locations of the TRANSITION_SHARD_INDEX (#6736) 2023-01-27 16:59:24 +00:00
kodiakhq[bot] ab643ffa8b
Merge pull request #6709 from influxdata/cn/read-filter
test: Port query_tests read_filter tests to end-to-end
2023-01-27 15:47:07 +00:00
Carol (Nichols || Goulding) 4f8dd072b3
fix: Translate a test with a predicate of a literal = literal 2023-01-27 10:28:43 -05:00
Carol (Nichols || Goulding) 94f7f015f4
fix: Port a test with a predicate that tag=tag, which is always true 2023-01-27 10:28:43 -05:00
Carol (Nichols || Goulding) a2b67abe54
fix: Remove test cases that aren't valid to port to end-to-end tests 2023-01-27 10:28:43 -05:00
Carol (Nichols || Goulding) 67c430da63
test: Port read_filter query_tests to end-to-end tests 2023-01-27 10:28:43 -05:00
Carol (Nichols || Goulding) 9d490ceb1a
feat: Add a method to create tag expressions ORed together 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) 7b94e545f1
feat: Change combine_predicate to take a logical operator
To enable building "OR" queries
2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) b633b8b7a0
feat: Allow building a predicate that ANDs multiple nodes 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) d6bd6d5178
fix: Make regex_predicate function private; it's only used in this impl 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) 31e7925f47
refactor: Extract a function for making a comparison expression node 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) a400e212ec
refactor: Extract a function for making a string value node 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) 39cd34912b
refactor: Extract a function for making a tag ref node 2023-01-27 10:28:42 -05:00
Carol (Nichols || Goulding) c2c8524dd8
refactor: Extract a shared function for tag predicates 2023-01-27 10:28:42 -05:00
Marko Mikulicic aa9789049a
fix(iox): Use a transition shard id that doesn't overlap with legacy (#6733) 2023-01-27 14:23:40 +00:00
Andrew Lamb ead6812210
fix: reduce logging verbosity (#6704)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-27 13:53:42 +00:00
Dom b0c17ba036
Merge pull request #6732 from influxdata/dom/configurable-partition-key
feat(router): configurable partition key
2023-01-27 13:37:09 +00:00
Dom Dwyer 6797eab5fc
feat(router): configurable partition key
Allows the partition key to be set at runtime, though it's probably best
no one does so for now.
2023-01-27 14:26:18 +01:00
Dom 1de66ea56a
Merge pull request #6714 from influxdata/dom/service-limit-metric-labels
feat(metrics): separate service limit counters
2023-01-27 12:45:36 +00:00
Dom d8f80270bb
Merge branch 'main' into dom/service-limit-metric-labels 2023-01-27 12:31:01 +00:00
dependabot[bot] 13fc93a1b2
chore(deps): Bump either from 1.8.0 to 1.8.1 (#6724)
Bumps [either](https://github.com/bluss/either) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/bluss/either/releases)
- [Commits](https://github.com/bluss/either/compare/1.8.0...1.8.1)

---
updated-dependencies:
- dependency-name: either
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-27 10:12:49 +00:00
Dom 5757674d5e
Merge branch 'main' into dom/service-limit-metric-labels 2023-01-27 10:04:56 +00:00
Christopher M. Wolff c78088b043
fix: update clap parser for --ingester-addresses (#6723)
* fix: update clap parser for --ingester-addresses

* fix: make querier2 specify ingester addrs same as router2

* fix: update clap parser args but do not prepend http://

* chore: cargo fmt
2023-01-27 02:54:57 +00:00
Andrew Lamb 5ef9018f7e
refactor: Move sql script files from query_tests and into end to end query tests (#6708)
* refactor: Move sql script files from query_tests and into end to end query tests

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2023-01-26 19:49:21 +00:00
Andrew Lamb 589fbbf11c
chore: remove unecessary checks for persisted in end to end tests (#6713)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-26 18:24:39 +00:00
Dom ccc5cdcd80
Merge pull request #6720 from influxdata/dom/fix-namespace-rpc
fix(router): reject new namespaces / deny namespace auto-creation
2023-01-26 17:03:39 +00:00
Dom Dwyer 8140313775
test: drive catalog in ns rejection test
Use the actual catalog resolver, not the mock to assert the correct
behaviour with a populated catalog.
2023-01-26 17:55:39 +01:00
Dom Dwyer 0aa5469ac6
test(e2e): explicit namespace creation
Adds an end-to-end test of the router's gRPC NamespaceService covering
creation and reading of new namespaces.
2023-01-26 17:32:12 +01:00
Dom Dwyer 7eaa8f59b0
fix: explicit namespace creation w/ existing ns
Prior to this commit, namespaces that had been created on one router
could not be used on another router until the latter was restarted.
Effectively, newly created namespaces couldn't be used.

After this commit, the catalog is also checked when a cache miss occurs,
ensuring the router discovers new, not-yet-cached namespaces.
2023-01-26 17:32:12 +01:00
Dom Dwyer 105e354299
refactor: clean up namespace errors
The namespace error was poorly refactored and duplicated the prefix
string. The "rejected" case is now also tested.
2023-01-26 17:32:11 +01:00
Dom Dwyer 3a9b5a4d29
fix: bind NamespaceService to gRPC server
I forgot to bind the service!
2023-01-26 17:32:11 +01:00
Dom Dwyer 1a7679bcee
refactor: expose underlying gRPC implementations
Changes the gRPC delegate to return the underlying service (type erased)
implementations instead of the RPC service wrappers.
2023-01-26 17:32:11 +01:00
Dom Dwyer ac8fa293cb
refactor(test): TestContext::write_lp() helper
Adds a helper method to construct the HTTP write request.
2023-01-26 17:32:10 +01:00
Dom Dwyer 6f1869f9dc
test(router): initialise gRPC delegate in e2e
Initialise the "rpc mode" gRPC handlers in the router e2e TestContext.
2023-01-26 17:32:10 +01:00
Dom Dwyer 3efc42baac
refactor(test): dedicated e2e TestContext module
Moves the router's TestContext to its own file/module.
2023-01-26 17:32:10 +01:00
Marco Neumann 4391e30d2d
feat: improve compactor2 debugging (#6718)
* feat: add planning logging wrapper

* refactor: split partitionS source and partition source into two components
2023-01-26 16:10:20 +00:00
Marco Neumann 68380a32e5
fix: "timeout" as a reason to skip a partition (#6716)
I've meant to skip partitions w/ timeouts when I designed the
functionality but forgot to adjust the error filter accordingly. To not
run into this problem again (i.e. forget adjust the filter), make the
code a bit more explicit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-26 15:00:13 +00:00
Dom 8686b559ae
Merge pull request #6715 from influxdata/dom/rpc-namespace
fix(router): restore NamespaceService
2023-01-26 14:38:47 +00:00
Dom a489d03e1b
Merge branch 'main' into dom/rpc-namespace 2023-01-26 14:23:36 +00:00
Marco Neumann 30d411dc95
feat: shadow mode (#6712)
* refactor: remove untyped durations from `compactor2`

* feat: shadow mode

Closes #6645.

* refactor: split input and output store
2023-01-26 14:20:55 +00:00
Dom Dwyer c66f4a3d92
fix(router): restore NamespaceService
This was removed in the RPC variant of the router - no idea why, we
definitely should have it!
2023-01-26 15:10:22 +01:00
Dom 054101dabb
Merge branch 'main' into dom/service-limit-metric-labels 2023-01-26 13:52:10 +00:00
Dom Dwyer b6018e1c39
feat(metrics): separate service limit counters
Service limits are enforced on two values:

    * Number of tables in a namespace
    * Number of columns in a table

This commit labels the existing service limit hit metric with the type
of limit reached, and adds this information to the log lines emitted.
2023-01-26 14:48:33 +01:00
Dom 9b538f2c20
Merge pull request #6711 from influxdata/alamb/fix_encoding_again
fix: Do not send dictionary encoded data to clients
2023-01-26 12:29:36 +00:00
Andrew Lamb 6a0429584a fix: update doc example 2023-01-26 06:59:34 -05:00
Andrew Lamb c100737a81 chore: Do not send dictionary encoded data to clients 2023-01-26 06:35:15 -05:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
Marco Neumann ed694d3be4
feat: introduce scratchpad store for compactor (#6706)
* feat: introduce scratchpad store for compactor

Use an intermediate in-memory store (can be a disk later if we want) to
stage all inputs and outputs of the compaction. The reasons are:

- **fewer IO ops:** DataFusion's streaming IO requires slightly more
  IO requests (at least 2 per file) due to the way it is optimized to
  read as little as possible. It first reads the metadata and then
  decides which content to fetch. In the compaction case this is (esp.
  w/o delete predicates) EVERYTHING. So in contrast to the querier,
  there is no advantage of this approach. In contrary this easily adds
  100ms latency to every single input file.
- **less traffic:** For divide&conquer partitions (i.e. when we need to
  run multiple compaction steps to deal with them) it is kinda pointless
  to upload an intermediate result just to download it again. The
  scratchpad avoids that.
- **higher throughput:** We want to limit the number of concurrent
  DataFusion jobs because we don't wanna blow up the whole process by
  having too much in-flight arrow data at the same time. However while
  we perform the actual computation, we were waiting for object store
  IO. This was limiting our throughput substantially.
- **shadow mode:** De-coupling the stores in this way makes it easier to
  implement #6645.

Note that we assume here that the input parquet files are WAY SMALLER
than the uncompressed Arrow data during compaction itself.

Closes #6650.

* fix: panic on shutdown

* refactor: remove shadow scratchpad (for now)

* refactor: make scratchpad safe to use
2023-01-26 10:03:08 +00:00
Andrew Lamb 7853a19953
feat: JDBC integration tests with FlightSQL (#6693)
* feat: basic JDBC integration test

* fix: do not run test without env set

* docs: add maven link

* refactor: clean up java with switch statement
2023-01-25 22:21:18 +00:00
Andrew Lamb 2db8443a64
refactor: split flightsql crate into smaller modules (#6703)
* refactor: split flightsql crate into smaller modules

* refactor: automatically derive from Impl
2023-01-25 21:12:48 +00:00