influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	c100737a81	chore: Do not send dictionary encoded data to clients	2023-01-26 06:35:15 -05:00
Nga Tran	b8a80869d4	feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692 ) * feat: introduce a new way of max_sequence_number for ingester, compactor and querier * chore: cleanup * feat: new column max_l0_created_at to order files for deduplication * chore: cleanup * chore: debug info for chnaging cpu.parquet * fix: update test parquet file Co-authored-by: Marco Neumann <marco@crepererum.net>	2023-01-26 10:52:47 +00:00
Marco Neumann	ed694d3be4	feat: introduce scratchpad store for compactor (#6706 ) * feat: introduce scratchpad store for compactor Use an intermediate in-memory store (can be a disk later if we want) to stage all inputs and outputs of the compaction. The reasons are: - fewer IO ops: DataFusion's streaming IO requires slightly more IO requests (at least 2 per file) due to the way it is optimized to read as little as possible. It first reads the metadata and then decides which content to fetch. In the compaction case this is (esp. w/o delete predicates) EVERYTHING. So in contrast to the querier, there is no advantage of this approach. In contrary this easily adds 100ms latency to every single input file. - less traffic: For divide&conquer partitions (i.e. when we need to run multiple compaction steps to deal with them) it is kinda pointless to upload an intermediate result just to download it again. The scratchpad avoids that. - higher throughput: We want to limit the number of concurrent DataFusion jobs because we don't wanna blow up the whole process by having too much in-flight arrow data at the same time. However while we perform the actual computation, we were waiting for object store IO. This was limiting our throughput substantially. - shadow mode: De-coupling the stores in this way makes it easier to implement #6645. Note that we assume here that the input parquet files are WAY SMALLER than the uncompressed Arrow data during compaction itself. Closes #6650. * fix: panic on shutdown * refactor: remove shadow scratchpad (for now) * refactor: make scratchpad safe to use	2023-01-26 10:03:08 +00:00
Andrew Lamb	7853a19953	feat: JDBC integration tests with FlightSQL (#6693 ) * feat: basic JDBC integration test * fix: do not run test without env set * docs: add maven link * refactor: clean up java with switch statement	2023-01-25 22:21:18 +00:00
Andrew Lamb	2db8443a64	refactor: split flightsql crate into smaller modules (#6703 ) * refactor: split flightsql crate into smaller modules * refactor: automatically derive from Impl	2023-01-25 21:12:48 +00:00
Carol (Nichols \|\| Goulding)	57b5b639d6	test: Port all field columns query_tests to end-to-end tests (#6707 ) * test: Port a test that's not actually supported through the full gRPC API * test: Port remaining field column/measurement fields tests * test: Remove unsupported measurement predicate and clarify purposes of tests Andrew confirmed that the only way to invoke a Measurement Fields request is with a measurement/table name specified: <`0249b5018e/generated_types/protos/influxdata/platform/storage/service.proto (L43)`> so testing with a `_measurement` predicate is not valid. I thought this test would become redundant with some other tests, but they're actually still different enough; I took this opportunity to better highlight the differences in the test names. * refactor: Move all measurement fields tests to their own file * test: Remove field columns tests that are now covered in end-to-end measurement fields tests	2023-01-25 19:49:29 +00:00
kodiakhq[bot]	0249b5018e	Merge pull request #6655 from influxdata/cn/one-test test: Start of porting InfluxRpc query_tests	2023-01-25 15:56:44 +00:00
kodiakhq[bot]	98c60f9dc5	Merge branch 'main' into cn/one-test	2023-01-25 15:49:51 +00:00
Dom	7c7d737d0e	Merge pull request #6702 from influxdata/dom/persist-enqueue-durations refactor: appropriate queue wait histogram buckets	2023-01-25 15:49:14 +00:00
Carol (Nichols \|\| Goulding)	f803c31e84	fix: Limit tests in CI to 8 threads to not use up Postgres connections This is only needed until we switch over to ingester2 completely. Old ingester tests need to be run on non-shared servers because I'm unable to implement persistence per-namespace. Rather than spending time figuring that out, limit the parallelization to limit the Postgres connections that CI uses at one time.	2023-01-25 10:37:05 -05:00
Carol (Nichols \|\| Goulding)	4658510102	fix: For Ingester2, persist a particular namespace on demand and share MiniClusters This should hopefully help CI from running out of Postgres connections 😬 The old architecture will still need to be non-shared and persist everything.	2023-01-25 10:36:56 -05:00
Dom Dwyer	df87ca3f17	refactor: appropriate queue wait histogram buckets Changes the bucket values for the queue wait duration metric to be more appropriately scaled.	2023-01-25 16:31:49 +01:00
Carol (Nichols \|\| Goulding)	f310e01b1a	test: Start of porting InfluxRpc query_tests Make a new trait, `InfluxRpcTest`, that types can implement to define how to run a test on a specific Storage gRPC API. `InfluxRpcTest` takes care of iterating through the two architectures, running the setups, and creating the custom test step. Implementers of the trait can define aspects of the tests that differ per run, to make the parameters of the test clearer and highlight what different tests are testing.	2023-01-25 10:27:42 -05:00
Dom	8ee6c1ec68	Merge pull request #6701 from influxdata/dom/persist-config feat: export persist config metrics	2023-01-25 15:25:33 +00:00
Dom	dd445de275	Merge branch 'main' into dom/persist-config	2023-01-25 14:56:48 +00:00
Marco Neumann	7306ea9424	feat: divide&conquer framework (#6697 ) Allows compactor2 to run a fixed-point loop (until all work is done) and in every loop in can run mulitiple jobs. The jobs are currently organized by "branches". This is because our upcoming OOM handling may split a branch further if it doesn't complete. Also note that the current config resembles the state prior to this PR. So the FP-loop will only iterate ONCE and then runs out of L0 files. A more advanced setup can be built using the framework though.	2023-01-25 14:45:20 +00:00
Dom Dwyer	7b69c84ceb	feat: export persist config metrics Export the configured maximum persist parallelism, and the maximum queue depth, so they can be used to compute % saturation in alerts / dashboards.	2023-01-25 14:57:09 +01:00
Dom	c928eddaab	Merge pull request #6698 from influxdata/dom/circuit-fuzz test: CircuitBreaker recovery property fuzz test	2023-01-25 12:49:38 +00:00
Dom	f0d7ee59c3	Merge branch 'main' into dom/circuit-fuzz	2023-01-25 12:42:43 +00:00
Dom	e6876db431	Merge pull request #6700 from influxdata/dom/probe-at-most-one perf(router): faster balancer node recovery	2023-01-25 12:42:34 +00:00
Dom	b34bb46833	Merge branch 'main' into dom/circuit-fuzz	2023-01-25 12:29:46 +00:00
Dom	eb67a1fa3f	Merge branch 'main' into dom/probe-at-most-one	2023-01-25 12:23:26 +00:00
Dom Dwyer	6eb1773ec0	perf(router): faster balancer node recovery Ensure a "probe" node is always returned as the first candidate, driving it to recovery faster. This also includes a fix for the balancer metrics that would report probe candidate nodes as healthy nodes.	2023-01-25 13:18:24 +01:00
Andrew Lamb	0c55a0f257	feat: Implement basic prepared statement support in IOx (#6667 ) * feat: allow override of flightsql namespace * feat: Implement DoAction endpoint * refactor: Remove try_unpack * fix: remove unused code / more clone	2023-01-25 12:00:43 +00:00
Andrew Lamb	6caf31acf3	chore: Move garbage collection configuration into clap_blocks (#6678 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 11:31:48 +00:00
dependabot[bot]	f72a999fb3	chore(deps): Bump clap from 4.1.3 to 4.1.4 (#6694 ) Bumps [clap](https://github.com/clap-rs/clap) from 4.1.3 to 4.1.4. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v4.1.3...v4.1.4) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 11:03:41 +00:00
Dom	40c7c8b2e2	Merge branch 'main' into dom/circuit-fuzz	2023-01-25 10:57:19 +00:00
Andrew Lamb	509c80bc55	docs: document how the garbage collector works (#6682 ) * docs: document how the garbage collector works * fix: Updates * docs: Update docs/garbage_collector.md Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 10:54:43 +00:00
Dom Dwyer	f5d4171be0	test: CircuitBreaker recovery property fuzz test Adds a multi-threaded fuzz test that ensures a circuit breaker can always transition to the healthy state, regardless of the sequence of events prior.	2023-01-25 11:53:57 +01:00
Marco Neumann	40e6a1a437	feat: job semaphore (#6696 ) * refactor: avoid too-many-arguments * refactor: extract `fetch_partition_info` * feat: job semaphore	2023-01-25 10:35:07 +00:00
Dom	75fc4ba17f	Merge pull request #6695 from influxdata/dependabot/cargo/ahash-0.8.3 chore(deps): Bump ahash from 0.8.2 to 0.8.3	2023-01-25 09:28:04 +00:00
dependabot[bot]	cae3071776	chore(deps): Bump ahash from 0.8.2 to 0.8.3 Bumps [ahash](https://github.com/tkaitchuck/ahash) from 0.8.2 to 0.8.3. - [Release notes](https://github.com/tkaitchuck/ahash/releases) - [Commits](https://github.com/tkaitchuck/ahash/compare/v0.8.2...v0.8.3) --- updated-dependencies: - dependency-name: ahash dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-25 01:08:55 +00:00
kodiakhq[bot]	33e29e890a	Merge pull request #6688 from influxdata/dom/rpc-endpoint-metrics feat(metrics): router upstream RPC endpoint metrics	2023-01-24 23:51:38 +00:00
Luke Bond	caea42665b	Merge branch 'main' into dom/rpc-endpoint-metrics	2023-01-25 10:44:18 +11:00
Christopher M. Wolff	9a942ceff5	refactor: propagate gapfill stride to exec (#6690 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-24 20:49:29 +00:00
Dom	39dd455297	Merge pull request #6689 from influxdata/dom/ingester-rediscovery fix(router): force rediscovery of nodes	2023-01-24 19:21:17 +00:00
Dom	442e8a8b79	Merge branch 'main' into dom/ingester-rediscovery	2023-01-24 19:13:02 +00:00
Dom Dwyer	411f4bd08b	fix(router): force rediscovery of nodes Similar to https://github.com/influxdata/influxdb_iox/pull/6509, this forces a constant re-querying of the DNS address of an ingester to drive rediscovery. Unlike the above PR, this only reconnects when there are errors observed. This still isn't ideal - something is wrong with the discovery itself - this just papers over it.	2023-01-24 20:11:53 +01:00
kodiakhq[bot]	20ac3608ab	Merge pull request #6687 from influxdata/dom/timeouts refactor(router): set sensible RPC timeouts	2023-01-24 18:32:09 +00:00
Dom Dwyer	9132343dac	feat(metrics): export RPC upstream health state Adds a metric with a per-ingester label recording the current health state of the upstream ingester from the perspective of the router instance. Also logs periodically when one or more ingesters are offline.	2023-01-24 19:27:15 +01:00
kodiakhq[bot]	77b6b234d5	Merge branch 'main' into dom/timeouts	2023-01-24 18:24:42 +00:00
Andrew Lamb	c3bc61f10e	refactor: Move `flightsql` code into its own module, add docs and tests (#6640 ) * refactor: Move `flightsql` code into its own module * fix: get schema from LogicalPlan * refactor: use arrow_flight::sql::Any instead of prost_types::any * fix: cleanup docs and avoid as_ref * fix: Use Bytes * fix: use Any::pack * fix: doclink	2023-01-24 18:24:32 +00:00
Dom Dwyer	f26b54beec	refactor(router): set sensible RPC timeouts Copies these over from the client_util package.	2023-01-24 19:22:27 +01:00
Dom Dwyer	87b553fe9d	feat: WARN logs w/ endpoint for unhealthy upstream Changes the DEBUG log event to a WARN now that it includes the endpoint to which the event applies.	2023-01-24 19:19:31 +01:00
Marco Neumann	4521516147	feat: add per-partition timeout (#6686 ) It seems that prod was hanging last night. This is pretty hard to debug and in general we should protect the compactor against hanging / malformed partitions that take forever. This is similar to the fact that the querier also has a timeout for every query. Let's see if this shows anything in prod (and if not it's still a desired safety net). Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-24 16:53:47 +00:00
kodiakhq[bot]	6a2e0ae5cc	Merge pull request #6685 from influxdata/dom/rpc-balancer-2 perf(router): rpc balancer & circuit breaking take 2	2023-01-24 16:13:07 +00:00
Dom	b0e5e860cb	Merge branch 'main' into dom/rpc-balancer-2	2023-01-24 16:04:32 +00:00
Dom Dwyer	085de40127	feat: lazy-connect to ingester gRPC endpoints Lazily establish connections in the background, instead of using tonic's connect_lazy(). connect_lazy() causes error handling to take a different path in tonic compared to "normal" connections, and this stops reconnections from being performed when the endpoint goes away (likely a bug). It also means the first few write requests won't have to wait while the connection is dialed, which brings down the P99 as a nice side-effect.	2023-01-24 16:44:55 +01:00
Marco Neumann	1c87d9667f	refactor: record partition completion (both Ok and Err) (#6680 ) With the upcoming divide-and-conquer approach, we have have multiple commits per partition since we can divide it into multiple compaction jobs. For metrics (and logs) however it is important to track the overall process, so we shall also monitor the number of completed partitions.	2023-01-24 15:06:15 +00:00
kodiakhq[bot]	dcc1eb9a21	Merge pull request #6679 from influxdata/dom/ingester-metrics feat(ingester2): metrics	2023-01-24 14:57:12 +00:00

1 2 3 4 5 ...

10652 Commits (11233e3b3b0e9c96ec88920844a4a80cbccfe2b4) All Branches Search

10652 Commits (11233e3b3b0e9c96ec88920844a4a80cbccfe2b4)

All Branches