influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	c100737a81	chore: Do not send dictionary encoded data to clients	2023-01-26 06:35:15 -05:00
Nga Tran	b8a80869d4	feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692 ) * feat: introduce a new way of max_sequence_number for ingester, compactor and querier * chore: cleanup * feat: new column max_l0_created_at to order files for deduplication * chore: cleanup * chore: debug info for chnaging cpu.parquet * fix: update test parquet file Co-authored-by: Marco Neumann <marco@crepererum.net>	2023-01-26 10:52:47 +00:00
Marco Neumann	ed694d3be4	feat: introduce scratchpad store for compactor (#6706 ) * feat: introduce scratchpad store for compactor Use an intermediate in-memory store (can be a disk later if we want) to stage all inputs and outputs of the compaction. The reasons are: - fewer IO ops: DataFusion's streaming IO requires slightly more IO requests (at least 2 per file) due to the way it is optimized to read as little as possible. It first reads the metadata and then decides which content to fetch. In the compaction case this is (esp. w/o delete predicates) EVERYTHING. So in contrast to the querier, there is no advantage of this approach. In contrary this easily adds 100ms latency to every single input file. - less traffic: For divide&conquer partitions (i.e. when we need to run multiple compaction steps to deal with them) it is kinda pointless to upload an intermediate result just to download it again. The scratchpad avoids that. - higher throughput: We want to limit the number of concurrent DataFusion jobs because we don't wanna blow up the whole process by having too much in-flight arrow data at the same time. However while we perform the actual computation, we were waiting for object store IO. This was limiting our throughput substantially. - shadow mode: De-coupling the stores in this way makes it easier to implement #6645. Note that we assume here that the input parquet files are WAY SMALLER than the uncompressed Arrow data during compaction itself. Closes #6650. * fix: panic on shutdown * refactor: remove shadow scratchpad (for now) * refactor: make scratchpad safe to use	2023-01-26 10:03:08 +00:00
Andrew Lamb	7853a19953	feat: JDBC integration tests with FlightSQL (#6693 ) * feat: basic JDBC integration test * fix: do not run test without env set * docs: add maven link * refactor: clean up java with switch statement	2023-01-25 22:21:18 +00:00
Carol (Nichols \|\| Goulding)	57b5b639d6	test: Port all field columns query_tests to end-to-end tests (#6707 ) * test: Port a test that's not actually supported through the full gRPC API * test: Port remaining field column/measurement fields tests * test: Remove unsupported measurement predicate and clarify purposes of tests Andrew confirmed that the only way to invoke a Measurement Fields request is with a measurement/table name specified: <`0249b5018e/generated_types/protos/influxdata/platform/storage/service.proto (L43)`> so testing with a `_measurement` predicate is not valid. I thought this test would become redundant with some other tests, but they're actually still different enough; I took this opportunity to better highlight the differences in the test names. * refactor: Move all measurement fields tests to their own file * test: Remove field columns tests that are now covered in end-to-end measurement fields tests	2023-01-25 19:49:29 +00:00
Carol (Nichols \|\| Goulding)	4658510102	fix: For Ingester2, persist a particular namespace on demand and share MiniClusters This should hopefully help CI from running out of Postgres connections 😬 The old architecture will still need to be non-shared and persist everything.	2023-01-25 10:36:56 -05:00
Carol (Nichols \|\| Goulding)	f310e01b1a	test: Start of porting InfluxRpc query_tests Make a new trait, `InfluxRpcTest`, that types can implement to define how to run a test on a specific Storage gRPC API. `InfluxRpcTest` takes care of iterating through the two architectures, running the setups, and creating the custom test step. Implementers of the trait can define aspects of the tests that differ per run, to make the parameters of the test clearer and highlight what different tests are testing.	2023-01-25 10:27:42 -05:00
Andrew Lamb	0c55a0f257	feat: Implement basic prepared statement support in IOx (#6667 ) * feat: allow override of flightsql namespace * feat: Implement DoAction endpoint * refactor: Remove try_unpack * fix: remove unused code / more clone	2023-01-25 12:00:43 +00:00
Andrew Lamb	6caf31acf3	chore: Move garbage collection configuration into clap_blocks (#6678 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-25 11:31:48 +00:00
Luke Bond	e3fc873b2e	feat: enable object store metrics on ingester2 (#6672 ) Signed-off-by: Luke Bond <luke.n.bond@gmail.com> Signed-off-by: Luke Bond <luke.n.bond@gmail.com>	2023-01-24 01:59:58 +00:00
Andrew Lamb	1b882e0062	fix: `error arrow/ipc: could not read message schema: EOF` (#6668 ) * chore: Test for schema from query * fix: Send schema even for no RecordBatches * fix: docs	2023-01-23 22:23:34 +00:00
Nga Tran	411b3db928	fix: Get shard id from a constant (topic, shard_index) to avoid error of shard_id FK violation (#6658 ) * fix: ake shard_id FK always 1 * fix: use const shard_index to read its ID * refactor: read shard_id during compactor initiation	2023-01-22 16:49:06 +00:00
Carol (Nichols \|\| Goulding)	6afd782b3f	fix: Move query_tests2 into influxdb_iox/tests so that the code rebuilds	2023-01-19 16:44:31 -05:00
Andrew Lamb	65c020c9f2	refactor: remove iox_arrow_flight use in `influxdb_iox_client ` and `querier` (#6624 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-19 18:48:23 +00:00
Marco Neumann	5e297b4667	refactor: lift up compactor2 CLI args, set mem limit to 8GB (#6631 ) - use a single data structure for CLI args (not two) - set mem limit default to 8GB (same as querier). We can always tune this later, but we should not run with "unlimited" to begin with.	2023-01-19 12:21:51 +00:00
kodiakhq[bot]	33168b97f0	Merge branch 'main' into cn/query-tests-grpc	2023-01-18 19:03:51 +00:00
Marco Neumann	e72173d58d	feat: very basic compactor2 skeleton (#6614 ) Sets up crate and wires up the main binary. No tests yet, no algorithm framework, just the bare minimum. Also I decided to not offer a gRPC server in `compactor2` at the moment and hence did not implement any handle/delegate infrastructure. We add this later if we need it. This also means compactor2 does NOT provide a catalog service for now. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-18 16:36:40 +00:00
Carol (Nichols \|\| Goulding)	f3b5dcaab7	feat: Reimagining query_tests	2023-01-18 10:24:17 -05:00
dependabot[bot]	0a70e9f43f	chore(deps): Bump rustyline from 10.0.0 to 10.1.0 Bumps [rustyline](https://github.com/kkawakam/rustyline) from 10.0.0 to 10.1.0. - [Release notes](https://github.com/kkawakam/rustyline/releases) - [Changelog](https://github.com/kkawakam/rustyline/blob/master/History.md) - [Commits](https://github.com/kkawakam/rustyline/compare/v10.0.0...v10.1.0) --- updated-dependencies: - dependency-name: rustyline dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-16 02:06:24 +00:00
Dom	f7ff877582	Merge branch 'main' into cn/ingester-persist-tick	2023-01-13 12:31:45 +00:00
Carol (Nichols \|\| Goulding)	f56123bf30	test: Allow integration tests that should_panic to pass if TEST_INTEGRATION isn't set	2023-01-12 15:31:34 -05:00
Carol (Nichols \|\| Goulding)	1c7ffb95df	test: Write a should_panic test that shows ingester is persisting when I thought it wouldn't	2023-01-12 14:55:28 -05:00
Carol (Nichols \|\| Goulding)	b989e0893f	test: Make persist-on-demand test a step test and check the number of parquet files	2023-01-12 11:40:46 -05:00
Carol (Nichols \|\| Goulding)	3a2544a7eb	feat: Define a new gRPC service for ingester persist	2023-01-12 11:03:12 -05:00
Carol (Nichols \|\| Goulding)	e9b3efb33d	refactor: Extract a method for making requests to the ingester onto MiniCluster	2023-01-12 11:03:10 -05:00
Marco Neumann	e2da573dcf	refactor: improve thread naming (#6579 ) - name exec driver thread (instead of using the default that `thread::spawn` gives us) - provide number to every worker thread (both for the dedicatd executor and for the main runtime) - shorten thread names (current naming too long for most debug tools)	2023-01-12 14:22:49 +00:00
Dom	1e5b594863	Merge branch 'main' into dom/shutdown-persist	2023-01-12 10:15:34 +00:00
Nga Tran	fa0893819c	fix: have warm compaction work with compactor2 (#6571 ) * refactor: same function to select partition candidates * fix: have warm compaction work with compactor2 * fix: format * chore: cleanup	2023-01-12 02:32:39 +00:00
Carol (Nichols \|\| Goulding)	be7c312033	fix: Wait for a particular number of Parquet files, not just any change	2023-01-11 12:11:56 -05:00
Carol (Nichols \|\| Goulding)	7e921e6a23	fix: Make recording num parquet files an explicit test step To support a case where someone calls WriteLineProtocol twice in a row to simulate two write requests. The test should be able to record this state before the two write requests and not twice.	2023-01-11 11:51:56 -05:00
Carol (Nichols \|\| Goulding)	6677ae5c61	test: Record number of Parquet files before a write Fixes #6506. Also has the pleasant side effect of making this code simpler and less hacky-- it now checks the number of Parquet files for the whole namespace, which is useful in cases where the line protocol writes to several tables.	2023-01-11 11:51:55 -05:00
Dom Dwyer	303c9e4398	test: fix e2e test This test was relying on a graceful shutdown of the ingester to drive a WAL replay, restoring the buffered state at startup. Now the shutdown causes the data to be persisted and not replayed, this didn't work.	2023-01-11 17:15:04 +01:00
Stuart Carnie	66047f4372	feat: InfluxQL learns how to plan some InfluxQL queries (#6520 ) * feat: InfluxQL learns how to plan some queries Also added a means to test the planner and execution * chore: Update module docs * chore: Document the planner functions * chore: Update end_to_end_cases crate * chore: Clarify why `SLIMIT` and `SOFFSET` return `NotImplemented` * chore: Address lint issues * chore: Fix rustdoc link issue * chore: Remove InfluxQL tests from query_tests crate Will follow conventions established by @carols10cents when new query_tests crate is merged. * chore: `now` field `now` is a DataFusion built-in scalar function * chore: remove unused code * chore: Add additional arithmetic expression tests * chore: Establish pattern for identifying and tracking InfluxQL issues * chore: Add tests for case sensitivity issues * chore: group tests into modules and functions This avoids mass rewriting of insta snapshots as new tests are added to each function. When tests are added in the middle, existing snapshots are renamed (-N+1, -N+2, etc) resulting in having to review numerous additional snapshots.	2023-01-11 02:50:49 +00:00
Nga Tran	62c0f3dbdd	feat: have cold compaction work with Compactor2 (#6542 ) * feat: cold * chore: debug info * feat: only compact qualified cold partition candidates * fix: catalog test * chore: cleanup * chore: add new config flag for cold partition candidates * chore: implement display for CompactionType and add tests for max num partitions Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 16:42:57 +00:00
dependabot[bot]	b49cc2e35e	chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 09:48:44 +00:00
dependabot[bot]	09746c3aea	chore(deps): Bump assert_cmd from 2.0.7 to 2.0.8 Bumps [assert_cmd](https://github.com/assert-rs/assert_cmd) from 2.0.7 to 2.0.8. - [Release notes](https://github.com/assert-rs/assert_cmd/releases) - [Changelog](https://github.com/assert-rs/assert_cmd/blob/master/CHANGELOG.md) - [Commits](https://github.com/assert-rs/assert_cmd/compare/v2.0.7...v2.0.8) --- updated-dependencies: - dependency-name: assert_cmd dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2023-01-10 01:06:02 +00:00
Paul Dix	828992c9c5	feat: Ingest replica skeleton (#6529 ) * feat: Update replication.proto * Remove the PartitionId in the replicate request as a single replicate request can have the data for many partitions. * Add namespace_id and table_id to persist complete request to make data easier to lookup in buffer. * feat: Initial ingest_replica skeleton A bunch of copy pasta here from ingester2, but this takes out a ton of stuff that isn't used in replicas. Also lays the groundwork for the simpler buffer structure to keep the data and a basic cache for catalog information that will be required. * feat: update replication.proto GetPartitionBufferResponse * chore: PR cleanup * chore: PR cleanup	2023-01-09 16:53:49 +00:00
Carol (Nichols \|\| Goulding)	afaaedc758	test: Reproduce the test for querier caching	2023-01-04 10:15:35 -05:00
Carol (Nichols \|\| Goulding)	9f7dede433	refactor: Reorganize tests to make deleting half the tests easier When we switch to RPC write mode, the with_kafka module can be deleted and the kafkaless_rpc_write module can be unwrapped.	2023-01-04 10:15:35 -05:00
Carol (Nichols \|\| Goulding)	c464487bb2	test: Ability to set up a querier2 that doesn't connect to any ingesters	2023-01-04 10:06:57 -05:00
Carol (Nichols \|\| Goulding)	08ceb4ee48	test: Check catalog for new Parquet files to know when data is persisted	2023-01-04 10:06:57 -05:00
Carol (Nichols \|\| Goulding)	e49bee0c26	test: Make test ingester2 instances either persist very quickly or not at all	2023-01-04 10:06:57 -05:00
Carol (Nichols \|\| Goulding)	96029654ab	test: Add a shared MiniCluster for version 2 services	2023-01-04 10:06:56 -05:00
Andrew Lamb	dbe52f1ca1	chore: Upgrade datafusion (#6467 ) * chore: Update datafusion * fix: Update for new apis * chore: Update expected plan * fix: Update for new config construction * chore: update clippy * fix: Fix error codes * fix: update another test * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-03 15:29:11 +00:00
dependabot[bot]	0aacef3c59	chore(deps): Bump once_cell from 1.16.0 to 1.17.0 (#6473 ) * chore(deps): Bump once_cell from 1.16.0 to 1.17.0 Bumps [once_cell](https://github.com/matklad/once_cell) from 1.16.0 to 1.17.0. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.16.0...v1.17.0) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Change once_cell version specifier to major.minor for less churn Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com>	2023-01-02 17:07:15 +00:00
Nga Tran	d27e137c39	chore: add debug info for the investigation (#6472 )	2022-12-29 23:49:29 +00:00
Carol (Nichols \|\| Goulding)	7c6ccdb6d7	fix: Use keys and values functions. Thanks clippy!	2022-12-21 14:32:35 -05:00
dependabot[bot]	ed17069087	chore(deps): Bump num_cpus from 1.14.0 to 1.15.0 (#6451 ) Bumps [num_cpus](https://github.com/seanmonstar/num_cpus) from 1.14.0 to 1.15.0. - [Release notes](https://github.com/seanmonstar/num_cpus/releases) - [Changelog](https://github.com/seanmonstar/num_cpus/blob/master/CHANGELOG.md) - [Commits](https://github.com/seanmonstar/num_cpus/compare/v1.14.0...v1.15.0) --- updated-dependencies: - dependency-name: num_cpus dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-21 15:17:55 +00:00
Andrew Lamb	e1059a9009	feat: FlightSQL Milestone 2 basic FlightSQL client and FlightSQL server implementation and plumbing (#6398 ) * feat: Add basic Flight and FlightSQL client into IOx codebase Basic flight end to end test * fix: Apply suggestions from code review Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-20 17:34:00 +00:00
Andrew Lamb	9b22ede3f0	refactor: Make arrow flight client return `futures::Streams` (#6438 ) * refactor: Make arrow flight client use futures::Streams * refactor: concision	2022-12-19 17:09:26 +00:00

1 2 3 4 5 ...

921 Commits (105e3542991aef5f1654fe260f9fa53a32f622f2)