influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Andrew Lamb	005610b172	refactor: remove some `&` use in iox_catalog (#4862 ) * refactor: remove some `&` use in iox_catalog * fix: Update data_types/src/lib.rs	2022-06-15 11:31:49 +00:00
Nga Tran	b682dbbc2e	chore: Add debug info of sort_key for ingester (#4859 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 20:39:17 +00:00
Carol (Nichols \|\| Goulding)	e875a92cf8	feat: Log time spent requesting ingester partitions (#4806 ) * feat: Log time spent requesting ingester partitions Fixes #4558. * feat: Record a metric for the duration queriers wait on ingesters * fix: Use DurationHistogram instead of U64 Histogram * test: Add a test for the ingester ms metric * feat: Add back the logging to provide both logging and metrics for ingester duration * refactor: Use sample_count method on metrics * feat: Record ingester duration separately for success or failure * fix: Create a separate test for the ingester metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 17:58:19 +00:00
Dom Dwyer	b41ea1d718	refactor: PartitionKey type This commit changes the code base to use a new reference-counted PartitionKey type wrapper, instead of passing a bare String around. This allows the compiler to type check & verify usage of the partition key, instead of passing a bare string around. By reference counting the underlying string, we reduce memory usage for some use cases.	2022-06-14 14:47:56 +01:00
kodiakhq[bot]	dd8d44e24f	Merge branch 'main' into cn/duration	2022-06-10 14:23:09 +00:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Marko Mikulicic	c09f6f6bc9	chore: Incrementally migrate sort_key to array type (#4826 ) This PR is the first step where we add a new column sort_key_arr whose content we'll manually migrate from sort_key. When we're done with this, we'll merge https://github.com/influxdata/influxdb_iox/pull/4801/ (whose migration script must be adapted slightly to rename the `sort_key_arr` column back to `sort_key`). All this must be done while we shut down the ingesters and the compactors. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-10 11:35:43 +00:00
Andrew Lamb	50697906b1	refactor: Make `DMLWrite::sequence_number` a `SequenceNumber` (#4817 )	2022-06-09 19:36:37 +00:00
Carol (Nichols \|\| Goulding)	1c7cbaf5ae	refactor: Use DurationHistogram in more places	2022-06-09 14:20:51 -04:00
dependabot[bot]	e03bf94420	chore(deps): Bump tokio from 1.18.2 to 1.19.1 (#4783 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-06 14:15:12 +00:00
dependabot[bot]	9a21292db8	chore(deps): Bump async-trait from 0.1.53 to 0.1.56 (#4774 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.53 to 0.1.56. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.53...0.1.56) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-03 09:10:40 +00:00
Ryan Russell	d279deddad	docs(various): Improve Readability (#4768 ) Signed-off-by: Ryan Russell <git@ryanrussell.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-02 18:01:06 +00:00
Andrew Lamb	dde3c3922c	refactor: use consistent spelling of serialize (#4717 )	2022-05-27 14:42:59 +00:00
Carol (Nichols \|\| Goulding)	077884c925	fix: Remove allow dead_code annotations from undead code	2022-05-06 16:58:02 -04:00
Carol (Nichols \|\| Goulding)	6681298a93	fix: Remove unused dependencies found with cargo-udeps	2022-05-06 14:51:54 -04:00
Carol (Nichols \|\| Goulding)	068096e7e1	fix: Rename data_types2 to data_types	2022-05-06 14:45:39 -04:00
Carol (Nichols \|\| Goulding)	12793bffbf	fix: Move Partition Template types to data_types2	2022-05-06 14:45:36 -04:00
Andrew Lamb	7c7d3fafe9	Merge branch 'main' into dom/schema-cache-warm	2022-04-29 09:11:53 -04:00
Marco Neumann	0a20086a58	feat: expose catalog timeouts via CLI/env (#4472 ) This is useful for local instances that run against a prod system, because port forwarding can lead to long connection delays. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-29 11:14:15 +00:00
Paul Dix	8e48fcd620	feat: add remote pull partition (#4433 ) Add lookup of partitions by table id to catalog. Add API to catalog to return partitions by table id. Add to client to return partitions by table id. Add CLI to pull remote schema, partition, and parquet files into a local catalog and object store.	2022-04-28 21:04:27 +00:00
dependabot[bot]	420c306caa	chore(deps): Bump tokio from 1.17.0 to 1.18.0 (#4453 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-04-28 08:21:17 +00:00
Dom Dwyer	bb8a19b571	feat(iox_catalog): list_schemas() Adds a function to resolve an atomic snapshot of all NamespaceSchema in the catalog with minimal query overhead.	2022-04-27 17:23:28 +01:00
Dom Dwyer	874521da8a	feat(iox_catalog): ColumnRepo::list() Allow all columns in the catalog to be fetched.	2022-04-27 17:21:00 +01:00
Dom Dwyer	eb5abce99e	feat(iox_catalog): TableRepo::list() Allow all tables in the catalog to be fetched.	2022-04-27 17:20:53 +01:00
二手掉包工程师	4b47d723b1	refactor: Rename time to iox_time (#4416 ) Signed-off-by: hi-rustin <rustin.liu@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-26 00:19:59 +00:00
Marco Neumann	86e8f05ed1	fix: make all catalog IDs 64bit (#4418 ) Closes #4365. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-25 16:49:34 +00:00
Dom Dwyer	320f1073e0	fix: revert column service limits (#4179 ) This reverts commit `ea865b63f4`.	2022-04-19 16:08:56 +01:00
Paul Dix	5bf4550259	feat: add object store service to router (#4338 ) Add method to catalog to get parquet file by object store id. Add gRPC service for object store to get a file from by its uuid. Add the object store service to router2 with object store config.	2022-04-16 17:58:31 +00:00
Carol (Nichols \|\| Goulding)	94dcde4996	fix: Do fewer queries for metadata By adding another _with_metadata catalog function. Also introduce a new type rather than passing around tuples everywhere.	2022-04-13 10:43:20 -04:00
Carol (Nichols \|\| Goulding)	bba4251363	fix: Remove duplication in metric name	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	02fee3b84f	feat: Request parquet metadata from the catalog when needed only	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	ec25620b73	feat: Add a catalog method for requesting a parquet file's metadata	2022-04-13 10:43:19 -04:00
Carol (Nichols \|\| Goulding)	ee56ebf0e3	feat: Store metadata in catalog, but don't fetch by default	2022-04-13 10:43:19 -04:00
Dom Dwyer	02f87e8484	refactor: reduce level_0 query limit Reduce the query limit from 10,000 to 1,000 to help reduce query execution time.	2022-04-05 15:14:56 +01:00
Paul Dix	81d41f81a1	fix: ingester replay logic (#4212 ) Fix the ingester to track the max persisted sequence number per partition. Ensure replay takes in data from unpersisted partitions. Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.	2022-04-04 18:04:34 +00:00
kodiakhq[bot]	e2439c0a4f	Merge branch 'main' into cn/sort-key-catalog	2022-04-04 16:54:48 +00:00
Dom Dwyer	61bc9c83ad	refactor: add table_id index on column_name After checking the postgres workload for the catalog in prod, this missing index was noted as the cause of unexpectedly expensive plans for simple queries.	2022-04-04 13:04:25 +01:00
dependabot[bot]	dc9632114c	chore(deps): Bump pretty_assertions from 1.2.0 to 1.2.1 (#4213 ) Bumps [pretty_assertions](https://github.com/colin-kiegel/rust-pretty-assertions) from 1.2.0 to 1.2.1. - [Release notes](https://github.com/colin-kiegel/rust-pretty-assertions/releases) - [Changelog](https://github.com/colin-kiegel/rust-pretty-assertions/blob/main/CHANGELOG.md) - [Commits](https://github.com/colin-kiegel/rust-pretty-assertions/compare/v1.2.0...v1.2.1) --- updated-dependencies: - dependency-name: pretty_assertions dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-04 10:53:31 +00:00
Carol (Nichols \|\| Goulding)	cbf7888435	feat: Add Partition update_sort_key method to catalog	2022-04-01 15:45:51 -04:00
Carol (Nichols \|\| Goulding)	c9bc70f03a	feat: Add optional sort_key column to partition table Connects to #4195.	2022-04-01 15:45:51 -04:00
Luke Bond	ea865b63f4	fix: create_or_get_multi for column in catalog now enforces limits (#4179 ) * fix: create_or_get_multi for column in catalog now enforces limits fix: create_or_get_multi for column in catalog now enforces limits chore: reorder catalog column create fns to be next to each other test: add failing test for multi col insert w/ limits test: bend catalog mem impl to match postgres for tests fix: postgres column insert many column type error checks chore: clippy * test: assert column counts in partial column insert test * chore: add some sql comments to the monster multicolumn insert query; s/RIGHT/INNER/ join * chore: adding comments to clarify partial failure behaviour of multi col insert * test: add tests for create_or_get_many columns in catalog * test: forgot how macros work for a moment * test: service limit test handles partial update of cols Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-01 10:59:43 +00:00
Paul Dix	6479e1fc8e	fix: add indexes to parquet_file (#4198 ) Add indexes so compactor can find candidate partitions and specific partition files quickly. Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-01 09:59:39 +00:00
Nga Tran	ddc2c8304f	fix: have the compaction level set correctly (#4184 ) * fix: have the compaction level set correctly, especially for compacted file from the compactor * fix: typo	2022-03-30 21:23:40 +00:00
Paul Dix	04d961e70d	feat: wire up compactor scheduler and config (#4139 ) Add configuration options for compactor for the max size of level 0 files and split percentage. Add metrics for compaction to track the number of candidates, compactions, and durations. Add functions to separate identifying partitions to compact from running compaction. Make compaction run in smaller chunks, specifically per partition. Update compaction to automatically promote level 0 files that are non-overlapping without waiting some period of time. Closes #4120 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 17:45:24 +00:00
Marko Mikulicic	2c47d77a5b	fix: Backfill namespace_id in schema migration (#4177 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 16:31:26 +00:00
Carol (Nichols \|\| Goulding)	79447aed33	fix: Logical merge conflict, missing namespace_id in test setup	2022-03-29 08:28:51 -04:00
Carol (Nichols \|\| Goulding)	5c8a80dca6	fix: Add an index to parquet_file to_delete	2022-03-29 08:15:26 -04:00
Carol (Nichols \|\| Goulding)	f3f792fd08	feat: Add namespace_id to the parquet_files table; object store paths need it	2022-03-29 08:15:26 -04:00

1 2 3 4

156 Commits (59accfe862e12d96e8834f3a9dfad60189c6e4ab)