Commit Graph

485 Commits (3852b2ee643ce0880f723e5730ecd3d6d7fc3453)

Author SHA1 Message Date
Carol (Nichols || Goulding) ef9ef75e56
fix: Remove unsupported TemplatePart variants (#7746)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-04 16:20:18 +00:00
Carol (Nichols || Goulding) b0959667d5
fix: Move topic and query pool within iox catalog (#7734)
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-04 13:45:56 +00:00
Carol (Nichols || Goulding) 621caab2e9
fix: Remove unused parquet_max_sequence_number metadata 2023-05-03 10:57:27 -04:00
Carol (Nichols || Goulding) 038f8e9ce0
fix: Move shard concepts into only the catalog
This still inserts the shard id into the database, always set to the
TRANSITION_SHARD_ID, but never reads it back out again.
2023-04-26 11:42:32 -04:00
wiedld 3990dcd519
Merge branch 'main' into idpe-17434/namespace-validation-alignment 2023-04-25 09:41:45 -07:00
wiedld b3cdea25ba chore: update term conventions to allowlist 2023-04-24 17:07:25 -07:00
Carol (Nichols || Goulding) 7b7101b53f
fix: In DmlMeta, only use SequenceNumber rather than Sequence (shard index plus seq num) 2023-04-24 12:23:16 -04:00
wiedld daabe9663c chore(idpe-17434): make restrictive whitelist of chars accepted, for any NamespaceName 2023-04-21 16:36:00 -07:00
wiedld b870242ec7 chore(idpe-17434): remove utf8-percent encoding on v2 write path, such that it matches v1 writes and onCreate 2023-04-21 16:31:55 -07:00
Carol (Nichols || Goulding) 0ac2494c5d
fix: Remove unused partitions_with_recent_created_files method
And PartitionParam type.
2023-04-20 14:54:03 -04:00
Dom Dwyer 92b1055e41
refactor: error message should not end in .
Change the "bad character" namespace name error to not include a
trailing "."
2023-04-18 14:46:22 +02:00
Carol (Nichols || Goulding) f1850c9234
fix: Remove unused level_1 function and TablePartition type 2023-04-17 19:28:50 -04:00
Carol (Nichols || Goulding) 6c06b9fd7a
fix: Remove unused list_type_count_by_table_id and ColumnTypeCount 2023-04-17 19:28:49 -04:00
Carol (Nichols || Goulding) 5e6dbec909
fix: Remove tombstones as they aren't functional currently 2023-04-14 13:36:08 -04:00
Carol (Nichols || Goulding) acf857816e
fix: Remove old querier 2023-04-12 13:18:23 -04:00
Carol (Nichols || Goulding) 6387a9576a
fix: Remove the write_summary crate and write info service 2023-04-12 11:31:23 -04:00
Dom Dwyer 7fed2ba456 feat(router): single tenancy operational mode
Adds a single-tenant mode (CST) to the IOx routers.

Single-tenancy mode differs in two main ways:

    * V1 write endpoint is partially supported
    * V2 write endpoint ignores "org" parameter

The "normal" mode is "multi tenant" which is the default operational
mode, and all existing behaviour remains unchanged. Single tenant mode
can be enabled by specifying INFLUXDB_IOX_SINGLE_TENANCY=true.

Request parsing is delegated to two implementations of the
WriteParamExtractor trait, one each for CST and MT - the logic of each
"mode" is defined within these files and all other functionality is
common between the two.

This commit also renames some of the error types for clarity
(NoSpecified -> NoOrgBucketSpecified, other NotSpecified ->
NoQueryParams, etc).

Note: single tenant code requires testing
2023-04-10 12:59:20 -07:00
Dom Dwyer d322791d12
refactor: tidy NamespaceName construction errors
There was a mix of different ways of returning errors - this commit
unifies them, adds some documentation to the returned errors, and
removes the capitalisation.

Errors should be lower-case so they compose nicely like this:
    "something failed: super important error: inner error"
rather than:
    "something failed: Super important error: Inner error"
2023-03-31 16:27:26 +02:00
Dom Dwyer 65034cfaa6
refactor: org & bucket parser on NamespaceName
Moves the function org_and_bucket_to_namespace() to be an associated
method (constructor) on the NamespaceName itself.
2023-03-31 16:12:49 +02:00
Dom Dwyer 259dc04937
refactor: move NamespaceName & related fn into mod
Moves the NamespaceName, associated errors and helper functions into
it's own module.
2023-03-31 15:58:09 +02:00
Dom Dwyer 0a2cb33f7d
test: property test SequenceNumberSet operations
Adds proptests to assert correctness of set operations against a
SequenceNumberSet, by ensuring they match those of a known-good
implementation from the stdlib (HashSet).

Ensure (de-)serialisation correctness by asserting a round trip results
in equal sets.
2023-03-20 14:23:43 +01:00
Dom Dwyer 5680710e75
fix: full u64 range in SequenceNumberSet
Prior to this commit, the SequenceNumberSet accepted values up to
u32::MAX (approx 4.2 billion) which works out to be ~50 days of
continious ID allocation at 1k writes/s.

This commit changes the SequenceNumberSet to accept the full range of a
u64, with the caveat that outside of the u32 range, values may be
ordered incorrectly. See:

    https://github.com/influxdata/influxdb_iox/issues/7260

Fortunately values from SequenceNumberSet are used as an identity, and
never for ordering. We should remove the PartialOrd bounds on
SequenceNumber once Kafka has been cleaned up to encode this in the type
system.
2023-03-20 14:15:30 +01:00
Andrew Lamb 28cb56767c
fix: Rename crate `influxdb_line_protocol` to `influxdb-line-protocol` (#7244)
* fix: Rename crate `influxdb_line_protocol` to `influxdb-line-protocol`

* fix: hakari config

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-17 13:19:50 +00:00
Marco Neumann 20ec47b00b
feat: virtual chunk order col (#7240)
* feat: introduce `CHUNK_ORDER_COLUMN_NAME`

* feat: impl `ChunkOrder` everywhere

* feat: `ChunkOrder::get`

* feat: emit chunk order column for `RecordBatchesExec`

* feat: `chunk_order_field`

* feat: chunk order col for parquet chunks

* feat: optional chunk order col handling for dedup

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-17 09:39:21 +00:00
Nga Tran 9e9e689a30
feat: handle large-size overlapped files (#7079)
* feat: split start-level files that overlap wiht many files

* test: split files and theit split times

* test: split test for L1 and L2 files

* feat: full implementation that support large-size overlapped files

* chore: modify comments to reflect the changes

* fix: typo

* chore: update test output

* docs: clearer comments

* chore: remove empty test files. Will add in in a separate PR

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: address review comments

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: add a knob to turn large-size overlaps on and off

* fix: typo

* chore: update test output after merging main

* fix: split_times should not include the max_time of the file

* fix: fix an overlap bug while limitting number of files to compact

* test: unit tests for different overlap cases of limit files to compact

* chore: increase time range of the tests to let the split files work correctly

* fix: skip compacting files of tiny ranges

* test: add tests for time range 1

* chore: address review comments

* chore: remove  enable_large_size_overlap_files knob

* fix: fix a bug that sort L1 files in thier min_time instead of l0_max_created_at

* refactor: use the same order_files function afer merging main into branch

* chore: typos and clearer comments

* chore: remove obsolete comments

* chore: add asserts per review suggestion

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-03-07 18:51:59 +00:00
Dom Dwyer ac1b37c0f0
feat(data_types): SequenceNumberSet intersection
Support computing the intersection of two SequenceNumberSet.
2023-03-03 17:21:39 +01:00
Dom Dwyer 146494f619
perf: pre-allocation of SequenceNumberSet
Support pre-allocation of SequenceNumberSet for known-length sets.
2023-03-03 17:21:38 +01:00
Dom Dwyer 7801a63334
feat: apply RLE optimisation for SequenceNumberSet
Allow a SequenceNumberSet to be space-optimised by transforming it to
use run length encoding.
2023-03-03 17:21:38 +01:00
Dom Dwyer 043f3421ba
feat(data_types): PartialEq for SequenceNumberSet
Derive + test partial equality matching for SequenceNumberSet.
2023-03-03 17:21:38 +01:00
Dom Dwyer 6aa33ef380
feat: impl FromIterator for SequenceNumberSet
Allow a SequenceNumberSet to be instantiated from an iterator of
SequenceNumber.
2023-03-02 10:58:02 +01:00
Dom Dwyer 6532fb752b
feat: impl Extend for SequenceNumberSet
Allow a SequenceNumberSet to be efficiently extended from any iterator
of SequenceNumber instances.
2023-03-02 10:58:01 +01:00
Carol (Nichols || Goulding) faae5eb438 chore: Rerun cargo hakari manage-deps 2023-02-27 11:56:15 +01:00
dependabot[bot] c9417bb7f1
chore(deps): Bump croaring from 0.8.0 to 0.8.1 (#7034)
Bumps [croaring](https://github.com/saulius/croaring-rs) from 0.8.0 to 0.8.1.
- [Release notes](https://github.com/saulius/croaring-rs/releases)
- [Commits](https://github.com/saulius/croaring-rs/compare/0.8.0...0.8.1)

---
updated-dependencies:
- dependency-name: croaring
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-21 16:39:05 +00:00
Dom Dwyer 1805f7fbe3
refactor: SequenceNumberSet::as_bytes -> to_bytes
The conversion is not (always) a cheap cast/conversion, and therefore
should be prefixed with "to_" and not "as_".
2023-02-15 11:03:15 +01:00
Marco Neumann f499022511
feat: add compaction level to commit metrics (#6985)
* feat: add compaction level to commit metrics

* test: more realism
2023-02-15 09:28:19 +00:00
Nga Tran 5c506058da
feat: skip partitions of wide tables (#6978)
* feat: skip partitions of wide tables

* test: one more test

* refactor: address review comments
2023-02-14 16:42:13 +00:00
dependabot[bot] 06d50e8611
chore(deps): Bump croaring from 0.7.0 to 0.8.0 (#6965)
Bumps [croaring](https://github.com/saulius/croaring-rs) from 0.7.0 to 0.8.0.
- [Release notes](https://github.com/saulius/croaring-rs/releases)
- [Commits](https://github.com/saulius/croaring-rs/compare/0.7.0...0.8.0)

---
updated-dependencies:
- dependency-name: croaring
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 09:09:32 +00:00
Dom Dwyer a85dcd745b
refactor(catalog): expose deleted_at on Namespace
Add the new catalog column to the Namespace representation/model.
2023-02-10 14:15:01 +01:00
Dom 9e8785f9c1
Merge branch 'main' into dom/cache-table-limit 2023-02-07 10:04:50 +00:00
Stuart Carnie eb245d6774
feat: Initial SQLite catalog schema (#6851)
* feat: Initial SQLite catalog schema

* chore: Run cargo hakari tasks

* feat: impls, many TODOs

* feat: completed `todo!()`'s

* chore: add remaining tests from postgres module

* feat: add SQLite to get_catalog API

* chore: Add docs

* chore: Placate clippy

* chore: Placate clippy

* chore: PR feedback from @domodwyer

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-06 22:55:14 +00:00
Dom Dwyer a633964f2b
feat(catalog): return max table limit in schema
The maximum number of tables is part of the Namespace, which is already
loaded in its entirety. This commit copies the value into the
NamespaceSchema, making it available for the router to utilise.
2023-02-06 17:33:55 +01:00
Nga Tran f46881496d
feat: split files to upgrade and compact (#6852)
* feat: first step to split upgrade

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: address review comments

* chore: rename per review request

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-04 13:17:29 +00:00
Nga Tran 670b811a2e
feat: split non-overlapping files and not compact them (#6849)
* chore:  start the overlap filter module

* feat: split  non-overlapping files and not compact them

* chore: remove unused file

* refactor: address review comment

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-03 23:25:34 +00:00
Carol (Nichols || Goulding) 30fea67701
fix: Move variables within format strings. Thanks clippy!
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
Marko Mikulicic 167cde1838
fix(iox): Add transition shard to catalog setup (#6799) 2023-02-01 17:04:03 +00:00
Marko Mikulicic 0bc7d90ee3 chore: Avoid defining transition shard numbers in multiple crates 2023-01-27 18:30:34 +01:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
Marco Neumann 5c462937ca
refactor: converters for `ParquetFile`<>`ParquetFileParams` (#6637)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-19 16:47:54 +00:00
Dom c313db8dfe
Merge branch 'main' into dom/seqnum-set 2023-01-09 10:33:53 +00:00
Nga Tran b856edf826
feat: function to get parttion candidates from partition table (#6519)
* feat: function to get parttion candidates from partition table

* chore: cleanup

* fix: make new_file_at the same value as created_at

* chore: cleanup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-06 16:20:45 +00:00