Commit Graph

5032 Commits (c107434d2008694461499b38257f79be34b7d054)

Author SHA1 Message Date
Edd Robinson dbbfd2a9f8 feat: add delete support to row_group: 2021-08-27 12:30:20 +01:00
Edd Robinson 95548dcec9 feat: add relative complement to RowIDs(bitmap) 2021-08-27 12:30:20 +01:00
kodiakhq[bot] 1e8aa6111a
Merge pull request #2426 from influxdata/re-enable-heappy
feat(iox): Enable heappy again
2021-08-27 09:23:29 +00:00
Marko Mikulicic 6e2aa2eef3
feat(iox): Enable heappy again 2021-08-27 11:13:30 +02:00
Nga Tran bcd39e225c feat: Management API for delete 2021-08-26 17:31:21 -04:00
Andrew Lamb f42f0349ed
feat: Implement basic metrics for `DeduplicateExec`, `IOxReadFilterNode`, `SchemaPivotExec` and `StreamSplitExec` (#2387)
* feat: Add baseline metrics to DeduplicateExec

* feat: Add metrics to `IOxReadFilterNode`

* feat: Add metrics for SchemaPivotExec

* feat: Add metrics to StreamSplitExec

* fix: Update for new API, cleanups

* test: Add tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 20:31:28 +00:00
kodiakhq[bot] ee0e8be70c
Merge pull request #2412 from influxdata/pd/duration-sampling-interval
feat: Support sampling interval strings in data generator
2021-08-26 18:02:09 +00:00
kodiakhq[bot] 8cf84a24eb
Merge branch 'main' into pd/duration-sampling-interval 2021-08-26 17:52:34 +00:00
Raphael Taylor-Davies e3e801d29a
feat: propagate span context into storage RPC queries (#2407)
* feat: propagate span context into storage RPC queries

* refactor: create ExecutionContextProvider trait

* chore: cleanup imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 17:11:49 +00:00
kodiakhq[bot] 3603f219fb
Merge pull request #2355 from influxdata/cn/create-database
feat: Add a generation directory inside the database directory on object storage
2021-08-26 13:45:16 +00:00
Carol (Nichols || Goulding) 7ca177978e fix: Add missing await from a logical merge conflict 2021-08-26 09:27:16 -04:00
Carol (Nichols || Goulding) 5566e1926c refactor: Use max_by_key instead of map max 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 82b566a3bb fix: Warn when a non-generation item is found in a database directory 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 7cf7fb02ed refactor: Rename database ObjectStore state types to DatabaseObjectStore 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) c16b4b1bff refactor: Have GenerationPath save a Generation field 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 63efda8213 fix: Avoid leaking generation id in an error type 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 9edffea878 docs: Add and clarify some caveats and intentions 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 7b6093092a refactor: make the `existing` method private and share more code 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 2c42a195ea fix: Remove now-unused generation_id method on iox object store 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 6d0959fbc3 fix: Move IOx object store creation logic into Database state machine 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 199d212b18 refactor: Move find-or-create IoxObjectStore logic into tests
This is the only place this logic is used; it's not appropriate for
production usage as we only ever want to either find and error or create
and error in real life.
2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) c7eceac8a3 refactor: Have server determine database generation from object store 2021-08-26 09:14:23 -04:00
Carol (Nichols || Goulding) 1f0e37c9d1 refactor: Use the parsed_path macro to make path creation shorter 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 6a79f69bfc refactor: Make methods on paths to produce other paths 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 5e1b57de9a refactor: Borrow arcs instead of as_ref 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) cee2f21d47 feat: Add a find_or_create object store function for tests 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) d42cbaeef8 feat: Add a method to find one active database in object storage 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) 18ba3b5c59 feat: Create database directories with a generation ID 2021-08-26 09:14:22 -04:00
Carol (Nichols || Goulding) a8ada048dd test: Move success case first so it's easier to see when assumptions are broken 2021-08-26 09:14:22 -04:00
Andrew Lamb f975baba6b
chore: Update datafusion + other deps again (get baseline metrics) (#2422)
* chore: Update datafusion reference

* chore: cargo update

* fix: update explain tests to show Union

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 13:13:00 +00:00
kodiakhq[bot] 3dbed49659
Merge pull request #2420 from influxdata/crepererum/account_for_parquet_md_size
fix: correctly account for parquet metadata size
2021-08-26 12:26:26 +00:00
Marco Neumann 026202a05c fix: correctly account for parquet metadata size
We need to hold the parquet metadata in memory so that we're able to
create catalog checkpoints. We used to do that by holding the decoded
structure (provided by the upstream `parquet` crate) in memory and
serializing that data on demand to Apache Thrift.

There are two drawbacks:

1. We did not account for the memory usage of the decoded structures (or
   at least not fully).
2. We actually don't need the decoded data in-memory, since for the
   checkpoint creation we only need to write the serialized data.

So this PR changes our wrapper so it holds the serialized data which is
then only decoded when it's really necessary. Since the serialized data
is a simple byte vector, we can also easily account for the size.

Note that this makes the accounted size of parquet chunks larger.
However this data was always there, we just ignored it up until now. If
the size of the parquet metadata really becomes an issue, we could trait
some CPU time for memory by compressing it.
2021-08-26 13:24:32 +02:00
Raphael Taylor-Davies 1773bf5d37
feat: add storage client to influxdb_iox_client (#2404)
* feat: add storage client to influxdb_iox_client

* chore: fix type_url

* refactor: split storage into separate crate

* chore: fix doctest

* chore: review feedback

* chore: add generated_types cleanup ticket

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-26 10:30:59 +00:00
kodiakhq[bot] c81a650409
Merge pull request #2415 from influxdata/crepererum/job_start_time_in_system_table
feat: add start time to `operations` system table
2021-08-26 08:11:49 +00:00
kodiakhq[bot] b1ecf1bfed
Merge branch 'main' into crepererum/job_start_time_in_system_table 2021-08-26 08:04:10 +00:00
Andrew Lamb ddf6c6362e
chore: update DataFusion again (#2411)
* chore: update datafusion ref

* chore: run cargo update

* refactor: Rename concurrency to target_partitions, avoid deprecation warning
2021-08-26 08:03:13 +00:00
Marco Neumann 558aa54aa3 feat: add start time to `operations` system table 2021-08-26 10:00:29 +02:00
Paul Dix 64fca1ee34 feat: Support sampling interval strings in data generator
This changes the sampling_interval in the data generator to be a string, supporting things like ns, us, ms, s, m, h and others.
2021-08-25 17:35:01 -04:00
kodiakhq[bot] 4c3056dd91
Merge pull request #2406 from influxdata/jemallocon
feat: Re-enable jemalloc
2021-08-25 12:54:58 +00:00
Marko Mikulicic 31521e076e
feat: Re-enable jemalloc
But without heappy
2021-08-25 14:46:28 +02:00
kodiakhq[bot] e441c72bbb
Merge pull request #2405 from influxdata/er/refactor/read_buffer/rle_entries
refactor: use sorted vector for RLE dictionary entry collection
2021-08-25 11:44:33 +00:00
Edd Robinson 69329b0b38
Merge branch 'main' into er/refactor/read_buffer/rle_entries 2021-08-25 12:08:44 +01:00
Edd Robinson 11e88877f4 fix: correct size estimation of RLE encoding 2021-08-25 12:03:04 +01:00
Edd Robinson d18e835b4f refactor: remove next_id generation 2021-08-25 11:31:51 +01:00
Edd Robinson 833a410e4a refactor: replace btreeset for vec
Benchmarks are roughly the same depending on the workload

 critcmp master_string pr_string
group                                                                master_string                             pr_string
-----                                                                -------------                             ---------
_select/enc_"plain encoder"/rows_100000/loc_End/card_100             1.12     43.9±0.41µs  2.1 GElem/sec       1.00     39.4±0.40µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_End/card_1000            1.00     32.9±0.43µs  2.8 GElem/sec       1.00     33.0±0.48µs  2.8 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_End/card_10000           1.00     32.1±0.37µs  2.9 GElem/sec       1.00     32.2±0.43µs  2.9 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Middle/card_100          1.02     40.2±0.79µs  2.3 GElem/sec       1.00     39.5±0.56µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Middle/card_1000         1.00     33.0±0.42µs  2.8 GElem/sec       1.00     33.0±0.38µs  2.8 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Middle/card_10000        1.00     32.3±0.41µs  2.9 GElem/sec       1.00     32.4±0.53µs  2.9 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Start/card_100           1.04     41.2±1.45µs  2.3 GElem/sec       1.00     39.5±0.54µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Start/card_1000          1.01     33.4±0.87µs  2.8 GElem/sec       1.00     32.9±0.43µs  2.8 GElem/sec
_select/enc_"plain encoder"/rows_100000/loc_Start/card_10000         1.01     32.5±0.44µs  2.9 GElem/sec       1.00     32.3±0.51µs  2.9 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_End/card_1000           1.00    382.0±3.43µs  2.4 GElem/sec       1.00    382.0±4.04µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_End/card_10000          1.00    376.7±4.67µs  2.5 GElem/sec       1.00   377.2±12.83µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_End/card_100000         1.00    374.4±3.08µs  2.5 GElem/sec       1.00    375.0±4.09µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Middle/card_1000        1.00    382.4±4.68µs  2.4 GElem/sec       1.00    382.8±4.61µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Middle/card_10000       1.00    375.8±3.55µs  2.5 GElem/sec       1.00    376.0±4.17µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Middle/card_100000      1.00    374.7±3.76µs  2.5 GElem/sec       1.00    375.1±4.44µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Start/card_1000         1.00    382.1±3.80µs  2.4 GElem/sec       1.00    382.2±3.44µs  2.4 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Start/card_10000        1.00    376.5±4.85µs  2.5 GElem/sec       1.00    376.5±4.76µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_1000000/loc_Start/card_100000       1.00    375.0±3.41µs  2.5 GElem/sec       1.00    375.3±4.28µs  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_End/card_10000         1.00      3.7±0.02ms  2.5 GElem/sec       1.01      3.8±0.06ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_End/card_100000        1.00      3.7±0.01ms  2.5 GElem/sec       1.01      3.8±0.06ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_End/card_1000000       1.00      3.7±0.01ms  2.5 GElem/sec       1.01      3.8±0.10ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Middle/card_10000      1.00      3.8±0.03ms  2.5 GElem/sec       1.00      3.8±0.04ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Middle/card_100000     1.00      3.8±0.03ms  2.5 GElem/sec       1.07      4.0±0.73ms  2.3 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Middle/card_1000000    1.02      3.8±0.06ms  2.4 GElem/sec       1.00      3.8±0.03ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Start/card_10000       1.00      3.8±0.03ms  2.5 GElem/sec       1.00      3.8±0.03ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Start/card_100000      1.00      3.8±0.04ms  2.5 GElem/sec       1.00      3.8±0.04ms  2.5 GElem/sec
_select/enc_"plain encoder"/rows_10000000/loc_Start/card_1000000     1.00      3.8±0.05ms  2.5 GElem/sec       1.00      3.8±0.03ms  2.5 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_End/card_100                1.00      2.9±0.03µs 32.0 GElem/sec       1.01      2.9±0.09µs 31.6 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_End/card_1000               1.06  1002.0±13.75ns 93.0 GElem/sec       1.00    948.3±9.63ns 98.2 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_End/card_10000              1.02      4.6±0.05µs 20.3 GElem/sec       1.00      4.5±0.17µs 20.7 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Middle/card_100             1.00      3.0±0.03µs 31.5 GElem/sec       1.00      2.9±0.04µs 31.6 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Middle/card_1000            1.04   788.9±12.39ns 118.1 GElem/sec      1.00   755.7±20.50ns 123.2 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Middle/card_10000           1.00      2.8±0.43µs 33.5 GElem/sec       1.02      2.8±0.03µs 32.8 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Start/card_100              1.00      2.9±0.04µs 32.3 GElem/sec       1.02      2.9±0.10µs 31.7 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Start/card_1000             1.03   597.4±14.85ns 155.9 GElem/sec      1.00   581.1±13.60ns 160.3 GElem/sec
select/enc_"RLE encoder"/rows_100000/loc_Start/card_10000            1.42   606.6±13.37ns 153.5 GElem/sec      1.00    426.0±6.32ns 218.6 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_End/card_1000              1.00      3.3±0.03µs 280.9 GElem/sec      1.03      3.4±0.47µs 273.5 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_End/card_10000             1.00      4.6±0.09µs 200.6 GElem/sec      1.03      4.8±0.06µs 194.8 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_End/card_100000            1.01     41.5±0.44µs 22.4 GElem/sec       1.00     41.1±0.57µs 22.6 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Middle/card_1000           1.02      3.1±0.04µs 296.8 GElem/sec      1.00      3.1±0.05µs 301.8 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Middle/card_10000          1.00      2.8±0.05µs 332.6 GElem/sec      1.12      3.1±0.46µs 297.2 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Middle/card_100000         1.10     23.7±0.30µs 39.2 GElem/sec       1.00     21.5±0.25µs 43.3 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Start/card_1000            1.00      2.9±0.03µs 321.1 GElem/sec      1.00      2.9±0.04µs 320.5 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Start/card_10000           1.00    623.6±7.76ns 1493.6 GElem/sec     1.06   661.5±44.34ns 1408.0 GElem/sec
select/enc_"RLE encoder"/rows_1000000/loc_Start/card_100000          1.00   954.4±18.68ns 975.9 GElem/sec      2.94      2.8±0.89µs 331.9 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_End/card_10000            1.01      7.0±0.09µs 1335.5 GElem/sec     1.00      6.9±0.10µs 1353.8 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_End/card_100000           1.06     42.8±0.78µs 217.6 GElem/sec      1.00     40.4±0.49µs 230.7 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_End/card_1000000          1.00    397.9±6.26µs 23.4 GElem/sec       1.09    433.3±5.78µs 21.5 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Middle/card_10000         1.03      5.2±0.05µs 1779.4 GElem/sec     1.00      5.1±0.17µs 1840.2 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Middle/card_100000        1.00     20.3±0.21µs 458.9 GElem/sec      1.15     23.4±0.30µs 397.9 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Middle/card_1000000       1.18    211.4±3.28µs 44.1 GElem/sec       1.00    178.5±2.56µs 52.2 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Start/card_10000          1.00      3.0±0.04µs 3091.2 GElem/sec     1.00      3.0±0.08µs 3079.4 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Start/card_100000         1.00   785.1±10.39ns 11862.8 GElem/sec    2.48  1948.8±44.72ns 4778.9 GElem/sec
select/enc_"RLE encoder"/rows_10000000/loc_Start/card_1000000        1.00      6.5±0.07µs 1433.0 GElem/sec     2.07     13.5±0.16µs 692.3 GElem/sec
2021-08-25 11:19:58 +01:00
kodiakhq[bot] 6508f46667
Merge pull request #2394 from influxdata/er/refactor/read_buffer/table_arg
refactor: remove redunant argument
2021-08-25 10:07:38 +00:00
Edd Robinson f3c57c47fa
Merge branch 'main' into er/refactor/read_buffer/table_arg 2021-08-25 10:30:12 +01:00
kodiakhq[bot] 16f55fff4d
Merge pull request #2400 from influxdata/crepererum/rub_shrink_rle
feat: make `RLE` a bit smaller by capacity-based allocation
2021-08-25 09:06:04 +00:00
kodiakhq[bot] c98723e3b3
Merge branch 'main' into crepererum/rub_shrink_rle 2021-08-25 08:58:22 +00:00
kodiakhq[bot] 2a02d5eb72
Merge pull request #2401 from influxdata/nojemalloc
feat: Disable jemalloc/heappy and use system allocator
2021-08-25 08:57:47 +00:00