Commit Graph

550 Commits (b8f7319704ea7f802843a5e383769d917c9c29ac)

Author SHA1 Message Date
Raphael Taylor-Davies dd422492e2
feat: sort order in schema (#1357) (#1667)
* feat: sort order in schema (#1357)

* chore: review feedback

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-14 18:10:41 +00:00
kodiakhq[bot] fc1b5ea165
Merge branch 'main' into crepererum/parquet_metadata_wrapper 2021-06-14 11:20:39 +00:00
Marco Neumann 518f7c6f15 refactor: wrap upstream parquet MD into struct + clean up interface
This prevents users from `parquet_file::metadata` to also depend on
`parquet` directly. Furthermore they don't need to important dozend of
functions and can instead just use `IoxParquetMetaData` directly.
2021-06-14 13:17:01 +02:00
Andrew Lamb 0d8d32fd8f
chore: Update deps to get latest arrow (#1708)
* chore: Update deps to get latest arrow

* fix: Update to rust 1.52

* fix: clippy
2021-06-14 11:08:09 +00:00
Marco Neumann 898c638630 feat: wire up catalog checkpointing
Closes #1381.
2021-06-14 10:08:32 +02:00
Marko Mikulicic 5a68abaa53
feat: Implement LayeredTracing
__Rationale__

We currently use the `tracing` framework to output to both log outputs (e.g. stdout for k8s) and distributed tracing collectors (e.g. opentelemetry jaeger).

However, due to a limitation in the `tracing` SDK, we can only have one "filter" level that applies
to both logs and tracing outputs. This is unpractical because tracing collectors are designed
to receive high verbosity data (which will be then sampled within the opentelemetry library),
while logs generally are limited to the DEBUG level on production.

This PR adds a `FilteredLayer` tracing subscriber layer, that wraps a subscriber layer with a independent
filter, which can filter events goint to the wrapper subscriber layer more agressively than the global layer.

This will allow us to emit logs at INFO or DEBUG level while passing all events to opentelemetry at TRACE
level (and opentelemetry SDK will then sample the events so that only a small part will be sent to the
ot collector)

__Note__

This PR just implements the `FilteredLayer` and a test. Another PR will integrate this with
our log/tracing setup code.
2021-06-11 04:32:47 +02:00
Raphael Taylor-Davies 11b25b3aaf
refactor: swap order of partition and table in in-memory catalog (#1678)
* refactor: swap order of partition and table in in-memory catalog

* chore: review feedback

* chore: validate panic message

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-10 16:40:30 +00:00
Marco Neumann 876f642860 test: add benchmark for catalog loading 2021-06-10 15:42:21 +02:00
Andrew Lamb 29dadba4f3
chore: update dependencies (#1673)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-10 13:01:31 +00:00
Andrew Lamb a614fef5bc
chore: remove more unused dependencies (#1658)
* chore: remove more unused deps

* refactor: move benchmarks into server_benchmarks crate
2021-06-09 10:17:20 +00:00
Raphael Taylor-Davies 07c4277ca7
refactor: schema merge to give more control over field merging (#1653)
* refactor: schema merge to give more control over field merging

* chore: review feedback
2021-06-09 06:30:45 +00:00
Andrew Lamb e9834a907c
feat: Prune on boolean column predicates too (#1629)
* chore: update deps to get latest DataFusion

* fix: enable boolean pruning tests

* fix: update explain plan tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 16:51:30 +00:00
kodiakhq[bot] 87297f7db4
Merge branch 'main' into cn/delete 2021-06-07 13:32:42 +00:00
Raphael Taylor-Davies 5749a2c119
chore: cleanup legacy TSM -> parquet code (#1639)
* chore: cleanup legacy parquet code

* chore: remove tests of removed functionality

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-07 12:59:33 +00:00
Carol (Nichols || Goulding) f4a9a5ae56 fix: Remove write buffer 2021-06-04 14:40:17 -04:00
Andrew Lamb 42f26b609b
refactor: Move `query_tests` and `server_benchmarks` into their own crate --> smaller `server` (#1628)
* refactor: Separate query_tests into its own crate

* fix: references

* refactor: break out server benchmarks

* fix: Update query_tests/src/lib.rs

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2021-06-04 17:31:19 +00:00
Marco Neumann bbd73e59be feat: jitter background clean-up job + wait on first job 2021-06-03 11:23:29 +02:00
Marco Neumann 0a625b50e6 feat: store transaction timestamp in preserved catalog 2021-06-02 09:41:19 +02:00
Andrew Lamb 83b2eacea6
chore: update deps for datafusion (#1601) 2021-06-01 20:00:39 +00:00
Andrew Lamb c0e4e6951a
chore: update datafusion (#1594) 2021-06-01 15:55:07 +00:00
Andrew Lamb d3711a5591
refactor: Use ParquetExec from DataFusion to read parquet files (#1580)
* refactor: use ParquetExec to read parquet files

* fix: test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-06-01 14:44:07 +00:00
Andrew Lamb 3338ddcca9
chore: update dependencies (#1593) 2021-06-01 14:05:56 +00:00
Andrew Lamb 73cedd2f88
chore: remove unused dependency (#1587)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-31 14:22:11 +00:00
Andrew Lamb d50c7c8919
chore: remove unused dependency (#1581) 2021-05-31 09:58:10 +00:00
Andrew Lamb 00e735ef0d
chore: remove unused dependencies (#1583) 2021-05-29 10:31:57 +00:00
Raphael Taylor-Davies d8f19348bf
feat: per-column dictionaries in MUB (#1570)
* feat: per-column dictionaries in MUB

* chore: fmt

* refactor: remove chunk-level dictionary

* chore: remove redundant sort

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-28 13:51:56 +00:00
kodiakhq[bot] 166851d952
Merge branch 'main' into crepererum/in_file_metadata 2021-05-26 07:39:53 +00:00
Andrew Lamb 638d754e0f
chore: Update datafusion + other deps (#1557) 2021-05-25 22:32:14 +00:00
Raphael Taylor-Davies c2fd85209c
feat: wait for task shutdown on DedicatedExecutor (#1537) 2021-05-25 11:33:55 +00:00
Marco Neumann 19a2733d30 feat: preserve transaction metadata in parquets 2021-05-25 09:56:12 +02:00
Andrew Lamb 14ba25f86d
chore: Update datafusion and use released version of arrow crates (#1546)
* chore: Update datafusion and use released version of arrow crate

* fix: Update for change in API
2021-05-24 15:37:22 +00:00
Edd Robinson 634ceb886b feat: add Dictionary Values type 2021-05-20 10:49:49 +01:00
Marko Mikulicic ce2f8351be
fix: Cache outbound gRPC connections 2021-05-19 18:28:45 +02:00
Marko Mikulicic 40b21dbca1
feat: Add /debug/pprof/profile 2021-05-19 13:08:42 +02:00
Andrew Lamb 133ce12827
chore: update deps (#1499)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-14 20:22:16 +00:00
Raphael Taylor-Davies f9178dbb5f
feat: push metrics into catalog (#1488)
* feat: push metrics into catalog

* chore: minor cleanup

* fix: include db labels in chunk metric domains

* chore: fmt

* fix: don't allow dropping moving chunks

* chore: further tweaks

* chore: review feedback

* feat: use new_unregistered() for metric instruments instead of default

* chore: use &[KeyValue] instead of &Vec<KeyValue>

* refactor: make GauageValue non default constructible
2021-05-14 17:37:39 +00:00
Andrew Lamb ebea554e5a
feat: Report num_cpus seen by IOx on startup (#1484) 2021-05-12 13:42:16 +00:00
Marco Neumann 0b1ef52481
chore: upgrade arrow and datafusion (#1483) 2021-05-12 13:08:37 +00:00
Raphael Taylor-Davies b02105e47b
feat: construct StringDictionary from PackedStringArray (#1475)
* feat: construct StringDictionary from PackedStringArray

* chore: fix formatting

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-11 20:16:25 +00:00
Raphael Taylor-Davies 4409d2c8af
feat: instrument catalog locks (#1464)
* feat: instrument catalog locks (#1355)

* chore: add metrics test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-11 18:59:11 +00:00
Raphael Taylor-Davies c85d1574eb
feat: move dictionary and bitset into arrow_utils (#1459)
* feat: move dictionary and bitset into arrow_utils

* chore: review feedback

* chore: remove redundant dictionary methods

* chore: consistent type parameter name in PackedStringArray

* chore: review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-11 16:43:38 +00:00
Edd Robinson 3622a92c8b feat: wire in rb column metrics 2021-05-11 13:00:52 +01:00
Andrew Lamb b6290f1ff3
chore: update deps (#1466) 2021-05-10 22:14:08 +00:00
Raphael Taylor-Davies e2e3c9f77c
feat: workaround bug/limitation in OT handling of observers (#1457)
* feat: workardound bug/limitation in OT handling of observers

* chore: fix lints

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-07 18:53:28 +00:00
Edd Robinson 8ccc359cab refactor: address PR feedback 2021-05-07 13:48:44 +01:00
Edd Robinson eae3fec571 feat: wire up regex UDF as predicate filter expr 2021-05-07 13:44:51 +01:00
Edd Robinson 3fc2c9fc04 feat: add DataFusion regex match operator
This commit adds a new custom UDF to IOx that provide a regex operator to Datafusion plans.
Effectively it allows predicates to contain regex operators that are applied as filters, only allowing rows that satisfy the regex to be returned.

I did not use the Arrow regex kernel for this work because that does not return a boolean array indicating which rows matched a regex, but instead returns a new string array of results. This doesn't work well with DF's approach to filtering.
2021-05-07 13:44:51 +01:00
Raphael Taylor-Davies 216903a949
refactor: move protobuf conversion logic to generated_types (#1437)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 15:49:27 +00:00
Raphael Taylor-Davies 10f89a3e8d
refactor: split entry out into separate crate (#1428)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-06 11:36:23 +00:00
Raphael Taylor-Davies b153fc9ef3
chore: remove unused dependency from influxdb_line_protocol (#1435) 2021-05-06 10:13:00 +00:00
Raphael Taylor-Davies 01ba48c1ba
feat: instrumentable RwLocks (#1355) (#1421) 2021-05-06 08:36:30 +00:00
Raphael Taylor-Davies 7e2e2c4caf
refactor: initial split of observability_deps (#1429) 2021-05-06 08:25:09 +00:00
Andrew Lamb 86771ea629
chore: update arrow/datafusion deps (#1433)
* chore: update datafusion deps

* chore: update arrow deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-05 22:37:31 +00:00
Raphael Taylor-Davies ca1c698fd0
chore: update hashbrown (#1430)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-05-05 22:32:46 +00:00
kodiakhq[bot] 7e01650c53
Merge branch 'main' into reflection 2021-05-05 20:29:13 +00:00
Marko Mikulicic 0d6d94dc00
feat: Enable gRPC reflection 2021-05-05 22:04:47 +02:00
Raphael Taylor-Davies 411cf134e9
refactor: explode arrow_deps (#1425)
* refactor: explode arrow_deps

* chore: workaround doctest bug
2021-05-05 16:59:12 +00:00
Marco Neumann 1f42eb89cd feat: implement parquet metadata handling
Closes #1379 and contributes to #1380.
2021-05-05 13:29:16 +02:00
Marco Neumann 80d78c2a67 feat: re-export matching `parquet_format` in `arrow_deps` 2021-05-05 13:29:16 +02:00
Marco Neumann 34754ebcdb refactor: move MemoryStream to arrow_deps 2021-05-05 13:29:16 +02:00
kodiakhq[bot] ce568b4b2f
Merge branch 'main' into crepererum/issue1253 2021-05-03 11:11:42 +00:00
Marko Mikulicic b579ef8646
feat: Add jemalloc stats 2021-05-03 12:10:48 +02:00
Marco Neumann 136c35cb88 feat: implement transaction handling for catalog
Closes #1253.
2021-05-03 10:04:35 +02:00
Marko Mikulicic 234c899631
feat: Add gauge metric 2021-04-30 14:10:42 +02:00
Edd Robinson 13fbf2e68d refactor: plumb registry to gRPC server 2021-04-29 14:00:05 +01:00
Andrew Lamb 74d35ce9a4
chore: update deps (#1365) 2021-04-29 10:52:43 +00:00
Andrew Lamb a64e622f6e
feat: Add interactive SQL repl (#1332)
* feat: Add interactive SQL repl

* fix: try and fix test

* fix: remove test for prompt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-28 20:41:02 +00:00
Raphael Taylor-Davies 7ca1da3fcd
feat: pushdown table and partition key predicates to catalog (#736) (#1327)
* feat: catalog predicate pushdown (#736)

* chore: fix lints

* chore: review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-27 15:31:47 +00:00
Raphael Taylor-Davies 0a835436ac
feat: use bitmasks within MUB (#1274) (#1289)
* feat: use bitmasks within MUB (#1274)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-26 18:00:16 +00:00
Raphael Taylor-Davies e393d16e36
chore: update arrow versions (#1300) 2021-04-26 10:11:25 +00:00
Edd Robinson 87de656d23 feat: metrics test helpers 2021-04-23 15:58:48 +00:00
Edd Robinson 97b2369140 refactor: swap existing metrics for THE NEW WAY 2021-04-23 15:58:48 +00:00
Edd Robinson ea909f45ad feat: RED metrics 2021-04-23 15:58:48 +00:00
Raphael Taylor-Davies 74c25f541d
feat: fast MUB dictionary arrow conversion (#1273)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-22 20:31:44 +00:00
Raphael Taylor-Davies fe4fa29930
feat: MUB benchmarks (#635) (#1271) 2021-04-22 11:15:32 +00:00
Marko Mikulicic 83d6550316 feat: Implement write_entry_downstream 2021-04-21 20:50:46 +00:00
Carol (Nichols || Goulding) f0b93c5c8c refactor: Rename the wal crate to write_buffer 2021-04-21 17:43:03 +00:00
Raphael Taylor-Davies fdb22c9d9b
chore: update tonic to 0.4.2 (#1268) 2021-04-21 11:54:03 +00:00
kodiakhq[bot] dc6637b448
Merge branch 'main' into jgm-tracing-logging 2021-04-20 20:16:40 +00:00
Edd Robinson 895827aa98 chore: disable SIMD in Arrow 2021-04-20 17:30:50 +00:00
Jacob Marble 5539887cd6 chore: put env_logger back 2021-04-19 15:48:30 -07:00
Jacob Marble 87396edc56 chore: incremental dependency update
This fixes a metrics CI error, and introduces two new CI errors.
2021-04-19 15:48:30 -07:00
Jacob Marble df9dd15ab8 feat(tracing): improve logs and tracing
Logs and traces are emitted via one pipeline. For now, it is not
possible to emit both, but it should be possible in a few weeks, as
tokio/tracing/tracing-subscriber is going through some refactoring recently.

All affected flags are well-documented, and I have tested all but the
OTLP output flags.

chore: clippy happy

chore: revert instrumentation changes

feat: add log format logfmt, log destinations stderr, stdout

chore: clippy happy
2021-04-19 15:48:30 -07:00
Marco Neumann fd0da7e74a chore: upgrade arrow and Rust
See https://github.com/apache/arrow/pull/10082 for upstream PR.
2021-04-19 14:00:04 +02:00
Edd Robinson 4b706141de refactor: log new row groups added to RB 2021-04-19 10:25:57 +00:00
Nga Tran 4c23ca8888 feat: full implementation of parquet's read_filter for review 2021-04-16 16:03:24 -04:00
Marko Mikulicic 878b1b318e feat: Initial scaffolding for routing layer
Part of #916

Adding first-class concept of ShardId in shard config, fixes #1156

NEXT:

- [ ] implement sharder
- [ ] implement `write_entry_downstream`
- [ ] add tests
2021-04-15 09:02:47 +00:00
Edd Robinson a3fc5e2474 refactor: change sync::RwLock to parking_lot 2021-04-14 19:18:03 +00:00
Andrew Lamb f5f768d750
feat: Add a dedicated threadpool for running queries (#1191)
* feat: use a dedicated tokio threadpool for running queries

* feat: plumb number of executor threads through to command line

thread through command line

* fix: Logical merge conflict

* fix: another logical conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-14 10:48:09 +00:00
Edd Robinson 9834c845db test: add influxrpc tag_values benches
The initial benchmarks look like this on my i9 MBP:

```
Data in one open chunk and one closed chunk of mutable buffer/tag0/no_pred           1.00     91.0±2.55ms        ? ?/sec
Data in one open chunk and one closed chunk of mutable buffer/tag0/with_pred         1.00     11.5±0.72ms        ? ?/sec
Data in one open chunk and one closed chunk of mutable buffer/tag1/no_pred           1.00    120.3±5.10ms        ? ?/sec
Data in one open chunk and one closed chunk of mutable buffer/tag1/with_pred         1.00     11.2±0.22ms        ? ?/sec
Data in one open chunk and one closed chunk of mutable buffer/tag2/no_pred           1.00    203.2±8.45ms        ? ?/sec
Data in one open chunk and one closed chunk of mutable buffer/tag2/with_pred         1.00     11.2±0.21ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag0/no_pred      1.00    100.3±3.73ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag0/with_pred    1.00     31.2±1.80ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag1/no_pred      1.00    126.7±2.29ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag1/with_pred    1.00     33.0±1.70ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag2/no_pred      1.00    212.0±6.86ms        ? ?/sec
Data in open chunk of mutable buffer, and one chunk of read buffer/tag2/with_pred    1.00     18.1±0.99ms        ? ?/sec
Data in single open chunk of mutable buffer/tag0/no_pred                             1.00     98.7±6.08ms        ? ?/sec
Data in single open chunk of mutable buffer/tag0/with_pred                           1.00     11.2±0.37ms        ? ?/sec
Data in single open chunk of mutable buffer/tag1/no_pred                             1.00    118.9±3.97ms        ? ?/sec
Data in single open chunk of mutable buffer/tag1/with_pred                           1.00     11.7±0.64ms        ? ?/sec
Data in single open chunk of mutable buffer/tag2/no_pred                             1.00    202.1±8.49ms        ? ?/sec
Data in single open chunk of mutable buffer/tag2/with_pred                           1.00     11.1±0.27ms        ? ?/sec
Data in two read buffer chunks/tag0/no_pred                                          1.00    109.2±5.20ms        ? ?/sec
Data in two read buffer chunks/tag0/with_pred                                        1.00     44.2±1.83ms        ? ?/sec
Data in two read buffer chunks/tag1/no_pred                                          1.00    132.9±3.79ms        ? ?/sec
Data in two read buffer chunks/tag1/with_pred                                        1.00     41.7±2.43ms        ? ?/sec
Data in two read buffer chunks/tag2/no_pred                                          1.00    222.4±7.00ms        ? ?/sec
Data in two read buffer chunks/tag2/with_pred                                        1.00     27.9±0.92ms        ? ?/sec
```
2021-04-14 09:36:39 +00:00
kodiakhq[bot] 8e0ee48018
Merge branch 'main' into ntran/query_local_parquet 2021-04-13 22:38:56 +00:00
Andrew Lamb 518df742df
chore: update arrow deps (#1195) 2021-04-13 18:05:03 +00:00
Nga Tran 4a6d6bd7ad feat: initial work for querying data from parquet file in object store 2021-04-13 13:57:46 -04:00
Raphael Taylor-Davies 55a77914b1
feat: basic snapshot caching (#1184) 2021-04-13 17:10:28 +00:00
Nga Tran 7f77a01e61 chore: merged main to branch and resolved conflicts 2021-04-12 12:09:04 -04:00
Nga Tran 453aeaf1a0 feat: Add tests for writing RB chunks to Object Store 2021-04-09 17:39:23 -04:00
Jake Goulding 16cc37e7f3
chore: update cloud-storage to 0.9 (#1165)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-09 14:37:51 +00:00
Marko Mikulicic e76980928b feat: Implement Update API 2021-04-08 22:25:36 +00:00
Carol (Nichols || Goulding) 1552c0113a Merge remote-tracking branch 'origin/main' into feature-query 2021-04-08 14:40:12 -04:00
Edd Robinson a34de76c49 refactor: wire read buffer tracker in 2021-04-08 18:20:37 +00:00
Marko Mikulicic df5349406a feat: Switch to jemalloc 2021-04-08 14:19:35 +00:00
Andrew Lamb a1ee443f2a
chore: update arrow deps (#1148) 2021-04-07 22:08:58 +00:00
Carol (Nichols || Goulding) ebb6bbd13c test: start of integration tests of influxdb2 client against influxdb 2.0 OSS 2021-04-07 14:09:39 -04:00
Raphael Taylor-Davies c2355aca6d
feat: add basic memory tracking (#1125)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-07 15:38:24 +00:00
Andrew Lamb 7cc9f06e74
chore: Update arrow / datafusion deps again (#1126)
* chore: Update arrow dependencies

* test: add test for SHOW COLUMNS

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-04-06 12:41:41 +00:00
Raphael Taylor-Davies 9a2e636d8c
refactor: move tracking utilities into separate crate (#1124) 2021-04-06 11:43:11 +00:00
Nga Tran 6e01fbc382 feat: ause TableSummary as metadata for parquet chunk's tables and read buffer's read_filter ot get data 2021-04-05 15:37:34 -04:00
Nga Tran 59bd66a33a refactor: Merge remote-tracking branch 'origin' into ntran/write_parquet_3 2021-04-05 09:40:08 -04:00
Jacob Marble 80d55d0829 chore: rename tracing_deps to observability_deps
OpenTelemetry makes this necessary.
2021-04-02 13:14:30 -07:00
Nga Tran 4bdf8963e6 feat: continue buidling foundation for writing RB chunks to parquet files 2021-04-02 16:06:25 -04:00
Jacob Marble e885cf072e feat: add Prometheus metrics endpoint
This change adds HTTP endpoint /metrics. Now, I need to add a metric.
2021-04-02 09:35:16 -07:00
Carol (Nichols || Goulding) 0b880d3534 chore: Group all tracing-related crates under one crate for easier upgrade management 2021-04-02 09:54:39 -04:00
Aakash Hemadri 9952de57ff
fix: Update Cargo.lock for influxdb_client
Add `url` to Cargo.toml

Signed-off-by: Aakash Hemadri <aakashhemadri123@gmail.com>
2021-04-02 10:35:43 +05:30
Andrew Lamb 18ca5c5292 refactor: Implement `as_str()` directly rather than using a crate 2021-04-01 21:22:04 +00:00
Raphael Taylor-Davies c0abeba792 feat: system tables 2021-04-01 21:22:04 +00:00
Nga Tran 49267114d3 chore: merge main into branch and resolve conflicts 2021-04-01 13:22:49 -04:00
Andrew Lamb a59a6edbb8
feat: add SHOW TABLES and `select * from information_schema.columns` (#1102)
* chore: update dependencies

* test: add tests for new information schema
2021-04-01 12:50:37 +00:00
Marko Mikulicic 830676d843 chore: Add the serde feature to bytes dep 2021-04-01 09:00:29 +00:00
Nga Tran 19a453a483 feat: finally have some framework with clear todos for writing a chunk into parquet files 2021-03-31 16:21:53 -04:00
Andrew Lamb b61875e0b2
feat: Add TableSummary calculation to ReadBuffer (#1092) 2021-03-31 16:26:37 +00:00
Nga Tran cd409b471f feat: continue the implementation 2021-03-30 21:31:51 -04:00
Andrew Lamb 7154dfd5f6
feat: Add timestamps to ChunkSummary (#1079)
* refactor: Move timestamps from mutable_buffer::Chunk to catalog::Chunk

* feat: Add timestamps to ChunkSummary

* feat: Add timestamp conversion logic to protobuf types

* test: Add tests

* fix: Update data_types test

* fix: handle negative nanos during conversion

* fix: clippy

* fix: more clippy

* fix: even more clippy

* fix: even more clippy
2021-03-30 19:03:23 +00:00
Nga Tran a630c119ab feat: make it easy to get OperationMetadata from Operation 2021-03-30 12:57:11 -04:00
Marko Mikulicic 569099fc6e feat: Derive serde for protos
Rationale
---------

Our CLI needs to be able to accept configuration as JSON and render configuration as JSON.

Protobufs technically have an official JSON encoding rule called 'jsonpb` but prost doesn't
offer native supprot for it.

`prost` allows us to specify arbitrary derive metadata to be added to generated
code. We emit the `serde` derive directives in the two packages that generate prost code
(`generated_types` and `google_types`).

We use the `serde(rename_all = "camelCase")` to approximate `jsonpb`.

We instruct `prost` to use `bytes::Bytes` for some types, hence we must turn on the `serde` feature
on the `bytes` dependency.

We also use json to serialize the output of the `database get` command, to showcase the feature
and get rid of a TODO. In a subsequent PR I'll teach `database create` (and the yet to be done `database update`) to accept an option JSON configuration body so we can configure partitioning, lifecycle, sharding etc rules etc.

Caveats
-------

This is not technically `jsonpb`. Main issues:
1. default values not omitted
2. no special rendering of special types like `google.protobuf.Any`

Future work
-----------

Figure out if we can get fully compliant `jsonpb`, or at least a decent approximation.

Effect
------

```console
$ cargo run -- database get foobar_weather
{
  "name": "foobar_weather",
  "partitionTemplate": {
    "parts": [
      {
        "part": {
          "time": "%Y-%m-%d %H:00:00"
        }
      }
    ]
  },
  "lifecycleRules": {
    "mutableLingerSeconds": 0,
    "mutableMinimumAgeSeconds": 0,
    "mutableSizeThreshold": 0,
    "bufferSizeSoft": 0,
    "bufferSizeHard": 0,
    "sortOrder": {
      "order": 2,
      "sort": {
        "createdAtTime": {}
      }
    },
    "dropNonPersisted": false,
    "immutable": false
  },
  "walBufferConfig": null,
  "shardConfig": {
    "specificTargets": null,
    "hashRing": null,
    "ignoreErrors": false
  }
}
```
2021-03-30 15:16:31 +00:00
Andrew Lamb 865910ae5b chore: update dependencies 2021-03-30 09:01:35 -04:00
Wakahisa b33d97c946
refactor: write nanosecond timestamps to parquet (#1030)
* feat: write nanosecond timestamps to parquet

* chore: update arrow deps

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-27 19:00:57 +00:00
Andrew Lamb 790e5d3348
refactor: Add fine grained (object level) catalog locking (#1053)
* refactor: inline catalog crate to server

* refactor: Add fine grained (object level) catalog locking

* fix: Move mod definition and use to top of file

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-26 13:48:24 +00:00
Andrew Lamb 895e808754
chore: Upgrade arrow deps (#1046)
* chore: Upgrade dependencies

* chore: upgrade query for new interfaces

* chore: update read_buffer
2021-03-25 13:35:08 +00:00
Andrew Lamb 44d67db733
feat: Rework Db to use Catalog for chunk state (#1043)
* feat: Rework Db to use Catalog for chunk state

* docs: Update server/src/db.rs

* fix: fmt

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-24 17:57:11 +00:00
Andrew Lamb 4911a01489 feat: Initial catalog objects 2021-03-24 12:30:23 +00:00
Carol (Nichols || Goulding) 52e415ae1b fix: Verify flatbuffers when restoring from a file, and not after that 2021-03-22 09:38:59 -04:00
Carol (Nichols || Goulding) 24af745b23 chore: Upgrade to flatbuffers 0.8 2021-03-22 09:38:58 -04:00
Andrew Lamb 6e1795fda0
refactor: Move some types (not yet exposed to clients) into internal_types (#1015)
* refactor: Move some types (not yet exposed to clients) into internal_types

* docs: Add README.md explaining the rationale

* refactor: remove some stragglers

* fix: fix benches

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: add clippy lints

* fix: fmt

* docs: Apply suggestions from code review

fix typos

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-19 16:27:57 +00:00
Raphael Taylor-Davies 7e6c6d67b4
feat: graceful shutdown (#827) (#1018)
* feat: graceful shutdown (#827)

* chore: additional docs

* chore: further docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-19 15:36:49 +00:00
Raphael Taylor-Davies dd94a33bc7
feat: retain limited tracker history (#1005) 2021-03-17 16:32:34 +00:00
Andrew Lamb 72eff5eed5 chore: update deps (including arrow) 2021-03-16 18:15:44 -04:00
Raphael Taylor-Davies 3fe1b8c5b7
feat: add longrunning operations client (#981)
refactor: add separate format feature influxdb_iox_client

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-16 13:19:44 +00:00
Raphael Taylor-Davies 65f7a1ac5b
fix: use consistent crate versions (#989)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-15 15:42:19 +00:00
kodiakhq[bot] fcd4419702
Merge branch 'main' into pd-routing-rules 2021-03-12 20:02:53 +00:00
Raphael Taylor-Davies 7e25c4e896
feat: add fanout task tracking (#956)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-12 15:01:27 +00:00
Marko Mikulicic 9df4131e60 feat: Add server remote [set|remove|list] commands 2021-03-12 10:41:18 +00:00
Paul Dix 0606203b40 feat: add configuration for routing rules
This is a strawman for what routing rules might look like in DatabaseRules. Once there's a chance for discussion, I'd move next to looking at how the Server would split up an incoming write into separate FB blobs to be sent to remote IOx servers. That might change what the API/configuration looks like as that's how it would be used (at least for writes).

After that it would make sense to move to adding the proto definitions with conversions and gRPC and CLI CRUD to configure routing rules.
2021-03-11 15:25:57 -05:00
Raphael Taylor-Davies d2859a99d0
feat: add google longrunning operations stubs (#959)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-10 17:34:07 +00:00
Andrew Lamb f568c083a4
fix: Do not leave child processes around after the end-to-end test (#955) 2021-03-10 14:25:27 +00:00
Andrew Lamb 1af5cf8b7c
refactor: Move end-to-end test server fixture into its own module (#945)
* refactor: Move test server fixture into its own module

* fix: Update tests/end-to-end.rs

* fix: better error handling and display

* fix: tweak startup message
2021-03-09 19:08:55 +00:00
Andrew Lamb 746373a687
refactor: Remove mutable_buffer crate dependency on query crate (#927) 2021-03-05 11:34:27 +00:00
Andrew Lamb 3abfb5f089
chore: Update arrow deps, turn off optional datafusion features (#930)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-04 22:41:54 +00:00
Raphael Taylor-Davies 5c9dd68bf8
feat: MVP Management CLI (#907)
* feat: MVP CLI implementation

* feat: multiline database description
2021-03-03 17:37:55 +00:00
Nga Tran 957e05ef25 chore: use newly added Arrow's Expr::is_not_null function 2021-03-03 11:46:49 -05:00
Raphael Taylor-Davies 51981c92f5
feat: implement gRPC API and migrate influxdb_iox_client to use it (#853)
* feat: implement gRPC management API

* feat: migrate influxdb_iox_client to use gRPC API

* fix: review comments

* refactor: separate influxdb_iox_client error types

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-03-02 17:51:46 +00:00
Andrew Lamb 86c82be7f1
chore: Update arrow deps, remove custom JsonArrayWriter (#888)
* chore: update dependencies

* refactor: Use Arrow json::ArrayWriter
2021-02-27 10:18:43 +00:00
kodiakhq[bot] 76921bdef9
Merge branch 'main' into alamb/update_deps 2021-02-25 13:24:32 +00:00
Raphael Taylor-Davies ffc20fa821
feat: add basic gRPC health service (#862)
* feat: add basic gRPC health service

* feat: update README.md add /health to HTTP API

* feat: add health client to influxdb_iox_client

feat: end-to-end test health check service

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-25 13:24:12 +00:00
Andrew Lamb d29d7efa8c chore: Update arrow/datafusion deps again 2021-02-25 07:39:01 -05:00
kodiakhq[bot] 3502f96ab9
Merge branch 'main' into cn/google-list-with-delimiter 2021-02-22 19:42:49 +00:00
Jake Goulding 6e6cc616a0 refactor: Switch to parking_lot::Mutex 2021-02-22 13:51:31 -05:00
Jake Goulding 2bd09612bd refactor: Enable parking lot for dependencies 2021-02-22 13:37:35 -05:00
Jake Goulding 6603ecd758 refactor: Unify on once_cell
This is closest in API to what is likely to be added to the standard
library at some point, so let's use it consistently.
2021-02-22 13:37:35 -05:00
Raphael Taylor-Davies dd8b41cdb0
feat: encoding of standard gRPC error details payloads (#846)
* feat: encoding of standard gRPC error details payloads

* feat: return unknown error on failure to encode error details

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-22 18:03:17 +00:00
Carol (Nichols || Goulding) cff12da3a1 fix: Upgrade to released version of cloud_storage
Fixes #801.
2021-02-22 13:01:06 -05:00
Carol (Nichols || Goulding) a42103f436 Merge remote-tracking branch 'origin/main' into cn/google-list-with-delimiter 2021-02-22 12:53:46 -05:00
Carol (Nichols || Goulding) 57942b51b7 feat: Update to latest Azure sdk to get delimiter support
Needed these PRs:
  - https://github.com/Azure/azure-sdk-for-rust/pull/176
  - https://github.com/Azure/azure-sdk-for-rust/pull/179

Also needed to enable the queue feature to get the azure_storage crate
compiling; at the moment, the code is still being reorganized and the
features aren't independent yet:
https://github.com/Azure/azure-sdk-for-rust/issues/177
2021-02-18 14:59:06 -05:00
Marko Mikulicic 536c1724bd feat: Allow to put streams of unknown length to objectstore
Addresses the API aspect of #818

Adds a utility module that helps computing the length of a stream while buffering it
for later replay (in-memory or spilling it in a temporary file).
2021-02-18 16:49:18 +00:00
NGA TRAN 213094f8f7 chore: update Arrow dependencies 2021-02-18 10:02:57 -05:00
Carol (Nichols || Goulding) ef54131afb feat: Gets google cloud list_with_delimiter tests passing 2021-02-17 14:23:33 -05:00
Andrew Lamb 071b13b939
chore: Update dependencies (#821)
* chore: Update dependencies

* fix: update udf implementation for DataFusion update

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-16 23:27:36 +00:00
Andrew Lamb 150baec84c chore: Update arrow dependencies 2021-02-14 05:47:23 -05:00
Raphael Taylor-Davies 6d3c21b952 feat: Use Bytes instead of Vec<u8> in prost generated code
Also add google.rpc.* Protobuf definitions
2021-02-12 17:10:58 +00:00
Andrew Lamb b8f85967dd
feat: Enable/Disable logging in tests via RUST_LOG environment variable (#793)
* feat: Enable/Disable logging in tests via RUST_LOG environment variable

* docs: Add section to contributing

* docs: tweak readme

* fix: Use same logging system in tests as in influxdb_ioxd
2021-02-12 13:43:12 +00:00
Andrew Lamb a03598dfe2
feat: Implement Cross Chunk Schema / RecordBatch merging at query time (#783)
* feat: feat: Implement Cross Chunk Schema / RecordBatch merging at query time

* docs: update comments about NullArray::new_with-type

* docs: Update comments based on code review

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-02-11 18:26:38 +00:00
kodiakhq[bot] a242fc39aa
Merge branch 'main' into jg/flight-client 2021-02-11 14:29:54 +00:00
Raphael Taylor-Davies 7debe94ee6 feat: add background task tracking (#655) 2021-02-11 10:30:19 +00:00
Jake Goulding 699f4a577f feat: Add an optional Flight client to the IOx client library 2021-02-10 10:30:05 -05:00
Raphael Taylor-Davies 143488fae9 feat: add WAL metadata endpoint (#724) 2021-02-08 16:21:34 +00:00
Raphael Taylor-Davies 29314a6118 feat: consistent global error handling and logging 2021-02-04 13:15:17 +00:00
Andrew Lamb d5ebf9c3da
chore: Update deps again (#738) 2021-02-04 06:02:05 -05:00
Jake Goulding a5e09366b0 feat: Export arrow-flight from arrow-deps 2021-02-03 09:56:56 -05:00
Carol (Nichols || Goulding) 0f8ef9c7d5
Merge branch 'main' into cn+jg/osp-types 2021-02-03 09:09:04 -05:00
Andrew Lamb abc26a33c1
chore: Update dependencies (again) (#718)
* chore: Update dependencies (again)

* refactor: update for changes in DataFusion API

* fix: fmt

* fix: clippy
2021-02-02 18:33:01 -05:00
Andrew Lamb 485a59b2f8
feat: Implement logfmt (Heroku) formatted log output (#716)
* feat: add option to output logs formatted via logfmt

* refactor: Apply suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

* fix: add tests for span inclusion

* feat: Also log spans

* fix: bug in normalizer

Co-authored-by: Edd Robinson <me@edd.io>
2021-02-01 16:43:01 -05:00
Carol (Nichols || Goulding) ff6955a433 refactor: Extract a trait for ObjectStoreApi with associated path
This is the promised cleanup. This structure gets rid of a lot of
intermediate structures and encodes through associated types how the
object stores and path types are related.

The enums are still necessary to avoid having generics leak all over
the place, but the object store variants and path variants should always
match because they'll always come from the object store trait
implementations that use the associated types.
2021-02-01 14:56:47 -05:00
Andrew Lamb f3bd8bd0e3
chore: update deps (tokio 1.0 and ecosystem) (#707)
* chore: Update arrow + tokio deps

* chore: Use bleeding edge azure

* chore: Update aws + other deps

* fix: fmt

* fix: Switch to in-house version of routerify

* fix: Upgrade to hyper 0.14

The hyper::error module is now private; hyper::Error is the public
re-export

* fix: Upgrade cloud storage to get tokio upgrade

* fix: Upgrade open_telemetry

* fix: Do not call `panic::set_hook` during another panic

Doing so leads to a double panic which aborts the process.

* fix: new h2 error who dis

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2021-01-29 16:11:55 -05:00
Andrew Lamb 8308fad188
chore: update arrow deps again (#699) 2021-01-26 07:55:30 -05:00
Andrew Lamb 9b6fbae7f5 chore: Bump arrow deps 2021-01-23 08:09:46 -05:00
Andrew Lamb a967e2f1dd
fix: disallow control characters in Database names (#684) 2021-01-21 17:55:55 -05:00
Andrew Lamb c50f9b1baf
Merge branch 'main' into alamb/underscore_in_bucket_names_2 2021-01-21 15:53:27 -05:00
Andrew Lamb 747b96d801
chore: Upgrade arrow dependencies, reduce duplication with upstream (#676) 2021-01-21 08:58:11 -05:00
Andrew Lamb a6b9ff9c91 fix: allow arbitrary characters in org/bucket names 2021-01-20 17:58:15 -05:00
Andrew Lamb 7969808f09
feat: Chunk Migration APIs and query data in the read buffer via SQL (#668)
* feat: Chunk Migration APIs and query data in the read buffer via SQL

* fix: Make code more consistent

* fix: fmt / clippy

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: Remove unecessary Result and make chunks() infallable

* chore: Apply more suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Edd Robinson <me@edd.io>
2021-01-19 13:28:26 -05:00
Andrew Lamb 71627120b9
refactor: consolidate line protocol schema creation into data_types and port code to use it (#663)
* refactor: consolidate line protocol schema creation into data_types, and port code to use it

refactor: Port mutable buffer to use SchemaBuilder

* fix: doctest

* refactor: remove unecessary clippyisms

* docs: Improve comments via suggestions from code review

Co-authored-by: Edd Robinson <me@edd.io>

* refactor: use more idomatic try_ naming and TryInto trait

* docs: Change from line protocol data model to InfluxDB data model

* refactor: rename LP --> Influx in code

* feat: add support for UInteger type

Co-authored-by: Edd Robinson <me@edd.io>
2021-01-15 17:29:30 -05:00
Carol (Nichols || Goulding) 813092649d fix: Make file behave the same as other object stores with paths 2021-01-15 10:25:05 -05:00
Dom a2c0434554
Merge branch 'main' into dom/iox-api-client 2021-01-14 14:35:17 +00:00
Paul Dix 9377dc1943 chore: address pr feedback 2021-01-13 18:15:35 -05:00
Paul Dix 3b1f045f44 feat: add serialization to wal buffer segments
Adds serialization with compression and checksum for WAL buffer segments.

This required a weird structure where the flatbuffer bytes of ReplicatedWrite were kept as a raw payload. I did this because otherwise each of the replicated writes would have been rebuilt in the segment.

The other thing that isn't ideal is that deserializing a segment actually marshals it into a Rust struct as opposed to keeping the entire thing as raw flatbuffers. We could update this later to have a concept of an open segment (regular rust stuct) and closed segments that are just the flatbuffers.
2021-01-13 18:15:34 -05:00
Dom 5647bfeb6f refactor: error handling & typed errors
Refactors the API method errors.

The user of the API client needs to be able to distinguish between various error
states when an API request fails. The most ergonomic way of exposing this
information is by returning an error enum that is specific to each API method
(or at least the important ones with well defined failure modes) - currently
only the `create_database()` method has significant error states, so this is
the only one with a specific error type in this impl.

This change defines a bunch of API error codes in the API client, adds them to
the IOx API error response body, and maps them in the API client. Due to error
wrapping the error code mapping in the IOx server is less exhaustive than I had
hoped however.
2021-01-13 17:32:12 +00:00
Andrew Lamb 38abe9735f
Merge branch 'main' into dom/iox-api-client 2021-01-12 13:19:49 -05:00
Andrew Lamb 2938c8f8fc
feat: implement chunk listing and snapshotting in mutable buffer (#641)
* feat: implement chunk listing and snapshotting in mutable buffer

* fix: update to use latest version of string interner and remove custom clone

* docs: fix comment
2021-01-12 12:46:18 -05:00
Dom e06b989fa6
Merge branch 'main' into dom/iox-api-client 2021-01-12 17:10:51 +00:00
Andrew Lamb c1a7778d85
refactor: move id and deps out of query crate (#646) 2021-01-12 11:47:43 -05:00
Dom 62349edb94 feat: create IOx API client
Initialises a new library crate and implements a basic IOx API client.

The API client supports:
	- ping
	- create database

Care has been taken to abstract away the underlying HTTP client used
(reqwest) and avoid leaking it into the public API (error types is a
common leak!) This makes updating the HTTP client and/or swapping it for
something else a backwards compatible change for end users of the crate.

Outstanding items:
	- move shared API types into a sensible location
	- discriminate between various IOx error responses

The former doesn't need doing until we publish the crate and will likely
be rather invasive / conlict prone so aiming to merge this PR and then
move things around in a follow-up.

The latter would allow us to expose error conditions to the user such
that they can take actions to remidy the situation / know if the request
can or should be retried / etc. Currently we expose a string error
message when requests fail, requiring string matching and/or passing the
string higher in the stack (and thus punting the problem to the caller).
It would be very nice to have typed errors, but a detail I have left for
later.
2021-01-12 16:38:33 +00:00