Commit Graph

67 Commits (6136fd35d128f727f6f601438175a2141959c292)

Author SHA1 Message Date
alamb 5ac0069020 Merge remote-tracking branch 'origin/master' into alamb/take-2-228-index-parse-error 2020-07-15 11:20:40 -04:00
Andrew Lamb 201ad1ae87
fix: Apply suggestions from code review
Update comment to match new name

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-15 11:19:07 -04:00
Carol (Nichols || Goulding) 582e18a241 refactor: Shorten matches 2020-07-15 09:04:59 -04:00
Edd Robinson c0a823f0d2
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:55:38 +01:00
Edd Robinson ceec4c9627
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:55:27 +01:00
Edd Robinson 3096d76d77
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:54:08 +01:00
Edd Robinson 8dcbfcdfb9
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:53:16 +01:00
Edd Robinson bfe83868fc
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:52:53 +01:00
Edd Robinson 999ba44fad
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:52:41 +01:00
Edd Robinson 51fcf59da9
refactor: PR feedback
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-07-13 15:52:31 +01:00
Edd Robinson ec9ed12fcb refactor: move into function on BlockData 2020-07-13 10:48:58 +01:00
Edd Robinson b62810676d feat: add support for merging blocks 2020-07-13 10:39:36 +01:00
Edd Robinson cf31aebe40 feat: determine if a block overlaps another 2020-07-13 10:39:36 +01:00
alamb 64616f21bf fix: allow arbitrary characters after the delimiter in a field key, including unsecaped spaces 2020-07-11 06:17:43 -04:00
Edd Robinson bd5d39f60c refactor: address PR feedback 2020-07-08 22:57:15 +01:00
Edd Robinson 54a61b33fc refactor: remove redundant block type 2020-07-08 22:57:15 +01:00
Edd Robinson eed1e030df feat: add support for multi-block readers
This commit embeds an index within each block materialised from a TSM
index, which can be used later on to identify which TSM block reader
should be used to decode the block.

This essentially lets one coalesce blocks for the same measurement from
different files lazily - that is, you don't need to materialise them
until you're ready, and when you do want to materialise them you know
which file to read from.
2020-07-08 22:57:15 +01:00
Edd Robinson 70cdeb2d08 feat: add ability to merge measurement tables 2020-07-08 22:57:15 +01:00
Edd Robinson fff5577efb refactor: encapsulate mapping logic
This commit moves some of the TSM mapper logic that had leaked into the
TSM->Parquer converter back into the mapper. The refactor allows us to
make some previously public APIs private, whilst still providing a
reasonably flexible API.
2020-07-08 22:57:15 +01:00
alamb 0865196f20 fix: implement PR suggestions 2020-07-08 17:04:08 -04:00
alamb d072cafc23 feat: Better error messages for TSM key parse errors 2020-07-08 09:37:16 -04:00
Andrew Lamb 10079de9b7 fix: Apply more suggestions from code review
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>

Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
2020-07-04 09:00:56 -04:00
alamb 2123797edb fix: better string api usage 2020-07-04 08:46:45 -04:00
alamb 18a5f5312c fix: improve comments 2020-07-04 08:46:45 -04:00
alamb 75fe1217e0 fix: do not copy the key to decode 2020-07-04 08:46:45 -04:00
alamb 06f8dee3dc fix: cleanup error handling 2020-07-04 08:46:45 -04:00
Edd Robinson 831f647b9d feat: implement escaped tsm key parsing 2020-07-04 08:46:45 -04:00
Edd Robinson 95cc07409d perf: avoid bounds check 2020-07-03 15:04:57 +01:00
Edd Robinson 2be6385ade perf: drain block data more efficiently
This commit reduces copying of block data by replacing an inefficient
`remove` call on vectors by with an index tracking approach, leving the
original vectors in place.

It further refactors some of the mapping code DRYing things up.

It improves performance of the `map_field_columns` function by 48%.

```
time:   [137.11 us 137.50 us 137.92 us]
change: [-49.095% -48.558% -48.033%] (p = 0.00 < 0.05)
Performance has improved.
```
2020-07-03 10:56:31 +01:00
Edd Robinson 08058c8b63 refactor: move mock decoder 2020-07-03 10:56:31 +01:00
Edd Robinson b2addf614b refactor: PR feedback
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2020-07-01 15:52:21 +01:00
Edd Robinson 55bf2a44be test: exercise TSM converter 2020-07-01 15:52:21 +01:00
Edd Robinson 414029b96d refactor: use BlockDecoder trait 2020-07-01 15:52:21 +01:00
Edd Robinson a51cfca49a refactor: PR feedback 2020-07-01 13:47:29 +01:00
Edd Robinson 06e9fae845 fix: ignore conflicting field types
Fixes #205.
2020-06-30 18:08:05 +01:00
Carol (Nichols || Goulding) a2888dbbb2 refactor: Use cmp::min instead of an if 2020-06-26 10:36:27 -04:00
Carol (Nichols || Goulding) 201f7bef41 fix: Add an assertion for the boolean encoder format bit 2020-06-26 10:36:27 -04:00
Carol (Nichols || Goulding) a56957944c refactor: Use a constant value instead of calculating 1 2020-06-26 10:36:27 -04:00
Carol (Nichols || Goulding) 9ef07cac14 feat: Implement boolean encoding
Fixes #149.
2020-06-26 10:36:27 -04:00
Edd Robinson 9d889828c3 fix: ensure all rows are emitted for each column 2020-06-26 11:50:37 +01:00
Carol (Nichols || Goulding) 4df99f1a7c style: Enable the clippy warning to use `Self` when recommended
Fixes #158.
2020-06-25 07:38:58 -04:00
Carol (Nichols || Goulding) afcd1efd1e style: Unify lints everywhere
Then fix the failures, mostly by adding derives and then removing some
unneeded (cheap) clones.

Document places where we purposefully don't use the same lints.

Not unifying missing_docs.

👀 https://github.com/rust-lang/cargo/issues/5034
2020-06-25 07:28:42 -04:00
Edd Robinson ec448f361a refactor: enable unisgned block reading 2020-06-23 10:50:32 +01:00
Carol (Nichols || Goulding) 294163bed0 feat: Implement unsigned encoding 2020-06-22 16:52:24 -04:00
Carol (Nichols || Goulding) 89b9dbe9e8 refactor: Slice twice instead of adding 2020-06-22 15:41:49 -04:00
Carol (Nichols || Goulding) 85e442373f test: Verify encoding and decoding invalid UTF-8 2020-06-22 15:41:27 -04:00
Carol (Nichols || Goulding) 264dd96035 test: Add a test for unicode data 2020-06-22 15:33:47 -04:00
Carol (Nichols || Goulding) 683205ad03 refactor: Use `Vec::clear` instead of `Vec::truncate(0)` 2020-06-22 15:32:15 -04:00
Carol (Nichols || Goulding) 1e341a7321 fix: Encode and decode string data as bytes
String data isn't guaranteed to be UTF-8
2020-06-22 15:32:14 -04:00
Carol (Nichols || Goulding) 672d3fe668 fix: Assert that encoded strings' lengths fits in an i32 2020-06-22 15:19:19 -04:00