Carol (Nichols || Goulding)
423ee71f5e
refactor: Remove duplicated lint rules
...
These get inherited from crate root files, so the lint rules in
src/main.rs apply in this file already.
2020-06-24 16:56:16 -04:00
Andrew Lamb
3bb3f2ddbd
Merge pull request #185 from influxdata/alamb/fix-parquet-nulls
...
fix: Correctly encode nulls in parquet files
2020-06-24 11:51:31 -04:00
alamb
0fdc6aa745
test: add test for packing null values
2020-06-24 11:34:40 -04:00
alamb
431787fb31
Merge remote-tracking branch 'origin/master' into alamb/fix-parquet-nulls
2020-06-24 11:29:07 -04:00
Andrew Lamb
de600b7712
fix: Apply suggestions from code review
...
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-24 09:44:08 -04:00
Andrew Lamb
ab22384009
Merge pull request #186 from influxdata/alamb/refactor-parquet-deps
...
refactor: clean up parquet library deps and remove use of InputReaderAdapter (related to parquet dependencies)
2020-06-24 09:42:44 -04:00
Carol (Nichols || Goulding)
6fb107af68
Merge pull request #178 from influxdata/cn-u64-enc
2020-06-24 08:48:57 -04:00
alamb
2c4a9dba53
fix: cleanup comment + code order
2020-06-23 17:21:20 -04:00
alamb
b22423621b
refactor: remove InputReaderAdapter
2020-06-23 17:15:02 -04:00
alamb
68ce351a3a
refactor: remove direct parquet dependency from delorean_ingest
2020-06-23 16:58:31 -04:00
Andrew Lamb
16bf5887df
fix: Setup parquet column encoding correctly ( #182 )
...
FYI @e-dard
2020-06-23 16:42:44 -04:00
alamb
c9b24f3762
fix: Correctly encode nulls in parquet files
2020-06-23 12:23:47 -04:00
alamb
eee1e9fe77
fix: Setup parquet column encoding correctly
2020-06-23 09:54:16 -04:00
Edd Robinson
ec448f361a
refactor: enable unisgned block reading
2020-06-23 10:50:32 +01:00
Andrew Lamb
943a6cd299
feat: benchmark for lp->parquet performance ( #176 )
2020-06-23 05:44:52 -04:00
Andrew Lamb
322a491b9d
perf: Improve line protocol --> parquet conversion performance by ~20% ( #177 )
...
* feat: benchmark for lp->parquet performance
* feat: improve parser performance by storing contiguous EscapedStr
* fix: remove all string copies during LP-Parquet conversion
* refactor: Implement from_str as From<&str> only
* refactor: implement Deref instead of as_str
* refactor: Remove ends_with because Deref now makes it work
* refactor: Eq can be derived
* refactor: Remove unused From implementation
* refactor: Replace single-character strings with chars as requested by clippy
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
2020-06-23 05:42:19 -04:00
Andrew Lamb
86a425e5ef
feat: Add support for parsing bool values in line protocol parser ( #156 )
...
* feat: Implement boolean support for the line protcol parser
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fmt+clippy
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-22 16:58:38 -04:00
Carol (Nichols || Goulding)
294163bed0
feat: Implement unsigned encoding
2020-06-22 16:52:24 -04:00
Andrew Lamb
2a42df278a
docs: Initial style guide with idomatic error handling ( #174 )
...
* docs: Initial style guide with idomatic error handling
* fix: Apply suggestions from code review
Co-authored-by: Paul Dix <paul@influxdata.com>
* fix: Apply suggestions from code review
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: clean up example
To not to use different field name
Co-authored-by: Paul Dix <paul@influxdata.com>
Co-authored-by: Jake Goulding <jake.goulding@integer32.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2020-06-22 16:41:36 -04:00
Carol (Nichols || Goulding)
1f6effba91
Merge pull request #163 from influxdata/cn-string-enc
2020-06-22 16:10:58 -04:00
Carol (Nichols || Goulding)
89b9dbe9e8
refactor: Slice twice instead of adding
2020-06-22 15:41:49 -04:00
Carol (Nichols || Goulding)
85e442373f
test: Verify encoding and decoding invalid UTF-8
2020-06-22 15:41:27 -04:00
Carol (Nichols || Goulding)
264dd96035
test: Add a test for unicode data
2020-06-22 15:33:47 -04:00
Carol (Nichols || Goulding)
683205ad03
refactor: Use `Vec::clear` instead of `Vec::truncate(0)`
2020-06-22 15:32:15 -04:00
Carol (Nichols || Goulding)
1e341a7321
fix: Encode and decode string data as bytes
...
String data isn't guaranteed to be UTF-8
2020-06-22 15:32:14 -04:00
Carol (Nichols || Goulding)
672d3fe668
fix: Assert that encoded strings' lengths fits in an i32
2020-06-22 15:19:19 -04:00
Carol (Nichols || Goulding)
df75db6870
refactor: Remove some unneeded type annotations
2020-06-22 15:17:03 -04:00
Carol (Nichols || Goulding)
8bc25e92bf
refactor: Shorten unused cases
2020-06-22 15:15:37 -04:00
Carol (Nichols || Goulding)
d7dbf061cb
feat: Implement String encoding/decoding
...
Fixes #148 .
2020-06-22 15:15:34 -04:00
Carol (Nichols || Goulding)
bf884ff3d3
refactor: Extract a constant for max varint size for 64-bit integers
2020-06-22 14:53:53 -04:00
Carol (Nichols || Goulding)
4a91a8b45f
refactor: Remove unneeded lifetime annotations
2020-06-22 14:53:53 -04:00
Carol (Nichols || Goulding)
f2fc4a6d43
chore: Remove or change scope for outdated dead_code allows
2020-06-22 14:53:53 -04:00
Edd Robinson
2768b15bf4
Merge pull request #168 from influxdata/er/tsm-parquet
...
feat: Add support for converting TSM files into Parquet
2020-06-22 19:10:17 +01:00
Edd Robinson
b3e78d712d
refactor: address PR feedback
2020-06-22 18:56:17 +01:00
Edd Robinson
844625d811
fix: down-sample timestamps to μs
2020-06-22 18:56:17 +01:00
Edd Robinson
e507183fbd
refactor: cleanup + clippy
2020-06-22 18:56:17 +01:00
Edd Robinson
4bbeac7a1c
refactor: extend packers
2020-06-22 18:56:17 +01:00
Edd Robinson
106bd69b5a
feat: support converting from TSM->Parquet
2020-06-22 18:56:17 +01:00
Edd Robinson
9006af8961
feat: support converting from BlockType
2020-06-22 18:56:17 +01:00
Edd Robinson
3c24b6e10e
refactor: small API change
2020-06-22 18:56:17 +01:00
Edd Robinson
5f40974752
refactor: don't error on string blocks
2020-06-22 18:56:17 +01:00
Edd Robinson
353c7a618b
fix: ensure short blocks decode correctly
2020-06-22 18:56:17 +01:00
Edd Robinson
68a1d5355d
refactor: simplify block types
2020-06-22 18:56:17 +01:00
Edd Robinson
621f2f91f0
refactor: hoist tsm mapper to delorean_tsm
2020-06-22 18:56:17 +01:00
Edd Robinson
f046dbeea0
refactor: organise code in delorean_tsm crate
2020-06-22 18:56:17 +01:00
Edd Robinson
0ca6fdfa5f
refactor: StorageError -> TSMError
2020-06-22 18:56:17 +01:00
Edd Robinson
85e0b4ec16
refactor: hoist tsm reader into own crate
2020-06-22 18:56:17 +01:00
Edd Robinson
fd9f2ea5b8
refactor: split out index reading and block decoding
...
This commit splits out the functionality required to read a TSM file's
index, and decode the blocks within the file.
2020-06-22 18:56:17 +01:00
Edd Robinson
6339083b87
feat: implement mapping between blocks and table
...
This commit implements the ability to map from multiple columns into a
single tablular view, where columns are aligned by their timestamp
components.
2020-06-22 18:56:17 +01:00
Edd Robinson
5418b34fcc
feat(tsm): map TSM data model to table model
...
This commit adds a new type `TSMMeasurementMapper` that will iterate
through a `TSMReader`'s index and collect together all series and blocks
by measurement. These units are called `MeasurementTable`s.
2020-06-22 18:56:17 +01:00