* refactor: consolidate line protocol schema creation into data_types, and port code to use it
refactor: Port mutable buffer to use SchemaBuilder
* fix: doctest
* refactor: remove unecessary clippyisms
* docs: Improve comments via suggestions from code review
Co-authored-by: Edd Robinson <me@edd.io>
* refactor: use more idomatic try_ naming and TryInto trait
* docs: Change from line protocol data model to InfluxDB data model
* refactor: rename LP --> Influx in code
* feat: add support for UInteger type
Co-authored-by: Edd Robinson <me@edd.io>
* feat: Script to scrape benchmark results into line protocol
* feat: script to run benchmark on a checkout
* fix: rename
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* docs: update comments
* fix: use a single jq command invocation rather than 4
* fix: use one jq command rather than 2
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: check for directory existence too
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
This commit moves some of the TSM mapper logic that had leaked into the
TSM->Parquer converter back into the mapper. The refactor allows us to
make some previously public APIs private, whilst still providing a
reasonably flexible API.
This commit reduces copying of block data by replacing an inefficient
`remove` call on vectors by with an index tracking approach, leving the
original vectors in place.
It further refactors some of the mapping code DRYing things up.
It improves performance of the `map_field_columns` function by 48%.
```
time: [137.11 us 137.50 us 137.92 us]
change: [-49.095% -48.558% -48.033%] (p = 0.00 < 0.05)
Performance has improved.
```
* feat: benchmark for lp->parquet performance
* feat: improve parser performance by storing contiguous EscapedStr
* fix: remove all string copies during LP-Parquet conversion
* refactor: Implement from_str as From<&str> only
* refactor: implement Deref instead of as_str
* refactor: Remove ends_with because Deref now makes it work
* refactor: Eq can be derived
* refactor: Remove unused From implementation
* refactor: Replace single-character strings with chars as requested by clippy
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
The first thing the `encode` function does is truncate the `dst` buffer,
so this should never be necessary inside the code being benchmarked for
testing encoders.
Benchmarking random values was more general than sequential since it
takes an arbitrary function to create the decoded values; express
sequential in terms of random and change the name of random to be
general benchmarking of encoding.