Commit Graph

367 Commits (0f6aad62b4583c1948605375499f6fa0266e9398)

Author SHA1 Message Date
kodiakhq[bot] fc6a7ea532
Merge branch 'main' into er/refactor/read_buffer/bitmap_size 2021-08-19 14:20:38 +00:00
Raphael Taylor-Davies 98627944e7
refactor: make packers a dev-dependency of read buffer (#2345)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-19 11:09:34 +00:00
Edd Robinson b9f09fce49 feat: improve bitset size estimation 2021-08-17 22:54:22 +01:00
Edd Robinson 1daa30cc7d fix: include enum in sizing 2021-08-17 22:54:22 +01:00
Edd Robinson c795fc7f9d feat: add metric to track total row groups 2021-08-17 12:55:11 +01:00
Edd Robinson eee4e10fd1 refactor: rename statistic to required_bytes 2021-08-13 11:57:46 +01:00
Edd Robinson efde3a8f5a feat: expose required bytes metric 2021-08-13 11:57:46 +01:00
Edd Robinson de702ec820 refactor: make allocated bytes explicit Read Buffer metric 2021-08-13 11:57:46 +01:00
Edd Robinson 311d36d776 refactor: include capacity in Read Buffer chunk size 2021-08-13 11:57:46 +01:00
Edd Robinson 03592aaf94 refactor: ignore bitmap size from required bytes
Bitmaps are a performance optimisation; they're not required for the RLE compression and so it seems reasonable to ignore them when assessing the compression performance of RLE.
2021-08-13 11:57:46 +01:00
Edd Robinson fa8da19c45 refactor: expose enc size API into column 2021-08-13 11:57:46 +01:00
Edd Robinson e0bce4c2f2 refactor: always use same Arrow sizing call 2021-08-13 11:57:46 +01:00
Edd Robinson e78aebdf19
refactor: update read_buffer/src/column/encoding/scalar/fixed.rs
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2021-08-12 15:57:01 +01:00
Edd Robinson 0e8b0edfc9 feat: add buffer-based sizing for numerical encodings 2021-08-12 15:05:47 +01:00
Edd Robinson 11349fa30d feat: add allocated size to bool 2021-08-12 15:05:47 +01:00
Edd Robinson b4f8e854f6 feat: size rle string encoding by allocated buffers 2021-08-12 15:05:47 +01:00
Edd Robinson 78d3749af5 feat: size dictionary encoding by allocated space 2021-08-12 15:05:47 +01:00
Dom 3de6b44e23
build: use new rustdoc lint name (#2261)
* fix: nocache feature code rot

The MBChunk::snapshot code when using the "nocache" option no longer
compiles - this commit updates it to match the not(nocache) code.

* build: use updated broken_intra_doc_links name

The broken_intra_doc_links lint was renamed
rustdoc::broken_intra_doc_links

https://doc.rust-lang.org/rustdoc/lints.html
2021-08-11 19:48:51 +00:00
kodiakhq[bot] 304901bf40
Merge branch 'main' into er/refactor/logs 2021-08-10 21:31:49 +00:00
Andrew Lamb 8626e9980b
docs: Add/update doccomments in the read_buffer (#2245)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 21:26:02 +00:00
Edd Robinson 5d5ed7d0db refactor: remove logging 2021-08-10 22:16:01 +01:00
Edd Robinson f8870968b9 refactor: reduce logging when creating RUB chunk 2021-08-10 22:11:10 +01:00
Andrew Lamb 126598a2e8
fix(read_buffer): Improve statistics update to handle nulls and prevent `panic`s (#2246)
* fix(read_buffer): Improve statistics update to handle nulls

* fix: clippy

* refactor: only compile test helpers with cfg(test)

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-08-10 16:58:20 +00:00
kodiakhq[bot] 0297aae17e
Merge branch 'main' into cn/1.54 2021-07-30 17:01:37 +00:00
Andrew Lamb 248ae08343
fix(read_buffer): Avoid panic when creating stats for entirely null columns (#2159) 2021-07-30 14:59:18 +00:00
Carol (Nichols || Goulding) 9d15798288 fix: Address or allow Clippy warnings new with Rust 1.54 2021-07-30 09:59:59 -04:00
Carol (Nichols || Goulding) 11b7755325 refactor: Remove first/last write times from RUB chunks 2021-07-28 11:22:22 -04:00
Andrew Lamb 5fb3e00f2a
fix: Properly record total_count and null_count in statistics (#2103)
* fix: Properly record total_count and null_count in statistics

* fix: fix statistics calculation in mutable_buffer

* refactor: expose null counts in read_buffer

* refactor: expose null_count in parquet_file

* fix: update server crate tests

* fix: update query_tests tests

* docs: tweak comments

* refactor: Use storage_stats rather than adding `null_count`

* refactor: rename test data field for clarity

* fix: fixup merge conflicts

* refactor: rename initial_non_null_count to initial_total_count

* refactor: caculate null_count as row_count - to_add
2021-07-26 18:13:36 +00:00
Carol (Nichols || Goulding) 05782eb980 refactor: Move first/last write times up to read buffer Chunk rather than MetaData 2021-07-22 12:27:46 -04:00
Carol (Nichols || Goulding) 37f24ebfc7 feat: Record first/last write times for creation of read_buffer::Chunk 2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) 4e6b79534b feat: Require passing first/last write times for creation of Table 2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) b7bedeaaf3 feat: Require passing first/last write times for creation of Table MetaData 2021-07-22 11:35:23 -04:00
Carol (Nichols || Goulding) 8d1d877196 feat: Record first/last write times for RUB chunks 2021-07-22 11:35:22 -04:00
Carol (Nichols || Goulding) 16b07e5b31 refactor: Always use Table::with_row_group to ensure Tables are never empty
Remove Table::new that created an empty table.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) 6feea3b2d5 feat: Require at least one RecordBatch to create a read_buffer::Chunk::new
In the signature only for the moment.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) bbb4462264 refactor: Extract a function for the RecordBatch to RowGroup transformation with logging
So that we can call it from RBChunk::new too.
2021-07-22 11:15:18 -04:00
Carol (Nichols || Goulding) 0a724878e6 refactor: Organize uses 2021-07-22 11:15:18 -04:00
Andrew Lamb 4da8a16c18
chore: update to arrow 5.0 and master datafusion (#2049)
* chore: update to arrow 5.0 and master datafusion

* fix: Update test for change in object size
2021-07-19 12:49:51 +00:00
Edd Robinson 54ad69ed86 fix: ensure correct table meta size used 2021-07-16 10:48:45 -04:00
Carol (Nichols || Goulding) f3175ed291 test: use of different size values 2021-07-16 09:47:56 -04:00
Carol (Nichols || Goulding) abe2fe7262 test: MetaData new with a row group vs default then update_with should have the same size 2021-07-16 09:47:56 -04:00
Andrew Lamb 3fd6430fb6
fix: rename `estimated_bytes` to `memory_bytes` and expose `object_store_bytes` in ChunkSummary and system.chunks (#2017)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-15 16:00:24 +00:00
kodiakhq[bot] 833debd5b5
Merge branch 'main' into cn/exploration 2021-07-14 17:30:55 +00:00
Raphael Taylor-Davies 1d00fa2fd8
refactor: track memory metrics in catalog (#1995)
* refactor: track memory metrics in catalog

* chore: update comment
2021-07-14 16:23:00 +00:00
Carol (Nichols || Goulding) 8070065e2f fix: Change RUB chunk table_summaries to table_summary
Because chunks now have only one table.

Connects to #1718, #1613, #1295
2021-07-14 11:18:02 -04:00
Andrew Lamb 97c727a2c2 fix: update read_buffer tests 2021-07-13 15:44:57 -04:00
Marco Neumann 2e391deb34 chore: update croaring to 0.5.0
Upstreame changelog:

- CRoaring updated to 0.3.1
- `-march=native` is not a default for croaring-sys anymore
- Impl Default for `Bitmap` and `Treemap`
2021-07-13 15:15:41 +02:00
Andrew Lamb d35b74c226
fix: Fix doc build warnings (#1945)
* fix: Fix doc build warnings

* refactor: add deny bare_urls to crates

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-13 08:03:42 +00:00
Edd Robinson f811bf1e5e refactor: log compaction activity 2021-07-08 12:48:41 +01:00
Andrew Lamb 7602bde850
chore: Update datafusion deps (#1799)
* chore: Update datafusion deps + rework code

* refactor: remove workaround as it has been contributed upstream

* fix: Update query/src/exec/split.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-07-08 10:58:32 +00:00