influxdb/read_buffer/benches/read_group.rs

458 lines
16 KiB
Rust
Raw Normal View History

2021-04-29 18:57:32 +00:00
use criterion::{BenchmarkId, Criterion, Throughput};
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
use rand::distributions::Alphanumeric;
use rand::prelude::*;
use rand::Rng;
use rand_distr::{Distribution, Normal};
use packers::{sorter, Packers};
use read_buffer::benchmarks::{Column, ColumnType, RowGroup};
use read_buffer::{AggregateType, Predicate};
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
const ONE_MS: i64 = 1_000_000;
2021-04-29 18:57:32 +00:00
pub fn read_group(c: &mut Criterion) {
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let mut rng = rand::thread_rng();
2020-12-18 22:11:55 +00:00
let row_group = generate_row_group(500_000, &mut rng);
read_group_predicate_all_time(c, &row_group, &mut rng);
read_group_pre_computed_groups(c, &row_group, &mut rng);
}
// These benchmarks track the performance of read_group using the general
// approach of building up a mapping of group keys. To avoid hitting the
// optimised no predicate implementation we apply a time predicate that covers
2020-12-18 22:11:55 +00:00
// the `RowGroup`.
fn read_group_predicate_all_time(c: &mut Criterion, row_group: &RowGroup, rng: &mut ThreadRng) {
// This benchmark fixes the number of rows in the `RowGroup` (500K), and
// varies the cardinality of the group keys.
2021-01-14 14:06:17 +00:00
let time_pred = Predicate::with_time_range(&[], i64::MIN, i64::MAX);
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
benchmark_read_group_vary_cardinality(
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_all_time_vary_cardinality",
row_group,
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
&time_pred,
// grouping columns and expected cardinality
vec![
2021-03-19 10:34:41 +00:00
(vec!["env"], 2),
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
(vec!["env", "data_centre"], 20),
(vec!["data_centre", "cluster"], 200),
(vec!["cluster", "node_id"], 2000),
(vec!["cluster", "node_id", "pod_id"], 20000),
]
.as_slice(),
);
// This benchmark fixes the cardinality of the group keys and varies the
// number of columns grouped to produce that group key cardinality.
benchmark_read_group_vary_group_cols(
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_all_time_vary_columns",
row_group,
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
&time_pred,
// number of cols to group on and expected cardinality
vec![
(vec!["pod_id"], 20000),
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
(vec!["node_id", "pod_id"], 20000),
(vec!["cluster", "node_id", "pod_id"], 20000),
(vec!["data_centre", "cluster", "node_id", "pod_id"], 20000),
]
.as_slice(),
);
// This benchmark fixes the cardinality of the group keys and the number of
2020-12-18 22:11:55 +00:00
// columns grouped on. It then varies the number of rows in the `RowGroup`
// to be processed.
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
benchmark_read_group_vary_rows(
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_all_time_vary_rows",
&[250_000, 500_000, 750_000, 1_000_000], // `RowGroup` row sizes to vary
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
&time_pred,
(vec!["node_id", "pod_id"], 20000),
rng,
);
}
// These benchmarks track the performance of read_group when it is able to use
// the per-group bitsets provided by RLE-encoded columns. These code-path are
2020-12-18 22:11:55 +00:00
// hit due to the encoding of the grouping-columns and the lack of predicates on
// the query.
fn read_group_pre_computed_groups(c: &mut Criterion, row_group: &RowGroup, rng: &mut ThreadRng) {
// This benchmark fixes the number of rows in the `RowGroup` (500K), and
// varies the cardinality of the group keys.
benchmark_read_group_vary_cardinality(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_pre_computed_groups_vary_cardinality",
row_group,
2021-01-14 14:06:17 +00:00
&Predicate::default(),
// grouping columns and expected cardinality
vec![
(vec!["env"], 2),
(vec!["env", "data_centre"], 20),
(vec!["data_centre", "cluster"], 200),
(vec!["cluster", "node_id"], 2000),
]
.as_slice(),
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
);
// This benchmark fixes the cardinality of the group keys and varies the
// number of columns grouped to produce that group key cardinality.
benchmark_read_group_vary_group_cols(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_pre_computed_groups_vary_columns",
row_group,
2021-01-14 14:06:17 +00:00
&Predicate::default(),
// number of cols to group on and expected cardinality
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
vec![
(vec!["cluster"], 200),
(vec!["data_centre", "cluster"], 200),
(vec!["env", "data_centre", "cluster"], 200),
]
.as_slice(),
);
// This benchmark fixes the cardinality of the group keys and the number of
2020-12-18 22:11:55 +00:00
// columns grouped on. It then varies the number of rows in the `RowGroup`
// to be processed.
benchmark_read_group_vary_rows(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c,
2020-12-18 22:11:55 +00:00
"row_group_read_group_pre_computed_groups_vary_rows",
&[250_000, 500_000, 750_000, 1_000_000], // `RowGroup` row sizes to vary
2021-01-14 14:06:17 +00:00
&Predicate::default(),
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
(vec!["data_centre", "cluster"], 200),
rng,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
);
}
// This benchmarks the impact that the cardinality of group keys has on the
// performance of read_group.
fn benchmark_read_group_vary_cardinality(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c: &mut Criterion,
benchmark_group_name: &str,
2020-12-18 22:11:55 +00:00
row_group: &RowGroup,
2021-01-14 14:06:17 +00:00
predicate: &Predicate,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
cardinalities: &[(Vec<&str>, usize)],
) {
let mut group = c.benchmark_group(benchmark_group_name);
for (group_cols, expected_cardinality) in cardinalities {
// benchmark measures the throughput of group creation.
group.throughput(Throughput::Elements(*expected_cardinality as u64));
group.bench_with_input(
BenchmarkId::from_parameter(format!(
"cardinality_{:?}_columns_{:?}_rows_{:?}",
expected_cardinality,
&group_cols.len(),
500_000
)),
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
&expected_cardinality,
|b, expected_cardinality| {
b.iter(|| {
2021-01-15 16:06:47 +00:00
let result = row_group.read_aggregate(
2021-01-14 14:06:17 +00:00
predicate,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group_cols.as_slice(),
&[("duration", AggregateType::Count)],
);
// data_centre cardinality is split across env
assert_eq!(result.cardinality(), **expected_cardinality, "{}", &result);
});
},
);
}
group.finish();
}
2020-12-18 22:11:55 +00:00
// This benchmarks the impact that the number of rows in a `RowGroup` has on the
// performance of read_group.
fn benchmark_read_group_vary_rows(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c: &mut Criterion,
benchmark_group_name: &str,
row_sizes: &[usize],
2021-01-14 14:06:17 +00:00
predicate: &Predicate,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group_columns: (Vec<&str>, usize),
rng: &mut ThreadRng,
) {
let mut group = c.benchmark_group(benchmark_group_name);
for num_rows in row_sizes {
2020-12-18 22:11:55 +00:00
let row_group = generate_row_group(*num_rows, rng);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
// benchmark measures the throughput of group creation.
group.throughput(Throughput::Elements(group_columns.1 as u64));
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group.bench_function(
BenchmarkId::from_parameter(format!(
"cardinality_{:?}_columns_{:?}_rows_{:?}",
group_columns.1,
group_columns.0.len(),
num_rows
)),
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
|b| {
b.iter(|| {
2021-01-15 16:06:47 +00:00
let result = row_group.read_aggregate(
2021-01-14 14:06:17 +00:00
predicate,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group_columns.0.as_slice(),
&[("duration", AggregateType::Count)],
);
// data_centre cardinality is split across env
assert_eq!(result.cardinality(), group_columns.1, "{}", &result);
});
},
);
}
group.finish();
}
// This benchmarks the impact that the number of group columns has on the
// performance of read_group.
fn benchmark_read_group_vary_group_cols(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
c: &mut Criterion,
benchmark_group_name: &str,
2020-12-18 22:11:55 +00:00
row_group: &RowGroup,
2021-01-14 14:06:17 +00:00
predicates: &Predicate,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group_columns: &[(Vec<&str>, usize)],
) {
let mut group = c.benchmark_group(benchmark_group_name);
for (group_cols, expected_cardinality) in group_columns {
let num_cols = group_cols.len();
// benchmark measures the throughput of group creation.
group.throughput(Throughput::Elements(*expected_cardinality as u64));
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group.bench_with_input(
BenchmarkId::from_parameter(format!(
"cardinality_{:?}_columns_{:?}_rows_{:?}",
*expected_cardinality, num_cols, 500_000
)),
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
&group_cols,
|b, group_cols| {
b.iter(|| {
2021-01-15 16:06:47 +00:00
let result = row_group.read_aggregate(
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
predicates,
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
group_cols.as_slice(),
&[("duration", AggregateType::Count)],
);
assert_eq!(result.cardinality(), *expected_cardinality, "{}", &result);
});
},
);
}
group.finish();
}
//
2020-12-18 22:11:55 +00:00
// This generates a `RowGroup` with a known schema, ~known column cardinalities
// and variable number of rows.
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
//
// The schema and cardinalities are in-line with a tracing data use-case.
2020-12-18 22:11:55 +00:00
fn generate_row_group(rows: usize, rng: &mut ThreadRng) -> RowGroup {
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let mut timestamp = 1351700038292387000_i64;
let spans_per_trace = 10;
let mut column_packers: Vec<Packers> = vec![
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // env (card 2)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // data_centre (card 20)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // cluster (card 200)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // user_id (card 200,000)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // request_id (card 2,000,000)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // node_id (card 2,000)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // pod_id (card 20,000)
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // trace_id (card "rows / 10")
Packers::from(Vec::<Option<String>>::with_capacity(rows)), // span_id (card "rows")
Packers::from(Vec::<Option<i64>>::with_capacity(rows)), // duration
Packers::from(Vec::<Option<i64>>::with_capacity(rows)), // time
];
let n = rows / spans_per_trace;
for _ in 0..n {
column_packers =
2020-12-18 22:11:55 +00:00
generate_trace_for_row_group(spans_per_trace, timestamp, column_packers, rng);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
// next trace is ~10 seconds in the future
timestamp += 10_000 * ONE_MS;
}
// sort the packers according to lowest to highest cardinality excluding
// columns that are likely to be unique.
//
// - env, data_centre, cluster, node_id, pod_id, user_id, request_id, time
sorter::sort(&mut column_packers, &[0, 1, 2, 5, 6, 3, 4, 10]).unwrap();
// create columns
let columns = vec![
(
"env".to_string(),
ColumnType::Tag(Column::from(column_packers[0].str_packer().values())),
),
(
"data_centre".to_string(),
ColumnType::Tag(Column::from(column_packers[1].str_packer().values())),
),
(
"cluster".to_string(),
ColumnType::Tag(Column::from(column_packers[2].str_packer().values())),
),
(
"user_id".to_string(),
ColumnType::Tag(Column::from(column_packers[3].str_packer().values())),
),
(
"request_id".to_string(),
ColumnType::Tag(Column::from(column_packers[4].str_packer().values())),
),
(
"node_id".to_string(),
ColumnType::Tag(Column::from(column_packers[5].str_packer().values())),
),
(
"pod_id".to_string(),
ColumnType::Tag(Column::from(column_packers[6].str_packer().values())),
),
(
"trace_id".to_string(),
ColumnType::Tag(Column::from(column_packers[7].str_packer().values())),
),
(
"span_id".to_string(),
ColumnType::Tag(Column::from(column_packers[8].str_packer().values())),
),
(
"duration".to_string(),
ColumnType::Field(Column::from(
column_packers[9].i64_packer().some_values().as_slice(),
)),
),
(
"time".to_string(),
ColumnType::Time(Column::from(
column_packers[10].i64_packer().some_values().as_slice(),
)),
),
2021-04-01 15:56:42 +00:00
];
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
2020-12-18 22:11:55 +00:00
RowGroup::new(rows as u32, columns)
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
}
2020-12-18 22:11:55 +00:00
fn generate_trace_for_row_group(
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
spans_per_trace: usize,
timestamp: i64,
mut column_packers: Vec<Packers>,
rng: &mut ThreadRng,
) -> Vec<Packers> {
let env_idx = 0;
let data_centre_idx = 1;
let cluster_idx = 2;
let user_id_idx = 3;
let request_id_idx = 4;
let node_id_idx = 5;
let pod_id_idx = 6;
let trace_id_idx = 7;
let span_id_idx = 8;
let duration_idx = 9;
let time_idx = 10;
let env_value = rng.gen_range(0_u8..2);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let env = format!("env-{:?}", env_value); // cardinality of 2.
let data_centre_value = rng.gen_range(0_u8..10);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let data_centre = format!("data_centre-{:?}-{:?}", env_value, data_centre_value); // cardinality of 2 * 10 = 20
let cluster_value = rng.gen_range(0_u8..10);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let cluster = format!(
"cluster-{:?}-{:?}-{:?}",
env_value,
data_centre_value,
cluster_value // cardinality of 2 * 10 * 10 = 200
);
// user id is dependent on the cluster
let user_id_value = rng.gen_range(0_u32..1000);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let user_id = format!(
"uid-{:?}-{:?}-{:?}-{:?}",
env_value,
data_centre_value,
cluster_value,
user_id_value // cardinality of 2 * 10 * 10 * 1000 = 200,000
);
let request_id_value = rng.gen_range(0_u32..10);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let request_id = format!(
"rid-{:?}-{:?}-{:?}-{:?}-{:?}",
env_value,
data_centre_value,
cluster_value,
user_id_value,
request_id_value // cardinality of 2 * 10 * 10 * 1000 * 10 = 2,000,000
);
let trace_id = rng
.sample_iter(&Alphanumeric)
.map(char::from)
.take(8)
.collect::<String>();
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
2020-12-18 22:11:55 +00:00
// the trace should move across hosts, which in this setup would be nodes
// and pods.
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
let normal = Normal::new(10.0, 5.0).unwrap();
let node_id_prefix = format!("{}-{}-{}", env_value, data_centre_value, cluster_value,);
for _ in 0..spans_per_trace {
// these values are not the same for each span so need to be generated
// separately.
let node_id = rng.gen_range(0..10); // cardinality is 2 * 10 * 10 * 10 = 2,000
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
column_packers[pod_id_idx].str_packer_mut().push(format!(
"pod_id-{}-{}-{}",
node_id_prefix,
node_id,
rng.gen_range(0..10) // cardinality is 2 * 10 * 10 * 10 * 10 = 20,000
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
));
test: benchmarks for general read_group case This commit adds some initial benchmarks for the general read_group approach using a hashing strategy. Benchmarks are as follows: segment_read_group_all_time_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [23.335 ms 23.363 ms 23.397 ms] thrpt: [854.82 elem/s 856.07 elem/s 857.07 elem/s] Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [34.266 ms 34.301 ms 34.346 ms] thrpt: [5.8231 Kelem/s 5.8307 Kelem/s 5.8367 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [48.788 ms 48.996 ms 49.238 ms] thrpt: [40.619 Kelem/s 40.820 Kelem/s 40.993 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.2s, or reduce sample count to 60. segment_read_group_all_time_vary_cardinality/cardinality_20000_columns_3_rows_500000 time: [80.133 ms 80.201 ms 80.287 ms] thrpt: [249.11 Kelem/s 249.37 Kelem/s 249.58 Kelem/s] Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.4s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_2_rows_500000 time: [73.692 ms 73.951 ms 74.245 ms] thrpt: [269.38 Kelem/s 270.45 Kelem/s 271.40 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, or reduce sample count to 60. segment_read_group_all_time_vary_columns/cardinality_20000_columns_3_rows_500000 time: [79.837 ms 79.934 ms 80.079 ms] thrpt: [249.75 Kelem/s 250.21 Kelem/s 250.51 Kelem/s] Found 7 outliers among 100 measurements (7.00%) 5 (5.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.7s, or reduce sample count to 50. segment_read_group_all_time_vary_columns/cardinality_20000_columns_4_rows_500000 time: [95.415 ms 95.549 ms 95.707 ms] thrpt: [208.97 Kelem/s 209.32 Kelem/s 209.61 Kelem/s] Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) high mild 8 (8.00%) high severe segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_250000 time: [38.897 ms 39.045 ms 39.227 ms] thrpt: [509.86 Kelem/s 512.22 Kelem/s 514.18 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.2s, or reduce sample count to 60. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_500000 time: [71.965 ms 72.190 ms 72.445 ms] thrpt: [276.07 Kelem/s 277.04 Kelem/s 277.91 Kelem/s] Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) low mild 3 (3.00%) high mild 14 (14.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 10.7s, or reduce sample count to 40. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_750000 time: [106.48 ms 106.58 ms 106.70 ms] thrpt: [187.43 Kelem/s 187.65 Kelem/s 187.82 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe Benchmarking segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 14.0s, or reduce sample count to 30. segment_read_group_all_time_vary_rows/cardinality_20000_columns_2_rows_1000000 time: [140.02 ms 140.14 ms 140.29 ms] thrpt: [142.57 Kelem/s 142.71 Kelem/s 142.84 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2_columns_1_rows_500000 time: [51.734 us 52.123 us 52.560 us] thrpt: [38.051 Kelem/s 38.371 Kelem/s 38.659 Kelem/s] Found 18 outliers among 100 measurements (18.00%) 3 (3.00%) high mild 15 (15.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_20_columns_2_rows_500000 time: [50.546 us 50.642 us 50.785 us] thrpt: [393.82 Kelem/s 394.93 Kelem/s 395.68 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe segment_read_group_pre_computed_groups_vary_cardinality/cardinality_200_columns_2_rows_500000 time: [267.47 us 270.23 us 273.10 us] thrpt: [732.33 Kelem/s 740.12 Kelem/s 747.75 Kelem/s] segment_read_group_pre_computed_groups_vary_cardinality/cardinality_2000_columns_2_rows_500000 time: [14.961 ms 15.033 ms 15.113 ms] thrpt: [132.33 Kelem/s 133.04 Kelem/s 133.68 Kelem/s] Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_1_rows_500000 time: [84.825 us 84.938 us 85.083 us] thrpt: [2.3506 Melem/s 2.3546 Melem/s 2.3578 Melem/s] Found 14 outliers among 100 measurements (14.00%) 7 (7.00%) high mild 7 (7.00%) high severe segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_2_rows_500000 time: [258.81 us 259.33 us 260.05 us] thrpt: [769.08 Kelem/s 771.22 Kelem/s 772.77 Kelem/s] Found 14 outliers among 100 measurements (14.00%) 2 (2.00%) high mild 12 (12.00%) high severe Benchmarking segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60. segment_read_group_pre_computed_groups_vary_columns/cardinality_200_columns_3_rows_500000 time: [1.1971 ms 1.2020 ms 1.2079 ms] thrpt: [165.58 Kelem/s 166.39 Kelem/s 167.07 Kelem/s] Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_250000 time: [252.42 us 252.58 us 252.75 us] thrpt: [791.31 Kelem/s 791.84 Kelem/s 792.32 Kelem/s] Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_500000 time: [271.68 us 272.46 us 273.59 us] thrpt: [731.01 Kelem/s 734.04 Kelem/s 736.15 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 8 (8.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_750000 time: [293.17 us 293.42 us 293.65 us] thrpt: [681.09 Kelem/s 681.63 Kelem/s 682.20 Kelem/s] Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe segment_read_group_pre_computed_groups_vary_rows/cardinality_200_columns_2_rows_1000000 time: [306.48 us 307.11 us 307.95 us] thrpt: [649.45 Kelem/s 651.22 Kelem/s 652.57 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe
2020-12-08 15:22:08 +00:00
column_packers[node_id_idx]
.str_packer_mut()
.push(format!("node_id-{}-{}", node_id_prefix, node_id));
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
// randomly generate a span_id
column_packers[span_id_idx].str_packer_mut().push(
rng.sample_iter(&Alphanumeric)
.map(char::from)
.take(8)
.collect::<String>(),
);
test: add benchmarks for specific read_group path This commit adds benchmarks to track the performance of `read_group` when aggregating across columns that support pre-computed bit-sets of row_ids for each distinct column value. Currently this is limited to the RLE columns, and only makes sense when grouping by low-cardinality columns. The benchmarks are in three groups: * one group fixes the number of rows in the segment but varies the cardinality (that is, how many groups the query produces). * another groups fixes the cardinality and the number of rows but varies the number of columns needed to be grouped to produce the fixed cardinality. * a final group fixes the number of columns being grouped, the cardinality, and instead varies the number of rows in the segment. Some initial results from my development box are as follows: ``` time: [51.099 ms 51.119 ms 51.140 ms] thrpt: [39.108 Kelem/s 39.125 Kelem/s 39.140 Kelem/s] Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/1 time: [93.162 us 93.219 us 93.280 us] thrpt: [10.720 Kelem/s 10.727 Kelem/s 10.734 Kelem/s] Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe segment_read_group_pre_computed_groups_no_predicates_group_cols/2 time: [571.72 us 572.31 us 572.98 us] thrpt: [3.4905 Kelem/s 3.4946 Kelem/s 3.4982 Kelem/s] Found 12 outliers among 100 measurements (12.00%) 5 (5.00%) high mild 7 (7.00%) high severe Benchmarking segment_read_group_pre_computed_groups_no_predicates_group_cols/3: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.9s, enable flat sampling, or reduce sample count to 50. segment_read_group_pre_computed_groups_no_predicates_group_cols/3 time: [1.7292 ms 1.7313 ms 1.7340 ms] thrpt: [1.7301 Kelem/s 1.7328 Kelem/s 1.7349 Kelem/s] Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 6 (6.00%) high mild 1 (1.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/250000 time: [562.29 us 565.19 us 568.80 us] thrpt: [439.52 Melem/s 442.33 Melem/s 444.61 Melem/s] Found 18 outliers among 100 measurements (18.00%) 6 (6.00%) high mild 12 (12.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/500000 time: [561.32 us 561.85 us 562.47 us] thrpt: [888.93 Melem/s 889.92 Melem/s 890.76 Melem/s] Found 11 outliers among 100 measurements (11.00%) 5 (5.00%) high mild 6 (6.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/750000 time: [573.75 us 574.27 us 574.85 us] thrpt: [1.3047 Gelem/s 1.3060 Gelem/s 1.3072 Gelem/s] Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) high mild 8 (8.00%) high severe segment_read_group_pre_computed_groups_no_predicates_rows/1000000 time: [586.36 us 586.74 us 587.19 us] thrpt: [1.7030 Gelem/s 1.7043 Gelem/s 1.7054 Gelem/s] Found 9 outliers among 100 measurements (9.00%) 4 (4.00%) high mild 5 (5.00%) high severe ```
2020-12-03 16:04:08 +00:00
// randomly generate some duration times in milliseconds.
column_packers[duration_idx].i64_packer_mut().push(
(normal.sample(rng) * ONE_MS as f64)
.max(ONE_MS as f64) // minimum duration is 1ms
.round() as i64,
);
}
column_packers[env_idx]
.str_packer_mut()
.fill_with(env, spans_per_trace);
column_packers[data_centre_idx]
.str_packer_mut()
.fill_with(data_centre, spans_per_trace);
column_packers[cluster_idx]
.str_packer_mut()
.fill_with(cluster, spans_per_trace);
column_packers[user_id_idx]
.str_packer_mut()
.fill_with(user_id, spans_per_trace);
column_packers[request_id_idx]
.str_packer_mut()
.fill_with(request_id, spans_per_trace);
column_packers[trace_id_idx]
.str_packer_mut()
.fill_with(trace_id, spans_per_trace);
column_packers[time_idx]
.i64_packer_mut()
.fill_with(timestamp, spans_per_trace);
column_packers
}