Commit Graph

50 Commits (d24fb0eae7159a986a92c4e8be04e99321f4fc3f)

Author SHA1 Message Date
Carol (Nichols || Goulding) b982bdaf2f
fix: Derive Eq when we derive PartialEq and members can derive Eq
Allow this in generated code that we don't control, though.

Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
2022-08-11 15:04:06 -04:00
Marco Neumann 0fbff981ec
chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894)
Closes #4889.
Closes #4890.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 10:28:28 +00:00
dependabot[bot] 23c9e38ea7
chore(deps): Bump clap from 3.1.18 to 3.2.1 (#4848)
* chore(deps): Bump clap from 3.1.18 to 3.2.1

Bumps [clap](https://github.com/clap-rs/clap) from 3.1.18 to 3.2.1.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.18...clap_complete-v3.2.1)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: fix clap deprecations

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 15:42:18 +00:00
Carol (Nichols || Goulding) 0d723fb21d
fix: Remove allow dead_code and remove dead code 2022-05-06 16:58:03 -04:00
Carol (Nichols || Goulding) a4443e4c31
fix: Remove OG gRPC client code and APIs 2022-04-29 16:29:49 -04:00
dependabot[bot] 65ab5213e5
chore(deps): Bump clap from 3.0.14 to 3.1.1 (#3809)
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.14 to 3.1.1.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.14...v3.1.1)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-22 14:51:53 +00:00
Marco Neumann c399e676ca chore: upgrade clap to v3 2022-01-17 12:12:46 +01:00
Marco Neumann f3f6f335a9
chore: upgrade to snafu 0.7 (#3440) 2022-01-11 19:22:36 +00:00
Paul Dix 8efd02930e feat: add total throughput to data generator output
On log level info, data generator will now output what the total throughput is in rows per second after every agent's successful write.
2021-12-11 17:12:28 -05:00
Paul Dix 8c88e1e52c refactor: change orgbucket to database in data generator 2021-12-09 13:39:33 -05:00
Paul Dix 01e86a031e feat: add regex to bucket writers assignment in data generator
This adds the ability to specify a regex to match against database names when specifying what agents should write to which buckets in the data generator.

A default has also been added for ratio so that it doesn't need to be specified if only a single database writer is defined.
2021-12-09 13:39:33 -05:00
Paul Dix 2c8d17bea8 refactor: change percent to ratio in data generator bucket writers 2021-12-08 12:09:04 -05:00
Paul Dix 31aa41e240 feat: add ability for data generator to write to many buckets
This adds the ability for the data generator to write to many databases. A new command line argument, `bucket_list`, is added which should be a file name. The file should contain a list of databsaes, one per line, with the structure of <org>_<bucket>. This is a little odd given the data generator expects org and bucket separately, but I expect the file that we'll be using will be database names, which have this format.

The configuration can specify what percentage of the list should get written to by which agents at what sampling interval. This should allow configurations where databases get different levels of ingest and different types (as specified via different agent specs). The structure is a little wonky, but I think it'll get the job done. The next step is to run some perf tests to see how the data generator performs if writing to 10k databases.
2021-12-08 12:09:04 -05:00
Paul Dix 3279725d10
refactor: Add agent name to data generator (#3297)
This is work leading up to giving the data generator the ability to write to many databases. The plan is to specify which agents databases will use to write data.
2021-12-05 11:21:04 -05:00
Paul Dix 3c848049ba refactor: remove create database from data generator
This removes the create databsase command line flag and associated code from the data generator runner. Creation of databases should live outside the generator in other tools.
2021-12-03 16:22:45 -05:00
Carol (Nichols || Goulding) 948a45a4ea
fix: Use split_once rather than reimplementing manually
Identified by clippy.

https://rust-lang.github.io/rust-clippy/master/index.html#manual_split_once
2021-12-02 11:52:02 -05:00
Carol (Nichols || Goulding) 5d0fd1c603
fix: Allow dead code on fields that are now detected as never read 2021-12-02 11:52:01 -05:00
Marco Neumann dbf2642582 fix: `jaeger_debug` -> `jaeger_debug_header` 2021-12-01 18:02:39 +01:00
Marco Neumann 4bbe756b52 feat: make jaeger-debug-id configurable 2021-12-01 15:02:15 +01:00
Marco Neumann c961454dcd feat: `jaeger-debug-id` from data generator 2021-12-01 14:33:09 +01:00
Marco Neumann 4e043ecb55 refactor: remove old routing / sharding config
This is superseded by the new router subsystem.
2021-11-29 12:33:48 +01:00
Marco Neumann 7f2e4f4342 refactor: remove write buffer direction
The direction was required when a database could read or write from/to a
write buffer. Now it is clear from the usage context of a write buffer
context which of the two applications is meant (databases read, routers
write) so the direction flag is no longer required.
2021-11-26 12:38:40 +01:00
Raphael Taylor-Davies 7aa386b07f fix: flaky incrementing_i64_that_resets (#3197) 2021-11-23 16:37:12 +00:00
Raphael Taylor-Davies 88868e7496
feat: remove legacy write service from influxdb_iox_client (#3043)
* feat: remove legacy write service from influxdb_iox_client

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-11-05 17:26:18 +00:00
Paul Dix 7d90314f2b fix: data generator example 2021-11-04 09:56:58 -04:00
Paul Dix 041957be48 fix: datagenerator PR feedback, implement field count 2021-11-04 09:56:58 -04:00
Paul Dix 7044b89453 feat: Refactor Data Generator
This is a huge commit that refactors the data generator. It removes many of the previous features that didn't quite make sense. The goal of this refactor was to make the data generator capable of representing complex tagsets that have values dependent on each other. It also significantly optimizes things to use far less memory and generate data much faster. Follow on work will update the generation of line protocol to support spaces in tags and their keys, double quotes in strings, and add more examples and documentation.
2021-11-04 09:56:58 -04:00
Paul Dix 348b91edc4 feat: Add noop option to data generator
I needed this feature to be able to see how much memory and resources a given data spec toml would take to run.
2021-11-04 09:56:58 -04:00
Paul Dix 32bf4be64c chore: add benchmark for data generator tag set 2021-11-04 09:56:58 -04:00
Paul Dix db2f8a58fc feat: Add tag_set and tag_pairs to measurements in Data Generator
This adds the ability to specify a tag_set and a collection of tag_pairs to measurements in the data generator. Tag pairs are evaluated once when the generator is created. This avoids re-running handlebars evaluations while generating data for tags that don't change value.

This commit also fixes an issue when printing the generation output to stdout while generating from more than one agent. Previously it would be garbled together.

Follow on PRs will update the tag generation code in measurement specs to be more consistent and optimzised for performance. I'll be removing the restriction of using different options while using tag_set and tag_pairs. I wanted to get this in first to show the structure of what is output.
2021-11-04 09:56:58 -04:00
Marco Neumann 0d0c0cb42b refactor: move write buffer configs to new home
Write buffer configs will partially be shared by database and router
nodes, so lets move them into a shared home.
2021-11-02 10:17:01 +01:00
Raphael Taylor-Davies 3ffb16daa6
feat: remove parse_duration (#2574)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2021-09-19 11:27:24 +00:00
Jake Goulding 69939a5ae2 perf: Don't open the output file each time we write.
This improves performance of the the file output mode, which should
make it easier to improve the performance of the core generation
logic.

Benchmarked via:

```
time \
./target/release/iox_data_generator \
--spec iox_data_generator/schemas/fully-supported.toml \
--output /tmp/out \
--start '1 month ago'
```

Before:

```
Submitted 271608 total points

real	10.912	10911567us
user	3.129	3129032us
sys	6.257	6257340us
cpu	86%
mem	7152 KiB
```

After:

```
Submitted 271588 total points

real	2.291	2291364us
user	1.969	1969357us
sys	0.058	58030us
cpu	88%
mem	7104 KiB
```

That's 21.0% of the previous time.
2021-09-16 10:57:59 -04:00
Paul Dix 6d3ac4db46 chore: pr cleanup on data generator 2021-09-13 17:45:53 -04:00
Paul Dix 32f2410597 feat: Add print to stdout to data generator (#2512)
This adds a flag to the data generator to print samples to standard out. It disables logging output so that only the line protocol is output.
2021-09-13 17:45:53 -04:00
Paul Dix 914c6e712b chore: remove rogue println in data generator 2021-09-13 17:45:53 -04:00
Paul Dix 5f0b3b336e refactor: optimizations to make tag set generation significantly faster 2021-09-13 17:45:53 -04:00
Paul Dix 7f915ba9d4 feat: Add pre-generated values and tag sets to data generator
This adds the ability to pre-generate values and tag sets in the data generator. This makes it easy to have tags that depend on other tag values (like buckets in an org) and have tag values that have one associated tag (like if something is in production or staging environment). Follow on work will add the actual generation to the agent spec. An added bonus of these pre-generated values is that generating samples won't require any sort of template evaluation for all of the tags in the tag sets. Only unique values (like trace_id or span_id) would need to be generated during sampling generation.
2021-09-13 17:45:53 -04:00
Marco Neumann bb4fba0c4c chore: `iox_data_generator` QA
- Move main binary to `src/bin`. It's easier to reason about when all
  binaries are in a single directory and the other files in `src` just
  belong to the lib. Note that `cargo run [-p iox_data_generator]` still
  works.
- Enable lints that we use elsewhere. Fix the few issues that were found
  by this (e.g. broken intradoc links).
2021-09-07 11:05:09 +02:00
Marco Neumann ecf1f99ddb refactor: more flexible writer buffer config
This allows:

- different types (instead of guessing through the connection URL)
- sequencer counts (not used yet but will be by #2455)
- extensible configs (e.g. to configure Kafka in a more granular way,
  not wired up yet)
- future extensions (since we use a message now instead of a single
  string)

**BREAKING: This requires changes for deployed systems / existing DBs!**
2021-09-02 16:41:35 +02:00
Paul Dix 64fca1ee34 feat: Support sampling interval strings in data generator
This changes the sampling_interval in the data generator to be a string, supporting things like ns, us, ms, s, m, h and others.
2021-08-25 17:35:01 -04:00
Jake Goulding 405e6d4bf5 refactor: avoid manual iteration 2021-08-20 16:14:10 -04:00
Paul Dix e854527182 chore: fixup data generator based on feedback 2021-08-20 11:08:45 -04:00
Paul Dix 42fbb90d8c feat: Add batching to the data generator
Adds batch_size to the data genrator to optionally gather multiple calls to generate for each agent. For example, if you have a sampling interval of 10 seconds and start at some point back in time with a batch size of 3, it gather 3 samplings before writing to the points writer. For runs against a server API, this will batch them together in a single API call.
2021-08-20 11:08:45 -04:00
Carol (Nichols || Goulding) 033035d10a docs: Update a few more instances of un-prefixed data_generator 2021-08-19 15:23:31 -04:00
Paul Dix 7c401fbf28 fix: fmt in data generator field 2021-08-19 15:17:22 -04:00
Carol (Nichols || Goulding) 31ead36fc0 test: Don't compare floats for strict equality 2021-08-19 15:02:34 -04:00
Carol (Nichols || Goulding) 266dffea86 fix: Don't pass a unit value to a function arg 2021-08-19 15:00:19 -04:00
Carol (Nichols || Goulding) a72fb5b468 fix: Don't assert_eq on a bool 2021-08-19 14:56:52 -04:00
Paul Dix d5f01a2a68 refactor: move data generator to IOx repo and fix build 2021-08-19 14:26:15 -04:00