Running a server is now using `influxdb_iox run MODE [args]`, e.g.
`influxdb_iox run query --server-id 1`. Another mode that will follow
soon is `router`.
The old syntax `influxdb_iox run [args]` (w/o the mode part) is still
supported but a deprecation note will be printed.
This improves performance of the the file output mode, which should
make it easier to improve the performance of the core generation
logic.
Benchmarked via:
```
time \
./target/release/iox_data_generator \
--spec iox_data_generator/schemas/fully-supported.toml \
--output /tmp/out \
--start '1 month ago'
```
Before:
```
Submitted 271608 total points
real 10.912 10911567us
user 3.129 3129032us
sys 6.257 6257340us
cpu 86%
mem 7152 KiB
```
After:
```
Submitted 271588 total points
real 2.291 2291364us
user 1.969 1969357us
sys 0.058 58030us
cpu 88%
mem 7104 KiB
```
That's 21.0% of the previous time.
This adds the ability to pre-generate values and tag sets in the data generator. This makes it easy to have tags that depend on other tag values (like buckets in an org) and have tag values that have one associated tag (like if something is in production or staging environment). Follow on work will add the actual generation to the agent spec. An added bonus of these pre-generated values is that generating samples won't require any sort of template evaluation for all of the tags in the tag sets. Only unique values (like trace_id or span_id) would need to be generated during sampling generation.
- Move main binary to `src/bin`. It's easier to reason about when all
binaries are in a single directory and the other files in `src` just
belong to the lib. Note that `cargo run [-p iox_data_generator]` still
works.
- Enable lints that we use elsewhere. Fix the few issues that were found
by this (e.g. broken intradoc links).
This allows:
- different types (instead of guessing through the connection URL)
- sequencer counts (not used yet but will be by #2455)
- extensible configs (e.g. to configure Kafka in a more granular way,
not wired up yet)
- future extensions (since we use a message now instead of a single
string)
**BREAKING: This requires changes for deployed systems / existing DBs!**
Adds batch_size to the data genrator to optionally gather multiple calls to generate for each agent. For example, if you have a sampling interval of 10 seconds and start at some point back in time with a batch size of 3, it gather 3 samplings before writing to the points writer. For runs against a server API, this will batch them together in a single API call.