bb4fba0c4c
- Move main binary to `src/bin`. It's easier to reason about when all binaries are in a single directory and the other files in `src` just belong to the lib. Note that `cargo run [-p iox_data_generator]` still works. - Enable lints that we use elsewhere. Fix the few issues that were found by this (e.g. broken intradoc links). |
||
---|---|---|
.. | ||
benches | ||
schemas | ||
src | ||
Cargo.toml | ||
README.md |
README.md
iox_data_generator
The iox_data_generator
tool creates random data points according to a specification and loads them
into an iox
instance to simulate real data.
To build and run, first install Rust. Then from root of the influxdb_iox
repo run:
cargo build --release
And the built binary has command line help:
./target/release/iox_data_generator --help
For examples of specifications see the schemas folder
Use with two IOx servers and Kafka
The data generator tool can be used to simulate data being written to IOx in various shapes. This is how to set up a local experiment for profiling or debugging purposes using a database in two IOx instances: one writing to Kafka and one reading from Kafka.
If you're profiling IOx, be sure you've compiled and are running a release build using either:
cargo build --release
./target/release/influxdb_iox run --server-id 1
or:
cargo run --release -- run --server-id 1
Server ID is the only required attribute for running IOx; see influxdb_iox run --help
for all the
other configuration options for the server you may want to set for your experiment. Note that the
default HTTP API address is 127.0.0.1:8080
unless you set something different with --api-bind
and the default gRPC address is 127.0.0.1:8082
unless you set something different using
--grpc-bind
.
For the Kafka setup, you'll need to start two IOx servers, so you'll need to set the bind addresses for at least one of them. Here's an example of the two commands to run:
cargo run --release -- run --server-id 1
cargo run --release -- run --server-id 2 --api-bind 127.0.0.1:8084 --grpc-bind 127.0.0.1:8086
You'll also need to run a Kafka instance. There's a Docker compose script in the influxdb_iox repo you can run with:
docker-compose -f docker/ci-kafka-docker-compose.yml up kafka
The Kafka instance will be accessible from 127.0.0.1:9093
if you run it with this script.
Once you have the two IOx servers and one Kafka instance running, create a database with a name in
the format [orgname]_[bucketname]
. For example, create a database in IOx named mlb_pirates
, and
the org you'll use in the data generator will be mlb
and the bucket will be pirates
. The
DatabaseRules
defined in src/bin/create_database.rs
will set up a database in the "writer" IOx
instance to write to Kafka and the database in the "reader" IOx instance to read from Kafka if
you run it with:
cargo run --release -p iox_data_generator --bin create_database -- --writer 127.0.0.1:8082 --reader 127.0.0.1:8086 mlb_pirates
This script adds 3 rows to a writer_test
table because this issue with the Kafka Consumer
needing data before it can find partitions.
Once the database is created, decide what kind of data you would like to send it. You can use an
existing data generation schema in the schemas
directory or create a new one, perhaps starting
from an existing schema as a guide. In this example, we're going to use
iox_data_generator/schemas/cap-write.toml
.
Next, run the data generation tool as follows:
cargo run --release -p iox_data_generator -- --spec iox_data_generator/schemas/cap-write.toml --continue --host 127.0.0.1:8080 --token arbitrary --org mlb --bucket pirates
--spec iox_data_generator/schemas/cap-write.toml
sets the schema you want to use to generate the data--continue
means the data generation tool should generate data everysampling_interval
(which is set in the schema) until we stop it--host 127.0.0.1:8080
means to write to the writer IOx server running at the default HTTP API address of127.0.0.1:8080
(note this is NOT the gRPC address used by thecreate_database
command)--token arbitrary
- the data generator requires a token value but IOx doesn't use it, so this can be any value.--org mlb
is the part of the database name you created before the_
--bucket pirates
is the part of the database name you created after the_
You should be able to use influxdb_iox sql -h http://127.0.0.1:8086
to connect to the gRPC of the reader
then use database mlb_pirates;
and query the tables to see that the data is being inserted. That
is,
# in your influxdb_iox checkout
cargo run --release -- sql -h http://127.0.0.1:8086
Connecting to the writer instance won't show any data.