This is a huge commit that refactors the data generator. It removes many of the previous features that didn't quite make sense. The goal of this refactor was to make the data generator capable of representing complex tagsets that have values dependent on each other. It also significantly optimizes things to use far less memory and generate data much faster. Follow on work will update the generation of line protocol to support spaces in tags and their keys, double quotes in strings, and add more examples and documentation.
This adds the ability to specify a tag_set and a collection of tag_pairs to measurements in the data generator. Tag pairs are evaluated once when the generator is created. This avoids re-running handlebars evaluations while generating data for tags that don't change value.
This commit also fixes an issue when printing the generation output to stdout while generating from more than one agent. Previously it would be garbled together.
Follow on PRs will update the tag generation code in measurement specs to be more consistent and optimzised for performance. I'll be removing the restriction of using different options while using tag_set and tag_pairs. I wanted to get this in first to show the structure of what is output.
This adds the ability to pre-generate values and tag sets in the data generator. This makes it easy to have tags that depend on other tag values (like buckets in an org) and have tag values that have one associated tag (like if something is in production or staging environment). Follow on work will add the actual generation to the agent spec. An added bonus of these pre-generated values is that generating samples won't require any sort of template evaluation for all of the tags in the tag sets. Only unique values (like trace_id or span_id) would need to be generated during sampling generation.