influxdb

Commit Graph

Author	SHA1	Message	Date
Dom Dwyer	7c5ba34d44	refactor: enable gRPC handler Plumbs the gRPC write handler into the existing router2 server.	2022-03-04 14:51:43 +00:00
Dom Dwyer	26c43f0a2c	feat: grpc write handler Implements the gRPC write handler endpoint.	2022-03-04 14:51:42 +00:00
Dom Dwyer	14d90d1011	feat: schema validation benchmarks Useful for confirming the scalability of the schema check algorithm.	2022-03-03 23:40:13 +00:00
Dom Dwyer	6b5283bf36	refactor: NamespaceCache latency histograms Switches the get/put counters to latency histograms to record the duration of each call - this might be interesting!	2022-03-03 23:40:13 +00:00
Dom Dwyer	bb9b140f4b	refactor: sequencer metrics Records per-sequencer (kafka partition) enqueue latency / counts broken down by operation success/error.	2022-03-03 23:40:13 +00:00
Dom Dwyer	e00986c563	refactor: write table count metric Record the number of tables in each write - this will let us observe the total number of tables a router instance has observed, which when combined with the existing metrics helps us understand the shape (distribution of tables/lines/fields) of the workload hitting the routers.	2022-03-03 23:40:13 +00:00
Dom Dwyer	8de453edd1	feat: batch column upsert for schema validation Uses the new ColumnRepo::create_or_get_many() catalog method to perform a bulk upsert of (potentially) new columns to the catalog during schema validation.	2022-03-03 11:18:29 +00:00
Carol (Nichols \|\| Goulding)	3f2a58b47f	refactor: pub use data_types from data_types2 So it's clearer which parts of data_types the NG design is using, and which types can be cleaned up eventually.	2022-03-02 13:55:31 -05:00
Carol (Nichols \|\| Goulding)	8f3e44bf76	refactor: Extract a crate for shared data types in the new design	2022-03-02 12:16:15 -05:00
Dom Dwyer	bd64f55658	feat: http ingest metrics Records LP line count, field count & request body size (decompressed, byte size) for writes, and request body byte size for deletes.	2022-03-02 13:05:55 +00:00
Marco Neumann	48722783f9	feat: offer metrics for in-mem catalog (#3876 ) This can be quite helpful to test certain caching behavior w/o writing yet-another abstraction layer.	2022-03-01 11:33:54 +00:00
Dom Dwyer	b07f15bec7	refactor: parallel column resolution A quick change to perform the ColumnRepo::create_or_get() calls in parallel (up to a maximum of 3 in-flight at any one time) in order to mitigate the latency of the call and reduce the overall schema validation call duration. The in-flight limit is enforced to avoid starving the DB connection pool of connections.	2022-02-24 21:04:25 +00:00
Dom Dwyer	3d77cf5845	test: validate metrics for adding namespace	2022-02-24 16:07:02 +00:00
Dom Dwyer	0ddc35ce73	feat: instrument namespace cache contents Adds two new metrics: * namespace_cache_table_count: total number of tables in cache * namespace_cache_column_count: total number of columns in cache The metric decorator keeps a running total of each of the table and column counts as namespaces are inserted into the cache, and adjusts the value accordingly when an existing namespace is overwrote.	2022-02-24 15:11:14 +00:00
Dom Dwyer	4024e95ce9	refactor: borrow metric registry There's no need for the namespace metrics to take (shared) ownership of the metric registry, so lend it at the call site instead of cloning the arc.	2022-02-24 15:04:49 +00:00
Dom Dwyer	d7eda88581	refactor: early schema validation Changes the configuration of the router request pipeline to move schema validation before partitioning. This reduces the concurrency of callsm into the schema validator when a single write is split into one or more partitions, reducing contention and cash thrashing. It also ensures we don't bother partitioning the writes if the request will fail.	2022-02-23 18:59:14 +00:00
Dom Dwyer	9707d85e5e	test: InstrumentationDecorator DML handler impls	2022-02-23 17:23:02 +00:00
Dom Dwyer	b20dce80a2	feat: emit trace spans for router stages Configures the instrumentation decorator to emit a trace span covering the duration of the decorated handler's execution, recording the success/error result and and error message, if any.	2022-02-23 10:39:13 +00:00
Luke Bond	e19609ab7b	feat: routing service protection (#3807 ) * chore: db migration for namespace table & column limits * feat: impl table & column limits in catalog * chore: improved comment in catalog Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-22 17:26:37 +00:00
Dom Dwyer	497615d715	test: router handler stack integration test Adds an integration test covering the router's HTTP handler stack. Given a well-formed HTTP write, the test asserts: * Write passes through the stack without error * Response code sent to client * Write buffer message is enqueued * Catalog namespace record is created * Metric handler is invoked and the hit is recorded	2022-02-18 14:39:03 +00:00
Dom Dwyer	bb132b61ad	refactor: chain DML handlers The router is composed of several DML handlers called in sequence in order to construct the full request handling pipeline. Prior to this commit, each handler nested the next handler it calls internally, producing a nested call chain that resulted metrics (added in #3764) recording cumulative latency like this: ┌ ─ │ ┌───────────────┐ │ NS Creation │ │ └───────────────┘ │ ┌───────────────┐ │ │ │ Partitioner │ │ └───────────────┘ │ │ │ │ │ Cumulative │ │ │ ┌───────────────┐ Timings 1.5s 1s │ etc... │ │ │ │ └───────────────┘ │ │ │ │ │ │ ┌───────────────┐ │ │ │ Partitioner │ │ └───────────────┘ │ ┌───────────────┐ │ NS Creation │ │ └───────────────┘ └ ─ This meant it was hard to determine the latency of a single handler without knowing (and subtracting the latency of) all the child handlers it calls. This commit replaces the intrusive nested handler call chain with an external Chain combinator type to compose together individual handlers, resulting in correct per-handler timings and simpler code/tests: ┌───────────────┐ │ NS Creation │ └───────────────┘ │ .5s ┌───────────────┐ └───────▶│ Partitioner │ └───────────────┘ │ 1s ┌───────────────┐ └───▶│ etc... │ └───────────────┘	2022-02-18 14:19:53 +00:00
Dom Dwyer	52fd2af851	refactor: DML handler metric name labels Emit metrics labelled with "handler=<name>" and a common metric name, instead of constructing metrics prefixed with the DML handler name.	2022-02-17 15:11:20 +00:00
Dom Dwyer	40e5b19301	feat: metric instrumentation for DML handlers Adds a decorator type over a DmlHandler implementation that records call latency for writes & deletes, broken down by result (success/error).	2022-02-16 14:00:49 +00:00
Dom Dwyer	92fe507e52	feat: instrumented namespace cache Decorates the NamespaceCache with a set of cache get hit/miss counters, and put insert/update counters to expose cache behaviour.	2022-02-16 14:00:49 +00:00
Dom Dwyer	e055800039	refactor: enable Partitioner in request pipeline Adds the Partitioner DML handler into the handler stack, modifying the input types of down-stream handlers to accept the partitioned data.	2022-02-15 11:34:33 +00:00
Dom Dwyer	c64e9f0d40	refactor: namespace auto-creator generic input Changes the NamespaceAutocreation handler to be generic over any WriteInput. This allows the NamespaceAutocreation layer to be placed anywhere in the handler stack, without needing a prior transformation or specific write type.	2022-02-15 11:29:33 +00:00
Dom Dwyer	92218ce8aa	feat: write partitioner Implements a write partitioning DML handler that splits per-table MutableBatch instances into per-partition, per-table MutableBatch and concurrently calls the inner DML handler with each.	2022-02-15 11:29:32 +00:00
Dom Dwyer	5c254339fa	test: MockDmlHandler generic over write input Allow the MockDmlHandler to capture any input type given to the write() method. This lets us reuse the mock across all handler implementations, regardless of their expected write input type.	2022-02-15 11:27:16 +00:00
Dom Dwyer	e99922d518	refactor: parametrise DML handler input type Allow a DML handler to specify the write input type on which it operates. This allows us to construct a write handler pipeline that transforms the request as it passes through the various handlers. We'll use this to implement a handler that annotates a normal set of table writes with the partition key, modifying downstream handlers to expect this annotated input.	2022-02-15 11:23:45 +00:00
Marco Neumann	c6e374a025	feat: allow catalog access w/o a transaction (#3735 ) * feat: allow catalog access w/o a transaction Now the caller has the full control if they want to use a transaction or not. * fix: remove non-transaction-safe `create_many` * fix: remove unnecessary transactions	2022-02-15 10:15:36 +00:00
dependabot[bot]	f23574bc5f	chore(deps): bump futures from 0.3.19 to 0.3.21 (#3706 ) Bumps [futures](https://github.com/rust-lang/futures-rs) from 0.3.19 to 0.3.21. - [Release notes](https://github.com/rust-lang/futures-rs/releases) - [Changelog](https://github.com/rust-lang/futures-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/futures-rs/compare/0.3.19...0.3.21) --- updated-dependencies: - dependency-name: futures dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-10 09:19:19 +00:00
kodiakhq[bot]	ace76cef14	Merge branch 'main' into dom/sharded-cache	2022-02-08 16:09:48 +00:00
Marco Neumann	5de4d6203f	refactor: catalog transaction (#3660 ) * refactor: catalog Unit of Work (= transaction) Setup an inteface to handle Units of Work within our catalog. Previously both the Postgres and the in-mem backend used "mini-transactions on demand". Now the caller has a clear way to establish boundaries and gets read and write isolation. A single `Arc<dyn Catalog>` can create as many `Box<dyn UnitOfWork>` as you like, but note that depending on the backend you may not scale infinitely (postgres will likely impose certain limits and the in-mem backend limits concurrency to 1 to keep things simple). * docs: improve wording Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: rename Unit of Work to Transaction * test: improve `test_txn_isolation` * feat: clearify transaction drop semantics Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-08 13:38:33 +00:00
Dom Dwyer	45f9ef82ba	feat: shard namespace cache Adds a simple wrapper type that maps the namespace keyspace over a set of N namespace schema caches, thereby reducing cache lock contention by a factor of N (in a perfect world). This will help smooth out latency of workloads that include new namespace requests or incremental schema additions. It should also significantly help latency during initial cache warming of a freshly booted router.	2022-02-04 16:12:45 +00:00
Dom Dwyer	026a557c0b	refactor: rename TableNamespaceSharder Rename to JumpHash and expose the hashing internals for reuse (outside of only table & namespace sharding).	2022-02-04 15:56:09 +00:00
Dom Dwyer	aefc70a9ea	feat(router2): namespace auto-creation Decorate the existing request handler pipeline with a layer that implicitly creates the namespace when a write request is received.	2022-02-04 15:34:15 +00:00
kodiakhq[bot]	3197ea945b	Merge branch 'main' into dom/extract-ns-cache	2022-02-03 12:30:37 +00:00
Dom	2e9b97a4ab	docs: fix typo Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-02-03 12:30:16 +00:00
Paul Dix	ce46bbaada	feat: wire up the write buffer to the ingester process (#3533 ) This adds the scaffolding for the ingester server to consume data from Kafka. This ingests data in an in memory structure while creating records in the catalog for any partitions that don't yet exist. I've removed catalog_update.rs in ingester for now. That was mostly a placeholder and will be going in a combination of handler.rs and data.rs on my next PR which will have some primitive lifecycle wired up. There's one ugly bit here where the DML write is cloned because it's getting borrowed to output spans and metrics. I'll need to follow up with a refactor to make it so that the DML write's tables can be consumed without it gumming up the metrics stuff. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-02-03 11:47:28 +00:00
Dom Dwyer	3cc4481616	refactor: extract NamespaceSchema cache Breaks the in-memory cache of NamespaceSchema out into a decoupled type that can be shared across multiple DML handlers.	2022-02-03 10:01:07 +00:00
Dom Dwyer	26c033d529	style: return directly	2022-02-02 15:20:48 +00:00
Dom Dwyer	4744c5804e	refactor: remove Dashmap Swap Dashmap for a regular RwLock<HashMap<..,>> due to soundness issues: https://rustsec.org/advisories/RUSTSEC-2022-0002	2022-02-02 14:04:53 +00:00
Dom Dwyer	6598023726	feat: cache NamespaceSchema in validator Adds an in-memory cache of table schemas to the SchemaValidator DML handler. The cache pulls from the global catalog when observing a column for the first time, and pushes the column type to set it for subsequent requests if it does not exist (this pull & push is done by atomically by the catalog in an "upsert" call). The in-memory cache is sharded by namespace, with each shard guarded by an individual lock to minimise contention between readers (the expected average case) and writers (only when adding new columns/tables). Relies on the catalog to serialise new column creation and validate parallel creation requests.	2022-02-02 13:04:53 +00:00
Dom Dwyer	c81f207298	feat: schema validation Implements a write schema validation DML handler, denying requests that conflict with the schema within the global catalog. Additive schema changes are accepted, incrementally updating the global catalog schema. Deletes are passed through unchanged and unvalidated.	2022-02-02 13:04:53 +00:00
Marco Neumann	22778a3a80	chore: upgrade rskafka and parking_lot (#3592 )	2022-02-01 11:50:42 +00:00
Luke Bond	011b297f28	feat: more benchmarks of router2 (#3575 )	2022-01-28 17:44:10 +00:00
Dom Dwyer	b38deaa721	refactor: decouple error types for DmlHandler Allows the DmlHandler to return different types for each method. This enables a DmlHandler implementation decorating an inner handler to return the inner handler's error directly, avoiding any "wrapper" errors.	2022-01-28 11:01:06 +00:00
Luke Bond	4a96e52290	feat: router2 sharder benchmarking (#3558 ) * feat: benchmarking the router2 sharder * chore: added throughput to sharder benchmarks; vary num buckets	2022-01-27 18:09:16 +00:00
Andrew Lamb	2062267d0f	chore: Update hashbrown (#3551 ) * chore: Update hashbrown * fix: hakari Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-27 15:34:10 +00:00
Dom	5447554aee	refactor(router2): DML handler stack (#3549 ) * refactor: composable DmlHandler stack Changes the DmlHandler trait to allow composition of handler logic in order to construct the complete request processing pipeline. * feat: debug log write/delete requests Log requests hitting the HTTP endpoint at DEBUG. * refactor: dml_handler -> dml_handlers Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-01-27 14:54:27 +00:00

1 2

69 Commits (e4fb227c6e04daa01328294a554180e965d3fa93)