influxdb

Commit Graph

Author	SHA1	Message	Date
Eng Zer Jun	903d30d658	test: use `T.TempDir` to create temporary test directory (#23258 ) * test: use `T.TempDir` to create temporary test directory This commit replaces `os.MkdirTemp` with `t.TempDir` in tests. The directory created by `t.TempDir` is automatically removed when the test and all its subtests complete. Prior to this commit, temporary directory created using `os.MkdirTemp` needs to be removed manually by calling `os.RemoveAll`, which is omitted in some tests. The error handling boilerplate e.g. defer func() { if err := os.RemoveAll(dir); err != nil { t.Fatal(err) } } is also tedious, but `t.TempDir` handles this for us nicely. Reference: https://pkg.go.dev/testing#T.TempDir Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestSendWrite on Windows === FAIL: replications/internal TestSendWrite (0.29s) logger.go:130: 2022-06-23T13:00:54.290Z DEBUG Created new durable queue for replication stream {"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestSendWrite1627281409\\001\\replicationq\\0000000000000001"} logger.go:130: 2022-06-23T13:00:54.457Z ERROR Error in replication stream {"replication_id": "0000000000000001", "error": "remote timeout", "retries": 1} testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestSendWrite1627281409\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestStore_BadShard on Windows === FAIL: tsdb TestStore_BadShard (0.09s) logger.go:130: 2022-06-23T12:18:21.827Z INFO Using data dir {"service": "store", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestStore_BadShard1363295568\\001"} logger.go:130: 2022-06-23T12:18:21.827Z INFO Compaction settings {"service": "store", "max_concurrent_compactions": 2, "throughput_bytes_per_second": 50331648, "throughput_bytes_per_second_burst": 50331648} logger.go:130: 2022-06-23T12:18:21.828Z INFO Open store (start) {"service": "store", "op_name": "tsdb_open", "op_event": "start"} logger.go:130: 2022-06-23T12:18:21.828Z INFO Open store (end) {"service": "store", "op_name": "tsdb_open", "op_event": "end", "op_elapsed": "77.3µs"} testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestStore_BadShard1363295568\002\data\db0\rp0\1\index\0\L0-00000001.tsl: The process cannot access the file because it is being used by another process. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestPartition_PrependLogFile_Write_Fail and TestPartition_Compact_Write_Fail on Windows === FAIL: tsdb/index/tsi1 TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_PrependLogFile_Write_Failwrite_MANIFEST656030081\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process. --- FAIL: TestPartition_PrependLogFile_Write_Fail/write_MANIFEST (0.06s) === FAIL: tsdb/index/tsi1 TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s) testing.go:1090: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestPartition_Compact_Write_Failwrite_MANIFEST3398667527\002\0\L0-00000003.tsl: The process cannot access the file because it is being used by another process. --- FAIL: TestPartition_Compact_Write_Fail/write_MANIFEST (0.08s) We must close the open file descriptor otherwise the temporary file cannot be cleaned up on Windows. Fixes: `619eb1cae6` ("fix: restore in-memory Manifest on write error") Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestReplicationStartMissingQueue on Windows === FAIL: TestReplicationStartMissingQueue (1.60s) logger.go:130: 2023-03-17T10:42:07.269Z DEBUG Created new durable queue for replication stream {"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"} logger.go:130: 2023-03-17T10:42:07.305Z INFO Opened replication stream {"id": "0000000000000001", "path": "C:\\Users\\circleci\\AppData\\Local\\Temp\\TestReplicationStartMissingQueue76668607\\001\\replicationq\\0000000000000001"} testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestReplicationStartMissingQueue76668607\001\replicationq\0000000000000001\1: The process cannot access the file because it is being used by another process. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: update TestWAL_DiskSize Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * test: fix failing TestWAL_DiskSize on Windows === FAIL: tsdb/engine/tsm1 TestWAL_DiskSize (2.65s) testing.go:1206: TempDir RemoveAll cleanup: remove C:\Users\circleci\AppData\Local\Temp\TestWAL_DiskSize2736073801\001\_00006.wal: The process cannot access the file because it is being used by another process. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> --------- Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2023-03-21 16:22:11 -04:00
Jeffrey Smith II	b819edf095	fix: rename replication fields for better clarity (#24126 ) * fix: rename replication fields for better clarity * fix: dont rename, only add new field	2023-03-09 13:11:43 -05:00
Jeffrey Smith II	77fd64a975	fix: handle replication missing queue (#24123 ) * fix: replications should startup after backup/restore * chore: refactor * test: improve logging and handle test better	2023-03-09 13:10:53 -05:00
suitableZebraCaller	ec7fdd3a58	fix: Show Replication Queue size and Replication TCP Errors (#23960 ) * feat: Show remaining replication queue size * fix: Show non-http related error messages * fix: Show non-http related error messages with backoff * fix: Updates for replication tests * chore: formatting * chore: formatting * chore: formatting * chore: formatting * chore: lowercase json field --------- Co-authored-by: Geoffrey <suitableZebraCaller@users.noreply.github.com> Co-authored-by: Jeffrey Smith II <jeffreyssmith2nd@gmail.com>	2023-02-02 09:47:45 -05:00
Jeffrey Smith II	f026d7bdaf	fix: Fixes migrating when a remote already exists (#23912 ) * fix: handle migrating with already defined remotes * test: add test to verify migrating already defined remotes * fix: properly handle Up	2022-11-17 14:23:10 -05:00
Ole Kristian (Zee)	666cabb1f4	fix: fix wrong max age transformation from seconds (#23684 ) * fix: fix wrong max age transformation from seconds * refactor: clarify max age intent * refactor: remove unnecessary duration	2022-11-16 16:18:43 -05:00
Dane Strandboge	6fc66acb0a	fix: do not require remoteOrgID in remote config/creation request (#23838 )	2022-11-01 09:47:45 -05:00
Dane Strandboge	55b7d29e4f	fix: sql scan error on remote bucket id when replication to 1.x (#23826 )	2022-10-19 14:51:48 -05:00
Jeffrey Smith II	6f50e70960	feat: replicate based on bucket name rather than id (#23638 ) * feat: add the ability to replicate based on bucket name rather than bucket id. - This adds compatibility with 1.x replication targets * fix: improve error checking and add tests * fix: add additional constraint to replications table * fix: use OR not AND for constraint * feat: delete invalid replications on downgrade * fix: should be less than 2.4 * test: add test around down migration and cleanup migration code * fix: use nil instead of platform.ID(1) for better consistency * fix: fix tests * fix: fix tests	2022-08-18 14:21:59 -04:00
Jeffrey Smith II	090f681737	feat: Add remotes and replications to telemetry (#23456 ) * feat: start work on remotes/replications phone home data * feat: add remotes/replications phone home data (no tests * refactor: use erroring binary conversions * style: gofmt * refactor: improve some error handling * style: cleanup * feat: add tests * refactor: just list remotes/replications rather than decrement * chore: linting fix Co-authored-by: DStrand1 <dstrandboge@influxdata.com>	2022-06-16 14:48:06 -04:00
Dane Strandboge	9e556864a3	fix: replications remote write failure can deadlock remote writer (#23458 )	2022-06-16 11:57:24 -05:00
Jeffrey Smith II	692b0d5153	feat: add instance-id flag for identifying edge nodes (#23447 ) * feat: add instance-id flag for identifying edge nodes * refactor: rename tag to _instance_id	2022-06-16 12:18:11 -04:00
Dane Strandboge	9e20f9f3dc	feat: add signifier to replication user agent (#23370 )	2022-05-31 11:50:53 -05:00
Dane Strandboge	82d1123e78	build: upgrade to Go 1.18.1 (#23252 )	2022-04-13 15:24:27 -05:00
Dane Strandboge	359fcc46b5	feat: add maximum age to replication queues (#23206 ) Co-authored-by: Sam Arnold <sarnold@influxdata.com>	2022-03-25 13:06:05 -05:00
Sam Arnold	7c0ec4dd2c	fix: replications replicates flux to() writes (#23188 ) Fixes a few issues: * flux needs to write to the replication service, instead of the engine directly. * the replication service incorrectly had value receiver methods, I think this was just an accident. Pointer receivers make things easier to reason about. Also with value receivers flux was not picking up the replication config properly. * The flux to() function previously did not receive the org properly for internal writes. Previously this was not necessary as the write path only needs the bucket ID at this level (after authentication). But now we need the org id to look up replications properly. Closes #23183	2022-03-14 12:17:58 -04:00
Sam Arnold	e20b5e99a6	fix: remove nats for scraper processing (#23107 ) * fix: remove nats for scraper processing Scrapers now use go channels instead of NATS and interprocess communication. This should fix #23085 . Additionally, found and fixed #23106 . * chore: fix formatting * chore: fix static check and go.mod * test: fix some flaky tests * fix: mark NATS arguments as deprecated	2022-02-10 11:23:18 -05:00
William Baker	c1d384de19	test: fix flaky enqueue test (#23035 )	2022-01-10 08:04:59 -08:00
mcfarlm3	60234964d0	refactor: replications local write optimization (#22993 ) * refactor: eliminate sqlite query in case of no configured replications * refactor: updated write-related tests to reflect tracking of orgID and localBucket by the queue manager * refactor: removed redundant trackedReplications field * refactor: corrected slice init in GetReplications and added TestGetReplications * refactor: eliminated tracked package and moved TrackedReplication struct to influxdb package via replication.go * chore: ran make fmt * fix: added closeRq function back in to address flaky tests * refactor: small changes to queue manager test based on code review	2021-12-15 12:32:46 -08:00
William Baker	5a919b69d7	feat: enable remotes and replication streams feature (#22990 )	2021-12-13 16:01:50 -06:00
William Baker	0e5b14fa5e	chore: increase replications batch size limits (#22983 )	2021-12-13 11:02:38 -06:00
William Baker	a7a5233432	feat: advance queue scanner periodically instead of every remote write (#22981 )	2021-12-13 10:09:36 -06:00
William Baker	e3ff434f81	test: fix flaky replications tests (#22973 ) * fix: fix test and run 20 times * fix: unfix and run test 20 times * test: wait for rq run fn to return in tests	2021-12-08 14:48:25 -06:00
William Baker	e5cbd279ee	fix: advance replications queue after successful remote writes (#22967 ) * fix: advance replications queue after successful remote writes to prevent data duplication on errors * fix: loop on sendwrite * chore: remove flaky test * chore: add TODO about future optimization	2021-12-08 12:52:46 -06:00
William Baker	6096ee2ad4	feat: replications metrics include failure to enqueue (#22962 ) * feat: replications metrics include failure to enqueue	2021-12-02 14:42:55 -06:00
mcfarlm3	28bcd416b2	feat: batch replications remote writes to avoid payload limit errors (#22914 ) * feat: batch replications remote writes appropriately to avoid payload limit errors * chore: ran make fmt * chore: fixed staticcheck failure * refactor: removed batching code from queue manager * refactor: batch writes before gzip compression * fix: add in missing bracket after merge * fix: removed duplicate lines of code from WritePoints function * feat: add batching functionality for remote writes * refactor: removed batch index variable	2021-12-02 12:04:10 -08:00
William Baker	e4e16335f5	fix: replications remote writes do not block server shutdown (#22958 ) * fix: replications remote writes do not block server shutdown * fix: don't leak goroutine	2021-12-02 12:04:52 -06:00
William Baker	3460f1cc52	feat: replication remote writes do not block local writes (#22956 ) * feat: replication remote writes do not block local writes	2021-12-01 15:37:10 -06:00
William Baker	f05d0136f1	feat: metrics collection for replications remote writes (#22952 ) * feat: metrics collection for replications remote writes * fix: don't update metrics with 204 error code on successful writes	2021-12-01 12:41:24 -06:00
William Baker	9873ccd657	feat: remote write function for replications (#22942 ) * feat: remote write function for replications * chore: implement UpdateResponseInfo store method * chore: only set gzip heading for non-empty requests * fix: address review feedback	2021-11-30 15:33:42 -06:00
William Baker	f47d514225	refactor: move replications store functionality to separate package (#22923 ) * refactor: move replications store functionality to separate package * fix: make opening all repls on startup work right	2021-11-24 11:45:19 -06:00
William Baker	3a81166812	feat: added metrics collection for replications (#22906 ) * feat: added metrics collection for replications * fix: fixed panic when restarting * fix: fix panic pt2 * chore: self-review fixes * chore: simplify test	2021-11-22 11:40:03 -06:00
Dane Strandboge	6ee472725f	refactor: use remote write func in NewDurableQueueManager (#22888 )	2021-11-19 11:31:10 -06:00
William Baker	ad52815e19	feat: add field for dropping data resulting in non-retryable errors to individual replications (#22885 ) * feat: add field for dropping data resulting in non-retryable errors to individual replications	2021-11-16 13:41:54 -07:00
Dane Strandboge	40d9587ece	feat: add replications queue scanner (#22873 ) Co-authored-by: “mcfarlm3” <“58636946+mcfarlm3@users.noreply.github.com”>	2021-11-16 10:30:52 -06:00
Daniel Moran	6b56af3c3f	feat: mirror writes to registered replications (#22833 )	2021-11-10 08:25:47 -05:00
mcfarlm3	cd0243d2b4	feat: added replications queue management to launcher tasks (#22820 ) * feat: added replications queue management to launcher tasks * refactor: separated sql logic into replications service rather than durable queue manager * refactor: extended replications feature flag to launcher code and minor change to startup function param * chore: added unit test coverage for replications server startup queue management * refactor: made error messages reusable and factored out unecessary string from queue management tests * refactor: changed queue management error names to pass linter check	2021-11-09 11:32:07 -08:00
Daniel Moran	1aac92c5ee	refactor: remove replications.current_queue_size_bytes from sqlite (#22832 ) Maintaining the current queue size in a SQL column would require updating the DB on every queue operation. Avoid that contention by instead looking up the current size on the in-memory durable queue struct, which is already tracked & updated as data enters & leaves the queue.	2021-11-05 14:35:12 -04:00
William Baker	f7573f43a7	feat: sql migrator can do down migrations (#22806 ) * feat: sql down migrations * refactor: different name for up migrations * chore: update migrations ref in svc tests * build: add lint step to verify sql migration names match	2021-11-01 14:30:18 -06:00
mcfarlm3	8825cd5d50	feat: replication apis durable queue management (#22719 ) * feat: added durable queue management to replications service * refactor: improved mapping of replication streams to durable queues * refactor: modified replication stream durable queues to use user-specified engine path * chore: generated test mocks for replications DurableQueueManager * chore: add test coverage for replications durable queue manager * refactor: made changes based on code review, added mutex to durableQueueManager, improved error logging * chore: ran make fmt * refactor: further improvements to error logging	2021-10-26 12:14:29 -07:00
Daniel Moran	58139c47b2	feat: add auth to remotes & replications APIs (#22744 )	2021-10-26 11:32:35 -04:00
Daniel Moran	7c19225bed	feat: implement replication validation (#22581 )	2021-10-05 14:34:38 -04:00
Daniel Moran	153a89dba0	feat: deleting a bucket also deletes all associated replications (#22424 )	2021-09-09 15:22:36 -04:00
Daniel Moran	1fa0ccf24a	refactor: move interfaces for remotes & replication services out of root package (#22417 )	2021-09-07 16:21:29 -04:00
Daniel Moran	12c8fd28d2	feat: implement metadata management for replications (#22302 )	2021-09-01 12:01:41 -04:00
Daniel Moran	b37ad79e20	feat: add logging and metrics middlewares to replications API (#22291 )	2021-08-24 14:56:56 -04:00
Daniel Moran	641c02f9a8	feat: add APIs for management of replication streams (#22287 )	2021-08-24 14:19:03 -04:00

47 Commits (398660438f36d715b9370ed8da37428121b013c8)