congqixia
d635495885
fix: [2.3] Make coordinator `Register` not blocked on ProcessActiveStandby( #32069 ) ( #32133 )
...
Cherry-pick from master
pr: #32069
See also #32066
This PR make coordinator register successful and let
`ProcessActiveStandBy` run async. And roles may receive stop signal and
notify servers.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-11 17:33:21 +08:00
wei liu
0bf595a513
enhance: Speed up target recovery after query coord restart ( #31240 ) ( #31449 )
...
issue: #28491
pr: #31240
after querycoord restart, it will pull a new target, which include
channel and segment list. when segments loaded on querynode has reached
the target, the collection could provide search/query. but if segment
list changes by time, ater querycoord pull a new target, it will takes a
few minutes to catch up the target's segment distribution. and before
that, query/search will fail due to lack of segments.
This PR save the current loaded target to meta storein querycoord's stop
progress, and recover it when query coord starts, to speed up the target
recovery time.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-22 10:27:17 +08:00
jaime
5ddb0b435f
fix: revoke session may be ignored due to server context cancellation in advance ( #31213 )
...
issue: #31219
pr: #31220
Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-14 19:05:04 +08:00
SimFG
ef84d40e54
enhance: [2.3] make the watch dm channel request better compatibility ( #30954 )
...
pr: #30952
issue: https://github.com/milvus-io/milvus/issues/30938
Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-01 16:09:01 +08:00
chyezh
77e123762f
enhance: add graceful stop timeout to avoid node stop hang under extreme cases ( #30320 )
...
1. add coordinator and proxy graceful stop timeout to 5s.
3. add other work node graceful stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
4. change the order of datacoord component while stop.
5. `LivenessCheck` do not perform graceful shutdown now.
issue: https://github.com/milvus-io/milvus/issues/30310
pr: #30317
also see: https://github.com/milvus-io/milvus/pull/30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-01-27 08:45:02 +08:00
congqixia
9e8eb2aa51
fix: Revert leader checker related check ( #30262 )
...
See also #30150
PR reverted: #29984 #30152
Currently this scenario could not be covered by ut/it/e2e test cases
Revert it for now
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-25 12:39:02 +08:00
wei liu
7d73032582
enhance: refactor leader_observer to leader_checker ( #29454 ) ( #29984 )
...
issue: #29453
pr: #29452
sync distribution by rpc will also call loadSegment/releaseSegment,
which may cause all kinds of concurrent case on same segment, such as
concurrent load and release on one segment.
This PR add leader_checker which generate load/release task to correct
the leader view, instead of calling sync distribution by rpc
---------
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-18 14:08:54 +08:00
wei liu
26b1853c54
fix: Auto balance param can't be updated by dynamic( #29501 ) ( #29502 )
...
pr: #29501
This PR fixed that auto balance param can't be updated by dynamic
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-27 14:30:53 +08:00
SimFG
74e72ce27e
enhance: [2.3] Support to get the param value in the runtime ( #29298 )
...
pr: #29297
/kind improvement
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-21 20:36:43 +08:00
MrPresent-Han
5f4ac437b2
enhance: [Cherry-pick] Moving etcd client into session ( #27069 ) ( #28996 )
...
relate: #26694
pr: https://github.com/milvus-io/milvus/pull/27069
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
Co-authored-by: Filip Haltmayer <81822489+filip-halt@users.noreply.github.com>
2023-12-07 16:22:34 +08:00
jaime
9378f78218
enhance: Add logs for each step during service initialization ( #28687 )
...
/kind improvement
pr: #28624
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-11-27 17:54:26 +08:00
congqixia
6512b12fba
enhance: [cherry-pick] Make etcd kv request timeout configurable ( #28661 ) ( #28701 )
...
Cherry-pick from master
pr: #28661
See also #28660
This pr add request timeout config item for etcd kv request timeout
Sync the default timeout value to same value for etcdKV & tikv config
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-24 21:16:26 +08:00
wei liu
d3f149c403
fix unstable auto balance config ut ( #28289 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-09 10:02:19 +08:00
yah01
385507ce47
Fix the target updated before version updated to cause data missing ( #28257 )
...
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-08 18:54:18 +08:00
wei liu
918333817e
Disable auto balance when old node exists ( #28191 ) ( #28224 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-08 07:10:17 +08:00
wei liu
87e8d04ed7
fix sync distribution with wrong version ( #28130 ) ( #28170 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-06 11:38:18 +08:00
wei liu
178db7b0f0
check stopping node during start qc ( #27859 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-10-24 12:20:11 +08:00
jaime
ec1fe3549e
Add a stop hook to clean session ( #27564 )
...
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:24:10 +08:00
yah01
be980fbc38
Refine state check ( #27541 )
...
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-11 21:01:35 +08:00
yah01
a8ce1b6686
Refine QueryCoord stopping ( #27371 )
...
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-27 16:27:27 +08:00
jaime
7f7c71ea7d
Decoupling client and server API in types interface ( #27186 )
...
Co-authored-by:: aoiasd <zhicheng.yue@zilliz.com>
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-09-26 09:57:25 +08:00
SimFG
26f06dd732
Format the code ( #27275 )
...
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
yiwangdr
337edc321b
tikv integration ( #26246 )
...
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2023-09-07 07:25:14 +08:00
SimFG
28681276e2
Improve the retry of the rpc client ( #26795 )
...
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-06 17:43:14 +08:00
yah01
3349db4aa7
Refine errors to remove changes breaking design ( #26521 )
...
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-04 09:57:09 +08:00
wei liu
949c320185
remove pull target from qc recover ( #26775 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-01 11:17:01 +08:00
yihao.dai
63b86b32a6
Add server id validation interceptor ( #26395 )
...
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-08-17 20:20:20 +08:00
Enwei Jiao
7d61355ab0
Refactor log for Query ( #26310 )
...
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-08-14 18:57:32 +08:00
Bingyi Sun
a3e22786ed
Move meta store to kv catalog ( #25915 )
...
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-07-31 13:57:04 +08:00
congqixia
1045c88102
Support replace indexed field in QueryCoord ( #25747 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-07-19 21:22:58 +08:00
wei liu
68ae199a9f
load segment with target version, avoid read redundant segment ( #24929 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-06-27 11:48:45 +08:00
xige-16
33c2012675
Add more metrics ( #25081 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-26 17:52:44 +08:00
Bingyi Sun
b88e74a109
Fix querycoord close error ( #25034 )
...
Signed-off-by: sunby <bingyi.sun@zilliz.com>
Co-authored-by: sunby <bingyi.sun@zilliz.com>
2023-06-21 11:02:42 +08:00
congqixia
41af0a98fa
Use go-api/v2 for milvus-proto ( #24770 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-09 01:28:37 +08:00
Jiquan Long
30415e1b83
Fix metric QueryCoordNumCollections ( #24053 ) ( #24107 )
...
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-05-15 16:33:22 +08:00
congqixia
ed81eaa963
Make CollectionObserver trigger checker more frequently during load procedure ( #23928 )
...
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-05-08 14:06:41 +08:00
Xiaofan
87d790f052
Fix upgrade casue panic ( #23833 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-05-02 14:06:37 +08:00
wei liu
1deac692a0
fix nodeup block ( #23634 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-25 19:20:37 +08:00
wei liu
4336ed8609
fix node up ( #23415 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-20 09:52:31 +08:00
cai.zhang
43a9e175a3
Exit component process when session key is deleted ( #21658 ) ( #22164 )
...
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-04-12 20:12:28 +08:00
Xiaofan
680ad482b7
Check balance checker chore to 10s ( #23304 )
...
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-04-09 16:14:32 +08:00
jaime
c9d0c157ec
Move some modules from internal to public package ( #22572 )
...
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
MrPresent-Han
afd874b736
enhance segment balance by considering global rowCount(##22914) ( #23056 )
...
Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>
Co-authored-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-04-03 14:16:25 +08:00
yah01
75737c65ac
Refine error handle of QueryCoord ( #23068 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-31 10:54:29 +08:00
zhenshan.cao
1287ca699a
Refine usage of TimeRecorder.Record ( #23142 )
...
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2023-03-30 18:56:22 +08:00
yah01
081572d31c
Refactor QueryNode ( #21625 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: aoiasd <zhicheng.yue@zilliz.com>
2023-03-27 00:42:00 +08:00
yihao.dai
1f718118e9
Dynamic load/release partitions ( #22655 )
...
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-03-20 14:55:57 +08:00
SimFG
b57e476089
Fix the nil point about the session ( #22748 )
...
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-03-14 20:07:54 +08:00
wei liu
c162c6ecc0
fix assign node err ( #22479 )
...
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-03-01 11:11:47 +08:00
Enwei Jiao
697dedac7e
Use cockroachdb/errors to replace other error pkg ( #22390 )
...
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-02-26 11:31:49 +08:00