issue: #33285
pr: #37815
- remove the rpc layer of coordinator when enabling standalone or
mixcoord
- move health check into init
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: https://github.com/milvus-io/milvus/issues/37764
- add a local client to call local server directly for
querycoord/rootcoord/datacoord.
- enable local client if milvus is running mixcoord or standalone mode.
Signed-off-by: chyezh <chyezh@outlook.com>
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Co-authored-by: Zhen Ye <chyezh@outlook.com>
issue: #33285
pr: #37722
- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.
Signed-off-by: chyezh <chyezh@outlook.com>
Cherry-pick from master
pr: #35928
Related to #35927
There are serveral issue this PR addresses:
- Use `ResetTraceConfig` method instead init one in update event handler
- Implement dynamic stats.Handler to receive tracing config update event
- Update `enable_trace` flag when `ResetTraceConfig` is invoked
- Change `enable_trace` to `std::atomic<bool>` in case of data race
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #35719
pr: #35720
In standalone mode, block the start process until the new coordinator is
active to avoid the coexistence of the old coordinator and the new
node/proxy
1. In the start/restart process, the new coordinator will become active
immediately and will not be blocked
2. In the rolling upgrade process, the new coordinator will not be
active until the old coordinator is down, and it will be blocked
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
pr: #34953
after manual stop component by management restful api, `healthz` may
return unhealthy state. k8s may restart the pod to save the unhealthy
sate, and the manual stop operation will got unexpected result.
to solve this, we make `healthz` API skip the manual stopped component.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #32698
pr: #32076
This PR add two rest api for component stop and status check:
1. `/management/stop?role=querynode` can stop the specified component
2. `/management/check/ready?role=rootcoord` can check whether the target
component is serviceable
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Fix etcd config source didn't respect auth enabled
Also removed pulsar recoverable error when pulsar return ConsumerBusy.
It could happen that pulsar didn't find the original consumer is dead
and recover takes some time.
fix#31631
Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
issue: #30310
also see pr: #30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
See also #30211
After fix initialization problem, distributed components do no have
their role set. This will cause logger & tracing miss component service
info when recording information.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30176
Move paramtable.Init after env setup in roles.Run. Also introduced a
flag for mixture run to set role correctly for mixture mode.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also: #25323, #29969
many users reported log file name is incorrect when starting in mixture
type.
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>