Commit Graph

108 Commits (eb046863485fdf3e130fc60484485c901b81276b)

Author SHA1 Message Date
congqixia cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye c84a0748c4
enhance: add rw/ro streaming query node replica management (#38677)
issue: #38399

- Embed the query node into streaming node to make delegator available
at streaming node.
- The embedded query node has a special server label
`QUERYNODE_STREAMING-EMBEDDED`.
- Change the balance strategy to make the channel assigned to streaming
node as much as possible.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-24 16:55:07 +08:00
cai.zhang 6d45dd5666
fix: Add scalar index engine version for compatibility (#39204)
issue: #39203

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-15 12:25:00 +08:00
tinswzy 27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
congqixia b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00
wei liu a03157838b
enhance: Enable node assign policy on resource group (#36968)
issue: #36977
with node_label_filter on resource group, user can add label on
querynode with env `MILVUS_COMPONENT_LABEL`, then resource group will
prefer to accept node which match it's node_label_filter.

then querynode's can't be group by labels, and put querynodes with same
label to same resource groups.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-11-08 11:18:27 +08:00
chyezh cc8f7aa110
fix: streaming service related fix patch (#34696)
issue: #33285

- add idAlloc interface
- fix binary unsafe bug for message
- fix service discovery lost when repeated address with different server
id

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-16 15:49:38 +08:00
congqixia 25a1c9ecf0
fix: Make coordinator `Register` not blocked on ProcessActiveStandby (#32069)
See also #32066

This PR make coordinator register successful and let
`ProcessActiveStandBy` run async. And roles may receive stop signal and
notify servers.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-10 18:49:18 +08:00
congqixia 357fe814ce
fix: Remove unnecessary deleteSession operation (#31647)
See also #31628

The `Revoke` operation shall delete all keys related to the lease
attaching to. This `deleteSession` operation may also remove the session
key in next epoch by mistake and cause chaos session status

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-29 13:57:11 +08:00
jaime db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
yiwangdr 32cff25f97
enhance: decrease coordinator init time (#29822)
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).

Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.

integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.


issue: #29409

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-05 14:00:12 +08:00
smellthemoon 1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00
wei liu 5b45a138b1
disable auto balance when old node exists (#28191)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-07 14:02:20 +08:00
Xiaofan da19e49daf
Support purge old session for standalone (#28184)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-11-06 21:21:42 +08:00
yah01 9658367a3c
Refine chunk manager errors (#27590)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-31 12:18:15 +08:00
Filip Haltmayer 6b1a106a31
Moving etcd client into session (#27069)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-10-27 07:36:12 +08:00
jaime ac2d1bb5c2
Support receive signals from parent process (#27756)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-18 20:20:11 +08:00
jaime ec1fe3549e
Add a stop hook to clean session (#27564)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:24:10 +08:00
Jiquan Long e4f73cc805
Add host & enable_disk to session (#27507)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-08 20:05:31 +08:00
Jiquan Long 5c1abfa2cc
Print the server id when active-standby switch (#27119)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-07 10:01:31 +08:00
Jiquan Long 0f14d18201
Optimize the codec code of session (#27360)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-10-01 10:33:30 +08:00
foxspy 5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
wei liu 9433a24f5d
fix component not exit when liveness check failed (#27236)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-22 19:13:25 +08:00
SimFG 26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
congqixia 16b35e07b3
Fix `TestSessionSuite/TestKeepAliveRetryActiveCancel` unit test logic (#27231)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-20 18:59:23 +08:00
congqixia f0d0651989
Do not reset connection immediately if grpc code is `Canceled` or `DeadlineExceeded` (#27014)
We found lots of connection reset & canceled due to recent retry change
Current implementation resets connection no matter what the error code is
To sync behavior to previous retry, skip reset connection only if cancel error happens too much.

Also adds a config item for minResetInterval for grpc reset connection

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-13 15:01:18 +08:00
wei liu 0e2085b77f
fix dc standby to active (#26810)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-06 10:41:49 +08:00
congqixia 2b367b6bb0
Fix sessionutil Liveness check blcok in watch forever (#26248)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-10 14:07:16 +08:00
congqixia 7dfc8fbf0a
Fix data race on keepAliveCancel (#26087)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-02 18:55:07 +08:00
congqixia 8b11636e72
Cancel previous ctx for session retry keepalive (#26050)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-02 12:09:05 +08:00
wayblink 587237a3c9
Fix dead loop in session (#25451)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-07-13 18:02:29 +08:00
yah01 cd29b863d0
Fix data race in session (#25354)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-06 14:52:25 +08:00
wayblink b7ecb7f56b
Disable retryKeepAlive when LivenessCheck's Context close (#25161)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-27 17:08:45 +08:00
wayblink b752a29995
Add timeout for keepalive in session (#25077)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-26 12:30:44 +08:00
SimFG 0c3f92d7d7
Improve the panic code about the rootcoord/session/rocksmq (#24859) (#25024)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-06-21 11:24:42 +08:00
congqixia d0c2fa5d19
Fix retryKeepAlive assertion panic (#24667)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-07 10:08:36 +08:00
wayblink 5fb5b072ae
Retry keepalive when keepalive channel close (#24581)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-06-01 16:14:35 +08:00
congqixia 74bba2320a
Fix session stop/goingStop stuck after connection lost (#24131)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-05-16 14:51:22 +08:00
cai.zhang 43a9e175a3
Exit component process when session key is deleted (#21658) (#22164)
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-04-12 20:12:28 +08:00
jaime c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00
yah01 7da870f512
Remove useCustomConfig and simpilify the session type (#23166)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-04-03 20:10:24 +08:00
congqixia 732986aa04
Remove fmt.Print from internal package (#22722)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-03-14 17:36:05 +08:00
Enwei Jiao 697dedac7e
Use cockroachdb/errors to replace other error pkg (#22390)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-02-26 11:31:49 +08:00
congqixia f2575e5fa8
Add unconvert & durationcheck linters and fix issues (#22161)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-02-15 17:22:34 +08:00
yah01 b1f31da77a
Fix activate standby server ignores all errors (#22073)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-02-09 15:24:31 +08:00
wayblink d41cc0b21b
Revoke session to only delete session key created by this node (#21935)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-02-02 16:37:52 +08:00
wayblink de584b508e
Fix active-standby switch fail bug (#21755)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-01-17 11:43:43 +08:00
wayblink 6a722396bd
Integration test framework (#21283)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-01-12 19:49:40 +08:00
Jiquan Long 6d09bbed68
[skip e2e] Fix load meta migration (#21584)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-01-11 19:31:39 +08:00
Enwei Jiao 89b810a4db
Refactor all params into ParamItem (#20987)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>

Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2022-12-07 18:01:19 +08:00