Commit Graph

618 Commits (f49d618382984af9a1e3c6752d83836658983cec)

Author SHA1 Message Date
congqixia 37ca32dbba
enhance: Make SegmentDistManager filter use node index (#32533)
See also #32165

Change `SegmentDistFilter` to interface in order to provde node index
when filter segment dist.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-24 16:53:24 +08:00
smellthemoon 96d95e7743
enhance: fix pass error msg as channel name (#32511)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-04-23 16:45:22 +08:00
congqixia bfebdecf3e
enhance: Make LeaderView Manager filter use map index (#32505)
See also #32165

Change `LeaderViewFilter` to interface to provided map key to avoid
iterating all key-values in LeaderViewManager

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-23 11:07:24 +08:00
chyezh 21a9de5c8e
fix: resource group ut fixup (#32509)
issue: #30647

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-23 10:01:23 +08:00
congqixia d7ff1bbe5c
enhance: Make querycoordv2 collection observer task driven (#32441)
See also #32440

- Add loadTask in collection observer
- For load collection/partitions, load task shall timeout as a whole
- Change related constructor to load jobs

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-22 10:39:22 +08:00
congqixia 01c16fe6e3
enhance: Manual release pool after save targets (#32358)
See also #31632

Release conc.Pool after usage to clean worker and stop background purge
and ticktock.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-19 13:51:21 +08:00
chyezh a8c8a6bb0f
fix: parameter check of TransferReplica and TransferNode (#32297)
issue: #30647 

- Same dst and src resource group should not be allowed in
`TransferReplica` and `TransferNode`.

-  Remove redundant parameter check.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-17 15:27:19 +08:00
yiwangdr 7deda4d5e9
enhance: speed up GetByCollectionAndNode (#32232)
Related to https://github.com/milvus-io/milvus/issues/32165

Avoid iterating through all replicas/collections if possible. Iteration
is expensive when there are large number of replicas/collections.

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-04-17 10:23:25 +08:00
congqixia 72c172a7d7
enhance: Remove duplicated collectionID label for task latency (#32308)
`CollectionID` already exists in channel name, so remove it to save
metrics traffic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-16 18:55:19 +08:00
chyezh 70e3d5b495
fix: wrong node id in TestCheckNodesInReplica (#32268)
issue: #31930

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-15 17:38:17 +08:00
wei liu 4822b109bd
fix: Skip to load l0 segment on old version query node (#32124)
issue: #32107

during rolling upgrade progress, skip to load l0 segment on old version
query node

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-15 11:23:23 +08:00
congqixia dc11cbd123
enhance: Maintain collection-patitions mapping in qc meta (#32227)
Related to #32165

Add collection to partitionIDs mapping to avoid interation on all
partitions loaded when trying to get all partitions with collection id

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-15 10:05:19 +08:00
chyezh 48fe977a9d
enhance: declarative resource group api (#31930)
issue: #30647

- Add declarative resource group api

- Add config for resource group management

- Resource group recovery enhancement

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-15 08:13:19 +08:00
wei liu 68dec7dcd4
fix: Use correct ts to avoid exclude segment list leak (#31991)
issue: #31990

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-12 10:39:19 +08:00
congqixia b9a487608a
fix: Make `ResourceGroup.nodes` concurrent safe (#32159)
See also #32158

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-11 17:53:18 +08:00
congqixia 25a1c9ecf0
fix: Make coordinator `Register` not blocked on ProcessActiveStandby (#32069)
See also #32066

This PR make coordinator register successful and let
`ProcessActiveStandBy` run async. And roles may receive stop signal and
notify servers.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-10 18:49:18 +08:00
chyezh a3d6110957
fix: ut failure (#32120)
issue: #30647

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-10 17:30:48 +08:00
chyezh 0be67e7f99
fix: ut failure (#32119)
issue: #30647

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-10 17:23:27 +08:00
wei liu c4806b69c4
enhance: Refactor leader view manager interface (#31133)
issue: #31091
This PR add GetByFilter interface in leader view manager, instead of all
kind of get func

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-10 15:13:36 +08:00
wei liu 177ddda47f
fix: Check stale should check leader task's leader id (#31962)
issue: #30816

check stale rules for leader task:
1. for reduce leader task, it should keep executing until leader's node
become offline.
2. for grow leader task,it should keep executing until leader's node
become stopping.

This PR check leader node's stopping state for grow leader task

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-09 15:33:25 +08:00
zhenshan.cao 089c805e0a
enhance:Refactor hybrid search (#32020)
issue: https://github.com/milvus-io/milvus/issues/25639
https://github.com/milvus-io/milvus/issues/31368

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-04-09 14:21:18 +08:00
yiwangdr 1cd15d9322
test: support segment release in integration test (#31190)
issue: #29507

Notice that api_testonly.go files should be guarded by compiler tag
`test`, so that production build rules don't compile them and these APIs
don't get misused.

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-04-09 11:39:17 +08:00
chyezh a2502bde75
enhance: replica manager enhancement (#31496)
issue: #30647 

- ReplicaManager manage read only node now, and always do persistent of
node distribution of replica.

- All segment/channel checker using ReplicaManager to get read-only node
or read-write node, but not ResourceManager.

- ReplicaManager promise that only apply unique querynode to one replica
in same collection now (replicas in same collection never hold same
querynode at same time).

- ReplicaManager promise that fairly node count assignment policy if
multi replicas of collection is assigned to one resource group.

- Move some parameters check into ReplicaManager to avoid data race.

- Allow transfer replica to resource group that already load replica of
same collection

- Allow transfer node between resource groups that load replica of same
collection

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-05 04:57:16 +08:00
congqixia c2aad513c0
fix: Check collection nil before check load status (#31850)
See also #31849

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-03 10:07:13 +08:00
congqixia 56e371c478
fix: Check replica exists before get latest leader (#31848)
See also #31847

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-03 10:05:22 +08:00
wei liu 7471a8005f
fix: querycoord panic after node down (#31831)
issue: #30519

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-03 10:03:22 +08:00
congqixia 0feee53631
enhance: Add back unit test for compactor and fix some TODOs (#31829)
This PR adds back compactor "Unhandled" data type unit test and fixes
some TODOs behvaior

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-02 20:35:14 +08:00
Bingyi Sun 91cb529ba6
fix: get latest collection info when checking index (#31744)
issue: https://github.com/milvus-io/milvus/issues/31727

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-04-02 14:43:13 +08:00
wei liu 0944a1f790
enhance: Refactor channel dist manager interface (#31119)
issue: #31091
This PR add GetByFilter interface in channel dist manager, instead of
all kind of get func

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-02 10:23:14 +08:00
congqixia 16d869c57e
enhance: Add EmbedEtcd testutil and remove etcd dep of task pkg (#31802)
See also #20478

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-02 09:59:14 +08:00
wei liu bb500d66c7
fix: Remove segment from leader view can't be executed (#31663)
issue: #31664

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-01 10:39:12 +08:00
wei liu c311932d5f
fix: Update segment's version in leader task (#31643)
issue: #31468

1. when segment's version in leader view doesn't match segment's version
in dist, should update leader view
2. after call loadDeltalog, should update segment's load version with
latest ts
3. change leader task's priority from high to low, to avoid leader task
replace segment task and balance task

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-01 10:37:21 +08:00
wei liu 92971707de
enhance: Add restful api for devops to execute rolling upgrade (#29998)
issue: #29261
This PR Add restful api for devops to execute rolling upgrade, including
suspend/resume balance and manual transfer segments/channels.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-27 16:15:19 +08:00
wei liu 5d752498e7
fix: Skip release duplicate l0 segment (#31540)
issue: #31480 #31481

release duplicate l0 segment task, which execute on old delegator may
cause segment lack, and execute on new delegator may break new
delegator's leader view.

This PR skip release duplicate l0 segment by segment_checker, cause l0
segment will be released with unsub channel

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-27 12:53:10 +08:00
congqixia 8e5865f630
enhance: Save collection targets by batches (#31616)
See also #28491 #31240

When colleciton number is large, querycoord saves collection target one
by one, which is slow and may block querycoord exits.

In local run, 500 collections scenario may lead to about 40 seconds
saving collection targets.

This PR changes the `SaveCollectionTarget` interface into batch one and
organizes the collection in 16 per bundle batches to accelerate this
procedure.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-27 00:09:08 +08:00
congqixia 73858b23bc
fix: Make target observer auto/manual task mutual exclusive (#31584)
See also #30867

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-26 09:57:08 +08:00
wei liu 6438d65459
fix: Grow task stuck at stopping node (#31487)
issue: #30816
this PR fix that grow task stuck at stopping node

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-25 18:57:07 +08:00
congqixia 4d2142d041
fix: Check latest leader exists before using it (#31500)
See also #31495

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-22 18:25:07 +08:00
wei liu 03eaa5d478
fix: Load segment task promote failed (#31430)
issue: #30816

pr #31319 introduce the logic that segment checker need to load level
zero segment which only exist in current target.

This PR fix load segment task promote failed when segment only belongs
to current target

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-21 18:09:07 +08:00
chyezh 9f9ef8ac32
enhance: transfer resource group and dbname to querynode when load (#30936)
issue: #30931

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-21 11:59:12 +08:00
wei liu 7c7375031d
enhance: Add metrics for task latency in querycoord scheduler (#31405)
This PR add metrics for task latency in querycoord scheduler, so if any
kind of task stuck, it's easy to figure out by metrics

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-20 19:29:06 +08:00
congqixia a647b84f3e
enhance: Add AllPartitionsID const to replace InvalidPartitionID (#31438)
"-1" as `InvalidPartitionID` previously used as All partition place
holder in delete cases. It's confusing and hard to maintain when a const
var has more than one meaning.

This PR add `AllPartitionsID` to replace these usages in delete
scenarios.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-20 19:01:05 +08:00
congqixia c3d53eb1bf
enhance: Remove metrics when target removed (#31399)
See also #31390

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-20 10:09:08 +08:00
congqixia 194a611814
enhance: Add metrics for querycoord current target cp lag (#31391)
See also #31390

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 14:07:05 +08:00
wei liu 3e7e9f15cd
fix: Wrong behavior of CurrentTargetFirst/NextTargetFirst in target maanger (#31379)
issue: #31162

when give scope CurrentTargetFirst/NextTargetFirst, it's expected to
scan both current and next target.

This PR fixed wrong behavior of CurrentTargetFirst/NextTargetFirst in
target manager, which may cause unexpected task generated, and load
collection may stuck forever due to dirty leader view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-19 11:49:05 +08:00
wei liu c26c1b33c2
fix: Transfer l0 segment to new delegator after balance (#31319)
issue: #30186

during channel balance, after new delegator loaded, instead of syncing
l0 segment's location to new delegator, we should load l0 segment on new
delegator, and release the old l0 segment, then start to release old
delegator.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-19 09:59:05 +08:00
wei liu 4dfdb1a443
fix: save current target after target observer stop (#31315)
issue: #28491

should save target to meta store after target observer stop, incase of
target changed

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-18 12:27:04 +08:00
wei liu d79aa58b37
enhance: Speed up target recovery after query coord restart (#31240)
issue: #28491

after querycoord restart, it will pull a new target, which include
channel and segment list. when segments loaded on querynode has reached
the target, the collection could provide search/query. but if segment
list changes by time, ater querycoord pull a new target, it will takes a
few minutes to catch up the target's segment distribution. and before
that, query/search will fail due to lack of segments.

This PR save the current loaded target to meta storein querycoord's stop
progress, and recover it when query coord starts, to speed up the target
recovery time.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-15 14:19:03 +08:00
chyezh ff4237bb90
enhance: add hostname into node info (#30673)
issue: https://github.com/milvus-io/milvus/issues/30647

- Address may be reused in k8s environment. Using hostname can be
better.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-15 10:45:06 +08:00
jaime db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
congqixia 773c64ecbb
fix: Set nodeID when remove distribution (#31259)
See also #30930

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-14 15:09:03 +08:00
wei liu 06b191b164
fix: Balance channel stuck forever due to logic dead lock (#31202)
issue: #30816

cause balance channel will stuck until leader view catch up the current
target, then start to unsub the old delegator. which make sure that the
new delegator can provide search before release old delegator. but
another logic in segment_checker skip loading segment during balance
channel. so during balance channel, if query node crash, new delegator
can't catch up target forever, then stuck forever.

This PR remove the rule that skip loading segment during balance channel
to avoid the logic dead lock here.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-13 15:05:04 +08:00
congqixia 5b51c20293
fix: Use `Remove` sync type for distribution removal (#31215)
See also #31214

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-13 06:11:04 +08:00
wei liu 06df9b8462
fix: Balance segment/channel won't be trigger on multi replicas (#31107)
issue: #30983 #30982

cause balancer call wrong interface to get segment/channel list in
replica, then got a wrong average segment/channel number, which make
each node have less segment/channel than average, and the balance won't
be trigger in multi replica case.

This PR fix that balance segment/channel won't be trigger on multi
replicas

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-11 20:35:04 +08:00
wei liu ddd918ba04
enhance: change frequency log to rated level (#31084)
This PR change frequency log of check shard leader to rated level

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-08 16:39:02 +08:00
wei liu efe8cecc88
enhance: refactor segment dist manager interface (#31073)
issue: #31091
This PR add `GetByFilter` interface in segment dist manager, instead of
all kind of get func

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-08 16:29:01 +08:00
wei liu 22df5061c1
fix: Leader checker can't update segment's load version (#31040)
issue: #30890

when leader checker find that leader view has an older load version of
segment, it will try to correct leader view. but the sync action doesn't
specify the latest load version. so the update operation will failed.

This PR fix leader checker can't update segment's load version and
keeping generate same task to scheduler.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-08 11:57:01 +08:00
congqixia c886aa29ff
enhance: Use `ListIndexes` instead of `DescribeIndex` for qc broker (#31122)
See also #31103

Since querycoord need index meta information from datacoord only, broker
shall use `ListIndexes` to skip segment index building check logic in
datacoord

This PR is also related to #30538, in which DescribeIndex caused lots of
memory usage and lead to OOM eventually

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-07 21:43:03 +08:00
wei liu 2a047103d6
fix: Dirty sealed segment won't release after channel balance (#31095)
issue: #31074
This PR fix dirty sealed segment doesn't release after channel balance,
dirty sealed segment means segment doesn't exist in targets.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-03-07 16:23:01 +08:00
Bingyi Sun e3cce11dd9
fix: data race in querynode task test (#31019)
issue: https://github.com/milvus-io/milvus/issues/31022

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-05 16:26:59 +08:00
Bingyi Sun 7783098ddd
feat: support lazy load on querycoord (#30372)
https://github.com/milvus-io/milvus/issues/30361

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-01 18:15:29 +08:00
SimFG ee8d6f236c
enhance: make the watch dm channel request better compatibility (#30952)
issue: #30938

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-01 16:07:37 +08:00
chyezh 0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
wei liu 545e8de401
fix: promote leader task failed when segment only exist on current target (#30794)
issue: #30150

`checkLeaderTaskStale` will check segment whether exist on next current
for leaderTask's growing action, which will cause promote leader task
failed when segment only exist on current target

This PR will check segment for both current or next target.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-02-28 13:14:59 +08:00
Bingyi Sun ece9d273a7
enhance: some patches for #30636 (#30664)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-26 11:42:55 +08:00
wei liu befe0e21fd
fix: Set indexInfo when try to set segment to leader view (#30758)
issue: #30150
see also: #30258

cause `SyncDataDistribution` will try to load delta for segment. if miss
indexInfo in request, sync action will failed due to lack of index info.

This PR set indexinfo when try to set segment to leader view

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-02-26 11:02:55 +08:00
wei liu 6dd7297178
fix: Skip generate balance task when target not ready (#30724)
issue: #30723

This PR skip generate balance task when collection's target isn't ready.
also refine the check stale logic in query coord's scheduler, if channel
exist in current or next target, task won't be canceled.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-02-23 10:32:53 +08:00
congqixia 7b91fa3db8
fix: Make leader checker generate leader task instead of segment task (#30258)
See also #30150

For leader view distribution with offline nodes, a release task can
never be sent to querynode due to targetNode online check logic. Even
the request is dispatched, normal release task does not have "force"
flag when calling `delegator.ReleaseSegment`.

This PR adds a new type of querycoord task: LeaderTask, the
responsibility of which is to rectify leader view distribtion.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-21 11:08:51 +08:00
Bingyi Sun 564b12c661
enhance: make balance cost threshold configurable (#30636)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-19 15:24:50 +08:00
wei liu 99297ab81b
fix: Add retry on unimplemented error for datacoord (#30554)
issue: #30553

when datacoord with version 2.2 and querycoord with version 2.3 coexist
during rolling upgrade, `DescribeIndex/GetIndexInfo` will return
`unimplemented` error
This PR add retry on `DescribeIndex/GetIndexInfo`, to prevent load
collection failed during rolling upgrade from milvus 2.2 to 2.3.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-02-18 17:26:52 +08:00
congqixia a6d9eb7f20
fix: Remove balance plan of which From, To nodes are same when merging (#30634)
See also #30627

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 17:24:50 +08:00
zhenshan.cao bb93b22c84
fix: should return collectionName in response of ListAliases (#30532)
issue : https://github.com/milvus-io/milvus/issues/30369

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-02-12 08:30:55 +08:00
Bingyi Sun 715f042965
feat: add a balancer based on both of row count and segment count (#30188)
issue: https://github.com/milvus-io/milvus/issues/30039

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-02-06 17:15:50 +08:00
yiwangdr 32cff25f97
enhance: decrease coordinator init time (#29822)
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).

Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.

integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.


issue: #29409

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-05 14:00:12 +08:00
xige-16 060c8603a3
fix: Support mvcc with hybrid serach (#30114)
issue: https://github.com/milvus-io/milvus/issues/29656
/kind bug

Signed-off-by: xige-16 <xi.ge@zilliz.com>

---------

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-02-01 16:03:03 +08:00
Bingyi Sun 406bf14e84
enhance: Add growing row count weight (#30271)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-29 14:05:02 +08:00
aoiasd f84d9a589a
fix: channel checker reduce balancing channels. (#30087)
Ignore leader unavailable when channel checker judge repeat channel to
avoid channel checker remove channels balancing.
relate: https://github.com/milvus-io/milvus/issues/29841
https://github.com/milvus-io/milvus/issues/29838

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-01-26 10:59:00 +08:00
wei liu f69f65ff68
fix: Leader checker can't remove segment from leader view (#30151)
issue: #30150

This PR fix three problems:
1. leader checker use wrong node id when generate release task, which
cause the release task finished immediately
2. the release request generated by leader_checker doesn't set the
`force` flag, the operation to clean leader view on delegator will fail.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-20 18:58:58 +08:00
SimFG ddccccbcab
enhance: add the bytes data type for merge data and format some code (#30105)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-18 22:18:55 +08:00
smellthemoon e52ce370b6
enhance:don't store logPath in meta to reduce memory (#28873)
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-18 22:06:31 +08:00
wei liu f8695aef9d
fix: Trigger leader checker too frequency (#29991)
issue: #29841
This PR fix leader checker use wrong check interval, which causes leader
checker trigger too frequency

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-17 19:40:53 +08:00
congqixia 4c93912135
enhance: Shuffle candidates before channel assignment (#30066)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-17 19:34:53 +08:00
wei liu 57bd3e2181
fix: Leader checker canot submit load task (#30067)
issue: #29841
if segment loaded, submit load segment task for it isn't permitted, to
avoid load segment twice. but this logic blocks the leader checker to
correct leader view by `LoadSegment`

This PR remove the segment loaded check, to fix that leader checker
cann't submit load task

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-17 19:12:54 +08:00
wei liu 9abc868d15
fix: Remove heartbeat lag logic during get shard leaders (#29999)
issue: #29677 #29838
during get shard leaders, if qeurynode doesn't ack the heartbeat than
10s, querycoord will treat it as unavailable, and won't return shard
leader on it. but when querynode has a full cpu usage, it's easily to
stuck for more than 10s without ack the heartbeat, which cause no shard
leader to search/query.

This PR remove heartbeat lag logic during get shard leaders

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-17 11:22:52 +08:00
congqixia 7cb6bebd96
enhance: replace magic number with ParamItem for dist handler (#30020)
See also #28817

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-16 17:33:03 +08:00
yah01 c68c128e47
fix: level 0 segments not loaded (#29908)
the recent changes move the level 0 segments list to a new proto field,
which leads to the QueryCoord can't see the level 0 segments, handle the
new changes
fix #29907

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-16 14:40:53 +08:00
smellthemoon 595ec2559c
enhance: change some frequent log level (#29953)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-14 10:19:16 +08:00
congqixia 082ee1a709
enhance: Use newer checkpoint when packing LoadSegmentRequest (#29922)
See also: #29650

Either segment dml position & channel checkpoint could be newer in some
cases. This PR make PackLoadSegments use the newer one improving load
performance during cases where there are lots of upsert.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-13 10:46:53 +08:00
wei liu 565fc3a019
enhance: Skip generate load segment task (#29724)
issue: #29814
if channel is not subscribed yet, the generated load segment task will
be remove from task scheduler due to the load segment task need to be
transfer to worker node by shard leader.

This PR skip generate load segment task when channel is not subscribed
yet.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-12 18:56:58 +08:00
Bingyi Sun e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
wei liu 797847904c
enhance: Change some frequency log to rated level (#29720)
This PR change some frequency log to rated level

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-11 16:30:50 +08:00
congqixia c4ddfff2a7
enhance: make Load process traceable in querycoord (#29806)
See also #29803

This PR:
- Add trace span for collection/partition load
- Use TraceSpan to generate Segment/ChannelTasks when loading
- Refine BaseTask trace tag usage

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 09:58:49 +08:00
xige-16 9702cef2b5
feat: Support multiple vector search (#29433)
issue #25639 

Signed-off-by: xige-16 <xi.ge@zilliz.com>

Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-08 15:34:48 +08:00
congqixia b5f039a221
fix: Assertion all async invocations in test case (#29737)
Resolves: #29736

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-07 15:54:47 +08:00
wei liu e98c62abbb
enhance: refactor leader_observer to leader_checker (#29454)
issue: #29453 

sync distribution by rpc will also call loadSegment/releaseSegment,
which may cause all kinds of concurrent case on same segment, such as
concurrent load and release on one segment.
This PR add leader_checker which generate load/release task to correct
the leader view, instead of calling sync distribution by rpc

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-05 15:54:55 +08:00
congqixia 3626f49025
fix: make sure balance candidate is alway pushed back (#29702)
See also #29699

Querycoord panicked when tried to pop from an empty heap. We assume the
heap shall not be empty, but in some branch, the candidate is never
pushed back.

This PR put pop & push in a closure and adds a defer call to push item
back.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-05 10:08:47 +08:00
congqixia da7c3cbd88
enhance: make delegator delete buffer holding all delete from cp (#29626)
See also #29625

This PR:
- Add a new implemention of `DeleteBuffer`: listDeleteBuffer
  - holds cacheBlock slice
  - `Put` method append new delete data into last block
  - when a block is full, append a new block into the list
- Add `TryDiscard` method for `DeleteBuffer` interface
  - For doubleCacheBuffer, do nothing
- For listDeleteBuffer, try to evict "old" blocks, which are blocks
before the first block whose start ts is behind provided ts
- Add checkpoint field for `UpdateVersion` sync action, which shall be
used to discard old cache delete block

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-04 17:02:46 +08:00
congqixia aa967de0a8
enhance: Explicitly pass LevelZero segment ids in vchan info (#29612)
See also #27675

For `GetRecoveryInfo` & `GetRecoveryInfoV2`, Level zero segment ids
shall be specified in vchan info so that querycoord could re-fetch
current segment info during watch procedure without having all segment
info

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-04 16:46:45 +08:00
wei liu 336fce0582
enhance: Rewrite gen segment plan based on assign segment (#29574)
issue: #29582
This PR rewrite gen segment plan logic based on assign segment in
`score_based_balancer`

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-04 11:10:44 +08:00
congqixia a3cb8e2625
fix: Add atomic method to get collection target (#29577)
Related to #29575

Add `getCollectionTarget` method which is atomic when scope is
`CurrentTargetFirst` or `NextTargetFirst`
Also return error when executor finds no channel in target manager

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-29 09:04:46 +08:00
wei liu 514e279f3a
enhance: Remove useless log in collection observer (#29554)
This PR removed the useless log in collection observer

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-28 17:16:47 +08:00
wei liu 5474bce9d2
fix: Choose wrong shard leader during balance channel (#29529)
issue: #29523

readable shard leader should still be the old one during channel
balance, if the new shard leader is not ready.
This PR fixed that query coord choose wrong shard leader during balance
channel

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-28 15:22:51 +08:00
congqixia aa279db44c
enhance: remove flushed segmentInfo in WatchChannelRequest (#29526)
`WatchDmChannel` only need growing segment info, this PR removes fetch
segmentInfos when fill watch dml channel request.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-28 00:40:47 +08:00
wei liu 839a72129e
fix: Auto balance param can't be updated by dynamic (#29501)
This PR fixed that auto balance param can't be updated by dynamic

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-27 14:30:53 +08:00
wei liu 6cbf9c489d
enhance: Rewrite gen stopping segment plan based on assign segment (#29473)
`AssignSegment` method defines how to assign segment to nodes, but
score_based_balance implement another assign logic in
`genStoppingSegmentPlan`
This PR rewrite gen stopping segment plan based on assign segment.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-26 14:26:56 +08:00
wei liu 2ffde52f8a
fix: Upgrade from 2.2 should update CollectionLoadInfo (#29443)
milvus branch 2.3 add `loadType` in CollectionLoadInfo, so for
collection meta upgrade from 2.2, we should add `loadType` to
CollectionLoadInfo. This PR update CollectionLoadInfo with `loadType`
when meet a old version CollectionLoadInfo

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-26 14:18:47 +08:00
yah01 1d6bcd1ded
enhance: speed up loading with many deletions (#29455)
the executor always fetches the latest segment info, so we could consume
from the latest checkpoint, which could save much time while deleted
many entities

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-25 20:58:45 +08:00
SimFG dd9c61831d
enhance: Support to get the param value in the runtime (#29297)
/kind improvement
issue: #29299

Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-22 18:36:44 +08:00
yah01 a0e1a1eb31
feat: support enable/disable mmap for index (#29005)
support enable/disable mmap for index, the user could alter the index's
mode by `AlterIndex` method
related: https://github.com/milvus-io/milvus/issues/21866

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-21 18:07:24 +08:00
wei liu 820ee692fc
enhance: Add config for querycoord auto balance channel (#29231)
issue: #23726
This PR add control config to querycoord's background auto balance
channel operation

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-18 10:00:40 +08:00
yah01 13beb5ccc0
fix: load gets stuck probably (#29191)
we found the load got stuck probably, and reviewed the logs.

the target observer seems not working, the reason is the taskDispatcher
removes the task in a goroutine, and modifies the task status after
committing the task into the goroutine pool, but this may happen after
the task removed, which leads to the task will never be removed

related #29086

Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-14 18:28:38 +08:00
wei liu 008bae675d
enhance: Skip balance segment when channel need be balanced (#29116)
issue: #28622
After we support balance segment with growing segment count #28623, if
we balance segment and channel at same time, some segments need to be
rebalanced after balance channel finish.

This PR skip balance segment when channel need be balanced.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-14 16:44:43 +08:00
yah01 58dbb7872a
fix: forbid balancing level zero segments (#29168)
related #29128
Signed-off-by: yah01 <yah2er0ne@outlook.com>

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-14 14:30:39 +08:00
yah01 2f0c7a6544
fix: forbid balancing level zero segments (#29130)
we can't balance the L0 segments
related #29128

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-12 20:38:38 +08:00
yah01 9819090247
enhance: add more logs for target updating (#29090)
- add more logs about the condition satisfying

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-12-12 14:06:43 +08:00
yah01 0a87724f18
enhance: remove merger for load segments (#29062)
remove merger as now QueryNode could load segments concurrently
fix #29063

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-12 10:48:37 +08:00
congqixia a67fc08865
fix: balance_unstable_view unit test (#29127)
fix: #29126
Allow unstable output channel balance plan

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-12 10:02:39 +08:00
wei liu 42e538b683
enhance: enable balance channel in querycoord (#28469)
issue: #23726

/kind improvement

1. enable auto balance channel between nodes in querycoord
2. make `genSegmentPlan` reuse the `AssignSegment` logic
3. make `genChannelPlan` reuse the `AssignChannel` logic

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-11 14:18:37 +08:00
yah01 fab52d167b
fix: may miss stream delta while loading (#28871)
we consume the delta data from the lastest channel checkpoint while
loading segment,

this works well without level 0 segments, but now it may lead to miss
some delta data,

so we have to consume from the current target's channel checkpoint

related: #27349

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-05 17:34:45 +08:00
Bingyi Sun 10bb2431d8
test: add checker unittests (#28954)
issue: https://github.com/milvus-io/milvus/issues/28610

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-05 10:56:33 +08:00
aoiasd b4af6f8c40
fix: sync action load segment with lack collection index info list (#28788)
relate: https://github.com/milvus-io/milvus/issues/28779
https://github.com/milvus-io/milvus/issues/28637

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-12-04 18:14:34 +08:00
Bingyi Sun 45e6801ce4
feat: Add checker activation service interfaces (#28850)
issue: #28610

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-04 17:38:37 +08:00
wei liu d081fd5481
enhance: Change some frequency log to rated level (#28897)
This pr change some frequency log's level to rated.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-04 10:38:35 +08:00
wei liu 043ac87be0
fix: Balance channel may cause channel not availble error (#28829)
issue: #28831
release old delegator before new delegator update it's distribution may
cause `channel not availble` error

This PR will block release old delgator before new delegator finish
`syncDistribution`

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-01 10:08:34 +08:00
congqixia f9bb8e9648
enhance: Change const magic number in querycoord to param (#28819)
See also #28817

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-30 09:06:28 +08:00
yah01 c0f6eccb7a
fix: No LevelZero segment in target (#28803)
the incorrect filter causes all LevelZero segment filtered, so the
deleted entities may be still visible
related: #27349

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-29 11:48:27 +08:00
jaime b1e0a27f31
enhance: Add logs for each step during service initialization (#28624)
/kind improvement

Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-11-27 16:30:26 +08:00
wei liu 911a915798
feat: enable balance based on growing segment row count (#28623)
issue: #28622 

query node with delegator will has more rows than other query node due
to delgator loads all growing rows.
This PR enable the balance segment which based on the num of growing
rows in leader view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-27 14:58:26 +08:00
Bingyi Sun 8514a39d1a
feat: Add checker activation (#28611)
issue: https://github.com/milvus-io/milvus/issues/28610

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-11-24 18:08:24 +08:00
congqixia a2fe9dad49
enhance: Make etcd kv request timeout configurable (#28661)
See also #28660
This pr add request timeout config item for etcd kv request timeout
 Sync the default timeout value to same value for etcdKV & tikv config

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-23 19:34:23 +08:00
smellthemoon 73f2bab454
enhance:add some log when create client and get component states (#28160)
/kind improvement

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-11-22 09:12:22 +08:00
yah01 bfccfcd0ca
enhance: refine error messages (#28424)
- Split the simple reason and full detail
- Refine existing error messages
related: #28422

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-21 17:02:24 +08:00
aoiasd 13a5b9f64a
fix: query l0 segment bugs (#28558)
relate: https://github.com/milvus-io/milvus/issues/27675

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-11-20 17:26:23 +08:00
wei liu 7895ac96b5
enhance: Remove rpc during querycoord start (#28396)
issue: #28332

during querycoord's recover, it try to call `DescribeCollection` and
`ShowPartitions` to root coord, to checker whether collection or
partition has been released in rootcoord. but if rootcoord isn't not
ready yet, the rpc will fail, the querycoord panic.

to fix this, we remove rpc call during querycoord's start

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-17 11:48:19 +08:00
congqixia 81caf02554
fix: make qcv2 observer dispatcher execute exactly once (#28472)
See also #28466

In `taskDispatcher.schedule`, same task may be resubmitted if the
previous round did not finish
In this case, TaskObserver.check may set current target by mistake,
which may cause the random search/query failure

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-16 10:24:19 +08:00
yah01 70995383bf
enhance: modify log to avoid ambiguity and improve readability (#28331)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-10 14:32:20 +08:00
wei liu b9bf910039
fix unstable auto balance config ut (#28288)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-09 10:00:22 +08:00
wei liu 7f78e1dd46
fix datacoord unstable ut (#28281)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-08 18:43:31 +08:00
yah01 ecb3f585c3
Fix passing the wrong dropped list from current target (#28265)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-08 17:02:18 +08:00
yah01 1b90630633
Fix the target updated before version updated to cause data missing (#28250)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-08 11:36:22 +08:00
wei liu 5b45a138b1
disable auto balance when old node exists (#28191)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-07 14:02:20 +08:00
yah01 ece592a42f
Deliver L0 segments delete records (#27722)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-07 01:44:18 +08:00
yah01 90e2c63d9e
Fix getting incorrect CPU num (#28146)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-11-06 06:02:16 +08:00
wei liu 7485eeb689
fix sync distribution with wrong version (#28130)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 19:02:18 +08:00
wei liu 86ec6f4832
fix load index for stopping node (#28047)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 07:58:18 +08:00
yah01 dc89730a50
Support collection-level mmap control (#26901)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-02 23:52:16 +08:00
congqixia 5d2eba2c2f
Set qcv2 index task priority to Low (#28117)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-11-02 23:22:17 +08:00
Filip Haltmayer 6b1a106a31
Moving etcd client into session (#27069)
Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-10-27 07:36:12 +08:00
congqixia 852be152de
Change task sourceID to stringer interface (#27965)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-27 01:08:12 +08:00
wei liu e0222b2ce3
refine target manager code style (#27883)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-10-25 00:44:12 +08:00
congqixia 323fc107a7
Fix taskDispatcher add multiple tasks will ignore following ones (#27885)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-24 17:18:13 +08:00
wei liu 178db7b0f0
check stopping node during start qc (#27859)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-10-24 12:20:11 +08:00
congqixia 93a877f55e
Make qcv2 target&leader observer execute in parallel (#27844)
- Add `taskDispatcher` to submit and run task async safely
- Change `LeaderObeserver` and `TargetObserver` schedule and manual check action to submitting task into dispatcher
- Fix logic problem in collection observer when manual check return false

See also #27494

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-24 10:14:11 +08:00
Xiaofan 2ea7579dbb
Reduce rpc size for GetRecoveryInfoV2 (#27483)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-10-23 21:44:09 +08:00
wei liu 55e5f80e24
update collection target after observer start (#27774)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-10-19 21:52:10 +08:00
zhenshan.cao 020ad9a6bc
Rectify wrong exception messages associated with Array datatype (#27769)
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2023-10-19 17:24:07 +08:00
yah01 635efdf170
Schedule loading L0 segments first (#27593)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-19 11:14:06 +08:00
yihao.dai 49b3a12804
Return newly defined merr instead of grpc unimplemented err (#27751)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-10-18 15:32:11 +08:00
wayblink e3f2122618
Expose metrics of stanby coordinators (#27698)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2023-10-16 15:04:09 +08:00
jaime ec1fe3549e
Add a stop hook to clean session (#27564)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-10-16 10:24:10 +08:00
yah01 be980fbc38
Refine state check (#27541)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-11 21:01:35 +08:00
smellthemoon a0ca982a52
Fix typo in priority name (#27558)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2023-10-11 14:19:35 +08:00
wei liu 42c475a0e0
remove useless log in querycoord (#27362)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-10-11 10:13:34 +08:00
congqixia b91a5ef42c
Refine log and err handling in querycoord broker (#27546)
- Add log.Ctx(ctx) for all log occurences
- Use `merr.CheckRPCErr` for all grpc response error handling

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-10 11:49:32 +08:00
MrPresent-Han cb71a3e235
rm dependency to rc when getting recovery info(#25363) (#27405)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-10-09 18:51:32 +08:00
congqixia eca79d149c
Add ctx control for observer manual check methods (#27531)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-09 11:07:33 +08:00
yah01 a715165306
Set timeout for leader observer syncing (#27504)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-08 16:55:31 +08:00
Xiaofan 41124f281a
Remove parser dependency (#27514)
Signed-off-by: xiaofan-luan <xiaofan.luan@zilliz.com>
2023-10-08 15:05:31 +08:00
congqixia 5d558623fe
Add revive sub-lints and fix existing problems (#27495)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-07 20:53:38 +08:00
yah01 8394b3a1ec
Block creating new error from status reason (#27426)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-10-07 11:29:32 +08:00
congqixia cd5f03f80c
Add var-name sub linter in revive (#27424)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-07 10:09:31 +08:00
yah01 63ac43a3b8
Refine errors for import (#27379)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-30 10:31:28 +08:00
Jiquan Long 370fdaf50d
Record engine version for segment index (#27384)
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-09-28 18:03:28 +08:00
yah01 a8ce1b6686
Refine QueryCoord stopping (#27371)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-27 16:27:27 +08:00
wei liu 4071132f6a
reload loading collection when qc recover (#27300)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-27 11:43:28 +08:00
yah01 6539a5ae2c
Refine DataCoord status (#27262)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-26 17:15:27 +08:00
jaime 7f7c71ea7d
Decoupling client and server API in types interface (#27186)
Co-authored-by:: aoiasd <zhicheng.yue@zilliz.com>

Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-09-26 09:57:25 +08:00
yah01 24354b166c
Fix unit test failed when run single test (#27348)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-26 09:23:25 +08:00
foxspy 5db4a0489e
dynamic index version control (#27335)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-25 21:39:27 +08:00
MrPresent-Han 4b12cb8847
fix unstable ut due to unstable sort of unique set (#27302)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-09-22 19:07:26 +08:00
foxspy 370b6fde58
milvus support multi index engine (#27178)
Co-authored-by: longjiquan <jiquan.long@zilliz.com>
2023-09-22 09:59:26 +08:00
SimFG 26f06dd732
Format the code (#27275)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-21 09:45:27 +08:00
yah01 b4f86ea55e
Construct all success status with merr (#27226)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-20 10:57:23 +08:00
congqixia cc9974979f
Add staticcheck linter and fix existing problems (#27174)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-19 10:05:22 +08:00
yah01 168e82ee10
Fix panic while handling with the nil status (#27040)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-15 10:09:21 +08:00
congqixia edde3cf1c7
Add tracer for querycoord tasks (#27058)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-14 09:59:19 +08:00
PowderLi c033580af4
show index info while GetSegmentInfo (#26981)
according to QueryNode::GetSegmentInfo

Signed-off-by: PowderLi <min.li@zilliz.com>
2023-09-13 11:37:18 +08:00
aoiasd e107d0794c
support complex delete expression (#25752)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-09-12 10:19:17 +08:00
congqixia c45c32fad4
Set task reason for collection released (#26962)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-10 15:15:17 +08:00
MrPresent-Han 2101f2d289
fix unstable checker id due to go map iteration(#26943) (#26944)
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-09-10 10:11:16 +08:00
congqixia 758aad705d
Fix checker using default interval after manual check (#26953)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-09 08:29:16 +08:00
SimFG 0901b76732
Avoid the panic when the status of rpc response is nil (#26910)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-07 19:23:15 +08:00
yiwangdr 337edc321b
tikv integration (#26246)
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2023-09-07 07:25:14 +08:00
SimFG 28681276e2
Improve the retry of the rpc client (#26795)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-09-06 17:43:14 +08:00
Enwei Jiao fb0705df1b
Decouple basetable and componentparam (#26725)
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-05 10:31:48 +08:00
congqixia 1a8cf5c415
Organize all mockery generation commands in Makefile (#26826)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-04 21:19:48 +08:00
yah01 3349db4aa7
Refine errors to remove changes breaking design (#26521)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-04 09:57:09 +08:00
yah01 941a383019
Fix failed to load collection with more than 128 partitions (#26763)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-09-02 00:09:01 +08:00
congqixia e8f1b1736e
Remove log.Error(err.error())-style log (#26783)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-01 13:09:01 +08:00
wei liu 5602b22531
refine checker code style (#26759)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-09-01 11:57:01 +08:00