issue: #34095
When a new query node comes online, the segment_checker,
channel_checker, and balance_checker simultaneously attempt to allocate
segments to it. If this occurs during the execution of a load task and
the distribution of the new query node hasn't been updated, the query
coordinator may mistakenly view the new query node as empty. As a
result, it assigns segments or channels to it, potentially overloading
the new query node with more segments or channels than expected.
This PR measures the workload of the executing tasks on the target query
node to prevent assigning an excessive number of segments to it.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
when querycoord process segment task, it will try to iterate whole
segment list to checke whether segment is loaded, which cost too much
cpu if there has thousands of segments.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #31091
This PR add GetByFilter interface in leader view manager, instead of all
kind of get func
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30816
check stale rules for leader task:
1. for reduce leader task, it should keep executing until leader's node
become offline.
2. for grow leader task,it should keep executing until leader's node
become stopping.
This PR check leader node's stopping state for grow leader task
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30816
pr #31319 introduce the logic that segment checker need to load level
zero segment which only exist in current target.
This PR fix load segment task promote failed when segment only belongs
to current target
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
This PR add metrics for task latency in querycoord scheduler, so if any
kind of task stuck, it's easy to figure out by metrics
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30186
during channel balance, after new delegator loaded, instead of syncing
l0 segment's location to new delegator, we should load l0 segment on new
delegator, and release the old l0 segment, then start to release old
delegator.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30150
`checkLeaderTaskStale` will check segment whether exist on next current
for leaderTask's growing action, which will cause promote leader task
failed when segment only exist on current target
This PR will check segment for both current or next target.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30723
This PR skip generate balance task when collection's target isn't ready.
also refine the check stale logic in query coord's scheduler, if channel
exist in current or next target, task won't be canceled.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #30150
For leader view distribution with offline nodes, a release task can
never be sent to querynode due to targetNode online check logic. Even
the request is dispatched, normal release task does not have "force"
flag when calling `delegator.ReleaseSegment`.
This PR adds a new type of querycoord task: LeaderTask, the
responsibility of which is to rectify leader view distribtion.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #29841
if segment loaded, submit load segment task for it isn't permitted, to
avoid load segment twice. but this logic blocks the leader checker to
correct leader view by `LoadSegment`
This PR remove the segment loaded check, to fix that leader checker
cann't submit load task
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #28831
release old delegator before new delegator update it's distribution may
cause `channel not availble` error
This PR will block release old delgator before new delegator finish
`syncDistribution`
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
1. balance granuity to replica to avoid influence unrelated replicas
2. avoid balance back and forth
Signed-off-by: MrPresent-Han <jamesharden11122@gmail.com>