issue: #34553
when rootcoord trigger graceful stop progress, it will block until all
rpc finished. for create collection request, rootcoord need to block
until datacoord finish to watch all channels, but datacoord need to call
`rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to
new request anymore. which cause create collection stucks, and graceful
stop progress stucks.
This PR remove the func call `rootcoord.Alloc` to solve the logic dead
lock during graceful stop progress.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #33285
- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Related to #28861
Move allocator interface and implementation into separate package. Also
update some unittest logic.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Log will be confusing when `Reassign` channel operation failed for both
success & failure log will be printed in row. This PR continue the loop
to avoid this output.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #33062
This PR:
- Add `lock.RWMutex` & `lock.Mutex` alias to switch implementation based
on build flags
- When build flags has `test` in it, use `go-deadlock` to detect
possible deadlocks
- Replace all `sync.RWMutex` & `sync.Mutex` in datacoord pkg
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to https://github.com/milvus-io/milvus/issues/32165
1. nodeid based channel store access should use map access instead of
iteration.
2. The join-ish functions calls are slow when # collections/segments
increases (e.g. 10k).
e.g.
getNumRowsOfCollectionUnsafe is O(num_segments); GetAllCollectionNumRows
is of O(num_collections*num_segments).
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
See also #32165
`ChannelManager.Match` is a frequent operation for datacoord. When the
collection number is large, iteration over all channels will cost lots
of CPU time and time consuming.
This PR change the data structure storing datanode-channel info to map
avoiding this iteration when checking channel existence.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #32029
lack of logic to clean collection info in datacoord's meta, This PR
clean collection info after drop channel, to avoid collection info leak
in datacoord
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
THe results don't meet our requirements, and the code hasn't been
maintained for a long time.
See also: #29447
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
The original GetVChanPositions are specialized for QueryCoord,
the recovery info needed by DataNode shouldn't take `indexed` state
into account.
This PR splits GetVChanPositions into GetDataVChanPositions and
GetQueryVChanPositions.
See also: #19653
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Signed-off-by: yangxuan <xuan.yang@zilliz.com>