issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #34304
cosine is more widely used in float vectors, and cosine and hamming
distance are 'metrics' which have good geometric properties
Signed-off-by: chasingegg <chao.gao@zilliz.com>
Related to https://github.com/milvus-io/milvus/issues/32165
1. nodeid based channel store access should use map access instead of
iteration.
2. The join-ish functions calls are slow when # collections/segments
increases (e.g. 10k).
e.g.
getNumRowsOfCollectionUnsafe is O(num_segments); GetAllCollectionNumRows
is of O(num_collections*num_segments).
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built
Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.
issue: https://github.com/milvus-io/milvus/issues/28521
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #31662#31409
during FilterIndexedSegment in GetRecoveryInfo, it try to acquire index
meta's read lock for every segment. when a collection has thousands of
segments, which may blocked for more than 10 seconds and even longer.
cause `AddSegmentIndex` may also triggered frequently, which try to get
the write lock.
This PR avoid acquire index meta's lock for each segment
Signed-off-by: Wei Liu <wei.liu@zilliz.com>