This PR changes the following to speed up L0 compaction and
prevent OOM:
1. Lower deltabuf limit to 16MB by default, so that each L0 segment
would be 4X smaller than before.
2. Add BatchProcess, use it if memory is sufficient
3. Iterator will Deserialize when called HasNext to avoid massive memory
peek
4. Add tracing in spiltDelta
See also: #30191
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;
issue: https://github.com/milvus-io/milvus/issues/28521
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
See also #27675
This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #25639
/kind improvement
When the number of vector columns increases, the number of rows per
segment will decrease. In order to reduce the impact on vector indexing
performance, it is necessary to increase the segment max limit.
If a collection has multiple vector fields with memory and disk indices
on different vector fields, the size limit after segment compaction is
the minimum of segment.maxSize and segment.diskSegmentMaxSize.
Signed-off-by: xige-16 <xi.ge@zilliz.com>
---------
Signed-off-by: xige-16 <xi.ge@zilliz.com>
See also #27675
After supporting control sync parallel in datanode globally, the shall
change default value to a more suitable value for most use cases.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.
issue: https://github.com/milvus-io/milvus/issues/30181
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #29709#291712
to avoid concurrent recursive RLock and Lock cause deadlock, This PR
remove the unnecessary lock in config manager
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
due to `clientMaxSendSize` and `serverMaxRecvSize` will limit the rpc
request size limit, they should use same config value, and
`serverMaxSendSize` and `clientMaxRecvSize` will limit the rpc response
size limit, they should use same config value too.
This PR fix unexpected rpc msg limit which caused by the wrong usage of
misunderstanding rpc config items
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #29625
This PR:
- Add a new implemention of `DeleteBuffer`: listDeleteBuffer
- holds cacheBlock slice
- `Put` method append new delete data into last block
- when a block is full, append a new block into the list
- Add `TryDiscard` method for `DeleteBuffer` interface
- For doubleCacheBuffer, do nothing
- For listDeleteBuffer, try to evict "old" blocks, which are blocks
before the first block whose start ts is behind provided ts
- Add checkpoint field for `UpdateVersion` sync action, which shall be
used to discard old cache delete block
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue:https://github.com/milvus-io/milvus/issues/29230
this pr do two things about cagra index:
a.milvus yaml config support gpu memory settings
b.add cagra-params check
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
Co-authored-by: yusheng.ma <yusheng.ma@zilliz.com>
issue: #28960 [milvus-proto
#212](https://github.com/milvus-io/milvus-proto/issues/212)
add new configuration: builtinRoles
user can define roles in config file: `milvus.yaml`
there is an example:
1. db_ro, only have read privileges, include load
2. db_rw, read and write privileges, include create/drop/rename
collection
3. db_admin, not only read and write privileges, but also user
administration
Signed-off-by: PowderLi <min.li@zilliz.com>
See also #29223
This PR make `conc.Pool` resizable by adding `Resize` method for it.
Also make newly added datanode `MaxParallelSyncMgrTasks` config
refreshable
---------
Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>
Since the sync manager is global in datanode now, the old
`maxParallelSyncTaskNum` does not fit into current implementation
anymore.
This PR add a new param item for sync mgr parallel control and enlarge
default value
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #29177
Add a config item for partition name as regexp feature and disable it by
default
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>