issue: #38715
pr: #38770
- Current milvus use a serialized index size(compressed) for estimate
resource for loading.
- Add a new field MemSize (before compressing) for index to estimate
resource.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
issue: #37579
If the schema includes large varchar fields, a few thousand rows can
reach hundreds of MB in size. Therefore, if the batch size of the
segment writer is large, it will produce relatively large `binlogs`,
which can cause datanode to run out of memory (OOM) during compaction.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Native support for Google cloud storage using the Google Cloud Storage
libraries. Authentication is performed using GCS service account
credentials JSON.
Currently, Milvus supports Google Cloud Storage using S3-compatible APIs
via the AWS SDK. This approach has the following limitations:
1. Overhead: Translating requests between S3-compatible APIs and GCS can
introduce additional overhead.
2. Compatibility Limitations: Some features of the original S3 API may
not fully translate or work as expected with GCS.
To address these limitations, This enhancement is needed.
Related Issue: #36212
issue: #33744
This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
See also #33561
This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Adding a collection id to the index node log allows you to associate an
index building task with a specific collection.
If the host CPU usage is too high due to index build, you can use the
collection id to quickly locate a specific collection, improving fault
locating efficiency.
Signed-off-by: dengxiaohai <rolkdengxiaohai@didiglobal.com>
Co-authored-by: dengxiaohai <rolkdengxiaohai@didiglobal.com>
issue: #19095,#29655,#31718
- Change `ListWithPrefix` to `WalkWithPrefix` of OOS into a pipeline
mode.
- File garbage collection is performed in other goroutine.
- Segment Index Recycle clean index file too.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.
https://github.com/milvus-io/milvus/issues/29419
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
issue: #30310
also see pr: #30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>