Commit Graph

29 Commits (2.5)

Author SHA1 Message Date
wei liu b08d9efe69
fix: Prevent delegator unserviceable due to shard leader change (#42689) (#43309)
issue: #42098 #42404
pr: #42689
Fix critical issue where concurrent balance segment and balance channel
operations cause delegator view inconsistency. When shard leader
switches between load and release phases of segment balance, it results
in loading segments on old delegator but releasing on new delegator,
making the new delegator unserviceable.

The root cause is that balance segment modifies delegator views, and if
these modifications happen on different delegators due to leader change,
it corrupts the delegator state and affects query availability.

Changes include:
- Add shardLeaderID field to SegmentTask to track delegator for load
- Record shard leader ID during segment loading in move operations
- Skip release if shard leader changed from the one used for loading
- Add comprehensive unit tests for leader change scenarios

This ensures balance segment operations are atomic on single delegator,
preventing view corruption and maintaining delegator serviceability.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-15 17:46:51 +08:00
groot f2774e3c5b
enhance: [2.5] bulkinsert handles nullable/default (#42072)
issue: https://github.com/milvus-io/milvus/issues/42096,
https://github.com/milvus-io/milvus/issues/42130
pr: https://github.com/milvus-io/milvus/pull/42127

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-06-10 11:50:35 +08:00
aoiasd bb562c6a7e
fix:[2.5] analyzer memory leak because function runner not close (#41840)
relate: https://github.com/milvus-io/milvus/issues/41213
pr:https://github.com/milvus-io/milvus/pull/41839

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-15 15:48:23 +08:00
aoiasd 8af350d9db
fix: [2.5] bulk insert should use function runner's input field list instead schema's (#41561)
relate: https://github.com/milvus-io/milvus/issues/41213
pr: https://github.com/milvus-io/milvus/pull/41560

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-27 22:16:40 +08:00
congqixia 709594f158
enhance: [2.5] Use v2 package name for pkg module (#40117)
Cherry-pick from master
pr: #39990
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-23 00:46:01 +08:00
zhenshan.cao 9918e1008d
fix: Fix import failed due to 0 row num (#39887) (#39904)
issue: https://github.com/milvus-io/milvus/issues/39885

pr: https://github.com/milvus-io/milvus/pull/39886

---------

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
Co-authored-by: yihao.dai <yihao.dai@zilliz.com>
2025-02-17 01:36:15 +08:00
Zhen Ye 95809ca767
enhance: make new go package to manage proto (#39128)
issue: #39095
pr: #39114

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:53:01 +08:00
aoiasd 6fa096eb39
fix:[Cherry-pick] bm25 import segment loss stats (#38881)
relate: https://github.com/milvus-io/milvus/issues/38854
pr: https://github.com/milvus-io/milvus/pull/38855

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-12-31 19:24:54 +08:00
jaime 9d16b972ea
feat: add tasks page into management WebUI (#37002)
issue: #36621

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
Buqian Zheng 82c5cf2fa2
feat: add bulk insert support for Functions (#36715)
issue: https://github.com/milvus-io/milvus/issues/35853 and
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-12 17:19:20 +08:00
yihao.dai 80f25d497f
enhance: Add metrics to monitor import throughput and imported rows (#36519)
issue: https://github.com/milvus-io/milvus/issues/36518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-28 17:31:15 +08:00
aoiasd 139787371e
feat: support embedding bm25 sparse vector and flush bm25 stats log (#36036)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-19 10:57:12 +08:00
yihao.dai 9868fe4e6c
fix: Fix panic due to empty candidate import segments (#35673)
issue: https://github.com/milvus-io/milvus/issues/35662

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-27 17:08:59 +08:00
congqixia ab532ae199
enhance: Add back BF lazy load logic for datanode watch channel (#35646)
Add back lazy loading statslog when watch dml channel on datanode.

Related to #22994 #27675

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-22 19:42:57 +08:00
zhenshan.cao aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
yihao.dai 8aab6cbfac
enhance: Organize the common modules of streamingNode and dataNode (#34773)
1. Move the common modules of streamingNode and dataNode to flushcommon
2. Add new GetVChannels interface for rootcoord

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-22 11:33:51 +08:00
yihao.dai 4e5f1d5f75
enhance: Pre-allocate ids for import (#33958)
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.

issue: https://github.com/milvus-io/milvus/issues/33957

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-07 21:26:14 +08:00
congqixia 512ea6be5f
enhance: Avoid merging insert data when buffering insert msgs (#33562)
See also #33561

This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-13 11:15:56 +08:00
yihao.dai 3540eee977
enhance: Support L0 import (#33514)
issue: https://github.com/milvus-io/milvus/issues/33157

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-07 14:17:20 +08:00
yihao.dai bbdf99a45e
fix: Fix import segment size is uneven (#33605)
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.

issue: https://github.com/milvus-io/milvus/issues/33604

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 15:41:51 +08:00
yihao.dai 558feed5ed
fix: Use pk from binlog during import (#32118)
During binlog import, even if the primary key's autoID is set to true,
the primary key from the binlog should be used instead of being
reassigned.

issue: https://github.com/milvus-io/milvus/discussions/31943,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-16 14:51:20 +08:00
yihao.dai 31cf849f68
enhance: Support retriving file size from importutilv2.Reader (#31533)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 20:29:07 +08:00
yihao.dai f65a796d18
enhance: Add max file num limit and max file size limit for import (#31497)
The max number of import files per request should not exceed 1024 by
default (configurable).
The import file size allowed for importing should not exceed 16GB by
default (configurable).

issue: https://github.com/milvus-io/milvus/issues/28521

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 18:13:06 +08:00
yihao.dai 776709e5ff
fix: Fix binlog import (#31310)
Fix binlog import functionality by removing the existing check and
refining the size retrieval process.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-17 20:59:04 +08:00
yihao.dai b5c67948b7
enhance: Enhance and modify the return content of ImportV2 (#31192)
1. The Import APIs now provide detailed progress information for each
imported file, including details such as file name, file size, progress,
and more.
2. The APIs now return the collection name and the completion time.
3. Other modifications include changing jobID to jobId and other similar
adjustments.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 19:51:03 +08:00
yihao.dai a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
yihao.dai 18b979d9b4
enhance: Extend support for varchar autoID to BulkInsertV2 (#30477)
issue: https://github.com/milvus-io/milvus/issues/30476

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-04 16:57:05 +08:00
yihao.dai 7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
yihao.dai c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00