Commit Graph

57 Commits (2.5)

Author SHA1 Message Date
wei liu b08d9efe69
fix: Prevent delegator unserviceable due to shard leader change (#42689) (#43309)
issue: #42098 #42404
pr: #42689
Fix critical issue where concurrent balance segment and balance channel
operations cause delegator view inconsistency. When shard leader
switches between load and release phases of segment balance, it results
in loading segments on old delegator but releasing on new delegator,
making the new delegator unserviceable.

The root cause is that balance segment modifies delegator views, and if
these modifications happen on different delegators due to leader change,
it corrupts the delegator state and affects query availability.

Changes include:
- Add shardLeaderID field to SegmentTask to track delegator for load
- Record shard leader ID during segment loading in move operations
- Skip release if shard leader changed from the one used for loading
- Add comprehensive unit tests for leader change scenarios

This ensures balance segment operations are atomic on single delegator,
preventing view corruption and maintaining delegator serviceability.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-15 17:46:51 +08:00
cai.zhang 0a62d6d509
enhance: Add Size interface to FileReader to eliminate the StatObject call during Read (#42911)
issue: #42907 

master pr: #42908

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-25 15:26:42 +08:00
yihao.dai f978641d6a
enhance: [2.5] Enhance import integration tests and logs (#42696)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
a. Add a test to verify that autoIDs are not duplicated
b. Add a test for the corner case where all data is deleted
c. Shorten test execution time
3. Enhance import logging:
a. Print imported segment information upon completion
b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42612

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-16 20:06:38 +08:00
groot f2774e3c5b
enhance: [2.5] bulkinsert handles nullable/default (#42072)
issue: https://github.com/milvus-io/milvus/issues/42096,
https://github.com/milvus-io/milvus/issues/42130
pr: https://github.com/milvus-io/milvus/pull/42127

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-06-10 11:50:35 +08:00
yihao.dai 28aa364bf7
enhance: [2.5] Adjust default import buffer size (#42542)
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.

issue: https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42541

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-05 18:46:33 +08:00
yihao.dai fdfb78b9e5
fix: [2.5] Fix duplicate autoID between import and insert (#42520)
Remove the unlimited logID mechanism and switch to redundantly
allocating a large number of IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

pr: https://github.com/milvus-io/milvus/pull/42519

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-05 00:54:33 +08:00
yihao.dai 83ca664150
fix: [2.5] Fix import slot assignment (#41982)
Assign the import task to the worker with the most available slots, even
if availableSlots < requiredSlots. This ensures tasks won’t be blocked
indefinitely.

issue: https://github.com/milvus-io/milvus/issues/41981

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-23 01:36:30 +08:00
yihao.dai 9b17108b50
fix: [2.5] Fix import reader goroutine leak (#41870)
Close the chunk manager's reader after the import completes to prevent
goroutine leaks.

issues: https://github.com/milvus-io/milvus/issues/41868

pr: https://github.com/milvus-io/milvus/pull/41869

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-15 22:20:23 +08:00
aoiasd bb562c6a7e
fix:[2.5] analyzer memory leak because function runner not close (#41840)
relate: https://github.com/milvus-io/milvus/issues/41213
pr:https://github.com/milvus-io/milvus/pull/41839

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-15 15:48:23 +08:00
yihao.dai a7c818cadb
fix: [2.5] Fix no candidate segments error for small import (#41772)
When autoID is enabled, the preimport task estimates row distribution by
evenly dividing the total row count (numRows) across all vchannels:
`estimatedCount = numRows / vchannelNum`.
However, the actual import task hashes real auto-generated IDs to
determine
the target vchannel. This mismatch can lead to inaccurate row
distribution estimation
in such corner cases:
- Importing 1 row into 2 vchannels:
				• Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows
				• Import: real autoID (e.g., 457975852966809057) hashes to v1
				  → actual result: v0 = 0, v1 = 1

To resolve such corner case, we now allocate at least one segment for
each vchannel
when autoID is enabled, ensuring all vchannels are prepared to receive
data even
if no rows are estimated for them.

issue: https://github.com/milvus-io/milvus/issues/41759

pr: https://github.com/milvus-io/milvus/pull/41771

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-14 10:36:22 +08:00
aoiasd 8af350d9db
fix: [2.5] bulk insert should use function runner's input field list instead schema's (#41561)
relate: https://github.com/milvus-io/milvus/issues/41213
pr: https://github.com/milvus-io/milvus/pull/41560

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-04-27 22:16:40 +08:00
congqixia 709594f158
enhance: [2.5] Use v2 package name for pkg module (#40117)
Cherry-pick from master
pr: #39990
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-23 00:46:01 +08:00
zhenshan.cao 9918e1008d
fix: Fix import failed due to 0 row num (#39887) (#39904)
issue: https://github.com/milvus-io/milvus/issues/39885

pr: https://github.com/milvus-io/milvus/pull/39886

---------

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
Co-authored-by: yihao.dai <yihao.dai@zilliz.com>
2025-02-17 01:36:15 +08:00
Zhen Ye 95809ca767
enhance: make new go package to manage proto (#39128)
issue: #39095
pr: #39114

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:53:01 +08:00
aoiasd 6fa096eb39
fix:[Cherry-pick] bm25 import segment loss stats (#38881)
relate: https://github.com/milvus-io/milvus/issues/38854
pr: https://github.com/milvus-io/milvus/pull/38855

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-12-31 19:24:54 +08:00
jaime 29e620fa6d
fix: sync task still running after DataNode has stopped (#38377)
issue: #38319

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 18:06:44 +08:00
yihao.dai 43e0e2b7ed
fix: Fix empty import task result (#38316)
Ensure the idempotency of import tasks to prevent duplicate tasks in
DataNode.

issue: https://github.com/milvus-io/milvus/issues/38313

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-11 15:42:49 +08:00
congqixia b0bd290a6e
enhance: Use internal json(sonic) to replace std json lib (#37708)
Related to #35020

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-18 10:46:31 +08:00
congqixia 5e90f348fc
enhance: Handle legacy proxy load fields request (#37565)
Related to #35415

In rolling upgrade, legacy proxy may dispatch load request wit empty
load field list. The upgraded querycoord may report error by mistake
that load field list is changed.

This PR:

- Auto field empty load field list with all user field ids
- Refine the error messag when load field list updates
- Refine load job unit test with service cases

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-11 10:14:26 +08:00
sthuang 70605cf5b3
enhance: Support custom privilege group for RBAC (#37087)
issue: #37031

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2024-11-09 08:44:28 +08:00
congqixia 3106384fc4
enhance: Return deltadata for `DeleteCodec.Deserialize` (#37214)
Related to #35303 #30404

This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.

Also refine `storage.DeltaData` methods to make it easier to usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 12:04:24 +08:00
jaime 9d16b972ea
feat: add tasks page into management WebUI (#37002)
issue: #36621

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
yihao.dai d7b2906318
enhance: Make dataNode.import.maxConcurrentTaskNum dynamic (#37102)
Resize import execution pool when config
`dataNode.import.maxConcurrentTaskNum` update.

issue: https://github.com/milvus-io/milvus/issues/37095

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 16:51:29 +08:00
Buqian Zheng 82c5cf2fa2
feat: add bulk insert support for Functions (#36715)
issue: https://github.com/milvus-io/milvus/issues/35853 and
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-10-12 17:19:20 +08:00
yihao.dai 0fc2a4aa53
enhance: Optimize import scheduling and add time cost metric (#36601)
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.

issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-09 14:41:20 +08:00
yihao.dai 80f25d497f
enhance: Add metrics to monitor import throughput and imported rows (#36519)
issue: https://github.com/milvus-io/milvus/issues/36518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-28 17:31:15 +08:00
aoiasd 139787371e
feat: support embedding bm25 sparse vector and flush bm25 stats log (#36036)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-19 10:57:12 +08:00
CharlesFeng 6eb8b3f745
fix: err degenerated to a new variable (#35891)
https://github.com/milvus-io/milvus/issues/35890

Signed-off-by: fengjun2016 <jornfeng@gmail.com>
2024-09-04 14:57:04 +08:00
yihao.dai 9868fe4e6c
fix: Fix panic due to empty candidate import segments (#35673)
issue: https://github.com/milvus-io/milvus/issues/35662

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-27 17:08:59 +08:00
congqixia ab532ae199
enhance: Add back BF lazy load logic for datanode watch channel (#35646)
Add back lazy loading statslog when watch dml channel on datanode.

Related to #22994 #27675

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-22 19:42:57 +08:00
Ted Xu 41646c8439
feat: integrate new deltalog format (#35522)
See #34123

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-08-20 19:06:56 +08:00
zhenshan.cao aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
yihao.dai 8aab6cbfac
enhance: Organize the common modules of streamingNode and dataNode (#34773)
1. Move the common modules of streamingNode and dataNode to flushcommon
2. Add new GetVChannels interface for rootcoord

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-22 11:33:51 +08:00
yihao.dai ca758c36cc
enhance: Pre-allocate ids for compaction (#34187)
This PR removes the dependency of compaction on the ID allocator by
pre-allocating the logID and segmentID.

issue: https://github.com/milvus-io/milvus/issues/33957

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-17 13:23:42 +08:00
yihao.dai 4e5f1d5f75
enhance: Pre-allocate ids for import (#33958)
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.

issue: https://github.com/milvus-io/milvus/issues/33957

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-07 21:26:14 +08:00
congqixia 962a5446f8
enhance: Add ctx in `SyncTask.Run` to be cancellable (#34042)
Related to #33716

This PR add context param in SyncTask.Run execution functions to make it
cancellable from the caller.

This make it possible to cancel task when datanode/data sync service is
beeing shut down.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-25 14:22:04 +08:00
congqixia 506a915272
fix: Deep copy ImportTask.segmentsInfo to prevent data race (#34090)
See also #34089

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-25 10:06:02 +08:00
congqixia 512ea6be5f
enhance: Avoid merging insert data when buffering insert msgs (#33562)
See also #33561

This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-13 11:15:56 +08:00
yihao.dai eb5d4de390
fix: Check if the import job exists (#33672)
issue: https://github.com/milvus-io/milvus/issues/33671

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-10 21:51:55 +08:00
yihao.dai 3540eee977
enhance: Support L0 import (#33514)
issue: https://github.com/milvus-io/milvus/issues/33157

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-07 14:17:20 +08:00
yihao.dai bbdf99a45e
fix: Fix import segment size is uneven (#33605)
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.

issue: https://github.com/milvus-io/milvus/issues/33604

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 15:41:51 +08:00
aoiasd 387b7cd7f4
enhance:avoid maintain checkpoint info in sync manager (#33413)
relate: https://github.com/milvus-io/milvus/issues/32915

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-06-05 10:05:50 +08:00
yihao.dai 895799ec61
enhance: Abstract Execute interface for import/preimport task (#33234)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-23 11:29:41 +08:00
congqixia 2c1e8f4774
enhance: Use `struct{}` for sync task future result (#32673)
Related to #27675

Use `struct{}` instead `error` for sync task future result type to
reduce result size and preventing logci error.

Also change some unused parameter to `_` to suppress lint warning

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-29 10:59:26 +08:00
yihao.dai 558feed5ed
fix: Use pk from binlog during import (#32118)
During binlog import, even if the primary key's autoID is set to true,
the primary key from the binlog should be used instead of being
reassigned.

issue: https://github.com/milvus-io/milvus/discussions/31943,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-16 14:51:20 +08:00
yihao.dai aa96843d31
fix: Fix import hanging and improve logging output (#32166)
Fix import hanging when the previous import task failed, and improve
parquet import logging outout.

issue: https://github.com/milvus-io/milvus/issues/31834

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-13 22:03:23 +08:00
yihao.dai 49d109de18
enhance: Use an individual buffer size parameter for imports (#31833)
Use an individual buffer size parameter for imports and set buffer size
to 64MB.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-08 21:07:18 +08:00
yihao.dai 4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
yihao.dai 31cf849f68
enhance: Support retriving file size from importutilv2.Reader (#31533)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 20:29:07 +08:00
yihao.dai f65a796d18
enhance: Add max file num limit and max file size limit for import (#31497)
The max number of import files per request should not exceed 1024 by
default (configurable).
The import file size allowed for importing should not exceed 16GB by
default (configurable).

issue: https://github.com/milvus-io/milvus/issues/28521

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 18:13:06 +08:00