Commit Graph

12 Commits (master)

Author SHA1 Message Date
yihao.dai a29b3272b0
fix: Improve import memory management to prevent OOM (#43568)
1. Use blocking memory allocation to wait until memory becomes available
2. Perform memory allocation at the file level instead of per task
3. Limit Parquet file reader batch size to prevent excessive memory
consumption
4. Limit import buffer size from 20% to 10% of total memory

issue: https://github.com/milvus-io/milvus/issues/43387,
https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-28 21:25:35 +08:00
yihao.dai 5124ed9758
fix: Fix import fileStats incorrectly set to nil (#43463)
1. Ensure that tasks in the InProgress state return valid fileStats.
2. Enhance import logs.

issue: https://github.com/milvus-io/milvus/issues/43387

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-22 12:37:01 +08:00
congqixia 5a9efb3f81
enhance: [StorageV2] Refine storage rw option usage & validation (#43175)
Related to #39173

This PR:
- Make all datanode task passes storage config via storage config option
- Remove legacy comments, rootPath & bucketName parameters
- Fix clustering compaction option behavior
- Add validation logic for `rwOptions`
- Use correct storageType from storageConfig
- Add storage config in sync task

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-11 01:14:48 +08:00
yihao.dai 9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
yihao.dai 86876682da
enhance: Enhance import integration tests and logs (#42612)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
 a. Add a test to verify that autoIDs are not duplicated
 b. Add a test for the corner case where all data is deleted
 c. Shorten test execution time
3. Enhance import logging:
 a. Print imported segment information upon completion
 b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-12 20:02:35 +08:00
yihao.dai 837349dead
enhance: Adjust default import buffer size (#42541)
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-09 13:02:33 +08:00
congqixia cb7f2fa6fd
enhance: Use v2 package name for pkg module (#39990)
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-22 23:15:58 +08:00
Zhen Ye bb8d1ab3bf
enhance: make new go package to manage proto (#39114)
issue: #39095

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:49:01 +08:00
yihao.dai 0fc2a4aa53
enhance: Optimize import scheduling and add time cost metric (#36601)
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.

issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-09 14:41:20 +08:00
yihao.dai 4e5f1d5f75
enhance: Pre-allocate ids for import (#33958)
The import is dependent on syncTask, which in turn relies on the
allocator. This PR pre-allocate the necessary IDs for import syncTask.

issue: https://github.com/milvus-io/milvus/issues/33957

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-07-07 21:26:14 +08:00
congqixia 506a915272
fix: Deep copy ImportTask.segmentsInfo to prevent data race (#34090)
See also #34089

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-25 10:06:02 +08:00
yihao.dai 895799ec61
enhance: Abstract Execute interface for import/preimport task (#33234)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-23 11:29:41 +08:00