Commit Graph

295 Commits (2.5)

Author SHA1 Message Date
yihao.dai 6de19f8598
enhance: [2.5] Enhance import context (#42051)
Rename `imeta` to `importMeta` to improve readability, and enhance
import related context usage.

issue: https://github.com/milvus-io/milvus/issues/41123

pr: https://github.com/milvus-io/milvus/pull/42021

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-24 18:40:26 +08:00
yihao.dai 7c8370ccd2
fix: [2.5] Fix ants.Pool goroutine leak (#41893)
1. Release the pool after it is no longer in use.
2. Upgrade ants.Pool to fix the goroutine leak issue (see
https://github.com/panjf2000/ants/pull/287).

issue: https://github.com/milvus-io/milvus/issues/41838

pr: https://github.com/milvus-io/milvus/pull/41892

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-16 19:12:22 +08:00
yihao.dai a7c818cadb
fix: [2.5] Fix no candidate segments error for small import (#41772)
When autoID is enabled, the preimport task estimates row distribution by
evenly dividing the total row count (numRows) across all vchannels:
`estimatedCount = numRows / vchannelNum`.
However, the actual import task hashes real auto-generated IDs to
determine
the target vchannel. This mismatch can lead to inaccurate row
distribution estimation
in such corner cases:
- Importing 1 row into 2 vchannels:
				• Preimport: 1 / 2 = 0 → both v0 and v1 are estimated to have 0 rows
				• Import: real autoID (e.g., 457975852966809057) hashes to v1
				  → actual result: v0 = 0, v1 = 1

To resolve such corner case, we now allocate at least one segment for
each vchannel
when autoID is enabled, ensuring all vchannels are prepared to receive
data even
if no rows are estimated for them.

issue: https://github.com/milvus-io/milvus/issues/41759

pr: https://github.com/milvus-io/milvus/pull/41771

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-14 10:36:22 +08:00
Xianhui Lin 21ca05e445
feat: refine drop parition through the new interface notifydroppartition in datacoord (#41543)
refine drop parition through the new interface notifydroppartition in
datacoord
 issue: https://github.com/milvus-io/milvus/issues/41542

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-04-27 17:42:40 +08:00
cai.zhang 7db59fb1fb
fix: [2.5] Get all children deltalogs for segment to load (#40957)
issue: #40207
master pr: #40956

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-03-28 16:04:19 +08:00
congqixia 2411c184a8
enhance: [2.5] Support detailed manual compaction criterion (#40892) (#40924)
Cherry-pick from master
pr: #40892
Related to #40866

This PR:
- update go-api/v2 and support partition id/channel/segment level manual
compaction
- refines the compaction trigger implementation
- unify the compaction signal usage

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-03-27 10:36:22 +08:00
yihao.dai e02a3daff5
enhance: [2.5] Optimize datacoord meta mutex (#40753)
Use a separate collection mutex.

issue: https://github.com/milvus-io/milvus/issues/40551

pr: https://github.com/milvus-io/milvus/pull/40552

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-25 13:50:21 +08:00
XuanYang-cn 1a6761ac69
enhance: [cp25]Replace currRows with NumOfRows (#40074) (#40681)
See also: #40068
pr: #40074

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-03-20 11:24:13 +08:00
yihao.dai b6b03ff74c
enhance: [2.5] Accelerate listing objects during binlog import (#40048)
issue: https://github.com/milvus-io/milvus/issues/40030

pr: https://github.com/milvus-io/milvus/pull/40047

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-02-24 15:59:56 +08:00
congqixia 709594f158
enhance: [2.5] Use v2 package name for pkg module (#40117)
Cherry-pick from master
pr: #39990
Related to #39095

https://go.dev/doc/modules/version-numbers

Update pkg version according to golang dep version convention

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-23 00:46:01 +08:00
zhenshan.cao 964000f645
fix: deleted the sealed segment data accidentally (#39422)
issue:https://github.com/milvus-io/milvus/issues/39333
pr: https://github.com/milvus-io/milvus/pull/39421

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-01-20 17:49:03 +08:00
XuanYang-cn c9b0859b16
fix: [cp25]Record active collections for l0Policy (#39217) (#39383)
By recording the active collection lists, The l0 compaction trigger of
view change and idle won't influence each other.

Also this pr replaces the L0View cache with real L0 segments' change.
Save some memory and make L0 compaction triggers more accurate.

See also: #39187
pr: #39217

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-01-20 10:47:03 +08:00
yihao.dai b69994272f
enhance: [2.5] Limit the maximum number of segments restored and fail the job if saving the binlog fails (#39359)
1. Limit the maximum number of restored segments to 1024.
2. Fail the import job if saving binlog fails.
3. Fail the import job if saving the import task fails to prevent
repeatedly generating dirty importing segments.

issue: https://github.com/milvus-io/milvus/issues/39331

pr: https://github.com/milvus-io/milvus/pull/39344

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-17 10:27:04 +08:00
wei liu 76ed552b00
enhance: Add logs for check health failed (#39208) (#39302)
pr: #39208

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-01-16 10:31:04 +08:00
Zhen Ye 95809ca767
enhance: make new go package to manage proto (#39128)
issue: #39095
pr: #39114

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:53:01 +08:00
jaime 11bedf5e76
fix: Revert "Expose metrics of stanby coordinators (#27698)" (#38621)
issue: #38608
pr: #38620

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-20 18:04:47 +08:00
XuanYang-cn ca7ec23198
enhance: Use partitionID when delete by partitionKey (#38231)
When delete by partition_key, Milvus will generates L0 segments
globally. During L0 Compaction, those L0 segments will touch all
partitions collection wise. Due to the false-positive rate of segment
bloomfilters, L0 compactions will append false deltalogs to completed
irrelevant partitions, which causes *partition deletion amplification.

This PR uses partition_key to set targeted partitionID when producing
deleteMsgs into MsgStreams. This'll narrow down L0 segments scope to
partition level, and remove the false-positive influence
collection-wise.

However, due to DeleteMsg structure, we can only label one partition to
one deleteMsg, so this enhancement fails if user wants to delete over 2
partition_keys in one deletion.

See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-20 11:18:46 +08:00
jaime 78438ef41e
fix: revert optimize CPU usage for CheckHealth requests (#35589) (#38555)
issue: #35563

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-19 00:38:45 +08:00
cai.zhang a348122758
fix: Support get segments from current segments view (#38512)
issue: #38511

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-18 18:00:54 +08:00
yihao.dai d4dab3c62f
enhance: Reduce segmentManager lock granularity (#37836)
Use a channel level key lock for segments in segmentManager.

issue: https://github.com/milvus-io/milvus/issues/37633,
https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-17 14:12:52 +08:00
jaime 28fdbc4e30
enhance: optimize CPU usage for CheckHealth requests (#35589)
issue: #35563
1. Use an internal health checker to monitor the cluster's health state,
storing the latest state on the coordinator node. The CheckHealth
request retrieves the cluster's health from this latest state on the
proxy sides, which enhances cluster stability.
2. Each health check will assess all collections and channels, with
detailed failure messages temporarily saved in the latest state.
3. Use CheckHealth request instead of the heavy GetMetrics request on
the querynode and datanode

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 11:02:45 +08:00
SimFG 2afe2eaf3e
feat: support to replicate collection when the services contains the system tt msg (#37559)
- issue: #37105

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-12-17 09:08:46 +08:00
tinswzy 27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
tinswzy 1dbb6cd7cb
enhance: refine the datacoord meta related interfaces (#37957)
issue: #35917 
This PR refines the meta-related APIs in datacoord to allow the ctx to
be passed down to the catalog operation interfaces

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-11-26 19:46:34 +08:00
XuanYang-cn 5a23c80f20
fix: Change memoryCheck write lock to read lock (#37525)
See also: milvus-io#37493

Signed-off-by: yangxuan <xuan.yang@zilliz.com>

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-11-15 10:44:31 +08:00
Zhen Ye 1b6edd0b4b
enhance: refactor the consumer grpc proto for reusing grpc stream for multi-consumer (#37564)
issue: #33285

- Modify the proto of consumer of streaming service.
- Make VChannel as a required option for streaming

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-11 17:24:29 +08:00
Zhen Ye 49657c4690
enhance: add create segment message, enable empty segment flush (#37407)
issue: #37172

- add redo interceptor to implement append context refresh. (make new
timetick)
- add create segment handler for flusher.
- make empty segment flushable and directly change it into dropped.
- add create segment message into wal when creating new growing segment.
- make the insert operation into following seq: createSegment -> insert
-> insert -> flushSegment.
- make manual flush into following seq: flushTs -> flushsegment ->
flushsegment -> manualflush.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-08 10:16:34 +08:00
yihao.dai 994f52fab8
fix: Revert "enhance: Support db for bulkinsert (#37012)" (#37420)
This reverts commit 6e90f9e8d9.

issue: https://github.com/milvus-io/milvus/issues/31273

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-11-07 17:02:25 +08:00
Zhen Ye cae9e1c732
fix: drop collection failed if enable streaming service (#37444)
issue: #36858

- Start channel manager on datacoord, but with empty assign policy in
streaming service.
- Make collection at dropping state can be recovered by flusher to make
sure that
 milvus consume the dropCollection message.
- Add backoff for flusher lifetime.
- remove the proxy watcher from timetick at rootcoord in streaming
service.

Also see the better fixup: #37176

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-07 10:26:26 +08:00
jaime 9d16b972ea
feat: add tasks page into management WebUI (#37002)
issue: #36621

1. Add API to access task runtime metrics, including:
  - build index task
  - compaction task
  - import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
  - sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-28 10:13:29 +08:00
yihao.dai b45cf2d49f
enhance: Add max length check for csv import (#37077)
1. Add max length check for csv import.
2. Tidy import options.
3. Tidy common import util functions.

issue: https://github.com/milvus-io/milvus/issues/34150

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:37:29 +08:00
yihao.dai 6e90f9e8d9
enhance: Support db for bulkinsert (#37012)
issue: https://github.com/milvus-io/milvus/issues/31273

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-25 14:31:39 +08:00
yihao.dai f0b3942a08
enhance: Limit import job number (#36891)
issue: https://github.com/milvus-io/milvus/issues/36890

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-23 16:01:28 +08:00
jaime 4746f47282
feat: management WebUI homepage (#36822)
issue: #36784
1. Implement an embedded web server for WebUI access.  
2. Complete the homepage development.

Home page demo:
<img width="2177" alt="iShot_2024-10-10_17 57 34"
src="https://github.com/user-attachments/assets/38539917-ce09-4e54-a5b5-7f4f7eaac353">

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-10-23 11:29:28 +08:00
yihao.dai 0fc2a4aa53
enhance: Optimize import scheduling and add time cost metric (#36601)
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.

issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-10-09 14:41:20 +08:00
cai.zhang ecb2b242e2
enhance: Add sorted for segment info (#36469)
issue: #33744

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-30 10:01:16 +08:00
congqixia d2c774fb6d
fix: Return all compactTo segments after support split (#36361)
Related to #36360

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-09-20 14:11:11 +08:00
aoiasd 139787371e
feat: support embedding bm25 sparse vector and flush bm25 stats log (#36036)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-09-19 10:57:12 +08:00
yihao.dai a61668c77e
feat: Introduce stats task for import (#35868)
This PR introduce stats task for import:
1. Define new `Stats` and `IndexBuilding` states for importJob
2. Add new stats step to the import process: trigger the stats task and
wait for its completion
3. Abort stats task if import job failed

issue: https://github.com/milvus-io/milvus/issues/33744

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-15 15:17:08 +08:00
XuanYang-cn e8840a1b41
enhance: Add metrics for Delete entries num of L0seg (#36175)
- Add metrics *DataCoordL0DeleteEntriesNum*
- Remove metrics *DataCoordRateStoredL0Segment*

See also: #36147

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-09-12 18:07:08 +08:00
yihao.dai 6b4ae0c65e
enhance: Log warn on delayed compaction tasks (#36049)
/kind enhancement

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-09-08 17:23:05 +08:00
cai.zhang 2c9bb4dfa3
feat: Support stats task to sort segment by PK (#35054)
issue: #33744 

This PR includes the following changes:
1. Added a new task type to the task scheduler in datacoord: stats task,
which sorts segments by primary key.
2. Implemented segment sorting in indexnode.
3. Added a new field `FieldStatsLog` to SegmentInfo to store token index
information.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-09-02 14:19:03 +08:00
Zhen Ye 99dff06391
enhance: using streaming service in insert/upsert/flush/delete/querynode (#35406)
issue: #33285

- using streaming service in insert/upsert/flush/delete/querynode
- fixup flusher bugs and refactor the flush operation
- enable streaming service for dml and ddl
- pass the e2e when enabling streaming service
- pass the integration tst when enabling streaming service

---------

Signed-off-by: chyezh <chyezh@outlook.com>
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-29 10:03:08 +08:00
congqixia c992a61a23
enhance: Separate allocator pkg in datacoord (#35622)
Related to #28861

Move allocator interface and implementation into separate package. Also
update some unittest logic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-22 10:06:56 +08:00
XuanYang-cn c42976ee6f
enhance: Init ChannelCP when creating a channel (#35387)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-08-14 10:16:58 +08:00
yihao.dai a4439cc911
enhance: Implement flusher in streamingNode (#34942)
- Implement flusher to:
  - Manage the pipelines (creation, deletion, etc.)
  - Manage the segment write buffer
  - Manage sync operation (including receive flushMsg and execute flush)
- Add a new `GetChannelRecoveryInfo` RPC in DataCoord.
- Reorganize packages: `flushcommon` and `datanode`.

issue: https://github.com/milvus-io/milvus/issues/33285

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-08-02 18:30:23 +08:00
zhenshan.cao aa247f192d
enhance: remove unused code for StorageV2 (#35132)
issue: https://github.com/milvus-io/milvus/issues/34168

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-08-01 12:08:13 +08:00
chyezh 1cff55381d
enhance: add manual alloc segment rpc for datacoord (#35002)
issue: #33285

- segment allocation will move to streamingnode, so a manual alloc
segment rpc is required

Signed-off-by: chyezh <chyezh@outlook.com>
2024-07-26 10:15:46 +08:00
XuanYang-cn e0b39d8bf4
fix: Milvus panic when compaction disabled and dropping a collection (#34103)
See also: #31059

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-07-11 14:44:52 +08:00
jaime 21fc5f5d46
enhance: Remove datanode reporting TT based on MQ implementation (#34421)
issue: #34420

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-05 15:48:09 +08:00