Commit Graph

1320 Commits (04175d85497f5f13bd253bbd940b4f780ef958dc)

Author SHA1 Message Date
cai.zhang 52434ccc78
enhance: [2.5] Limit the speed of the generating stats task (#39645)
master pr: #39644

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-17 16:06:17 +08:00
XuanYang-cn ee25af4c9b
enhance: Add configs for compaction schedule (#39010) (#39511)
pr: #39010

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-02-17 11:44:15 +08:00
Xianhui Lin f0964f769d
enhance: [2.5]Add json key inverted index in stats for optimization (#39876)
Add json key inverted index in stats for optimization
issue: https://github.com/milvus-io/milvus/issues/36995
pr: https://github.com/milvus-io/milvus/pull/38039

---------

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-02-16 20:12:15 +08:00
congqixia 5da9262f58
fix: [2.5] Add and use lifetime context for compaction trigger (#39857) (#39880)
Cherry-pick from master
pr: #39857 
Related to #39856

This PR add lifetime bound context for compaction trigger and use it
instead of context.Background in case of rootcoord down and some grpc
call retry forever

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-02-14 14:18:14 +08:00
cai.zhang 418f971d2d
fix: [2.5] ReEnqueue L0 compaction task when preCheck failed (#39871)
issue: #39868 

master pr: #39870

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-14 13:38:15 +08:00
SimFG cb1bf6d122
fix:[2.5] remove the mmap.enable param in the type param when creating index (#39806)
Because when GetIndexParams is used, index params and type params are
concatenated, so when loading index, the mmap.enable parameter in type
params is also referenced.

- issue: #39801 
- pr: #39803

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-02-13 10:08:53 +08:00
jaime ddc5b299ad
enhance: expose more metrics data (#39466)
issue: #36621 #39417
pr: #39456
1. Adjust the server-side cache size.
2. Add source information for configurations.
3. Add node ID for compaction and indexing tasks.
4. Resolve localhost access issues to fix health check failures for
etcd.

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-02-07 11:48:45 +08:00
cai.zhang 22a69b5399
enhance: [2.5]Only check L0 compaction with same channel when stating (#39543)
issue: #39333 

master pr: #39459

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-02-05 17:17:11 +08:00
congqixia 8934672687
enhance: [2.5] Skip update index metrics if index dropped (#39458) (#39572)
Cherry-pick from master
pr: #39458 
Related to #39457

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-24 18:21:06 +08:00
congqixia a48749cc11
enhance: [2.5] Use mockery pkg config for datacoord&datanode (#39567) (#39577)
Cherry-pick from master
pr: #39567
Related to #38339

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-24 17:21:13 +08:00
congqixia 6f7b2b4e75
enhance: [2.5] Refine error msg for schema & index checking (#39533) (#39565)
Cherry-pick from master
pr: #39533

The error message was malformated or missing some meta info, say field
name. This PR recitfies some message format and add field name in error
message when type param check fails.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-24 13:43:06 +08:00
SimFG 30411d6d3a
fix: [2.5] deny to set the mmap param for the alter index api (#39520)
- issue: #39517
- pr: #39518

Signed-off-by: SimFG <bang.fu@zilliz.com>
2025-01-22 23:55:06 +08:00
cai.zhang 4602e97888
fix: [2.5] Set the stating state correctly (#39514)
issue: #39333 

master pr: #39503

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-22 18:38:29 +08:00
cai.zhang cbf1161177
fix: [2.5] Set deltalogs for stats task after set segment stating (#39502)
issue: #39333

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-22 16:29:06 +08:00
cai.zhang e46c8ba7fb
fix: [2.5]Set isStating to ensuer mutual exclusive between L0 compacting and stats (#39490)
issue: #39333 

master pr: #39489

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2025-01-22 10:27:05 +08:00
cai.zhang 817b616eb4
fix: [2.5]Restore the compacting state for stats task during recovery (#39460)
issue: #39333 

master pr: #39459

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-21 01:03:05 +08:00
zhenshan.cao 964000f645
fix: deleted the sealed segment data accidentally (#39422)
issue:https://github.com/milvus-io/milvus/issues/39333
pr: https://github.com/milvus-io/milvus/pull/39421

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2025-01-20 17:49:03 +08:00
XuanYang-cn c9b0859b16
fix: [cp25]Record active collections for l0Policy (#39217) (#39383)
By recording the active collection lists, The l0 compaction trigger of
view change and idle won't influence each other.

Also this pr replaces the L0View cache with real L0 segments' change.
Save some memory and make L0 compaction triggers more accurate.

See also: #39187
pr: #39217

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-01-20 10:47:03 +08:00
yihao.dai b69994272f
enhance: [2.5] Limit the maximum number of segments restored and fail the job if saving the binlog fails (#39359)
1. Limit the maximum number of restored segments to 1024.
2. Fail the import job if saving binlog fails.
3. Fail the import job if saving the import task fails to prevent
repeatedly generating dirty importing segments.

issue: https://github.com/milvus-io/milvus/issues/39331

pr: https://github.com/milvus-io/milvus/pull/39344

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-17 10:27:04 +08:00
yihao.dai 6773fb10a8
enhance: [2.5] Read metadata concurrently to accelerate recovery (#38900)
Read metadata such as segments, binlogs, and partitions concurrently at
the collection level.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38403

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-16 17:53:01 +08:00
yihao.dai 29dad64341
fix: [2.5] Fix consume blocked due to too many consumers (#38915)
This PR limits the maximum number of consumers per pchannel to 10 for
each QueryNode and DataNode.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38455

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-16 15:19:03 +08:00
yihao.dai c741b8be2b
fix: [2.5] Remove frequently updating metric to avoid mutex contention (#38778)
issue: https://github.com/milvus-io/milvus/issues/37630

Reduce the frequency of `updateIndexTasksMetrics` to avoid holding the
mutex for long periods.

pr: https://github.com/milvus-io/milvus/pull/38775

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-16 11:51:02 +08:00
wei liu 76ed552b00
enhance: Add logs for check health failed (#39208) (#39302)
pr: #39208

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-01-16 10:31:04 +08:00
congqixia 2fe245f918
fix: [2.5] Add index param duplication check (#39289) (#39304)
Cherry-pick from master
pr: #39289
Related to #39288

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-01-15 21:04:06 +08:00
cai.zhang 6816ee4cf5
fix: [2.5] Record a map to avoid repeatedly traversing the CompactionFrom (#38926)
issue: #38811 

master pr: #38925

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-15 10:31:00 +08:00
yihao.dai 2e4a1052aa
enhance: [2.5] Reduce mutex contention in datacoord meta (#38904)
1. Using secondary index to avoid retrieving all segments at
GetSegmentsChanPart.
2. Perform batch SetAllocations to reduce the number of times the meta
lock is acquired.

issue: https://github.com/milvus-io/milvus/issues/37630

pr: https://github.com/milvus-io/milvus/pull/38219

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-01-15 00:55:00 +08:00
cai.zhang 4270174899
fix: [2.5] Add scalar index engine version for compatibility (#39236)
issue: #39203 

master pr: #39204

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-14 21:01:01 +08:00
Zhen Ye adfc3f945e
enhance: record memory size (uncompressed) item for index (#38844)
issue: #38715 
pr: #38770

- Current milvus use a serialized index size(compressed) for estimate
resource for loading.
- Add a new field MemSize (before compressing) for index to estimate
resource.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-14 10:33:06 +08:00
Zhen Ye 95809ca767
enhance: make new go package to manage proto (#39128)
issue: #39095
pr: #39114

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-10 10:53:01 +08:00
jaime 0693634f62
enhance: add db name in replica description (#38673)
issue: #36621
pr: #38672

Signed-off-by: jaime <yun.zhang@zilliz.com>
2025-01-09 19:43:04 +08:00
Zhen Ye 6f1febe881
enhance: move streaming coord from datacoord to rootcoord (#39009)
issue: #38399
pr: #39007

We want to support broadcast operation for both streaming and msgstream.
But msgstream can be only sent message from rootcoord and proxy.
So this pr move the streamingcoord to rootcoord to make easier
implementation.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-01-07 17:56:56 +08:00
cai.zhang e6dd3e5a57
fix: [2.5]Remove valid expressions from invalid expressions (#38999)
issue: #39014 

master pr: #38957 
master pr: #39012

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-01-06 18:02:55 +08:00
aoiasd 6fa096eb39
fix:[Cherry-pick] bm25 import segment loss stats (#38881)
relate: https://github.com/milvus-io/milvus/issues/38854
pr: https://github.com/milvus-io/milvus/pull/38855

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-12-31 19:24:54 +08:00
cai.zhang 71dea30d44
fix: [2.5] Release lock when return function (#38863)
issue: #38851 

master pr: #38856

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-30 22:44:39 +08:00
jaime 11bedf5e76
fix: Revert "Expose metrics of stanby coordinators (#27698)" (#38621)
issue: #38608
pr: #38620

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-20 18:04:47 +08:00
XuanYang-cn ca7ec23198
enhance: Use partitionID when delete by partitionKey (#38231)
When delete by partition_key, Milvus will generates L0 segments
globally. During L0 Compaction, those L0 segments will touch all
partitions collection wise. Due to the false-positive rate of segment
bloomfilters, L0 compactions will append false deltalogs to completed
irrelevant partitions, which causes *partition deletion amplification.

This PR uses partition_key to set targeted partitionID when producing
deleteMsgs into MsgStreams. This'll narrow down L0 segments scope to
partition level, and remove the false-positive influence
collection-wise.

However, due to DeleteMsg structure, we can only label one partition to
one deleteMsg, so this enhancement fails if user wants to delete over 2
partition_keys in one deletion.

See also: #34665

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-20 11:18:46 +08:00
XuanYang-cn c0b855dc75
fix: ChannelManager concurret Release and Watch bug (#38590)
See also: #38589

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-12-19 22:50:47 +08:00
congqixia 3d360c0624
fix: SyncSegments rpc always failed (#38578)
miss the patch due to code branching
previous pr: #38032

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Co-authored-by: Wei Liu <wei.liu@zilliz.com>
2024-12-19 15:40:45 +08:00
cai.zhang 306e5e6898
enhance: clean compaction task in compactionHandler (#38170)
issue: #35711

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
2024-12-19 12:36:47 +08:00
jaime 78438ef41e
fix: revert optimize CPU usage for CheckHealth requests (#35589) (#38555)
issue: #35563

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-19 00:38:45 +08:00
cai.zhang a348122758
fix: Support get segments from current segments view (#38512)
issue: #38511

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-18 18:00:54 +08:00
yihao.dai d4dab3c62f
enhance: Reduce segmentManager lock granularity (#37836)
Use a channel level key lock for segments in segmentManager.

issue: https://github.com/milvus-io/milvus/issues/37633,
https://github.com/milvus-io/milvus/issues/37630

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-17 14:12:52 +08:00
jaime 28fdbc4e30
enhance: optimize CPU usage for CheckHealth requests (#35589)
issue: #35563
1. Use an internal health checker to monitor the cluster's health state,
storing the latest state on the coordinator node. The CheckHealth
request retrieves the cluster's health from this latest state on the
proxy sides, which enhances cluster stability.
2. Each health check will assess all collections and channels, with
detailed failure messages temporarily saved in the latest state.
3. Use CheckHealth request instead of the heavy GetMetrics request on
the querynode and datanode

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-12-17 11:02:45 +08:00
SimFG 2afe2eaf3e
feat: support to replicate collection when the services contains the system tt msg (#37559)
- issue: #37105

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-12-17 09:08:46 +08:00
SimFG fa8ac09550
fix: the issue of replicate message exception when the ttMsgEnable config is changed dynamically (#38178)
- issue: #38177

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-12-14 23:24:51 +08:00
tinswzy 27229f7907
enhance: refine exists log print with ctx (#38080)
issue: #35917 
Refines exists log print with ctx

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2024-12-14 22:36:44 +08:00
Ted Xu dc85d8e968
enhance: improve mix compaction performance by removing max segment limitations (#38344)
See #37234

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-12-11 20:38:42 +08:00
cai.zhang 0d7a89a4f8
fix: Use the correct RootPath when decompressing binlog in stats task (#38341)
issue: #38336

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-11 16:16:42 +08:00
yihao.dai 43e0e2b7ed
fix: Fix empty import task result (#38316)
Ensure the idempotency of import tasks to prevent duplicate tasks in
DataNode.

issue: https://github.com/milvus-io/milvus/issues/38313

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-12-11 15:42:49 +08:00
cai.zhang 9be106dedf
enhance: Refine task scheduler logs (#38334)
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-12-11 15:00:44 +08:00