Commit Graph

906 Commits (c5212a42b645c38096fd12be143167ae970c6b73)

Author SHA1 Message Date
Gao 0a122533d0
enhance: change autoindex default metric type (#34328)
issue: #34304 
pr: #34261

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2024-08-02 16:22:20 +08:00
chyezh 923278b75d
enhance: the datacoord gc should fast quitable (#35057)
issue: #35049
pr: #35050

Signed-off-by: chyezh <chyezh@outlook.com>
2024-08-01 14:32:13 +08:00
cai.zhang b5ba5832d3
fix: [cherry-pick] Remove flushed segment in segment manager generated through import (#34650)
issue: #34648 

master pr: #34649

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-07-12 23:15:14 +08:00
xige-16 c566edc053
fix: Fix backup channel meta is empty (#34115)
issue: https://github.com/milvus-io/milvus/issues/34061
pr: https://github.com/milvus-io/milvus/pull/32044
/kind bug

Signed-off-by: xige-16 <xige2016@gmail.com>
2024-06-25 11:48:03 +08:00
wayblink d40929857c
fix:[cherry-pick]Panic if ProcessActiveStandBy returns error (#33371)
pr:#33369
issue:#33368

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-05-27 10:13:41 +08:00
Jiquan Long bfd88670ef
fix: metric milvus_rootcoord_indexed_entity_num (#33002)
issue: #31272 
pr: #32307

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-05-13 17:01:34 +08:00
wayblink 996b79c76c
enhance: Add channelCPs in FlushResponse (#32683)
#32609

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-05-10 09:55:31 +08:00
congqixia c36b54cb57
enhance: [2.3] Use different interval for gc scan (#31363) (#32551)
Cherry-pick from master
pr: #31363
See also #31362

This PR make datacoord garbage collection scan operation using differet
interval than other opeartion.

This interval is a newly added param item, which default value is 7*24
hours.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-25 16:07:26 +08:00
congqixia d9ac8e9e36
fix: [2.3] Mark channel checkpoint dropped prevent cp lag metrics leakage (#32454)
See also #31506 #31508

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-19 14:55:22 +08:00
congqixia 02acb9f7dd
fix: [2.3] Wait StandBy server ready for testcase (#32216) (#32231)
Cherry-pick from master
pr: #32216
See also #32069

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-13 12:03:23 +08:00
congqixia d635495885
fix: [2.3] Make coordinator `Register` not blocked on ProcessActiveStandby(#32069) (#32133)
Cherry-pick from master
pr: #32069
See also #32066

This PR make coordinator register successful and let
`ProcessActiveStandBy` run async. And roles may receive stop signal and
notify servers.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-11 17:33:21 +08:00
wei liu 85d736ec83
fix: Avoid acquire index meta's lock for each segment (#31723) (#31798)
issue: #31662 #31409
pr: #31723

during FilterIndexedSegment in GetRecoveryInfo, it try to acquire index
meta's read lock for every segment. when a collection has thousands of
segments, which may blocked for more than 10 seconds and even longer.
cause `AddSegmentIndex` may also triggered frequently, which try to get
the write lock.

This PR avoid acquire index meta's lock for each segment

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-09 17:19:17 +08:00
congqixia ba36f66a5c
fix: [2.3] Use server ctx instead of loopCtx for datacoord LivenessCheck (#31691) (#31747)
Cherry-pick from master
pr: #31691
See also #31689

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-02 23:30:33 -07:00
cqy123456 47f767cf32
enhance: remove float16 in 2.3 branch (#31720)
issue: https://github.com/milvus-io/milvus/issues/31696

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-03-30 10:49:13 +08:00
XuanYang-cn 69931a6e7f
fix: Skip changing meta if nodeID not match with channel (#31665)
See also: #31648
pr: #31666

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-28 16:05:11 +08:00
congqixia 368180bce4
fix: [2.3] Check nodeID before update channel checkpoint (#31473) (#31508)
Cherry-pick from master
pr: #31473
See also #31470 #31506

This PR adds nodeID assignment verification before updating channel
checkpoints.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-23 07:07:07 +08:00
Jiquan Long ab059bb064
enhance: add more metrics (#31271) (#31511)
/kind improvement
pr: #31271 
fix: https://github.com/milvus-io/milvus/issues/31272

This pr add more metrics, which are:

Slow query count, which the duration considered as slow can be
configurable;
Number of deleted entities;
Number of entities per collection;
Number of loaded entities per collection;
Number of indexed entities;
Number of indexed entities, per collection, per index and whether it's a
vetor index;
Quota states (LongTimeTickDelay, MemoryExhuasted, DiskQuotaExhuasted)
per database;

---------

Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2024-03-22 16:11:07 +08:00
groot 1ca7cba222
enhance: Support MinIO TLS connection (#31292)
issue: https://github.com/milvus-io/milvus/issues/30709
master pr: #31311

Signed-off-by: yhmo <yihua.mo@zilliz.com>
Co-authored-by: Chen Rao <chenrao317328@163.com>
2024-03-21 11:15:20 +08:00
cai.zhang 52a7eb9548
fix: Fix bug for get segment index state (#31429)
issue: #31361 
master pr: #31427 
2.4 pr: #31428

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-20 15:05:06 +08:00
cai.zhang ef530a2324
enhance: When describing an index, fetch the index info in batches (#31239)
issue: #29313 
master pr: #31238

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-15 16:37:09 +08:00
jaime 5ddb0b435f
fix: revoke session may be ignored due to server context cancellation in advance (#31213)
issue: #31219
pr: #31220

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-14 19:05:04 +08:00
congqixia 3c90475d55
enhance: [Cherry-pick] Add `ListIndexes` API from datacoord (#31104) (#31150)
Cherry-pick from master
pr: #31104
See also #31103

This PR add `listIndexes` API for datacoor server to list all indexes
for provided collection.
Comparing to the existing `DescribeIndex` API, the new one does NOT
check the segment index building progress to ease the burden when
invoking it

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 10:47:02 +08:00
zhagnlu 095c94305c
fix: add GetSegments optimization to avoid meta mutex competition (#31026)
pr: #31025

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-03-05 14:49:01 +08:00
yihao.dai 91d17870d6
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941) (#31024)
This PR includes the following adjustments:

1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

pr: https://github.com/milvus-io/milvus/pull/30941

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-05 14:27:01 +08:00
jaime 336e0ae45e
enhance: index meta use independent rather than global meta lock (#30986)
issue: https://github.com/milvus-io/milvus/issues/30837
pr: #30869

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-05 08:48:59 +08:00
chyezh df09222029
fix: starve lock caused by slow GetCompactionTo method when too much segments (#30965)
issue: #30823
pr: #30963

Signed-off-by: chyezh <chyezh@outlook.com>
2024-03-04 20:51:00 +08:00
XuanYang-cn bb2de0d964
fix: [cherry-pick] Clear DN unknown compaction tasks (#30972)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137
pr: #30850

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-04 16:52:59 +08:00
cai.zhang 38e3d6af3e
enhance: Optimize DescribeIndex to reduce lock contention (#30975)
issue: #29313
issue: #30443
master pr: #30939

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-03-04 11:30:59 +08:00
cai.zhang ef086dc0ca
fix: [Pick] Skip filling segmentID in indexBuildCh to prevent flush blocked (#30749)
issue: #30580 
master pr: #30747

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-22 20:42:56 +08:00
wayblink b2d3278c56
enhance: Add log when garbage collection resumed (#30536)
/kind enhancement
pr: #30535

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-05 17:09:53 +08:00
cai.zhang 3c5ff624f8
fix: [pick]Only use bound indexnodes in bound mode (#30462)
master pr: #30461 
issue: #30463

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-02-03 21:59:05 +08:00
yihao.dai 20608287b9
fix: Decoupling importing segment from flush process (#30402) (#30439)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

pr: https://github.com/milvus-io/milvus/pull/30402

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 12:59:14 +08:00
cqy123456 3036c19867
fix: can't not get search_cache_budget_gb in create index (#30353)
issue:https://github.com/milvus-io/milvus/issues/30375
pr: https://github.com/milvus-io/milvus/pull/30119

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-01-31 15:49:03 +08:00
chyezh 77e123762f
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30320)
1. add coordinator and proxy graceful stop timeout to 5s.
3. add other work node graceful stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
4. change the order of datacoord component while stop.
5. `LivenessCheck` do not perform graceful shutdown now. 

issue: https://github.com/milvus-io/milvus/issues/30310
pr: #30317
also see: https://github.com/milvus-io/milvus/pull/30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-01-27 08:45:02 +08:00
yihao.dai 917a4d74f3
fix: Use channel cp as the dml&start position for import segments (#30107) (#30133)
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.

issue: https://github.com/milvus-io/milvus/issues/30106

pr: https://github.com/milvus-io/milvus/pull/30107

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-22 13:32:55 +08:00
foxspy 0700434c58
fix: patching search cache param when index meta does not hold one (#30116)
patch search cache param from index configs when index meta could not
get the search cache size key

issue: #30113 
pr: #30119

Signed-off-by: xianliang <xianliang.li@zilliz.com>
2024-01-19 11:50:56 +08:00
wayblink e1446da83c
feat: [Cherry-pick] Implement DescribeAlias and ListAliases interfaces (#29896)
#22882
pr: #29641

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-01-12 16:30:51 +08:00
chyezh 98aae10273
fix: compact operation on datacoord meta should preform as a transcation (#29776)
issue: #29691
pr: #29775

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-12 14:54:52 +08:00
chyezh 7d3ec9f869
fix: unhealthy datacoord started with unhealthy channel manager (#29849)
issue: #29818
pr: #29848

Signed-off-by: chyezh <ye.zhen@zilliz.com>
2024-01-12 14:24:54 +08:00
wei liu 603cd1fb3f
fix: Drop segment meta info with prefix (#29857)
pr: #29856
If segment has more than 128 log fils, drop segment will exceed etcd txn
ops limit, which will failed the drop segment request
This PR drop segment meta info with prefix, to avoid drop segment meta
failed

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-01-11 15:02:50 +08:00
SimFG a2365e4b2a
enhance: [2.3] Add concurrency for datacoord segment GC (#29557)
issue: #29553
pr: https://github.com/milvus-io/milvus/pull/29561
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-03 13:16:57 +08:00
congqixia 5ba0f476d5
fix: [2.3]parse logID from logPath if copyDeltalog find logID not provided (#29276)
Cherry-pick from master
pr: #29273
See also: #29272

This PR add `getDeltaLogID` to safely return logID when Binlog struct
has zero value logID. It parses logID from logPath if the format is
valid. Otherwise, this function shall return error.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-27 14:42:46 +08:00
wei liu 26b1853c54
fix: Auto balance param can't be updated by dynamic(#29501) (#29502)
pr: #29501
This PR fixed that auto balance param can't be updated by dynamic

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-12-27 14:30:53 +08:00
aoiasd 908e075fdb
enhance: [Cherry-pick] pack datacoord Cluster and SessionManager with interface and mock them (#29171)
relate: https://github.com/milvus-io/milvus/issues/28861
https://github.com/milvus-io/milvus/issues/28854
pr: https://github.com/milvus-io/milvus/pull/28869

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2023-12-25 14:42:44 +08:00
SimFG b69543c7dc
fix: [2.3] Clean the compaction plan info to avoid the object leak (#29368)
issue: https://github.com/milvus-io/milvus/issues/29296
pr: #29365

Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-22 12:02:44 +08:00
SimFG 74e72ce27e
enhance: [2.3] Support to get the param value in the runtime (#29298)
pr: #29297
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-12-21 20:36:43 +08:00
congqixia 9acf32a0b7
enhance: [cherry-pick] change cp metric to absolute unix ts (#29328) (#29337)
Cherry pick from master
pr: #29328 

See also #29327

Change channel checkpoint metrics to unix seconds instead of checkpoint
timestamp lag value

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-20 15:04:42 +08:00
cai.zhang 3182b9df5b
fix: [Pick]Set the default index name to the name of the existing index (#29281)
issue: #29269 
master pr: #29275

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2023-12-20 14:10:40 +08:00
XuanYang-cn 7facaa0c40
fix: [Cherry-pick] fix unstable ConsistencyHashPolicy ut (#28375) (#29235)
Fixes: #28372, #29234
pr: #28375

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-12-15 18:34:38 +08:00
congqixia d8d699401b
enhance: [cherry-pick] Add http method to control datacoord garbage collection (#29212)
Cherry-pick from master
pr: #29052
See also #29051

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>
2023-12-15 02:16:38 +08:00