milvus

Commit Graph

Author	SHA1	Message	Date
yihao.dai	2a037a97f1	enhance: Add get vector latency metric and refine request limit error message (#40083 ) issue: https://github.com/milvus-io/milvus/issues/40078 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-02-21 19:41:55 +08:00
wei liu	7d2c948c69	fix: task delta cache leak on reduce task (#40055 ) issue: #40052 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-21 16:47:54 +08:00
wei liu	07578041ba	fix: querycoord panic in cornor case (#40057 ) issue: #40050 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-21 11:19:58 +08:00
Zhen Ye	64dad60dc2	fix: delegator doesn't follow with wal if streaming enabled (#39890 ) issue: #38399 Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-17 14:10:15 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
wei liu	bfc802297e	enhance: Add management api to check querycoord balance status (#37784 ) issue: #37783 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-14 18:00:14 +08:00
wei liu	b9e3ec7175	enhance: Add trigger interval config for auto balance (#39154 ) issue: #39156 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-14 16:12:15 +08:00
congqixia	58045a3396	fix: Check collection released before target checks (#39841 ) Related to #39840 The target could be updated async in previous code. This PR make remove collection from target observer block until all tasks related in dispatchers are removed preventing the metrics being updated after collection released. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-14 11:38:14 +08:00
Zhen Ye	0988807160	enhance: enable write ahead buffer for streaming service (#39771 ) issue: #38399 - Make a timetick-commit-based write ahead buffer at write side. - Add a switchable scanner at read side to transfer the state between catchup and tailing read Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-12 20:38:46 +08:00
wei liu	c12c4b4fff	fix: [skip e2e] pr conflict cause ut failed (#39811 ) Related to https://github.com/milvus-io/milvus/pull/39701 & https://github.com/milvus-io/milvus/issues/39681 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-12 11:44:51 +08:00
congqixia	7b51e4839f	fix: Resolve conflict on qc task test (#39796 ) Related to #39701 & #39681 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-11 18:40:45 +08:00
wei liu	ff5c680c99	fix: load collection stucks if compaction/gc happens (#39701 ) issue: #39680 if compaction/gc happens, load collection may stuck due to SegmentNotFound, we should trigger UpdateNextTarget to get a new data view to execute loading operation. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-11 15:48:50 +08:00
wei liu	85c9f92ff4	fix: uneven distribution caused by executing task delta cache leak (#39702 ) issue: #39681 this PR maintain workload effect in action instead of computing workload effect from target, which may cause leak if target changes. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-11 14:30:46 +08:00
Zhen Ye	d3e32bb599	enhance: make pchannel level flusher (#39275 ) issue: #38399 - Add a pchannel level checkpoint for flush processing - Refactor the recovery of flushers of wal - make a shared wal scanner first, then make multi datasyncservice on it Signed-off-by: chyezh <chyezh@outlook.com>	2025-02-10 16:32:45 +08:00
jaime	8a4ac8cccd	enhance: expose more metrics data (#39456 ) issue: #36621 #39417 1. Adjust the server-side cache size. 2. Add source information for configurations. 3. Add node ID for compaction and indexing tasks. 4. Resolve localhost access issues to fix health check failures for etcd. Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-02-07 11:50:50 +08:00
wei liu	05ac4041aa	enhance: use rated logger for high frequency log in dist handler (#39452 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-02-05 15:31:10 +08:00
Zhen Ye	c84a0748c4	enhance: add rw/ro streaming query node replica management (#38677 ) issue: #38399 - Embed the query node into streaming node to make delegator available at streaming node. - The embedded query node has a special server label `QUERYNODE_STREAMING-EMBEDDED`. - Change the balance strategy to make the channel assigned to streaming node as much as possible. Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-24 16:55:07 +08:00
yihao.dai	5fb597b37b	fix: Remove frequently updating metric to avoid mutex contention (#38775 ) issue: https://github.com/milvus-io/milvus/issues/37630 Reduce the frequency updating metrics to avoid holding the mutex for long periods. --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-24 10:31:07 +08:00
yihao.dai	e0b26260f2	enhance: enable task delta cache (#39307 ) When there are many segment tasks in the querycoord scheduler, the traversal in `GetSegmentTaskDelta` checks becomes time-consuming. This PR adds caching for segment deltas. issue: https://github.com/milvus-io/milvus/issues/37630 Signed-off-by: Wei Liu <wei.liu@zilliz.com> Co-authored-by: Wei Liu <wei.liu@zilliz.com>	2025-01-23 14:31:16 +08:00
yihao.dai	38f813bed3	enhance: Read metadata concurrently to accelerate recovery (#38403 ) Read metadata such as segments, binlogs, and partitions concurrently at the collection level. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-23 14:27:27 +08:00
yihao.dai	e55d6506e3	enhance: Remove frequent observe log (#39413 ) /kind improvement Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-20 11:01:10 +08:00
yihao.dai	657550cf06	fix: Fix slow dist handle and slow observe (#38566 ) 1. Provide partition&channel level indexing in the collection target. 2. Make `SegmentAction` not wait for distribution. 3. Remove scheduler and target manager mutex. 4. Optimize logging to reduce CPU overhead. issue: https://github.com/milvus-io/milvus/issues/37630 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-01-15 20:17:00 +08:00
wei liu	d2834a1812	enhance: Add logs for check health failed (#39208 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-15 17:31:00 +08:00
jaime	e8f76cd2d9	fix: unstable ut in leader_vew_manager.go file (#39161 ) issue: #38672 Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-01-15 12:26:59 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
wei liu	cc5d59392a	fix: channel unbalance during stopping balance progress (#38971 ) issue: #38970 cause the stopping balance channel still use the row_count_based policy, which may causes channel unbalance in multi-collection case. This PR impl a score based stopping balance channel policy. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-13 11:21:06 +08:00
wei liu	826b726c86	fix: Prevent leader checker from generating excessive duplicate leader tasks (#39000 ) issue: #39001 Background: Segment Load Version: Each segment load request assigns a timestamp as its version. When multiple copies of a segment are loaded on different QueryNodes, the leader checker uses this version to identify the latest copy and updates the routing table in the leader view to point to it. Delegator Router Version: When a delegator builds a route to a QueryNode that has loaded a segment, it also records the segment's version. Router Table Update Logic: If the leader checker detects that the version of a segment in the routing table does not match the version in the worker, it updates the routing table to point to the QueryNode with the latest version. Additionally, it updates the segment's load version in the QueryNode during this process. Issue: When a channel is undergoing load balancing, the leader checker may sync the routing table to a new delegator. This sync operation modifies the segment's load version, which invalidates the routing in the old delegator. Subsequently, the leader checker updates the routing table in the old delegator, breaking the routing in the new delegator. This cycle continues, causing repeated updates and inconsistencies. Fix: This PR introduces two changes to address the issue: 1. Use NodeID to verify whether the delegator's routing table needs an update, avoiding unnecessary modifications. 2. Ensure compatibility by using the latest segment's load version as the version recorded in the routing table. These changes resolve the cyclic updates and prevent the leader checker from generating excessive duplicate tasks, ensuring routing stability across delegators during load balancing. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-10 14:12:57 +08:00
Zhen Ye	bb8d1ab3bf	enhance: make new go package to manage proto (#39114 ) issue: #39095 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-10 10:49:01 +08:00
jaime	f03a85725a	enhance: add db name in replica (#38672 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2025-01-09 19:40:59 +08:00
wei liu	47e7ea241e	enhance: Add log for case which target not update as expected (#38944 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2025-01-07 17:45:03 +08:00
Xiaofan	cb6eca8e91	fix: drop partition can not be successful if load failed (#38793 ) fix #38649 when partition load failed, the partition drop will also fail due to the wrong error message Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-12-30 19:42:52 +08:00
wei liu	f49d618382	fix: Querycoord will trigger unexpected balance task after restart (#38630 ) issue: #38606 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 19:30:48 +08:00
wei liu	25f0c82ceb	fix: Fix update loading collection's load config doesn't work (#38595 ) issue: #38594 --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 18:02:51 +08:00
wei liu	9c3f59dbbe	fix: Prevent balancer from overloading the same QueryNode (#38719 ) issue: #38718 The balancer calculates the workload of executing tasks as an ongoing score for target nodes. However, a logic issue arises when GetSegmentTaskDelta or GetChannelTaskDelta is called with collectionID=-1, which incorrectly returns zero. Due to the incorrect global score, the executing task's workload is not properly reflected for each collection. Consequently, each collection submits its own balance task, leading to the balancer assigning excessive tasks to the same QueryNode. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-25 16:36:49 +08:00
jaime	5afd0c0a2b	fix: Revert "Expose metrics of stanby coordinators (#27698 )" (#38620 ) issue: #38608 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-23 11:46:57 +08:00
jaime	78438ef41e	fix: revert optimize CPU usage for CheckHealth requests (#35589 ) (#38555 ) issue: #35563 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-19 00:38:45 +08:00
yihao.dai	d3c174b0f1	enhance: Accelerate observe collection (#38028 ) 1. A collection should observe the channel only once. 2. A collection should check the CollectionLoadPercent for updates only once. 3. Skip saving coll/partition meta if there are no changes, primarily to accelerate collection observation after recovery. issue: https://github.com/milvus-io/milvus/issues/37630 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-12-17 14:14:45 +08:00
jaime	28fdbc4e30	enhance: optimize CPU usage for CheckHealth requests (#35589 ) issue: #35563 1. Use an internal health checker to monitor the cluster's health state, storing the latest state on the coordinator node. The CheckHealth request retrieves the cluster's health from this latest state on the proxy sides, which enhances cluster stability. 2. Each health check will assess all collections and channels, with detailed failure messages temporarily saved in the latest state. 3. Use CheckHealth request instead of the heavy GetMetrics request on the querynode and datanode Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-17 11:02:45 +08:00
SimFG	2afe2eaf3e	feat: support to replicate collection when the services contains the system tt msg (#37559 ) - issue: #37105 --------- Signed-off-by: SimFG <bang.fu@zilliz.com>	2024-12-17 09:08:46 +08:00
wei liu	659847c11f	enhance: Remove load task limit in one round (#38436 ) the task limit in assignSegment/assignChannel will works for both load task and balance task. this PR remove the load task limit, only limit balance task num in one round. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-16 19:30:43 +08:00
wei liu	40f9db491e	fix: Fix SyncDistribution may cost too much time on retry (#38454 ) issue: #38428 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-16 11:38:44 +08:00
tinswzy	27229f7907	enhance: refine exists log print with ctx (#38080 ) issue: #35917 Refines exists log print with ctx Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-14 22:36:44 +08:00
Zhen Ye	833c74aa66	enhance: add detail, replica count for resource group (#38314 ) issue: #30647 --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-13 14:14:50 +08:00
wei liu	e279ccf109	enhance: Enable score based balance channel policy (#38143 ) issue: #38142 current balance channel policy only consider current collection's distribution, so if all collections has 1 channel, and all channels has been loaded on same querynode, after querynode num increase, balance channel won't be triggered. This PR enable score based balance channel policy, to achieve: 1. distribute all channels evenly across multiple querynodes 2. distribute each collection's channel evenly across multiple querynodes. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-11 17:20:43 +08:00
Zhen Ye	d3ae8e9232	fix: delay the wait other coord logic in query coord after query coord change into standby state (#38259 ) issue: https://github.com/milvus-io/milvus/issues/37764 - After removing rpc layer from mixcoord, the querycoord at standby mode will be blocked forever of deployment rolling --------- Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-11 15:48:42 +08:00
wei liu	950203aba0	enhance: Optimize save colelction target latency (#38345 ) issue: #38237 this PR only use better compression level for proto msg which is larger than 1MB, and use a lighter compression level for smaller proto msg, which could get a better latency in most case. this PR could reduce the latency from 22.7s to 4.7s with 10000 collctions and each collections has 1000 segments. before this PR: BenchmarkTargetManager-8 1 22781536357 ns/op 566407275088 B/op 11188282 allocs/op after this PR: BenchmarkTargetManager-8 1 4729566944 ns/op 36713248864 B/op 10963615 allocs/op Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-11 10:12:43 +08:00
congqixia	7ea9c983d2	enhance: Add mockery package config for QC&QN (#38340 ) Related to #38339 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-10 19:18:42 +08:00
wei liu	856e2aad7d	fix: Leader task stuck and retry again and again (#38202 ) issue: #38201 leader task require to update delegator's distribution, and only success after the distribution change has been applyed to delegator. but the delegator will reject the distribution change if it's version is older than current version in delegator. which cause the leader task stuck and retry forever. this PR remove the leader task finish check. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-10 19:16:42 +08:00
wei liu	f04986fceb	enhance: Remove constraint on release segment task (#38297 ) issue: #38305 after we disable balance segment and balance channel happens at same time, the constriant which require release segment must happens on serviceable shard leader is unnessary. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-12-10 11:18:49 +08:00
jaime	8ed019735c	enhance: add disk stats within system metrics (#38033 ) issue: ##36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-12-06 16:32:41 +08:00
congqixia	36946cc9ce	enhance: Set loaded collection/partition number to metrics (#38271 ) Related to #36456 Previous PR: #38471 #38233 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-06 16:18:40 +08:00
congqixia	6ff19481f0	enhance: Resolve compilation error due to PR conflict (#38252 ) Related pr: #38233 #38059 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-05 19:26:40 +08:00
congqixia	051bc280dd	enhance: Make dynamic load/release partition follow targets (#38059 ) Related to #37849 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-05 16:24:40 +08:00
congqixia	32645fc28a	enhance: Unify querycoord meta metrics (#38233 ) Related to #36456 Unify collection/partition number metrics to collection manager in case of unwant missing modification Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-05 15:48:39 +08:00
tinswzy	7944538ade	enhance: Add ctx param to KV operation interfaces (#38154 ) issue: #35917 Refine KV operation interfaces by adding a ctx param Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-05 15:16:41 +08:00
tinswzy	e76802f910	enhance: refine querycoord meta/catalog related interfaces to ensure that each method includes a ctx parameter (#37916 ) issue: #35917 This PR refine the querycoord meta related interfaces to ensure that each method includes a ctx parameter. Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-11-25 11:14:34 +08:00
jaime	7bbfe86bcd	enhance: add list index and segment index retrieval API for WebUI (#37861 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-22 16:58:34 +08:00
congqixia	b34bfb98a0	enhance: Refine Replica manager colle2Replicas secondary index (#37906 ) Related to #37630 This PR add a new util coll2Replicas secondary index to reduce map access & iteration while get replicas by collection --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-22 11:54:32 +08:00
wei liu	965bda6e60	enhance: Add channel name to shard leader log in meta cache (#37856 ) Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-21 19:24:31 +08:00
wei liu	0a440e0d38	fix: Prevent simultaneous balance of segments and channels (#37850 ) issue: #33550 balance segment and balance segment execute at same time, which will cause bounch of corner case. This PR disable simultaneous balance of segments and channels Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-21 17:56:55 +08:00
wei liu	b983ef9fca	fix: Channel may be released after balance (#37862 ) issue: #37830 casue dist handler doesn't set channel's version, so if channel checker try to dedup channel, it may release the new delegator after balance finished. this PR fix the way to set proper version for channel. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-21 10:40:31 +08:00
congqixia	b8d31ebed8	enhance: Remove unnecessary segment clone updating dist (#37797 ) Related to #37630 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-20 11:26:31 +08:00
yihao.dai	b6612e02b4	enhance: Reduce GetIndexInfos calls (#37695 ) Batch `GetIndexInfos` calls for segments to reduce RPC calls. issue: https://github.com/milvus-io/milvus/issues/37634 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-11-19 14:24:31 +08:00
congqixia	6d86b9022e	enhance: Provide secondary index critria when filter leaderview (#37777 ) Related to #37630 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-19 10:12:30 +08:00
jaime	257ecab84b	enhance: remove collection queryable check from health check (#37712 ) Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-18 10:50:38 +08:00
congqixia	b0bd290a6e	enhance: Use internal json(sonic) to replace std json lib (#37708 ) Related to #35020 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-18 10:46:31 +08:00
wei liu	a1b6be1253	fix: Delegator stuck at unserviceable status (#37694 ) issue: #37679 pr #36549 introduce the logic error which update current target when only parts of channel is ready. This PR fix the logic error and let dist handler keep pull distribution on querynode until all delegator becomes serviceable. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-15 10:20:31 +08:00
jaime	1d06d4324b	fix: Int64 overflow in JSON encoding (#37657 ) issue: ##36621 - For simple types in a struct, add "string" to the JSON tag for automatic string conversion during JSON encoding. - For complex types in a struct, replace "int64" with "string." Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-14 22:52:30 +08:00
wei liu	1304b40552	fix: Balance channel may stuck at increasing replica number case (#37641 ) issue: #37640 fix the pr #36549 cause balance channel will wait until new delegator becomes serviceable, but new delegator need to sync target version then becomes serviceable, and sync target version need to be wait all replica load done. so if increasing replica number and balance channel happens at same time, logic dead lock occurs. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-14 10:08:31 +08:00
jaime	1e8ea4a7e7	feat: add segment/channel/task/slow query render (#37561 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-12 17:44:29 +08:00
wei liu	266f8ef1f5	fix: Search may return less result after qn recover (#36549 ) issue: #36293 #36242 after qn recover, delegator may be loaded in new node, after all segment has been loaded, delegator becomes serviceable. but delegator's target version hasn't been synced, and if search/query comes, delegator will use wrong target version to filter out a empty segment list, which caused empty search result. This pr will block delegator's serviceable status until target version is synced --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-12 16:34:28 +08:00
congqixia	f5b06a3c9f	enhance: Invalidate collection cache when release collection (#37577 ) Related to #37395 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-12 10:16:29 +08:00
wei liu	61a5b15ada	fix: Lost loading collection's updateTs after qc restart (#37538 ) issue: #37537 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-11 14:34:28 +08:00
congqixia	5e90f348fc	enhance: Handle legacy proxy load fields request (#37565 ) Related to #35415 In rolling upgrade, legacy proxy may dispatch load request wit empty load field list. The upgraded querycoord may report error by mistake that load field list is changed. This PR: - Auto field empty load field list with all user field ids - Refine the error messag when load field list updates - Refine load job unit test with service cases Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-11 10:14:26 +08:00
sthuang	70605cf5b3	enhance: Support custom privilege group for RBAC (#37087 ) issue: #37031 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-11-09 08:44:28 +08:00
yihao.dai	ff9bdf7029	fix: Fix load slowly (#37454 ) When there're a lot of loaded collections, they would occupy the target observer scheduler’s pool. This prevents loading collections from updating the current target in time, slowing down the load process. This PR adds a separate target dispatcher for loading collections. issue: https://github.com/milvus-io/milvus/issues/37166 --------- Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2024-11-09 07:48:26 +08:00
congqixia	dcc1b506dc	enhance: Add context trace for querycoord queryable check (#37524 ) When check health logic failed to collection not-queryable, the related reason is hard to find in log. This PR add context for log with trace id and print unqueryable collection info log. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-08 14:00:26 +08:00
wei liu	a03157838b	enhance: Enable node assign policy on resource group (#36968 ) issue: #36977 with node_label_filter on resource group, user can add label on querynode with env `MILVUS_COMPONENT_LABEL`, then resource group will prefer to accept node which match it's node_label_filter. then querynode's can't be group by labels, and put querynodes with same label to same resource groups. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-08 11:18:27 +08:00
Xiaofan	e073906a19	enhance: optimize describe collection and index (#37490 ) fix #37489 combine multiple describe collection and list index into one call Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2024-11-08 10:18:34 +08:00
jaime	f348bd9441	feat: add segment,pipeline, replica and resourcegroup api for WebUI (#37344 ) issue: #36621 Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-11-07 11:52:25 +08:00
wei liu	8714774305	fix: search/query failed due to segment not loaded (#37403 ) issue: #36970 cause release segment and balance channel may happen at same time, and before new delegator become serviceable, if release segment exeuctes on new delegator, and search/query comes on old delegator, then release segment and query segment happens in parallel, if release segment execute first in worker, then search/query will got a SegmentNodeLoaded error. This PR add serviceable filter on delegator, then all load/release segment operation will happens on serviceable delegator. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-11-06 15:10:25 +08:00
congqixia	9539739781	enhance: Release compacted growing segment if in dropped list (#37245 ) See also #37205 Previously releasing growing segments could be triggered by two conditions: - Sealed Segment with same id is loaded - Segment start position is before target checkpoint ts Which has a worst case that the corresponding sealed segment is compacted and the checkpoint is pinned by a growing l0 segment. This PR introduces a new rule that: a growing segment could be released if the segment id appeared in current target dropped segment id list. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-29 18:04:21 +08:00
jaime	9d16b972ea	feat: add tasks page into management WebUI (#37002 ) issue: #36621 1. Add API to access task runtime metrics, including: - build index task - compaction task - import task - balance (including load/release of segments/channels and some leader tasks on querycoord) - sync task 2. Add a debug model to the webpage by using debug=true or debug=false in the URL query parameters to enable or disable debug mode. Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-10-28 10:13:29 +08:00
wei liu	39a91eb100	fix: Delegator may becomes unserviceable after querycoord restart (#37055 ) issue: #37054 after querycoord restart, segment_checker may release segment by mistake due to next target isn't ready yet. This PR requires release segment must happens after next target is ready. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-10-24 12:21:28 +08:00
congqixia	f43527ef6f	enhance: Batch forward delete when using DirectForward (#37076 ) Relatedt #36887 DirectFoward streaming delete will cause memory usage explode if the segments number was large. This PR add batching delete API and using it for direct forward implementation. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-24 10:39:28 +08:00
wei liu	f029314e20	fix: Dynamic release parition may fail search/query. (#37049 ) issue: #33550 cause wrong impl of UpdateCollectionNextTarget, if ReleaseCollection and UpdateCollectionNextTarget happens at same time, the the released partition's segment list may be add to target again, and delegator will be marked as unserviceable due to lack of segment. This PR fix the impl of UpdateCollectionNextTarget Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-10-24 01:03:28 +08:00
jaime	4746f47282	feat: management WebUI homepage (#36822 ) issue: #36784 1. Implement an embedded web server for WebUI access. 2. Complete the homepage development. Home page demo: <img width="2177" alt="iShot_2024-10-10_17 57 34" src="https://github.com/user-attachments/assets/38539917-ce09-4e54-a5b5-7f4f7eaac353"> Signed-off-by: jaime <yun.zhang@zilliz.com>	2024-10-23 11:29:28 +08:00
Bingyi Sun	6851738fd1	fix: fix `make generate-mockery` panic with go1.22 (#36830 ) https://github.com/milvus-io/milvus/issues/36831 Fix `make generate-mockery` panic. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-10-17 12:11:31 +08:00
congqixia	3fe0f82923	enhance: Add balance report log for qc balancer (#36747 ) Related to #36746 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-10-11 10:25:24 +08:00
aoiasd	db34572c56	feat: support load and query with bm25 metric (#36071 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-10-11 10:23:20 +08:00
wei liu	470bb0cc3f	enhance: Enable balance on querynode with different mem capacity (#36466 ) issue: #36464 This PR enable balance on querynode with different mem capacity, for query node which has more mem capactity will be assigned more records, and query node with the largest difference between assignedScore and currentScore will have a higher priority to carry the new segment. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-30 16:15:17 +08:00
cai.zhang	ecb2b242e2	enhance: Add sorted for segment info (#36469 ) issue: #33744 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-09-30 10:01:16 +08:00
wei liu	55be814a58	enhance: make TransferChannel/TransferSegment idempotent (#36489 ) issue: #36488 when call TransferChannel/TransferSegment, querycoord will generate and submit balance task to scheduler, if segment/channel's task already exist in scheduler, submit task will failed. to make TransferChannel/TransferSegment idempotent, we should skip to submit if task already exist in scheduler. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-26 18:11:23 +08:00
wei liu	5dfa1c3397	fix: Segment unbalance after many times load/release (#36537 ) issue: #36536 query coord use `segmentTaskDeleta/channelTaskDelta` to measure the executing workload for querynode in scheduler, and we maintains the `segmentTaskDeleta/channelTaskDelta` by `scheulder.Add(task)` and `scheduler.remove(task)`, but `scheduler.remove(task)` has been called in unexpected way, which cause a wrong `segmentTaskDeleta/channelTaskDelta` value and affect the segment assign logic, causes segment unbalance. This PR moves to compute the `segmentTaskDeleta/channelTaskDelta` when access, to avoid the wrong value affect. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-26 15:13:15 +08:00
sthuang	4493aa2142	fix: querycoord collection num metric (#36471 ) related to: #36456 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2024-09-26 14:23:13 +08:00
wei liu	3cd0b26285	enhance: Enable dynamic update loaded collection's replica (#35822 ) issue: #35821 After collection loaded, if we need to increase/decrease collection's replica, we need to release and load it again. milvus offers 4 solution to update loaded collection's replica, this PR aims to dynamic change the replica number without release, and after replica number changed, milvus will execute load replica or release replica in async, and the replica loaded status can be checked by getReplicas API. Notice that if set too much replicas than querynode can afford，the new replica won't be loaded successfully until enough querynode joins. --------- Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-25 10:13:18 +08:00
wei liu	3bd7ec8751	fix: Fix cornor case that segment can't be move out from stopping node (#36431 ) issue: #36426 the old constriant requires only segment on current target can be balanced, which is wrong, and caused that segment can't be move out from stopping node, if it's only exist in next target. by design, stopping balance need to move out all segment on it by balance task, thus the unfair old constriant should be removed. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-24 17:01:14 +08:00
wei liu	f7d950d465	fix: [skip e2e] Fix unstable ut TestCollectionObserver (#36231 ) issue: #36237 Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-13 19:01:09 +08:00
wei liu	fb2a41a94c	fix: Clean dirty segment/channel on querynode (#36202 ) issue: #36201 after querynode has been remove from replica, all dirty segment/channel on it should be released. Signed-off-by: Wei Liu <wei.liu@zilliz.com>	2024-09-13 18:15:08 +08:00
Jiquan Long	89bf226f0b	feat: support keyword text match (#35923 ) fix: #35922 --------- Signed-off-by: longjiquan <jiquan.long@zilliz.com>	2024-09-10 15:11:08 +08:00

1 2 3 4 5 ...

699 Commits (master)