Commit Graph

11128 Commits (master)

Author SHA1 Message Date
Buqian Zheng d23205b718
enhance: DataCodec to release ownership of input_data after initialization (#43542)
issue: https://github.com/milvus-io/milvus/issues/43088
issue: https://github.com/milvus-io/milvus/issues/43038

see also https://github.com/milvus-io/milvus/pull/43533.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-25 14:24:54 +08:00
wei liu 369a811ae1
fix: only clear exclude node list after refresh shard leader cache (#43553)
issue: #43511

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-25 14:18:54 +08:00
sthuang 5cebc9f7f6
fix: [StorageV2] handle correct cid with multiple files and add storage v2 prefix logs (#43539)
related: #43372

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-25 11:22:54 +08:00
Shuyoou 87326a5a64
fix: [skip e2e] webui collection filter params error (#42969)
Fix Issue: #40929

Signed-off-by: Shuyoou <shuyoou@outlook.com>
2025-07-25 10:40:53 +08:00
Spade A 10fe53ff59
feat: support json for ngram (#43170)
Ref https://github.com/milvus-io/milvus/issues/42053

This PR enable ngram to support json data type.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-25 10:28:54 +08:00
sthuang a0c9f499ee
fix: [StorageV2] sync panic with nullable add field (#43142)
related: https://github.com/milvus-io/milvus/pull/42932
fix: https://github.com/milvus-io/milvus/issues/43072

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-25 10:08:53 +08:00
zhagnlu c86307aef0
enhance: forbid two column comparison with json type in parser stage (#43382)
#43381

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-24 19:42:54 +08:00
yihao.dai 804a7692a6
fix: Fix delete loss caused by missing mutual exclusion in sort compaction (#43540)
issue: https://github.com/milvus-io/milvus/issues/43513

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-24 14:53:34 +08:00
Buqian Zheng d367770649
enhance: greatly reduce the loading memory overhead - by up to 25% (#43533)
issue: #43088
issue: #43038

The current loading process:

* When loading an index, we first download the index files into a list
of buffers, say A
* then constructing(copying) them into a vector of FieldDatas(each file
is a FieldData), say B
* assembles them together as a huge BinarySet, say C
* lastly, copy into the actual index data structure, say D

The problem:

* We can see that, after each step, we don't need the data in previous
step.
* But currently, we release the memory of A, B, C only after we have
finished constructing D
* This leads to a up to 4x peak memory usage comparing with the raw
index size, during the loading process
* This PR allows timely releasing of B after we assembled C. So after
this PR, the peak memory usage during loading will be up to 3x of the
raw index size.

I will create another PR to release A after we created B, that seems
more complicated and need more work.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-24 11:26:54 +08:00
congqixia 4bdb5ccafa
fix: Close segment writer when reader returns error (#43531)
Realted #43520

Datanode may have memory leakage when reader returns error. In
previously mention issue, datanodes got OOM killed due to continueous
error in read path.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-24 11:18:54 +08:00
Jean-Francois Weber-Marx 1bd66b09e3
enhance: allow '.' and '-' characters in usernames (#42417) (#42588)
related: #42417

- update the isValidUsername function to accept dots and hyphens in
addition to letters, digits, and underscores
- this change improves compatibility with common username formats and
addresses feedback in issue #42417

Signed-off-by: Jean-Francois Weber-Marx <jfwm@hotmail.com>
Signed-off-by: Jean-Francois Weber-Marx <jf.webermarx@criteo.com>
2025-07-24 09:54:54 +08:00
wei liu 990a25e51a
fix: Prevent delete records loss during slow segment loading [QueryNodeV2] (#43527)
issue: #42884
Fixes an issue where delete records for a segment are lost from the
delete buffer if `load segment` execution on the delegator is too slow,
causing `syncTargetVersion` or other cleanup operations to clear them
prematurely.

Changes include:
- Introduced `Pin` and `Unpin` methods in `DeleteBuffer` interface and
its implementations (`doubleCacheBuffer`, `listDeleteBuffer`).
- Added a `pinnedTimestamps` map to track timestamps protected from
cleanup by specific segments.
- Modified `LoadSegments` in `shardDelegator` to `Pin` relevant segment
delete records before loading and `Unpin` them afterwards.
- Added `isPinned` check in `UnRegister` and `TryDiscard` methods of
`listDeleteBuffer` to skip cleanup if corresponding timestamps are
pinned.
- Added comprehensive unit tests for `Pin`, `Unpin`, and `isPinned`
functionality, covering basic, multiple pins, concurrent, and edge
cases.

This ensures the integrity of delete records by preventing their
premature removal from the delete buffer during segment loading.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-24 01:00:54 +08:00
congqixia 1cf8ed505f
fix: Implement `NeededFields` feature in `RecordReader` (#43523)
Related to #43522

Currently, passing partial schema to storage v2 packed reader may
trigger SEGV during clustering compaction unit test.

This patch implement `NeededFields` differently in each `RecordReader`
imlementation. For now, v2 will implemented as no-op. This will be
supported after packed reader support this API.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-24 00:22:54 +08:00
Zhen Ye e9ab73e93d
enhance: add schema version at recovery storage (#43500)
issue: #43072, #43289

- manage the schema version at recovery storage.
- update the schema when creating collection or alter schema.
- get schema at write buffer based on version.
- recover the schema when upgrading from 2.5.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-23 21:38:54 +08:00
yihao.dai 9fbd41a97d
fix: Adjust binlog and parquet reader buffer size for import (#43495)
1. Modify the binlog reader to stop reading a fixed 4096 rows and
instead use the calculated bufferSize to avoid generating small binlogs.
2. Use a fixed bufferSize (32MB) for the Parquet reader to prevent OOM.

issue: https://github.com/milvus-io/milvus/issues/43387

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-23 21:28:54 +08:00
foxspy ed57650b52
fix: remove invalid restrictions on dim for int8 vector (#43469)
issue: #43466

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-07-23 20:22:54 +08:00
cai.zhang 74c08069ef
fix: Set result storage version for sort compaction (#43521)
issue: #43520

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-23 19:04:53 +08:00
zhagnlu d64dceea47
fix:add convert int to float function to array_contains related expr (#43468)
#43281

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-23 15:20:53 +08:00
junjiejiangjjj 4db877f76c
fix: Fix weighted rerank (#43503)
#43478

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-07-23 14:54:53 +08:00
Buqian Zheng 7ced9fc5d9
fix: fix loading resource estimation (#43509)
currently we multiplied the requesting size when adding to loading, but
did not do so when estimating projected usage.

issue: #43088

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-23 10:36:53 +08:00
congqixia cc1034fe96
fix: [AddField] Resolve FieldIndexing dangling reference (#43499)
Related to #43113

This PR:
- Change member of FieldIndex from `FieldMeta &` to needed `DataType`
and dim member resolving dangling reference after schema change
- Add double check after acquiring lock to reduce multiple assignment
- Change `auto schema` to `auto& schema` to reduce schema copy

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-23 00:14:52 +08:00
sthuang 59bbdd93f5
fix: [StorageV2] fill the correct group chunk into cell (#43486)
The root cause of the issue lies in the fact that when a sealed segment
contains multiple row groups, the get_cells function may receive
unordered cids. This can result in row groups being written into
incorrect cells during data retrieval.

Previously, this issue was hard to reproduce because the old Storage V2
writer had a bug that caused it to write row groups larger than 1MB.
These large row groups could lead to uncontrolled memory usage and
eventually an OOM (Out of Memory) error. Additionally, compaction
typically produced a single large row group, which avoided the incorrect
cell-filling issue during query execution.

related: https://github.com/milvus-io/milvus/issues/43388,
https://github.com/milvus-io/milvus/issues/43372,
https://github.com/milvus-io/milvus/issues/43464, #43446, #43453

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-22 22:22:53 +08:00
XuanYang-cn 92f4fc0e8b
fix: Set status when err is not empty (#43403)
See also: #43341

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-22 17:48:53 +08:00
cai.zhang f19e0ef6e4
fix: Ensure task execution order by using a priority queue (#43271)
issue: #43260

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-22 17:42:53 +08:00
cai.zhang e26a532504
enhance: Only download necessary fields during clustering analyze phase (#43322)
issue: #43310

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-22 16:40:52 +08:00
Zhen Ye df7e507c49
fix: balance may not trigger at balance checker when upgrading (#43462)
issue: #43416

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-22 16:02:53 +08:00
Buqian Zheng 0599113a4b
enhance: add timeout to resource reservation (#43441)
issue: https://github.com/milvus-io/milvus/issues/41435

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-22 15:24:53 +08:00
yihao.dai a839017e81
fix: Handle retry state in import task (#43474)
issue: https://github.com/milvus-io/milvus/issues/43473

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-22 14:52:53 +08:00
Chun Han 5a1092304c
fix: refine judgement for batch views(#38736) (#43481)
related: #38736

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-22 14:20:53 +08:00
congqixia 5c0f0ee765
enhance: [StorageV2] Return EOF when packedReader closed (#43465)
This patch makes `PackedReader` return EOF when try to calling
`ReadNext` after closing it.

This behavior make importv2.binlog reader could retry after EOF reached
and act normally.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-22 14:04:52 +08:00
yihao.dai 5124ed9758
fix: Fix import fileStats incorrectly set to nil (#43463)
1. Ensure that tasks in the InProgress state return valid fileStats.
2. Enhance import logs.

issue: https://github.com/milvus-io/milvus/issues/43387

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-22 12:37:01 +08:00
congqixia 563e2935c5
enhance: [StorageV2] Fill ts range default values for `PackedBinlogRecordWriter` (#43454)
This PR fill default value for `PackedBinlogRecordWriter` timestamp
range so target segment meta will contains correct timestamp range

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-22 12:04:53 +08:00
sthuang f77571d5c1
fix: [StorageV2] file writer write row group split to default size (#43471)
Bumped milvus storage version.
related: https://github.com/milvus-io/milvus/issues/43310

* https://github.com/milvus-io/milvus-storage/pull/213
* https://github.com/milvus-io/milvus-storage/pull/217
* https://github.com/milvus-io/milvus-storage/pull/220

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-22 09:52:52 +08:00
sthuang 6c5f5f1e32
enhance: [StorageV2] refactor group chunk translator (#43406)
related: #43372

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-21 19:46:53 +08:00
sparknack 81694739ef
fix: revert ska::flat_hash_set to std::unordered_set to address an un… (#43428)
issue: #43388

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-21 17:39:40 +08:00
aoiasd e9fc140eaf
fix: jieba tokenizer cause panic when dict word was empty string (#43337)
relate: https://github.com/milvus-io/milvus/issues/42779

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-21 16:34:53 +08:00
aoiasd c7b53ed43b
enhance: run rust format (#43447)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-21 14:12:53 +08:00
junjiejiangjjj 77f3a1f213
enhance: Add search post pipeline (#43065)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-07-21 11:10:52 +08:00
Bingyi Sun 21e71f6eb2
fix: Check json nested path before validating data type (#43329)
issue: #43279

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-21 10:30:54 +08:00
Zhen Ye 69c8c2660b
fix: create nil start position segment if sync start position before insert (#43435)
issue: #43434

- the segment start position can be carried by other segment sync
operation. so the sync start position operation can happens before
insert.
- TODO: It's a wired design should be removed.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-21 09:50:52 +08:00
Bingyi Sun 09b6407e63
enhance: optimize error msg for json index inconsistent parameters (#43345)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-21 00:32:52 +08:00
Xianhui Lin c13393418c
fix: invalid string error when enabled json stats (#43380)
fix: invalid string error when enabled json stats
issue: https://github.com/milvus-io/milvus/issues/43151

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-07-20 23:38:53 +08:00
aoiasd f7e1f1c382
enhance: support download lindera system dictionary online (#43121)
relate: https://github.com/milvus-io/milvus/issues/43120

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-20 23:24:52 +08:00
Zhen Ye 25b76e1fde
fix: cannot auto balance the channel from old arch to streamingnode (#43424)
issue: #43416, #43413

- also fix the panic on streamingnode when concurrent sync

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-20 23:00:52 +08:00
Buqian Zheng 389104d200
enhance: rename PanicInfo to ThrowInfo (#43384)
issue: #41435

this is to prevent AI from thinking of our exception throwing as a
dangerous PANIC operation that terminates the program.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-19 20:22:52 +08:00
Buqian Zheng f7b262a702
feat: make storagev1 to support eviction (#43219)
issue: https://github.com/milvus-io/milvus/issues/41435

turns out we have per file binlog size in golang code, by passing it
into segcore we can support eviction in storage v1

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-19 02:02:52 +08:00
congqixia 672a83f66b
enhance: Skip remove op if key in save set (#43425)
Related to #43407

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-18 17:37:39 +08:00
cai.zhang 2adc6ce0bc
fix: Call AlterCollection when only rename collection (#43420)
issue: #43407

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-18 15:46:56 +08:00
Spade A 42ad786f75
fix: update tantivy for fixing dir removing race condition (#43399)
fix: https://github.com/milvus-io/milvus/issues/43258

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-18 15:44:56 +08:00
congqixia 8fc7069e1a
fix: Make MultiSaveAndRemove execute removal first (#43408)
Realted to #43407

When `MultiSaveAndRemove` like ops contains same key in saves and
removal keys it may cause data lost if the execution order is save first
than removal.

This PR make all the kv execute removal first then save the new values.
Even when same key appeared in both saves and removals, the new value
shall stay.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-18 15:41:40 +08:00
Zhen Ye b142589942
enhance: support all partitions in shard manager for L0 segment (#43385)
issue: #42416

- change the key from partitionID into PartitionUniqueKey to support
AllPartitionsID

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-18 11:40:51 +08:00
Zhen Ye 5aa7a116d2
fix: change maxTimeTickDelay from 5m into 20m (#43377)
issue: #43266

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-18 11:29:42 +08:00
Buqian Zheng d793def47c
feat: impose a physical memory limit when loading cells (#43222)
issue: #41435 

issue: https://github.com/milvus-io/milvus/issues/43038

This PR also:


1. removed ERROR state from ListNode
2. CacheSlot will do reserveMemory once for all requested cells after
updating the state to LOADING, so now we transit a cell to LOADING
before its resource reservation
3. reject resource reservation directly if size >= max_size

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-07-18 11:18:52 +08:00
Zhen Ye 07fa2cbdd3
enhance: wal balance consider the wal status on streamingnode (#43265)
issue: #42995

- don't balance the wal if the producing-consuming lag is too long.
- don't balance if the rebalance is set as false.
- don't balance if the wal is balanced recently.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-18 11:10:51 +08:00
Zhen Ye 3aacd179f7
fix: balance channel before balance segment when upgrading (#43346)
issue: #43117, #42966, #43373

- also fix channel balance may not work at 2.6.
- fix error lost at delete path
- add mvcc into s/q log
- change the log level for TestCoordDownSearch

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-17 20:16:52 +08:00
Spade A 8612a2c946
enhance: optimize in by batch-in (#43268)
fix: https://github.com/milvus-io/milvus/issues/43267

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-17 19:40:52 +08:00
sparknack 9b4081e110
enhance: cachinglayer: some performance optimization (#42858)
issue: #41435

We compared the performance using the modified test_sealed.cpp, which
randomly accesses all rows in all chunks and counts the number of runs
within 3s.

## performance data comparison (ops/second)

chunk config: 1x1000

| Field Type | w/o cachinglayer (commit 640f526301) | w/ cachinglayer |
w/ cachinglayer + opt |
|---|---|---|---|
| Bool field | 82428 | -63.6% (29983) | +2.7% (84675) |
| Int8 field | 82228 | -63.3% (30166) | +2.4% (84163) |
| Int16 field | 82572 | -63.8% (29867) | +1.8% (84036) |
| Int32 field | 82797 | -63.7% (30031) | +1.5% (84043) |
| Int64 field | 81077 | -62.9% (30107) | +0.6% (81604) |
| Float field | 82678 | -63.4% (30266) | +1.8% (84146) |
| Double field | 81925 | -63.4% (29974) | +0.2% (82097) |
| Varchar field | 19933 | -19.6% (16027) | +18.9% (23690) |
| JSON field | 16519 | -96.8% (533) | +2.5% (16927) |
| Int array field | 7325 | -13.7% (6321) | -1.4% (7220) |
| Long array field | 6347 | -8.9% (5781) | -0.1% (6344) |
| Bool array field | 8275 | -14.0% (7116) | +0.4% (8311) |
| String array field | 2281 | -5.0% (2168) | +0.2% (2287) |
| Double array field | 6427 | -13.3% (5574) | -2.0% (6301) |
| Float array field | 7291 | -13.0% (6346) | -1.5% (7183) |
| Vector field | 27487 | -40.4% (16371) | -4.7% (26192) |
| Float16 vector field | 49773 | -54.6% (22601) | -5.9% (46834) |
| BFloat16 vector field | 49783 | -53.1% (23350) | -5.7% (46934) |
| Int8 vector field | 63871 | -59.0% (26179) | -6.2% (59926) |

---

chunk config: 10x1000

| Field Type | w/o cachinglayer (commit 640f526301) | w/ cachinglayer |
w/ cachinglayer + opt |
|---|---|---|---|
| Bool field | 3659 | -48.6% (1879) | +110.1% (7686) |
| Int8 field | 3410 | -45.3% (1864) | +123.9% (7636) |
| Int16 field | 3647 | -48.6% (1874) | +110.1% (7661) |
| Int32 field | 3647 | -48.8% (1866) | +109.6% (7645) |
| Int64 field | 3645 | -48.9% (1863) | +107.8% (7573) |
| Float field | 3647 | -49.0% (1861) | +109.5% (7639) |
| Double field | 3640 | -45.1% (1998) | +108.4% (7586) |
| Varchar field | 1594 | -23.9% (1213) | +20.6% (1922) |
| JSON field | 1202 | -26.5% (884) | +16.1% (1396) |
| Int array field | 602 | -12.3% (528) | +12.7% (678) |
| Long array field | 529 | -12.2% (465) | +7.5% (569) |
| Double array field | 537 | -13.0% (467) | +6.4% (571) |
| Vector field | 1520 | -37.9% (943) | -5.5% (1437) |
| Float16 vector field | 2607 | -47.0% (1382) | +6.4% (2774) |
| BFloat16 vector field | 2586 | -46.5% (1383) | +8.8% (2813) |
| Int8 vector field | 3101 | -47.3% (1633) | +41.9% (4400) |

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-17 11:20:51 +08:00
zhagnlu ee43954534
fix:fix text_match bug because of not adapting to multi-chunk model (#43303)
https://github.com/milvus-io/milvus/issues/43296

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-17 10:32:51 +08:00
yihao.dai df8ceb123b
enhance: Support parallel execution of L0 import tasks (#43213)
issue: https://github.com/milvus-io/milvus/issues/43212

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-17 10:14:50 +08:00
XuanYang-cn 4dcaa97682
fix: Use diskSegmentMaxSize for coll with sparse and dense vectors (#43194)
Previous code uses diskSegmentMaxSize if and only if all of the
collection's vector fields are indexed with DiskANN index.

When introducing sparse vectors, since sparse vector cannot be indexed
with DiskANN index, collections with both dense and sparse vectors will
use maxSize instead.

This PR changes the requirments of using diskSegmentMaxSize to all dense
vectors are indexed with DiskANN indexs, ignoring sparse vector fields.

See also: #43193

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-07-16 18:04:52 +08:00
Spade A d750816ba0
fix: remove std::string support for stlsort index (#43355)
fix: https://github.com/milvus-io/milvus/issues/43354

The current implementation of stdsort index is not supported for
std::string. Remove the code.

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-16 17:46:51 +08:00
congqixia 5d90b65342
enhance: [StorageV2] Add storage version in Data/Query view resp (#43348)
Related to #39173

Add `storage_version` in data/query view segment info response

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-16 15:52:51 +08:00
foxspy 58a9e49066
enhance: update knowhere version (#43331)
issue: #42937 #43294

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-07-16 15:04:50 +08:00
yihao.dai b69e601fe1
fix: [StorageV2] Correct read and write buffer size (#43335)
Correct read and buffer size to 64MB to prevent OOM during clustering
compaction.

issue: https://github.com/milvus-io/milvus/issues/43310

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-16 14:28:52 +08:00
Bingyi Sun 1b8c958cff
enhance: fix tantivy wrapper is freed after json flat executor is destructed (#43233)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-16 10:58:50 +08:00
congqixia fe8de016d5
fix: [StorageV2] Align null bitmap offset when loading multi-chunk (#43321)
Related to #43262

This patch fixes following logic bug:
- When multiple chunks are loaded and size cannot be divided by 8, just
appending uint8_t as bitmap will cause null bitmap dislocation
- `null_bitmap_data()` points to start of whole row group, which may not
stand for current `arrow::Array`

The current solutions is:
- Reorganize the null_bitmap with currect size & offset
- Pass `array->offset()` in tuple to info the current offset

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-15 19:22:50 +08:00
Bingyi Sun ccfaa7bee8
fix: Fix the bug when offsets is nullptr in bulk api (#43127)
issue: https://github.com/milvus-io/milvus/issues/42978

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-15 17:54:50 +08:00
Zhen Ye ffc8c0730c
fix: wrong metric for sn timetick (#43312)
issue: #43266

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-14 20:40:50 +08:00
Spade A db91d85dbc
feat: more types of matches for ngram (#43081)
Ref https://github.com/milvus-io/milvus/issues/42053

This PR enable ngram to support more kinds of matches such as prefix and
postfix match.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-14 20:34:50 +08:00
Spade A e14a52721e
enhance: use stl sort with high cardinality for data_type int (#43305)
fix: https://github.com/milvus-io/milvus/issues/43304

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-14 18:40:50 +08:00
congqixia ae48f0e484
fix: [StorageV2] Handle missing column creating index (#43292)
Related to #43250

Use FieldIDList to check missing field. If column is missing, return
empty resultset

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-14 17:06:50 +08:00
wei liu 039564199c
fix: Prevent duplicate segment results in count queries (#43173)
issue: #41570
Fix issue where growing and sealed segments could be searched
simultaneously, causing inflated count(*) results. This was caused by
logic introduced in PR #42009 that made sealed segments readable before
target version advancement.

Changes include:
- Fix conditional filtering logic in PinReadableSegments to prevent
sealed segments from becoming readable prematurely
- Use target version filter for full results (ratio=1.0) to ensure
sealed segments only become readable after target advancement
- Use query view segment list filter for partial results (ratio<1.0) to
maintain backward compatibility
- Simplify target version setting in AddDistributions to prevent
premature segment readability
- Add logging for redundant growing segments during sync
- Add comprehensive unit tests covering the duplicate segment scenario

This fix ensures count(*) queries return accurate results by preventing
the same segment from being counted in both growing and sealed states.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-14 11:10:49 +08:00
foxspy 8171a2a0b5
enhance: update knowhere version (#43246)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-07-14 11:06:49 +08:00
Ted Xu 07894b37b6
enhance: returning collection metadata from cache (#42823)
See #43187

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-07-14 10:54:50 +08:00
Bingyi Sun 21a96bc903
enhance: Save meta with txn limit (#43263)
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-14 10:32:49 +08:00
yihao.dai 1984be646c
fix: Fix storagev2 binlog import (#43221)
issue: https://github.com/milvus-io/milvus/issues/43218

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-13 22:52:49 +08:00
Alexander Guzhva a848c4a8c5
fix: fix incorrect bitset for the division comparison when the right is < 0 (#43179)
issue: https://github.com/milvus-io/milvus/issues/42900
@sunby Unfortunately, it is not that easy to fix as it was thought in
#43177

Upd: also handles `Inf` and `NaN` values, and the division by zero case
for `fp32` and `fp64`

Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
2025-07-11 19:04:49 +08:00
Zhen Ye 15a6631147
enhance: add quota limit based on sn consuming lag (#43105)
issue: #42995

- The consuming lag at streaming node will be reported to coordinator.
- The consuming lag will trigger the write limit and deny by quota
center.
- Set the ttProtection by default.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-11 14:10:49 +08:00
cai.zhang c54a04c71c
fix: L2 segments remain as L2 even after sort compaction (#43237)
issue: #43186

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-11 11:30:48 +08:00
Zhen Ye f598ca2b4e
fix: block at msgpack adaptor and wrong metrics (#43235)
issue: #43018

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-11 10:14:49 +08:00
congqixia 5a9efb3f81
enhance: [StorageV2] Refine storage rw option usage & validation (#43175)
Related to #39173

This PR:
- Make all datanode task passes storage config via storage config option
- Remove legacy comments, rootPath & bucketName parameters
- Fix clustering compaction option behavior
- Add validation logic for `rwOptions`
- Use correct storageType from storageConfig
- Add storage config in sync task

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-11 01:14:48 +08:00
congqixia 6bbed3b019
fix: [AddField] Add shared_lock for insert prevent race (#43229)
Related to #43113

When schema change happens, insert shall not happen, otherwise:
- Data race may happen causing insertion failure
- Inconsistent data schema

This PR add shared_lock prevent this data race.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-10 21:26:48 +08:00
PjJinchen a90694165b
feat: Supports tracing services that require header-based authentication. (#43211)
issue: https://github.com/milvus-io/milvus/issues/43082

support tracing services that require header-based authentication.
for example: aliyun SLS, volcengine LogService etc...

[aliyun
SLS](https://help.aliyun.com/zh/sls/import-trace-data-from-golang-applications-to-log-service-by-using-opentelemetry-sdk-for-golang?spm=a2c4g.11186623.help-menu-search-28958.d_1#section-ktk-xxz-8om)

Add a headers config in trace config

```
trace:
  exporter: otlp
  sampleFraction: 1
  otlp:
    endpoint:  milvus-cn-beijing-pre.cn-beijing.log.aliyuncs.com:10010
    method:  # otlp export method, acceptable values: ["grpc", "http"],  using "grpc" by default
    secure: true
    headers:  # base64
  initTimeoutSeconds: 10
```

it is encoded as base64, raw data is json
```
{
    "x-sls-otel-project": "milvus-cn-beijing-pre",
    "x-sls-otel-instance-id": "milvus-cn-beijing-pre",
    "x-sls-otel-ak-id": "xxx",
    "x-sls-otel-ak-secret": "xxx"
}
```

[volcengine
tls](https://www.volcengine.com/docs/6470/812322#grpc-%E5%8D%8F%E8%AE%AE%E5%88%9D%E5%A7%8B%E5%8C%96%E7%A4%BA%E4%BE%8B)

Add a headers config in trace config

```
trace:
  exporter: otlp
  sampleFraction: 1
  otlp:
    endpoint:  xxx
    method:  # otlp export method, acceptable values: ["grpc", "http"],  using "grpc" by default
    secure: true
    headers:  # base64
  initTimeoutSeconds: 10
```

it is encoded as base64, raw data is json
```
{
    "x-tls-otel-region": "cn-beijing",
    "x-tls-otel-tracetopic": "milvus-cn-beijing-pre",
    "x-tls-otel-ak": "xxx",
    "x-tls-otel-sk": "xxx"
}
```

Signed-off-by: PjJinchen <6268414+pj1987111@users.noreply.github.com>
2025-07-10 17:32:48 +08:00
wei liu b2597c6329
enhance: apply load config changes after QueryCoord restart (#43108)
issue: #43107 
- Add checkLoadConfigChanges() to apply load config during startup
- Call config check in startQueryCoord() after restart
- Skip auto-updates for collections with user-specified replica numbers
- Add is_user_specified_replica_mode field to preserve user settings
- Add comprehensive unit tests with mockey

Ensures existing collections use latest cluster-level config after
restart.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-10 14:28:48 +08:00
cai.zhang 3ffd44f302
fix: Fix remaining issues with Datanode pooling and StorageV2 (#43147)
issue: #43146

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 14:26:48 +08:00
yihao.dai ee9a95189a
enhance: Print segments info after import done (#43200)
issue: https://github.com/milvus-io/milvus/issues/42488

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-10 12:38:47 +08:00
Chun Han 07745439b5
fix: empty search groupby result causing crash(#43137) (#43214)
related: #43137

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-07-10 12:04:48 +08:00
cai.zhang 47144429bf
fix: Fix regeneratePartitionStats failed after restore clusteringCompactionTask (#43205)
issue: #43186

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-10 10:40:47 +08:00
Zhen Ye 490c5d5088
fix: lost message version after compatible message modification (#43217)
issue: #43018

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-10 10:36:48 +08:00
Bingyi Sun 13f6e2130b
fix: Fix hybrid search return back empty set if one result is emtpy (#43209)
issue: https://github.com/milvus-io/milvus/issues/43160

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-10 10:34:55 +08:00
congqixia f027eea545
enhance: [AddField] Add log for segcore segment schema change (#43215)
Related to #39178

This PR add logs for segment schema change operations.

Also fixes the nit comments from PR #42490

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-10 10:22:47 +08:00
aoiasd 97b1c3ed96
enhance: add warn log if some segment's bm25 stats lacks (#43111)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-09 23:22:47 +08:00
cai.zhang 95e767611a
fix: Fix merge sort loss data when last row in a record is deleted (#43216)
issue: #43207

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-09 22:18:48 +08:00
zhagnlu 21d1fb2aa3
fix: fix move cursor bug for chunk segment with index (#43095)
#42974

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-07-09 17:38:47 +08:00
cai.zhang 41d1c8d6b3
fix: Handle error for invalid function params and prevent panic (#43189)
issue: #43188

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-09 12:46:46 +08:00
tinswzy c4634d861e
fix: v2.6 WebUI metrics response schema change bug (#42957)
#42919  
fix metrics response schema incompatibility with WebUI v2.6

Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>
2025-07-08 22:56:47 +08:00
cai.zhang 6989e18599
enhance: Move sort stats task to sort compaction (#42562)
issue: #42560

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-08 20:22:47 +08:00
aoiasd 54cc0b60f2
fix: dropped segment in excluded segment use wrong excluded ts (#43115)
cause some excluded growing data insert again
relate: https://github.com/milvus-io/milvus/issues/43114

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-07-08 18:04:46 +08:00
Spade A d41eec6f10
fix: void copy when getting json chunk (#43183)
fix: https://github.com/milvus-io/milvus/issues/43182

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-08 15:28:46 +08:00
cai.zhang 8720feeb79
fix: Fix enqueuing when current batch is fully deleted (#43174)
issue: #43045

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-08 12:20:46 +08:00
Ted Xu 6153272d4b
enhance: disabling max entry limit by default (#43166)
See: #43055

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-07-08 10:10:46 +08:00
yihao.dai 9cbd194c6b
fix: Prevent import from generating small binlogs (#43132)
- Introduce dynamic buffer sizing to avoid generating small binlogs
during import
- Refactor import slot calculation based on CPU and memory constraints
- Implement dynamic pool sizing for sync manager and import tasks
according to CPU core count

issue: https://github.com/milvus-io/milvus/issues/43131

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-07-07 21:32:47 +08:00
sthuang a0ae5bccc9
fix: [StorageV2] load growing segment get dim datatype check (#43168)
related: https://github.com/milvus-io/milvus/issues/43072

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-07 15:46:47 +08:00
congqixia ab818dcbca
fix: [StorageV2] Pass storage config for compaction rw (#43167)
Related to #43148

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-07 15:32:46 +08:00
sthuang 276c52490d
fix: [StorageV2] missing arrow fs when building index (#43162)
fix: https://github.com/milvus-io/milvus/issues/43150,
https://github.com/milvus-io/milvus/issues/43149

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-07 15:26:46 +08:00
sthuang 9f361a228e
enhance: storage v2 chunked column memory size from meta (#43130)
use meta to get chunked column memory size to avoid getting cells
actually from storage.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-07-07 14:24:46 +08:00
congqixia d09764508a
fix: [Storagev2] Close segment readers in mergeSort (#43116)
Related to #43062

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-04 23:56:44 +08:00
junjiejiangjjj fafd5db43f
fix: rank params bug (#43112)
https://github.com/milvus-io/milvus/issues/42985

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-07-04 18:28:44 +08:00
Zhen Ye 46b6f1b9e2
fix: panic when logging a old message should be skipped (#43076)
issue: #43074

- fix: panic when logging a old message should be skipped, #43074 
- fix: make the ack of broadcaster idompotent, #43026
- fix: lost dropping collection when upgrading, #43092
- fix: panic when DropPartition happen after DropCollection, #43027,
#43078

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 16:04:44 +08:00
groot 1ee8cea35b
enhance: bulkinsert handle nullable/defaultValue/functionOutput fields (#42956)
issue: https://github.com/milvus-io/milvus/issues/42173

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-07-04 14:20:44 +08:00
congqixia 684f027496
fix: Remove trimming space logic when validating collection name (#43064)
Related to #43031

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-04 11:00:45 +08:00
cai.zhang 4133e3b8fd
fix: Enable merge sort and fix sort bug (#43080)
issue: #42980, #43034

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-04 10:18:44 +08:00
Spade A fce0bbe2ae
fix: remove redundant locks for null_offset (#43103)
Ref: https://github.com/milvus-io/milvus/issues/40308
https://github.com/milvus-io/milvus/pull/40363 add lock for protecting
concurrent read/write for null offset. But we don't need this for sealed
segment.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-07-04 10:10:45 +08:00
Zhen Ye e97e44d56e
enhance: limit the gc concurrency when cpu is high (#43059)
issue: #42833

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-04 09:22:43 +08:00
congqixia 1d9a9a993d
fix: [StorageV2] Use correct template typename for `cache_raw_data_to_disk_common` (#43104)
Related to #43099

Previously `cache_raw_data_to_disk_common` used `milvus::DataType`
template typename, which shall be `knowhere::bf16` or other actual
datatype.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-03 18:50:46 +08:00
Zhen Ye bbbc7d4517
enhance: collect all cgo calling into metric and log slow cgo call (#43035)
issue: #42833

- also fix the error metric for async cgo.
- also make sure the roles can be seen when node startup, #43041.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-03 15:00:44 +08:00
cai.zhang f6b2a71c95
enhance: Remove chunkmanager-related dependencies from datanode (#43021)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-07-03 14:44:45 +08:00
congqixia 1fae5230fe
fix: Check field mmap property before apply collection level one (#43090)
Related to #43089

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-03 14:30:44 +08:00
Bingyi Sun 6e38e9d18f
fix: Add json cast type for flat index (#42970)
issue: #42916

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-07-03 14:14:44 +08:00
sparknack 7e855f1046
enhance: add disk file writer with Direct IO support (#42665)
issue: #43040 

This patch introduces a disk file writer that supports Direct IO.

Currently, it is exclusively utilized during the QueryNode load process.

Below is its parameters:

1. `common.diskWriteMode`
This parameter controls the write mode of the local disk, which is used
to write temporary data downloaded from remote storage.
Currently, only QueryNode uses 'common.diskWrite*' parameters. Support
for other components will be added in the future.
The options include 'direct' and 'buffered'. The default value is
'buffered'.

2. `common.diskWriteBufferSizeKb`
Disk write buffer size in KB, only used when disk write mode is
'direct', default is 64KB.
Current valid range is [4, 65536]. If the value is not aligned to 4KB,
it will be rounded up to the nearest multiple of 4KB.

3. `common.diskWriteNumThreads`
This parameter controls the number of writer threads used for disk write
operations. The valid range is [0, hardware_concurrency].
It is designed to limit the maximum concurrency of disk write operations
to reduce the impact on disk read performance.
For example, if you want to limit the maximum concurrency of disk write
operations to 1, you can set this parameter to 1.
The default value is 0, which means the caller will perform write
operations directly without using an additional writer thread pool.
In this case, the maximum concurrency of disk write operations is
determined by the caller's thread pool size.

Both parameters can be updated during runtime.

---------

Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>
2025-07-02 22:18:44 +08:00
congqixia 7bc7b18ed5
fix: [AddField] Prevent concurrent load during UpdateSchema (#43043)
Related to #43028

This PR:
- Add mutex prevent concurrent load segment & schema change
- Add schema verison field in load meta
- Update schema in PutOrRef if schema verison is larger

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 17:38:44 +08:00
congqixia 8962b0058d
fix: [StorageV2] Check writer nil when closing not written one (#43056)
Related to #43047

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-07-02 14:22:43 +08:00
Zhen Ye 09c6df62d8
fix: use impl and remove the close method of broadcast service (#42992)
issue: #38399

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-02 10:30:44 +08:00
wei liu c381bf3e41
enhance: add logs for count(*) (#43001)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 19:36:43 +08:00
Zhen Ye 08fff353af
fix: Revert "enhance: Enable mergeSort by default starting from version 2.6.0 (#42981)" (#43046)
issue: #43034

- implementation of mergeSortMultipleSegments is wrong.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-07-01 17:30:29 +08:00
Spade A 26ec841feb
feat: optimize `Like` query with n-gram (#41803)
Ref #42053

This is the first PR for optimizing `LIKE` with ngram inverted index.
Now, only VARCHAR data type is supported and only InnerMatch LIKE
(%xxx%) query is supported.


How to use it:
```
milvus_client = MilvusClient("http://localhost:19530")
schema = milvus_client.create_schema()
...
schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000)
...
index_params = milvus_client.prepare_index_params()
index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3)
milvus_client.create_collection(COLLECTION_NAME, ...)
```

min_gram and max_gram controls how we tokenize the documents. For
example, for min_gram=2 and max_gram=4, we will tokenize each document
with 2-gram, 3-gram and 4-gram.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-07-01 10:08:44 +08:00
wei liu 396120ade5
enhance: Improve delegator serviceable check with coordinator sync state (#42975)
issue: #42404
Add syncedByCoord field to ensure delegator only becomes serviceable
after coordinator sync, preventing unreliable service state when memory
is insufficient.

Issue: When memory is low, delegator may become serviceable before
current target is ready, but segments can be released at any time,
making the serviceable state unreliable.

Changes include:
- Add syncedByCoord field to track coordinator sync status
- Update Serviceable() to require both data readiness and coord sync
- Set syncedByCoord=true in SyncTargetVersion
- Add comprehensive test coverage

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-07-01 10:00:43 +08:00
Zhen Ye ecb24e7232
enhance: use multi-process framework in integration test (#42976)
issue: #41609

- add env `MILVUS_NODE_ID_FOR_TESTING` to set up a node id for milvus
process.
- add env `MILVUS_CONFIG_REFRESH_INTERVAL` to set up the refresh
interval of paramtable.
- Init paramtable when calling `paramtable.Get()`.
- add new multi process framework for integration test.
- change all integration test into multi process.
- merge some test case into one suite to speed up it.
- modify some test, which need to wait for issue #42966, #42685.
- remove the waittssync for delete collection to fix issue: #42989

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-30 14:22:43 +08:00
wei liu c919340763
enhance: Optimize channel node balancing for uneven QN distribution (#42786)
issue: #42860
Fix channel node allocation when QueryNode count is not a multiple of
channel count. The previous algorithm used simple division which caused
uneven distribution with remainders.

Key improvements:
- Implement smart remainder distribution algorithm
- Refactor large function into focused helper functions
- Support two-phase rebalancing (release then allocate)
- Handle edge cases like insufficient nodes gracefully

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-30 12:14:42 +08:00
rhys 48661655d6
fix: streamingcoord and streamingnode client support internal tls (#42685)
https://github.com/milvus-io/milvus/issues/42680

streamingnode/streamingcoord support internal tls

Signed-off-by: rhys <sdbwlr@163.com>
2025-06-27 17:50:42 +08:00
Zhen Ye 8367e4ec6a
fix: set 72h for wal retention (#42910)
issue: #42706

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-27 17:36:43 +08:00
Bingyi Sun 23c784cf69
fix: Fix querynode crash caused by json index (#42982)
issue: https://github.com/milvus-io/milvus/issues/42978

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-27 16:44:41 +08:00
XuanYang-cn 17f1ab71bb
enhance: Remove not inused BuildIndexInfo (#42926)
1. removed not inuse cgo methods in index_c.h/cpp
2. removed indexcogowrapper/build_index_info.go

See also: #39242

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-27 15:00:42 +08:00
congqixia 9b06ecb72f
enhance: [StorageV2] Release record and close reader (#42983)
Related to #39173

This PR
- Close packed reader after sort
- Release arrow.Record preventing memory leakage
- Invoke `pack_reader->Close()` for CloseReader

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-27 14:46:43 +08:00
sthuang 238bd30f42
fix: [StorageV2] end to end minor issues for sync, stats, and load (#42948)
Fix issues in end-to-end tests: 
1. **Split column groups based on schema**, rather than estimating by
average chunk row size. **Ensure column group consistency within a
segment**, to avoid errors caused by loading multiple column group
chunks simultaneously.
2. **Use sorted segmentId** when generating the stats binlog path, to
ensure consistent and correct file path resolution.
3. **Determine field IDs as follows**:
For multi-column column groups, retrieve the field ID list from
metadata.
For single-column column groups, use the column group ID directly as the
field ID.

related: #39173 
fix: #42862

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-27 14:44:42 +08:00
Zhen Ye 2d73e6eaa8
fix: mixcoord will not handle timetick anymore (#42965)
issue: #42954

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-26 19:14:42 +08:00
Zhen Ye 3602817c53
fix: dynamic log level for streaming node (#42964)
issue: #42963

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-26 19:12:50 +08:00
congqixia 5dd1f841d2
enhance: [AddField] Add Restful API for addfield (#42972)
Related to #39718

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-26 18:46:41 +08:00
Bingyi Sun 289b8b85d3
enhance: remove name check for alter index task (#42953)
issue: https://github.com/milvus-io/milvus/issues/42952

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-26 16:32:41 +08:00
foxspy be05b653c1
enhance: update knowhere version (#42938)
issue: #42937

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-06-26 01:22:41 +08:00
yihao.dai d7c9914eff
fix: Consider fields number when preallocating ids for import (#42810)
In corner cases where there are many fields but only a small number of
rows to import, the default preallocated IDs may be insufficient. To
address this, consider the number of fields when preallocating IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-25 23:38:41 +08:00
wei liu be492c2939
fix: Add missing keylocks in ReleasePartition operation (#42940)
issue: #42098
Fix concurrent access issue by adding proper locking around
ReleasePartition operation to prevent race conditions when releasing
partitions on the same collection.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-25 21:48:42 +08:00
congqixia 336e743b55
fix: [AddField] Respect growing mmap setting adding empty field (#42933)
Related to #42856

Data under mmapped growing segment shall be treated respecting
growingMmap setting. Otherwise, varchar datatype could be treated with
logic error.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 21:10:42 +08:00
congqixia 942055fa7d
fix: Use task timestamp to calculate TTL timestamp (#42920)
Related to #42918

Previously the `CollectionTtlTimestamp` could be overflowed when the
guarantee_ts==1, which means using `Eventually` consistency level.

This patch use task timestamp, allocated by scheduler, to generate ttl
timestamp ignore the potential very small timestamp being used.

Also add overflow check for ttl timestamp calculated.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-25 20:48:42 +08:00
zhagnlu 69872f45ad
fix: fix is_not_in for trie index (#42716)
#42604

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-25 16:52:42 +08:00
cai.zhang ebe1c95bb1
enhance: Add Size interface to FileReader to eliminate the StatObject call during Read (#42908)
issue: #42907

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-25 14:36:41 +08:00
aoiasd e2566c0e92
enhance: bm25 stats local cache use local storage path (#42923)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-25 13:44:46 +08:00
XuanYang-cn 0dfe5308e1
enhance: Tidy Download and decode in segcore storage (#42902)
1. Unify calling from GetObjectData
2. Move SetData inside Deserialize

See also: #40013

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-25 11:10:43 +08:00
sthuang 0d57acb13a
enhance: [StorageV2] field id as meta path for wide column when load (#42863)
related: #42862 #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-25 11:08:48 +08:00
sthuang d4260b47fa
fix: [StorageV2] sync panic with add field (#42932)
related: #39663

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-25 10:08:40 +08:00
sthuang ad6d620e9f
fix: [StorageV2] Compiling debug mode throw DCHECK s3 initialize error (#42922)
related: https://github.com/milvus-io/milvus/issues/42844

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-24 19:30:41 +08:00
Spade A 50f7579d8f
fix: fix some bugs discovered by chaos tests (#42906)
fix: https://github.com/milvus-io/milvus/issues/42870

This PR fixes:
1. SetBitset fn shuold consider growing segments with concurrent write
2. avoid using from_raw_parts directly

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-24 16:32:42 +08:00
XuanYang-cn 0adf44e6f8
enhance: Check if segment has too many deletions together (#42668)
This PR moves the deltalog file count check inside hasTooManyDeletions
check. Unifies the logic on checking if a segment has too many deletions
including: delta log count, deleted rows ratio and deltalog size.

This change removes several uncessary traverse through segment's binlogs
and deltalogs. And add more clear trigger logs

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-24 16:30:49 +08:00
Bingyi Sun 669ea51ce5
enhance: Make json index compatible with caching layer (#42484)
issue: https://github.com/milvus-io/milvus/issues/42483

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-24 15:16:41 +08:00
congqixia 718cd203c6
fix: OR binary expr is prunable only when both children are prunable (#42912)
Related to #42903

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-24 09:38:24 +08:00
zhagnlu 1024121ad9
fix:fix incorrect use of class member (#42885)
#39173

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-23 20:36:46 +08:00
congqixia 0a0a6b3471
enhance: Fill dbName for `OperatePrivilegeV2Request` in interceptor (#42898)
Related to #40340

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-23 18:04:41 +08:00
cai.zhang 59b003adac
enhance: Skip modify field meta when rename collection or rename dbName (#42875)
issue: #42873

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-23 17:04:41 +08:00
congqixia ee056f0bff
fix: [AddField] Fill default value in serde logic when field missing (#42891)
Related to #42856

Default value will be missing after segment get sorted/compacted. This
PR is a temp workaround since in long term default value shall be filled
with storage engine instead.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-23 14:20:41 +08:00
Bingyi Sun 24e24caf14
fix: Remove cached null expr result (#42818)
issue: #42698
cached result may be changed in caller so there is no need to cache it

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-23 10:44:40 +08:00
Zhen Ye a081906fb4
enhance: smaller backoff configuration for wal balancer to make faster recovery (#42869)
issue: #42835

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-23 10:32:40 +08:00
Xianhui Lin b902960057
fix: revert remote jsonstats path (#42882)
fix: revert remote jsonstats path
relate-pr:https://github.com/milvus-io/milvus/pull/42676
issue:https://github.com/milvus-io/milvus/issues/42872

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-21 13:24:39 +08:00
cai.zhang 8f8ffe9989
fix: Reduce task slot for standalone to 1/4 of normal datanode (#42808)
issue: #42129

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-20 16:38:46 +08:00
Spade A e15926b40c
enhance: optimize tantivy cargo config (#42880)
fix: https://github.com/milvus-io/milvus/issues/42879

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-20 16:17:49 +08:00
aoiasd 43a9f7a79e
enhance: Add and run rust format command in makefile (#42807)
relate: https://github.com/milvus-io/milvus/issues/42806

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-20 10:22:39 +08:00
Zhen Ye 6798fdc3b3
fix: rocksmq cannot graceful stop (#42841)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 19:38:39 +08:00
congqixia 74ea57bac1
enhance: Remove unused load field check from proxy (#42816)
Related to #42489

Since load list works as hint after cachelayer implemented, the related
check logic could be removed to keep code logic clean.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-19 19:34:47 +08:00
Zhen Ye fadc053d7a
fix: filter new proxy when initializing proxy session at timeticksync (#42831)
issue: #40532

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 16:44:40 +08:00
Zhen Ye 2fd8f910b0
fix: data duplicated when msgdispatcher make splitting (#42827)
issue: #41570

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-19 16:32:39 +08:00
junjiejiangjjj 9865d672f7
fix: Model rerank supports Truncate (#42643)
https://github.com/milvus-io/milvus/issues/42632

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-06-19 15:02:41 +08:00
sthuang 4a0a2441f2
enhance: [StorageV2] field id as meta path for wide column (#42787)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-19 15:00:38 +08:00
congqixia 4ba177cd2c
enhance: [StorageV2] Handle narrow column group resource estimation (#42842)
Related to #39173

In storage v2, "narrow" column group could have group id not mapped
schema, which causing loading fails or resource estimation result
inaccurate.

This PR handles this case by mapping binlog from index instead of vice
versa.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-19 14:44:39 +08:00
wei liu bf5fde1431
fix: Prevent delegator unserviceable due to shard leader change (#42689)
issue: #42098 #42404
Fix critical issue where concurrent balance segment and balance channel
operations cause delegator view inconsistency. When shard leader
switches between load and release phases of segment balance, it results
in loading segments on old delegator but releasing on new delegator,
making the new delegator unserviceable.

The root cause is that balance segment modifies delegator views, and if
these modifications happen on different delegators due to leader change,
it corrupts the delegator state and affects query availability.

Changes include:
- Add shardLeaderID field to SegmentTask to track delegator for load
- Record shard leader ID during segment loading in move operations
- Skip release if shard leader changed from the one used for loading
- Add comprehensive unit tests for leader change scenarios

This ensures balance segment operations are atomic on single delegator,
preventing view corruption and maintaining delegator serviceability.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-19 12:10:38 +08:00
Spade A e2c85eec81
fix: load stats index based on mmap config (#42788)
ref https://github.com/milvus-io/milvus/issues/42626

This PR makes text match index and json key stats index be loaded based
on mmap config.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-19 10:10:39 +08:00
aoiasd d49989345b
enhance: forbid regex filter clone regex for each streamer (#42781)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-18 16:10:39 +08:00
cai.zhang d122e6d1e2
enhance: Make Web UI toggleable via config (#42814)
issue: #42813

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-18 13:02:38 +08:00
Bingyi Sun 6bebb68727
fix: Return all targets segments in ListLoadedSegments (#42728)
issue: https://github.com/milvus-io/milvus/issues/42412

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-18 11:20:38 +08:00
Spade A 80f1d707f7
fix: tidy up path for scalar index (#42676)
Ref #42626

This path tidy up path for scalar index including path for loading index
from remote storage and temporary path for buliding index.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-18 00:42:38 +08:00
congqixia f9caad95b9
fix: [AddField] Check field empty instead of existence (#42789)
Related to #42773

Growing segment fills all known meta into `InsertRecord` data, which
cause even the field is missing, the field data will still exists.

This PR update the logic while finish loading growing segment to check
field empty or not instead of existence.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-17 17:22:39 +08:00
cai.zhang a9dcd4a380
enhance: ChunkManager is no longer created during datanode initialization (#42791)
issue: #41611

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-17 17:06:38 +08:00
Chun Han 001619aef9
feat: supporing load priority for loading (#42413)
related: #40781

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-17 15:22:38 +08:00
aoiasd 4e68c6d222
enhance: add test for new parameters in access log (#42546)
relate: https://github.com/milvus-io/milvus/issues/41801

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-17 11:14:38 +08:00
wei liu 679930bb93
enhance: refine delegator state checking error msg (#42673)
issue: #42661
Add NotStopped() and IsWorking() methods to shardDelegator for better
state management and error handling.

Changes include:
- Add instance state checking methods with proper error messages
- Replace lifetime package calls with delegator instance methods
- Add comprehensive unit tests for state transitions and error cases
- Improve error reporting with channel name for better debugging

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-17 10:40:38 +08:00
congqixia 880915e08b
enhance: Print out-of-date schema ts when returning ErrSchemaMismatch (#42790)
Related to #41858

This PR add log while debugging schema mismatch between pymilvus cache
and proxy schema.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-17 10:38:37 +08:00
zhagnlu 9c31a47c0f
fix:fix arith mod bug for big int (#42699)
#42624

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-17 10:36:38 +08:00
zhagnlu a887d81716
fix:reject div or mod by zero for binaryarith expr (#42691)
#42538

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-17 10:34:46 +08:00
zhagnlu 2025a2a53c
fix:fix wrong use return error for parse unsupported arith (#42729)
#42061

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-17 10:20:37 +08:00
congqixia f01ff57f3f
fix: [StorageV2] Use correct offset filling null bitmap (#42774)
Related to #39173

`null_bitmap_data()` returns raw pointer of null bitmap of Array. While
after slicing, this bitmap is not rewritten due to zero copy
implementation, so the current start pos maybe non-zero while
FillFieldData generating column `valid_data` array.

This PR add `offset` param for `FillFieldData` method, and force all
invocation pass correct offset of `null_bitmap_data` ptr.

Also update milvus-storage commit fixing reader failed to return data
when buffer size smaller than row group size problem.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-17 10:08:38 +08:00
congqixia 9653ec8d8c
fix: [AddField] Remove load list check on querycoord (#42736)
Related to #42735

Load field list shall work as hint after tiered storage impl, so the
load list compare is meaningless and block load with empty list after
adding a new field.

This PR totally moves the check logic.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-17 09:50:37 +08:00
wei liu 0b4a17c22b
fix: Fix exclude nodes clearing logic position in load balancer retry (#42577)
issue: #42561
Move the exclude nodes clearing logic from ExecuteWithRetry to
selectNode after shard leader cache refresh to ensure proper retry
behavior:
- Remove premature exclude clearing in ExecuteWithRetry that happened
before shard leader cache update
- Add exclude clearing logic in selectNode after refreshing shard leader
cache when all replicas are excluded
- Ensure multiple retries can properly update shard leader cache and
clear exclude list when needed
- Add comprehensive tests for edge cases including empty shard leaders
and mixed serviceable node scenarios

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-17 08:15:24 +08:00
sthuang ed5dbf3eaa
enhance: [StorageV2] sync separate vector datatype into its own column group (#42638)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-16 11:48:37 +08:00
zhagnlu d35c33da9f
fix: fix wrong assgin to chunk object (#42672)
#39173

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-15 21:18:37 +08:00
aoiasd 201e980d3d
fix: flow graph should free function resource after all node close (#42731)
relate: https://github.com/milvus-io/milvus/issues/42730

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-13 22:14:37 +08:00
yihao.dai 9acba25fad
enhance: Replace pointer-based map key with id in garbage collector (#42647)
issue: https://github.com/milvus-io/milvus/issues/42592

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-13 20:50:36 +08:00
congqixia ef8829c5bc
fix: [AddField] Skip missing nullable field in insertCodec (#42724)
Related to #42723

Previous PR #42684 permit insert msg transformation but insertCodec did
not adapt the same skip logic, whic causes panicking.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-13 19:56:36 +08:00
Bingyi Sun 1bf960b1a8
enhance: Check loaded segments before gc (#42639)
issue: https://github.com/milvus-io/milvus/issues/42412

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-13 17:44:38 +08:00
Zhen Ye 1f66b650e9
fix: pulsar cannot work properly if backlog exceed (#42653)
issue: #42649

- the sync operation of different pchannel is concurrent now.
- add a option to notify the backlog clear automatically.
- make pulsar walimpls can be recovered from backlog exceed.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-13 14:28:37 +08:00
wei liu 78c39edbce
fix: Fix potential panic when DeleteCheckpoint is nil (#42664)
issue: #42663
Fix panic issue when processing VchannelInfo messages from older
coordinator versions that don't have DeleteCheckpoint field.

Changes:
- Add null safety check for DeleteCheckpoint before accessing methods
- Maintain backward compatibility with legacy message formats
- Improve seek position selection logic for both old and new versions

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-13 14:26:36 +08:00
congqixia cbed31933a
fix: [AddField] Permit missing new nullable field in InsertMsg (#42684)
Related to #41858 #41951 #42084

When insert msg consumer (pipeline/flowgraph) have newer schema than
insertMsg, it have to adapter the insert msg used old schema(missing
newly added field)

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-13 13:52:35 +08:00
congqixia d59002d45e
fix: Make controller wait checker worker quit and add nil protection (#42704)
Related to #42702

This patch add wait logic for `CheckerController` and nil check for
channel checker in case of panicking during server/testcase stop
procedure

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-13 13:20:35 +08:00
Zhen Ye ca48603f35
fix: msg dispatcher lost data at streaming service (#42670)
issue: #41570

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-13 11:54:36 +08:00
Spade A 9873e0ee78
fix: fix text match index / json key stats index leak when segment released (#42655)
Ref https://github.com/milvus-io/milvus/issues/42626

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
2025-06-13 04:28:37 +08:00
cai.zhang 4ca1a231ad
fix: Add precheck for unsupport datatype cast (#42677)
issue: #42527

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-12 21:14:36 +08:00
congqixia c9bc70f272
fix: [AddField] Use shared_ptr of schema in plan fixing dangling ref (#42693)
Related to #42640

The search/query plan holded a reference to schema, which could be
destructed after schema change. This PR make plan hold a shared ptr to
it fixing dangling reference problem under concurrent read & schema
change.

This PR also remove field binlog check for loading index for old segment
with old schema may have binlog lack.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-12 20:46:36 +08:00
yihao.dai 86876682da
enhance: Enhance import integration tests and logs (#42612)
1. Optimize the import process: skip subsequent steps and mark the task
as complete if the number of imported rows is 0.
2. Improve import integration tests:
 a. Add a test to verify that autoIDs are not duplicated
 b. Add a test for the corner case where all data is deleted
 c. Shorten test execution time
3. Enhance import logging:
 a. Print imported segment information upon completion
 b. Include file name in failure logs

issue: https://github.com/milvus-io/milvus/issues/42488,
https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-12 20:02:35 +08:00
Xianhui Lin 98067f5fc6
fix: datacoord stop get stuck After upgrading from 2.5 to 2.6 (#42674)
datacoord stop get stuck After upgrading from 2.5 to 2.6
issue:https://github.com/milvus-io/milvus/issues/42656

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-12 16:56:36 +08:00
Spade A 911a8df17c
feat: impl StructArray -- data storage support in segcore (#42406)
Ref https://github.com/milvus-io/milvus/issues/42148
This PR mainly enables segcore to support array of vector (read and
write, but not indexing). Now only float vector as the element type is
supported.

---------

Signed-off-by: SpadeA <tangchenjie1210@gmail.com>
Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>
2025-06-12 14:38:35 +08:00
cai.zhang 57c60af00d
fix: Unsorted small segments should not be considered as indexed (#42614)
issue: #42143

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-12 14:30:35 +08:00
Buqian Zheng 8511ede5f8
feat: add back queryNode.cache.warmup for compatibility (#42621)
issue: https://github.com/milvus-io/milvus/issues/41435

also make ChunkTranslator to load in parallel

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-06-12 10:56:40 +08:00
Bingyi Sun 6c16d3dbee
enhance: Add bulk api for json data (#42407)
issue: https://github.com/milvus-io/milvus/issues/42409

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-12 10:40:39 +08:00
foxspy 58f9278db7
fix: fix build interim index failures (#42679)
issue: #42028 

W20250522 09:52:55.785657 12779 ChunkedSegmentSealedImpl.cpp:1752]
[SERVER][generate_interim_index][CGO_LOAD][]fail to generate binlog
index, because bad optional access

After the cachelayer is added, num_rows_ can not be obtained before
interim index generated , and an external parameter pass is required

Signed-off-by: foxspy <xianliang.li@zilliz.com>
2025-06-12 05:12:39 +08:00
yihao.dai a72463c619
enhance: Optimize memory usage during garbage collection (#42593)
Defer clone and decompress operations until just before removing from
meta, instead of eagerly applying them to all segments in advance.

issue: https://github.com/milvus-io/milvus/issues/42592

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-11 20:40:39 +08:00
foxspy 9af6c16ea0
fix: add describeIndex timestamp for restful interface (#42104)
issue: #41431

Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
2025-06-11 15:26:38 +08:00
yihao.dai e6da4a64b5
fix: Pre-check import message to prevent pipeline block indefinitely (#42415)
Pre-check import message to prevent pipeline block indefinitely.

issue: https://github.com/milvus-io/milvus/issues/42414

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: chyezh <chyezh@outlook.com>
2025-06-11 13:40:38 +08:00
wei liu e7c0a6ffbb
enhance: Refine QueryNode task parallelism based on CPU core count (#42166)
issue: #42165
Implement dynamic task execution capacity calculation based on QueryNode
CPU core count instead of static configuration for better resource
utilization.

Changes include:
- Add CpuCoreNum() method and WithCpuCoreNum() option to NodeInfo
- Implement GetTaskExecutionCap() for dynamic capacity calculation
- Add QueryNodeTaskParallelismFactor parameter for tuning
- Update proto definition to include cpu_core_num field
- Add unit tests for new functionality

This allows QueryCoord to automatically adjust task parallelism based on
actual hardware resources.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-11 13:20:35 +08:00
congqixia 499e9a0a73
fix: [AddField] Use corresponding datatype for int8/int16 def val (#42633)
Related to #42629

This PR handles converting default value to int8/int18 scalar with int32
default value definition

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-11 11:54:34 +08:00
Xianhui Lin d5c41acec1
fix: compatibility with old sessions upgrade from 2.5 to 2.6 in standalone mode (#42645)
compatibility with old sessions upgrade from 2.5 to 2.6 in standalone
mode
issue:https://github.com/milvus-io/milvus/issues/42602

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-11 10:58:35 +08:00
Zhen Ye 43f0c56ce7
fix: limit the concurency of zstd compression and decrease the memory usage of binlog generation (#42630)
issue: #42028

- limit the concurrency of zstd compression.
- zstd.go modified from
`github.com/apache/arrow/go/v17/parquet/compress/ztsd.go`
- may be related to #42129

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-11 09:06:34 +08:00
Bingyi Sun fbf5cb4e62
feat: Add json flat index (#39917)
issue: https://github.com/milvus-io/milvus/issues/35528

This PR introduces a JSON flat index that allows indexing JSON fields
and dynamic fields in the same way as other field types.

In a previous PR (#36750), we implemented a JSON index that requires
specifying a JSON path and casting a type. The only distinction lies in
the json_cast_type parameter. When json_cast_type is set to JSON type,
Milvus automatically creates a JSON flat index.

For details on how Tantivy interprets JSON data, refer to the [tantivy
documentation](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#pitfalls-limitation-and-corner-cases).

Limitations
Array handling: Arrays do not function as nested objects. See the
[limitations
section](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#arrays-do-not-work-like-nested-object)
for more details.

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-10 19:14:35 +08:00
XuanYang-cn 83877b9faf
enhance: remove extra get collection (#42042)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2025-06-10 18:34:35 +08:00
junjiejiangjjj f1a4526bac
enhance: refactor rrf and weighted rerank (#42154)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-06-10 18:08:35 +08:00
wei liu f3fe117840
fix: Use delete checkpoint to prevent delete record loss in L0 refactoring (#42628)
issue: #39333 #41570
Fix delete record missing issue introduced in PR #39552 L0 refactoring:
- Use delete checkpoint as consume start position when deleteCP <
channelCP
- Add logging when delete checkpoint is used instead of seek position
- Prevent delete record loss when deleteCP is earlier than default
channelCP

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-10 17:34:35 +08:00
yihao.dai ed55b14484
fix: Release data memory after sync task completes (#42627)
Release data memory after sync task completes to prevent datanode oom
during import.

issue: https://github.com/milvus-io/milvus/issues/42608

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-10 16:28:34 +08:00
cqy123456 c9680a5b56
fix: avoid load index or create interim index in ChunkedSegmentSealedImpl::HasRawData() (#42622)
issue: https://github.com/milvus-io/milvus/issues/42526

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-06-10 14:54:34 +08:00
Zhen Ye af0881ee5d
fix: timetick cannot push forward when upgrading (#42567)
issue #42492

- streamingcoord start before old rootcoord.
- streaming balancer will check the node session synchronously to avoid
redundant operation when cluster startup.
- ddl operation will check if streaming enabled, if the streaming is not
enabled, it will use msgstream.
- msgstream will initialize if streaming is not enabled, and stop when
streaming is enabled.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-10 14:52:42 +08:00
cqy123456 317bbfbf81
enhance: milvus support minhash vector and mhjaccard metric (#42036)
issue:
https://github.com/issues/assigned?issue=milvus-io%7Cmilvus%7C41746

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-06-10 14:38:34 +08:00
Bingyi Sun b3ecf77a66
fix: Fix the bug of valid data write corruption (#42556)
issue: https://github.com/milvus-io/milvus/issues/42554

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-10 14:22:34 +08:00
zhagnlu 2861096734
fix: Add explicit move semantics to get_batch_view interface (#42403)
#42401

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-10 13:06:35 +08:00
sthuang 9439eaef52
fix: [StorageV2] sync with int8 vector data type core dumped (#42616)
related: https://github.com/milvus-io/milvus/issues/42613, #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-10 11:42:35 +08:00
aoiasd 13330bd466
fix: add concurrency and close protect for bm25 function (#42597)
relate: https://github.com/milvus-io/milvus/issues/42576

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-10 11:36:34 +08:00
sthuang 89c3afb12e
fix: [StorageV2] index/stats task level storage v2 fs (#42191)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-10 11:06:35 +08:00
aoiasd fd6e2b52ff
enhance: use english name as language name for all type language identifier (#42600)
Set whatlang detect return language name as english name.
Make sure same with lingua.

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-10 10:24:35 +08:00
congqixia a9aaa86193
enhance: [StorageV2] Pass bucket name for compaction readers (#42607)
Related to #39173

Like logic in #41919, storage v2 fs shall use complete paths with
bucketName prefix to be compatible with its definition. This PR fills
bucket name from config when creating reader for compaction tasks.

NOTE: the bucket name shall be read from task params config for
compaction task pooling.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-10 10:20:35 +08:00
congqixia 118684afbb
enhance: [storageV2] Pass nullable converting insertMsg fieldData (#42584)
Related to #39173

`nullable` flag is crucial for serde logic of v2 writer, missing this
flag causes logic bug for v2 nullalbe data.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-10 10:06:34 +08:00
Bingyi Sun ffb2877992
enhance: support auto index type for json index (#42071)
issue: https://github.com/milvus-io/milvus/issues/42070

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-09 21:22:34 +08:00
wei liu 317e7999da
fix: ReleasePartition cause delegator unserviceable. (#42423)
issue: #42098 #42404
related to: ##42009 #41937

Implement new method to handle partition removal from next target
without directly modifying current target.

Changes include:
- Add RemovePartitionFromNextTarget method and deprecate RemovePartition
- Update target_observer to use new method for ReleasePartition
operations
- Add unit tests and mock methods for new functionality

This ensures that all changes to next target will propagates to
delegator's query view.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-09 19:02:34 +08:00
Bingyi Sun 6404e02d99
fix: Check cast type is array for json contains expr (#42184)
issue: https://github.com/milvus-io/milvus/issues/42181

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-09 17:04:33 +08:00
congqixia f1188b6781
enhance: [storagev2] Support partition key isolation index (#42574)
Related to #39173

This patch make storage v2 support partition key isolation index feature

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-09 14:02:33 +08:00
yihao.dai 837349dead
enhance: Adjust default import buffer size (#42541)
Increase insert buffer size from 16MB to 64MB, while keeping delete
buffer size at 16MB.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-09 13:02:33 +08:00
sthuang b136f85ca0
fix: storage v2 write mmap file per field per cell (#42180)
Each cell of a field should be written to its own mmap file, rather than
writing all cells of the field into a single mmap file.
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-09 11:48:33 +08:00
aoiasd 6e16653597
fix: update tantivy commit version to fix stemmer panic (#42171)
relate: https://github.com/milvus-io/milvus/issues/42168

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-09 10:34:33 +08:00
Xianhui Lin 7e46fc6618
feat: implement batch commit for JSON Stats (#42494)
implement batch commit for JSON Stats
issue:https://github.com/milvus-io/milvus/issues/41616

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-06-08 19:58:33 +08:00
Buqian Zheng b4d549d96a
fix: pipeline/delegator leak (#42582)
the manager's logging lambda should not capture the pipeline object

this creates a circular reference between the manager and the pipeline
object, making it impossible for both to be GC-ed.

issue: https://github.com/milvus-io/milvus/issues/42581

Signed-off-by: Buqian Zheng <buqianzheng@Buqians-MacBook-Air.local>
Co-authored-by: Buqian Zheng <buqianzheng@Buqians-MacBook-Air.local>
2025-06-06 22:00:32 +08:00
wei liu 8511881d3f
enhance: Increase search/query retry times on proxy before timeout (#40438)
issue: #39379

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-06 18:12:32 +08:00
congqixia b50c4a7973
enhance: Make segcore thread name set correctly (#42497)
Previous PR: #42017 did not work due to following updated points by this
PR:

- Initialize the `name_map`, which not touched at all before
- Trim the thread name under 15 characters to fit syscall limit

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-06 16:26:32 +08:00
Bingyi Sun cc5ac1c220
enhance: Support cast function for json index (#41949)
issue: #41948

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2025-06-05 19:42:32 +08:00
zhagnlu 0c4b12565e
fix: fix is null bug for marisa index (#42420)
#42255

Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2025-06-05 16:40:32 +08:00
cai.zhang e299c533be
fix: Just trigger stats task for Flushed segment (#42424)
issue: #42419

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-05 15:42:32 +08:00
aoiasd b1f86f6556
enhance: run analyzer should get database name from grpc context (#42398)
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-05 14:26:31 +08:00
aoiasd 2eb24fbe7c
fix: analyzer memory leak because function runner not close (#41839)
relate: https://github.com/milvus-io/milvus/issues/41213

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-06-05 14:24:40 +08:00
congqixia 373deba0bd
fix: Pass cluster id tranforming drop task to drop job request (#42531)
Related to #42530

The cluster id is missing when drop worker drop causing redoing task on
report duplicated task error.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-05 13:20:32 +08:00
Zhen Ye 0567f512b3
fix: streamingnode get stucked when stop (#42501)
issue: #42498

- fix: sealed segment cannot be flushed after upgrading
- fix: get mvcc panic when upgrading
- ignore the L0 segment when graceful stop of querynode.

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-05 12:22:31 +08:00
Ted Xu 35c17523de
feat: limit search result entries (#42522)
See: #42521

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2025-06-05 12:08:33 +08:00
cai.zhang 43c99a2c49
fix: Only mark segment compacting for sort stats task (#42516)
issue: #42506

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 22:46:32 +08:00
yihao.dai 6fda1f69c8
fix: Fix duplicate autoID between import and insert (#42519)
Remove the unlimited logID mechanism and switch to redundantly
allocating a large number of IDs.

issue: https://github.com/milvus-io/milvus/issues/42518

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-04 19:58:31 +08:00
cai.zhang 5566a85bcc
enhance: Add proxy task queue metrics (#42156)
issue: #42155

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-06-04 11:26:32 +08:00
Chun Han e9b5d9e8bc
enhance: refine compaction trigger to reduce read/write amplifaction(#41336) (#41728)
related: #41336

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
Co-authored-by: MrPresent-Han <chun.han@gmail.com>
2025-06-04 11:24:38 +08:00
Zhen Ye 508264f953
fix: querynode upgrade from 2.5 get stucked (#42502)
issue: #42492

- consider the old RO query node (not streaming node) when balancing
channel.
- querynode graceful stop can be done if there's only L0 segment exists.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-04 11:20:30 +08:00
congqixia b76478378a
feat: [Tiered] Make load list work as warmup hint (#42490)
Related to #42489
See also #41435

This PR's main target is to make partial load field list work as caching
layer warmup policy hint. If user specify load field list, the fields
not included in the list shall use `disabled` warmup policy and be able
to lazily loaded if any read op uses them.

The major changes are listed here:
- Pass load list to segcore and creating collection&schema
- Add util functions to check field shall be proactively loaded
- Adapt storage v2 column group, which may lead to hint fail if columns
share same group

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-06-04 10:28:32 +08:00
Zhen Ye fc010e44a8
fix: release memory after pop from heap (#42482)
issue: #42481

Signed-off-by: chyezh <chyezh@outlook.com>
2025-06-04 10:00:32 +08:00
sthuang 490827974d
enhance: avoid shutdown sdk api in minio cm destructor (#42459)
related: #39173

---------

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-06-04 09:58:39 +08:00
yihao.dai e0113b375e
fix: Fix sort stats generates large binlogs (#42456)
Remove the hardcoded batchSize of 100,000 and instead trigger a write
every 64MB based on actual data size. This prevents sort stats from
generating excessively large binlog files.

issue: https://github.com/milvus-io/milvus/issues/42400

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-06-04 09:56:39 +08:00
wei liu aa66072a1c
enhance: Remove inadvertently introduced goccy/go-json dependency (#42146)
Remove the 'goccy/go-json' library, which was inadvertently introduced,
and revert to using the standard internal JSON handling.

Changes include:
- Removed dependency on 'github.com/goccy/go-json' in go.mod and go.sum.
- Replaced import of 'goccy/go-json' with 'internal/json' in
'internal/querycoordv2/task/scheduler.go'.

This correction ensures the project continues to use the intended JSON
processing libraries and avoids unnecessary external dependencies.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-03 17:38:32 +08:00
cqy123456 727f4ec24b
enhance:mmapchunkmanager allocates MmapChunkDescriptor itself (#42150)
issue: https://github.com/milvus-io/milvus/issues/42157

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-06-03 14:42:31 +08:00
wei liu 5a355d1e57
fix: Fix data race in global scheduler test using atomic counters (#42454)
issue: #42457

Replace unsafe ExpectedCalls modification with atomic.Int32 state
tracking to avoid race conditions in concurrent test execution. Changes
include:
- Use atomic counters instead of direct mock ExpectedCalls manipulation
- Add RunAndReturn with atomic state transitions for thread safety
- Remove github.com/samber/lo dependency

This prevents data race when mock framework and test goroutines access
ExpectedCalls concurrently.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-06-03 14:18:30 +08:00
Zhen Ye e479467582
fix: panic when upgrading from old arch (#42422)
issue: #42405

- add delete rows into header when upsert.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-31 22:56:29 +08:00
congqixia cc42d49769
fix: [StorageV2][AddField] Handle lack binlog rows in storage v2 (#42186)
Related to #39173 #39718

In storage v2, the `lack_bin_rows` cannot be used since field id is not
column group id, which will not be matched forever.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-31 02:44:30 +08:00
yihao.dai 297331b2cc
enhance: Add slot and tasks num metrics (#42141)
issue: https://github.com/milvus-io/milvus/issues/41123

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-30 21:52:30 +08:00
wei liu 2669d14ba0
refactor: Remove balance constraints between channel and segment tasks (#42177)
issue: #42176

Remove the mutual exclusion constraints between channel and segment
balance tasks to allow them to run concurrently.

Changes include:
- Remove permitBalanceChannel() and permitBalanceSegment() methods from
RoundRobinBalancer
- Update ChannelLevelScoreBalancer, MultiTargetBalancer,
RowCountBasedBalancer, and ScoreBasedBalancer to remove constraint
checks
- Allow segment balance tasks to proceed even when channel balance tasks
are running
- Update test cases to reflect new behavior where balance tasks no
longer block each other

This change improves the efficiency of load balancing by removing
unnecessary coordination overhead between different types of balance
operations.

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-30 18:14:25 +08:00
congqixia 6d2ad519b1
enhance:[StorageV2] Adapt local storage & other minor issue (#42167)
Related to #39173

This PR
- Handle storage v2 log path in local storage mode on querynode
- Ignore field info check when append index for loaded sealed segment
when using storage v2

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-30 10:22:29 +08:00
Xiaowei Shi 729d0b666e
enhance: use parsed physical timestamp in metrics (#41784)
issue: https://github.com/milvus-io/milvus/issues/38809
pr: https://github.com/milvus-io/milvus/pull/38810 failed to reopen

Signed-off-by: Xiaowei Shi <shallwe.shih@gmail.com>
2025-05-30 10:20:37 +08:00
Chun Han ed0df38605
enhance: resize high priority wqthreadpool dynamically(#40838) (#41549) (#41929)
related: #40838
pr: https://github.com/milvus-io/milvus/pull/41549

Signed-off-by: MrPresent-Han <chun.han@gmail.com>
2025-05-30 10:18:36 +08:00
Zhen Ye 66cc194ab2
enhance: add partition gc at streaming arch (#42179)
issue: #41976

- make drop partition message as a broadcast message.
- add gc when drop partition message is acked.
- add a call back to handle the broadcast message when ack.
- the ack operation of broadcast message will retry until success.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:20:30 +08:00
Zhen Ye 4bad293655
enhance: make upgrading from 2.5.x less down time (#42082)
issue: #40532

- start timeticksync at rootcoord if the streaming service is not
available
- stop timeticksync if the streaming service is available
- open a read-only wal if some nodes in cluster is not upgrading to 2.6
- allow to open read-write wal after all nodes in cluster is upgrading
to 2.6

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:02:29 +08:00
Zhen Ye b94cee2413
fix: growing segment from old arch is not flushed after upgrading (#42164)
issue: #42162

- enhance: add read ahead buffer size issue #42129
- fix: rocksmq consumer's close operation may get stucked
- fix: growing segment from old arch is not flushed after upgrading

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 23:00:28 +08:00
wei liu eabb62e3ab
fix: Segment may be released prematurely during balance channel (#42090)
issue: #41143

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-29 18:36:35 +08:00
Zhen Ye c7d6e3f19b
fix: data lost when wal balance (#42149)
issue: #42147

- error of sync task should be returned if error is returned to avoid
checkpoint is push forward.
- fix up node id checker of UpdateChannelCheckpoint in streaming.

Signed-off-by: chyezh <chyezh@outlook.com>
2025-05-29 17:32:29 +08:00
aoiasd 3a74044149
fix: hybird search sub requset not set analyzer name (#41896)
relate: https://github.com/milvus-io/milvus/issues/41213

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 14:56:28 +08:00
hckex 020d36624c
enhance: Fix typo 'dimesion' to 'dimension' in PreExecute method (#42160)
This PR fixes a minor typo in a log message in the `PreExecute` method
of `internal/datanode/index/task_index.go`.
Corrected "dimesion" to "dimension".

Signed-off-by: hckex <33862757+hckex@users.noreply.github.com>
2025-05-29 12:24:30 +08:00
Buqian Zheng fdf5e05c80
fix: log is_sorted_by_pk_ when loading sealed segment (#42142)
issue: https://github.com/milvus-io/milvus/issues/41993

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-05-29 11:48:29 +08:00
Xianhui Lin 8bbfbd1d54
fix: handle the error and return in mixcoord (#42152)
fix: handle the error and return in mixcoord
issue:https://github.com/milvus-io/milvus/issues/42151

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-29 11:40:30 +08:00
aoiasd 2ae4d80120
enhance: support run analyzer by loaded collection field (#42113)
relate: https://github.com/milvus-io/milvus/issues/42094

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2025-05-29 10:54:30 +08:00
junjiejiangjjj 4202c775ba
feat: Support vllm and tei rerank (#41947)
https://github.com/milvus-io/milvus/issues/35856

Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
2025-05-28 19:18:28 +08:00
groot 14563ad2b3
enhance: bulkinsert handles nullable/default (#42127)
issue: https://github.com/milvus-io/milvus/issues/42096,
https://github.com/milvus-io/milvus/issues/42130

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-05-28 18:02:28 +08:00
yihao.dai 79b51cbb73
fix: Fix task getting stuck after recovery (#42114)
Submit tasks into the global scheduler after recovery.

issue: https://github.com/milvus-io/milvus/issues/42046

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-28 12:46:28 +08:00
congqixia 08a53c56b1
fix: [AddField] Use metacache schema in embedding node (#42115)
Related to #42084

Embedding node cached schema when created, causing schema mismatch after
schema change. This PR make embeddingNode use schema from metacache,
which will be updated.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-28 11:30:28 +08:00
Xianhui Lin da30e1e4df
fix: pass the ttl duration in the search request for ttl filter (#42122)
fix: pass the TTL duration in the search request for TTL filter
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-28 11:08:29 +08:00
Buqian Zheng 7243c1d0ce
feat: remove async warmup policy (#42123)
issue: https://github.com/milvus-io/milvus/issues/41993

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2025-05-28 10:30:28 +08:00
cai.zhang 63246c040f
fix: Use locking to ensure the atomicity of dropping segment indexes (#42075)
issue: #41288

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2025-05-28 10:00:28 +08:00
cqy123456 5fe7015f63
enhance: InterimIndex support more index type and data type (#41021)
issue: https://github.com/milvus-io/milvus/issues/27678
cherry pick from : https://github.com/milvus-io/milvus/pull/39180,
https://github.com/milvus-io/milvus/pull/40429

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2025-05-28 08:40:28 +08:00
wei liu 54619eaa2c
feat: Implement partial result support on node down (#42009)
issue: https://github.com/milvus-io/milvus/issues/41690
This commit implements partial search result functionality when query
nodes go down, improving system availability during node failures. The
changes include:

- Enhanced load balancing in proxy (lb_policy.go) to handle node
failures with retry support
- Added partial search result capability in querynode delegator and
distribution logic
- Implemented tests for various partial result scenarios when nodes go
down
- Added metrics to track partial search results in querynode_metrics.go
- Updated parameter configuration to support partial result required
data ratio
- Replaced old partial_search_test.go with more comprehensive
partial_result_on_node_down_test.go
- Updated proto definitions and improved retry logic

These changes improve query resilience by returning partial results to
users when some query nodes are unavailable, ensuring that queries don't
completely fail when a portion of data remains accessible.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2025-05-28 00:12:28 +08:00
yihao.dai 57b58ad778
fix: Fix concurrent l0Compaction and Stats (#42112)
Return `false` in the `Process()` function for `executing` or
`pipelining` state `l0Compaction`. This prevents the `l0Compaction` task
from being removed from the `CompactionInspector`'s executing queue,
thereby avoiding concurrent execution of `l0Compaction` and `Stats`.

issue: https://github.com/milvus-io/milvus/issues/42008

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-27 20:54:28 +08:00
junjiejiangjjj 0b2ecb7632
fix: Solve clang compilation errors (#42041)
https://github.com/milvus-io/milvus/issues/42040

Signed-off-by: junjiejiangjjj <junjie.jiang@zilliz.com>
2025-05-27 20:32:29 +08:00
congqixia 6d0b15308d
enhance: Take nq into slow query consideration (#42109)
Related to #40756

Large nq will naturally increase query time, which causing lots of slow
log when user NQ numbers are very large.

This PR make slow search counts span per nq (using avg val) to decide
whether one request is slow or not.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-27 19:56:28 +08:00
congqixia 4cab236bca
enhance: [AddField][Nullable] Fill absent nullable field server-side (#42095)
Related to #39718

The absent nullable field shall be filled at server-side in nullable
design. While the implementation here was buggy causing the feature was
not able to serve.

This PR make proxy fill the field data in correct format so that field
data with absent column(s) will be accepted.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2025-05-27 18:50:28 +08:00
Xianhui Lin 6a0e182e13
enhance: support TTL expiration with queries returning no results (#42086)
support TTL expiration with queries returning no results
issue:https://github.com/milvus-io/milvus/issues/41959

Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>
2025-05-27 18:28:27 +08:00
sthuang b9b554676c
fix: storage v2 get field data with correct column group files (#42107)
related: #39173

Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>
2025-05-27 15:26:28 +08:00
groot c00005bdaa
feat: support to drop properties of field (#41996)
issue: https://github.com/milvus-io/milvus/issues/41990

Signed-off-by: yhmo <yihua.mo@zilliz.com>
2025-05-27 14:32:34 +08:00
yihao.dai 59a6eef774
fix: Fix compaction getting stuck (#42087)
Reset `isCompacting` flag after JSONStats and BM25 task finished.

issue: https://github.com/milvus-io/milvus/issues/42083

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-05-27 10:26:27 +08:00
hckex a20500b3ed
doc: Fix small typo in internal/datanode/README.md (#42089)
This PR fixes a minor typo in the README file of the datanode module.
Corrected "imformation" to "information".

Signed-off-by: hckex <33862757+hckex@users.noreply.github.com>
2025-05-27 10:16:27 +08:00