milvus

Commit Graph

Author	SHA1	Message	Date
Bingyi Sun	0dee3ccfd7	enhance: Make user specified doc id selectable for tantivy index writer (#41528 ) issue: https://github.com/milvus-io/milvus/issues/41527 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-07 10:48:53 +08:00
Bingyi Sun	4c08090687	feat: Add json index support for json contains expr (#41478 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-06 11:44:52 +08:00
Buqian Zheng	73bbf4c674	fix: error when lack_binlog_rows = 0 (#41644 ) issue: https://github.com/milvus-io/milvus/issues/41643 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-04 00:24:56 +08:00
sthuang	e9442f575d	feat: storage v2 seal segment load (#41567 ) storage v2 chunked seal segment loading is based on caching layer. A cell unit in storage v2 is a parquet row group in remote object storage, containing all fields. Therefore, each field needs a proxy to do related one field operations. <img width="965" alt="Screenshot 2025-04-28 at 10 59 30" src="https://github.com/user-attachments/assets/83e93a10-3b1d-4066-ac17-b996d5650416" /> related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-30 14:22:58 +08:00
sthuang	6c377b6e86	feat: Storage v2 index and stats raw data (#41534 ) related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-30 08:48:54 +08:00
zhagnlu	cd60b965c8	enhance: add expr filter ratio monitor params (#41402 ) #41401 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-29 17:02:54 +08:00
foxspy	1d99f8bd67	enhance: add force rebuild index configuration (#41473 ) issue: #41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-04-29 16:20:56 +08:00
congqixia	f3f8227cd0	enhance: [AddField] Trigger check schema in retrieve as well (#41598 ) Related to #39718 Fixes milvus-io/pymilvus#2771 This PR: - Make AsyncRetrieve task triggers "schema check" logic as well - Rename `AddField` related methods to align with code standard Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-29 14:10:49 +08:00
Spade A	910f68c986	fix: update tantivy to fix tantivy doc out of order when merge (#41596 ) issue: #41597 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:46:49 +08:00
Spade A	f35e8f7420	fix: fix arm64 compile issue (#41593 ) issue: https://github.com/milvus-io/milvus/issues/41059, https://github.com/milvus-io/milvus/issues/41510 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:19:25 +08:00
Buqian Zheng	3de904c7ea	feat: add cachinglayer to sealed segment (#41436 ) issue: https://github.com/milvus-io/milvus/issues/41435 --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-28 10:52:40 +08:00
cai.zhang	640f526301	fix: Update current scalar index version to compatible tantivy different versions (#41141 ) issue: #40823 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-27 20:44:39 +08:00
Chun Han	12cde913b5	fix: fail to get string views due to chunk bound empty loop(#41300 ) (#41452 ) related: #41300 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-27 10:40:38 +08:00
congqixia	b5443ddbd0	enhance: [AddField] Reopen loaded segments after AddField (#41529 ) Related to #39718 This PR: - Add reopen logic for growing & sealed segments - Lazy reopen when schema version increases - Add FinishLoad api for loading progress --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-26 08:48:39 +08:00
Buqian Zheng	1c8b9c127d	fix: Make sure segment in ut is destroyed before static MmapManager singleton (#41508 ) issue: #41507 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-04-25 18:50:38 +08:00
Xianhui Lin	1a6838b496	fix: json stats add map null check before insert into tantivity (#41505 ) json stats add map null check before insert into tantivity. Json stats index may fail if there is no data issue:https://github.com/milvus-io/milvus/issues/41494 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-24 21:06:37 +08:00
congqixia	dbe54c2df8	enhance: [AddField] Resolve conflicts & make WAL ts collection updatets (#41476 ) Related to #39718 This PR: - Use WAL broadcast timestamp as Collection update timestamp - Remove request_fields size assertion - Remove proxy schema cache loaded field check & skip related cases - other minor issues --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-24 12:06:39 +08:00
Spade A	f3d878ab3f	fix: update tantivy for fixing phrase match (#41450 ) issue: #41454 https://github.com/zilliztech/tantivy/pull/8 fixes the problem, this PR update the tantivy. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-24 10:52:37 +08:00
aoiasd	f52c2909c4	feat: support multi analyzer for bm25 function (#41351 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 18:22:38 +08:00
Xianhui Lin	3d4889586d	fix: JsonStats filter by conjunctExpr and improve the task slot calculation logic (#41459 ) Optimized JSON filter execution by introducing ProcessJsonStatsChunkPos() for unified position calculation and GetNextBatchSize() for better batch processing. Improved JSON key generation by replacing manual path joining with milvus::Json::pointer() and adjusted slot size calculation for JSON key index jobs. Updated the task slot calculation logic in calculateStatsTaskSlot() to handle the increased resource needs of JSON key index jobs. issue: https://github.com/milvus-io/milvus/issues/41378 https://github.com/milvus-io/milvus/issues/41218 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2025-04-23 16:30:37 +08:00
aoiasd	a16bd6263b	feat: support more lauguage for build in stop words and add remove punct, regex filter (#41412 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 11:44:37 +08:00
aoiasd	11f2fae42e	feat: support extend default dict for jieba tokenizer (#41360 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 20:34:37 +08:00
congqixia	b36c88f3c8	enhance: [AddField] Broadcast schema change via WAL (#41373 ) Related to #39718 Add Broadcast logic for collection schema change and notifies: - Streamnode - Delegator - Streamnode - Flush component - QueryNodes via grpc --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-22 16:28:37 +08:00
aoiasd	110c5aaaf4	feat: support icu and language identifier tokenizer (#41214 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 15:56:37 +08:00
cqy123456	5219d9a723	fix: Inserting null and non-null array at the same time will cause milvus crash when growing mmap open (#41051 ) issue: https://github.com/milvus-io/milvus/issues/40981 2.5 pr: https://github.com/milvus-io/milvus/pull/41052 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-04-22 12:26:37 +08:00
aoiasd	f166843c5e	enhance: support use lindera tag filter (#40416 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-21 15:56:36 +08:00
sparknack	8ccb875e41	enhance: add simde package (#40943 ) issue: #40942 Add simde package, which can make porting SIMD code to other architectures much easier. Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-04-21 12:18:40 +08:00
Spade A	5b1430f27e	enhance: tantivy collector set bitset directly (#39748 ) fix: #39755 The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K: master: 2.62ms this PR: 2.11ms bench mark code: ``` TEST(TextMatch, TestPerf) { auto schema = GenTestSchema({}, true); auto seg = CreateSealedSegment(schema, empty_index_meta); int64_t N = 1000000; uint64_t seed = 19190504; auto raw_data = DataGen(schema, N, seed); auto str_col = raw_data.raw_->mutable_fields_data() ->at(1) .mutable_scalars() ->mutable_string_data() ->mutable_data(); for (int64_t i = 0; i < N - 1; i++) { str_col->at(i) = "hello"; } SealedLoadFieldData(raw_data, *seg); seg->CreateTextIndex(FieldId(101)); auto now = std::chrono::high_resolution_clock::now(); auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch); auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - now); std::cout << "TextMatch query time: " << duration.count() << "ms" << std::endl; } ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-20 23:02:41 +08:00
Chun Han	016920b023	fix: solve incompitable problem for none-encoding index(#40838 ) (#41369 ) related: #40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-20 22:56:44 +08:00
Ted Xu	d50781c8cc	enhance: support nullable group by keys (#41313 ) See #36264 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-04-18 10:08:34 +08:00
Spade A	62293cb582	fix: revert batch add (#41374 ) issue: #41375 todo: to fix the problems fixed in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-17 22:32:38 +08:00
Bingyi Sun	4552dd4b23	fix: Fix json index does not work for string filter (#41382 ) issue: #35528 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-17 20:10:39 +08:00
sthuang	1f1c836fb9	feat: Storage v2 growing segment load (#41001 ) support parallel loading sealed and growing segments with storage v2 format by async reading row groups. related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-16 17:14:33 +08:00
Spade A	70d13dcf61	enhance: update tantivy for removing "doc_id" fast field (#41198 ) Issue: #41210 After https://github.com/zilliztech/tantivy/pull/5, we can provide milvus row id directly to tantivy rather than record it in the fast field "doc_id". So rather than search tantivy doc id and then get milvus row id from "doc_id", now, the searched tantivy doc id is the milvus row id, eliminating the expensive acquiring row id phase. The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K: ![image](https://github.com/user-attachments/assets/d8e72134-56b5-430b-8628-36c3bed8eaad) The latency is 2.02 and 2.1 times respectively. bench mark code: ``` TEST(TextMatch, TestPerf) { auto schema = GenTestSchema({}, true); auto seg = CreateSealedSegment(schema, empty_index_meta); int64_t N = 1000000; uint64_t seed = 19190504; auto raw_data = DataGen(schema, N, seed); auto str_col = raw_data.raw_->mutable_fields_data() ->at(1) .mutable_scalars() ->mutable_string_data() ->mutable_data(); for (int64_t i = 0; i < N - 1; i++) { str_col->at(i) = "hello"; } SealedLoadFieldData(raw_data, *seg); seg->CreateTextIndex(FieldId(101)); auto now = std::chrono::high_resolution_clock::now(); auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch); auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - now); std::cout << "TextMatch query time: " << duration.count() << "ms" << std::endl; } ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-15 20:20:32 +08:00
Bingyi Sun	a953eaeaf0	enhance: support binary range expression for json path index (#41025 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-15 19:32:33 +08:00
Chun Han	59b14d38f5	enhance: Optimize index format for improved load performance(#40838 ) (#40839 ) related: https://github.com/milvus-io/milvus/issues/40838 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-15 03:10:30 +08:00
Bingyi Sun	bf617115ca	enhance: Remove single chunk segment related codes (#39249 ) https://github.com/milvus-io/milvus/issues/39112 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-11 18:56:29 +08:00
Spade A	9ce3e3cb44	enhance: add documents in batch for json key stats (#41228 ) issue: https://github.com/milvus-io/milvus/issues/40897 After this, the document add operations scheduling duration is decreased roughly from 6s to 0.9s for the case in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-11 14:08:26 +08:00
Bingyi Sun	b9b8419cbf	fix: Use int32 when creating array index for element type int8/int16 (#41185 ) issue: #41172 Elements with type int8 or int16 in Array is encoded using int32, so we should parse it as int32 when creating index. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-11 13:18:25 +08:00
foxspy	17e10beba0	fix: avoid segmentation faults caused by retrieving empty vector datasets (#40545 ) issue: #40544 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-04-10 20:16:29 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
Spade A	e9fa30f462	fix: remove single segment logic in V7 (#41159 ) Ref: https://github.com/milvus-io/milvus/issues/40823 It does not make any sense to create single segment tantivy index for old version such as 2.4 by using tantivy V7. So, clean the relevant code. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-09 19:54:27 +08:00
zhagnlu	3ed23a5f48	fix: fix remove index type failed when remote storage is local mode (#41164 ) #41142 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-09 16:42:26 +08:00
zhagnlu	ee1faf80dd	fix:add clear bitmap for batch skip mode (#41166 ) #41086 #41150 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-09 13:08:27 +08:00
sthuang	50e02e3598	enhance: update packed reader api (#41055 ) related: https://github.com/milvus-io/milvus/issues/39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-09 10:18:26 +08:00
congqixia	e2d8adb963	fix: Use element_type for Array is null operator (#41157 ) Related to #41156 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-04-09 10:16:24 +08:00
Spade A	c6a0c2ab64	enhance: process tantivy document add by batch (#40124 ) issue: https://github.com/milvus-io/milvus/issues/40006 This PR make tantivy document add by batch. Add document by batch can greately reduce the latency of scheduling the document add operation (call tantivy `add_document` only schdules the add operation and it returns immediately after scheduled) , because each call involes a tokio block_on which is relatively heavy. Reduce scheduling part not necessarily reduces the overall latency if the index writer threads does not process indexing quickly enough. But if scheduling itself is pretty slow, even the index writer threads process indexing very fast (by increasing thread number), the overall performance can still be limited. The following codes bench the PR (Note, the duration only counts for scheduling without commit) ``` fn test_performance() { let field_name = "text"; let dir = TempDir::new().unwrap(); let mut index_wrapper = IndexWriterWrapper::create_text_writer( field_name, dir.path().to_str().unwrap(), "default", "", 1, 50_000_000, false, TantivyIndexVersion::V7, ) .unwrap(); let mut batch = vec![]; for i in 0..1_000_000 { batch.push(format!("hello{:04}", i)); } let batch_ref = batch.iter().map(\|s\| s.as_str()).collect::<Vec<_>>(); let now = std::time::Instant::now(); index_wrapper .add_data_by_batch(&batch_ref, Some(0)) .unwrap(); let elapsed = now.elapsed(); println!("add_data_by_batch elapsed: {:?}", elapsed); } ``` Latency roughly reduces from 1.4s to 558ms. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-08 19:50:24 +08:00
Bingyi Sun	da21640ac3	fix: Fix the bug that null data can not be filtered by null expr (#41124 ) issue: https://github.com/milvus-io/milvus/issues/41063 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-08 19:12:24 +08:00
aoiasd	6f17720e4e	enhance: support use jieba tokenizer with costum dictionary (#39854 ) relate: https://github.com/milvus-io/milvus/issues/40168 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-08 14:52:27 +08:00
Spade A	e4da2765ba	enhance: process batch of strings within one tantivy_index_add_string call (#40007 ) issue: #40006 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-08 01:20:25 +08:00
Bingyi Sun	355f62d6c9	fix: Align brute force search with json index for exists expr (#41116 ) issue: #35528 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-07 15:42:23 +08:00
zhagnlu	ee8783cae9	fix:add operator type for some operator (#40895 ) #40894 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 11:58:27 +08:00
zhagnlu	10a63b3f2e	enhance: add formatter for serveral types to remove compile warning (#41094 ) #41091 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 11:54:24 +08:00
zhagnlu	0a378dc308	fix:fix format error for json (#41026 ) #40963 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-07 10:22:22 +08:00
Bingyi Sun	fcb03b5bd1	feat: add json null/exists expression (#41004 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-04-03 17:48:21 +08:00
Zhen Ye	9f27d9af61	fix: segv if the LoadArrowReaderFromRemote run at the exception path (#41069 ) issue: #41067 Signed-off-by: chyezh <chyezh@outlook.com>	2025-04-03 02:54:21 +08:00
Spade A	f552ec67dd	fix: support building tantivy index with low version(5) (#40822 ) fix: https://github.com/milvus-io/milvus/issues/40823 To solve the problem in the issue, we have to support building tantivy index with low version for those query nodes with low tantivy version. This PR does two things: 1. refactor codes for IndexWriterWrapper to make it concise 2. enable IndexWriterWrapper to build tantivy index by different tantivy crate --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-02 18:46:20 +08:00
Chun Han	afa519b4c7	fix: array is null failed(#40686 ) (#41027 ) related: #40686 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-04-02 18:20:22 +08:00
smellthemoon	cb1e86e17c	enhance: support add field (#39800 ) after the pr merged, we can support to insert, upsert, build index, query, search in the added field. can only do the above operates in added field after add field request complete, which is a sync operate. compact will be supported in the next pr. #39718 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-04-02 14:24:31 +08:00
Spade A	216be1494b	fix: add log for object storage operation fail (#40666 ) fix: #40665 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-02 01:26:21 +08:00
cqy123456	6dc0f42830	fix:growing mmap data type crashed by nullable input (#40994 ) issue: https://github.com/milvus-io/milvus/issues/40981 2.5 pr: https://github.com/milvus-io/milvus/pull/40980 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-03-31 20:32:19 +08:00
Bingyi Sun	27ff3a42e7	enhance: Record simdjson error (#41003 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-31 17:56:19 +08:00
Bingyi Sun	15ec7bae4d	fix: Fix using json index when iterative_filter is specified (#40945 ) issue: #40934 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-31 15:26:19 +08:00
Bingyi Sun	9676365af9	fix: Fix json index not equal filter (#40647 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-27 23:06:23 +08:00
aoiasd	384d39ef5a	enhance: not build lindera features by default and support make milvus with tantivy features (#40813 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-03-27 14:08:22 +08:00
zhagnlu	87e7d6d79f	fix:fix exception when do arith expr with using index (#40794 ) #40783 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-27 11:10:21 +08:00
Xiaofan	8788e591cd	enhance: add detailed stack for error message (#40883 ) fix #40882 adding stacktrace will operator execute failed. Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>	2025-03-26 13:24:20 +08:00
zhagnlu	7fdb2e144f	enhance:change multi or expr to in expr (#40757 ) #40752 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-25 11:06:18 +08:00
cai.zhang	a41cb942f6	fix: Do not delete the centroids file when sampling fails instead wait GC (#40701 ) issue: #40700 --------- Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-03-21 10:32:12 +08:00
sthuang	d7df78a6c9	feat: Storage v2 compaction (#40667 ) - Feat: Support Mix compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact with duplicate primary key test. - Compact with bm25 segments. - Compact with merge sort segments. - Compact with no expiration segments. - Compact with lack binlog segments. - Compact with nullable field segments. - Feat: Support Clustering compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact bm25 segments with v2 format. - Compact with memory limit. - Enhance: Use serdeMap serialize in BuildRecord function to support all Milvus data types. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-03-21 10:16:12 +08:00
Bingyi Sun	5a6b4e56d5	fix: Fix tasks will panic if one of them throw an exception. (#40691 ) issue: https://github.com/milvus-io/milvus/issues/40690 the variable rcm will be dangling if a future throws an exception and return. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-19 16:52:09 +08:00
aoiasd	92bdf7a0c1	enhance: support run anayser return detaild token (#40458 ) relate: https://github.com/milvus-io/milvus/issues/39705 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-03-19 15:48:15 +08:00
zhagnlu	6c55db44f1	enhance: reorder sub expr for conjunct expr (#39872 ) two point: (1) reoder conjucts expr's subexpr, postpone heavy operations sequence: int(column) -> index(column) -> string(column) -> light conjuct ...... -> json(column) -> heavy conjuct -> two_column_compare (2) support pre filter for expr execute, skip scan raw data that had been skipped because of preceding expr result. #39869 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-19 14:50:14 +08:00
Zhen Ye	8db708f67d	enhance: enable memory prof based on jemalloc (#40731 ) issue: #40730 also see: https://github.com/milvus-io/cgosymbolizer/pull/2 After these PR, at linux: - the milvus will always enable jemalloc by default. - jemalloc will always compiled with --enable-prof options. - all image will always enable the jemalloc prof by default. - a pprof http service for jemalloc at `/debug/jemalloc/` will be registered into restful. - `jeprof` can remote profile the memory of milvus. Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-19 14:46:18 +08:00
zhagnlu	7ebe3d7038	enhance: refine chunk access logic and add some comment on data (#40618 ) #40367 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-16 22:20:08 +08:00
Bingyi Sun	6249335859	fix: Catch invalid json pointer error (#40625 ) issue: #35528 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-14 16:56:08 +08:00
Bingyi Sun	d3adab15ac	fix: Build double index for all json numeric field (#40619 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-14 16:52:11 +08:00
Bingyi Sun	8fbacf3583	fix: Null expr does not work for json field (#40456 ) issue: https://github.com/milvus-io/milvus/issues/40455 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-14 16:06:08 +08:00
Spade A	001fc992df	enhance: get doc ids by batch (#40608 ) issue: #40607 tantivy change: https://github.com/zilliztech/tantivy/pull/3 Benchmarks: Test Envrioment: CPU 9900K The data is insert by: ``` for i in 0..N { for j in 0..UNIQUE { let key = format!("hello{}", j); index_writer.add_string(&key, i * UNIQUE + j).unwrap(); } } ``` So the unique influences the locality of the matched docs. The latency is the avg latency over 1000 repeate quries. The result shows 22.5%-34.8% latency reduction. ![image](https://github.com/user-attachments/assets/dd8af75a-ddc3-445d-92df-50d354dd5645) --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-03-14 15:48:09 +08:00
Spade A	f36d1562bd	enhance: add metrics for random sample (#40634 ) issue: #39541 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-03-13 21:42:11 +08:00
Spade A	9f3bd55755	fix: avoid panic when field not exists in schema in query node (#40541 ) ref #40473 This PR is a workaround to avoid the panic described in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-03-12 22:44:08 +08:00
Bingyi Sun	0698d04f7d	enhance: Upgrade simdjson version (#40538 ) issue: https://github.com/milvus-io/milvus/issues/40519 simdjson returns better error code in newer version. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-11 15:04:05 +08:00
cai.zhang	e5f50076ec	enhance: Only check element type with not null array (#40446 ) Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-03-11 14:58:07 +08:00
Bingyi Sun	0a7e692b6f	fix: Fix null offset loading in inverted index (#40523 ) issue: #40516 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-10 22:12:04 +08:00
Cai Yudong	2bd2cca04a	enhance: Truly support multi vector data types in SearchBruteForce (#40499 ) Issue: #38666 Signed-off-by: CaiYudong <yudong.cai@zilliz.com>	2025-03-10 18:36:03 +08:00
sre-ci-robot	a6d4121034	[automated] Update Knowhere Commit (#40486 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-10 12:28:04 +08:00
smellthemoon	faae8ee518	fix: store wrong offset when build tantivy in nullable field (#40452 ) #40454 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-03-09 09:34:04 +08:00
Bingyi Sun	37b118d55d	fix: Skip loading primary key if index has raw data (#39921 ) issue: https://github.com/milvus-io/milvus/issues/39907 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-06 17:46:02 +08:00
Spade A	3db56560fb	fix: fix concurrent issues in null offset (#40363 ) issue: #40308 This issue fixes these two concurrent issues: 1. element in null_offset is used to set bitset where the size of bitset is initialized by tantivy document count. However, there may still be some documents that are not committed in tantivy but are null in null_offset. So array out of range occurs. 2. null_offset can be read and write concurrently but there's no synchronization protection. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-03-05 17:48:00 +08:00
Bingyi Sun	be4d09561b	fix: Fix missing null or non-exist key in json index (#40336 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-05 11:48:02 +08:00
Bingyi Sun	7040ba1c12	enhance: make json path index support term filter (#40140 ) issue: #35528 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-04 11:56:02 +08:00
Zhen Ye	8eb662b4dc	enhance: add more metrics for async cgo component (#40136 ) issue: #40014 Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-03 09:56:03 +08:00
sre-ci-robot	6a57a1973f	[automated] Update Knowhere Commit (#40283 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-03 01:11:58 +08:00
zhagnlu	7a17fb68ec	enhance: add monitor metric for retrieve raw data (#40141 ) #40078 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-02 18:30:01 +08:00
zhagnlu	8c19e5c4a7	enhance: decrease delete record dump snapshot limit (#40101 ) #40100 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-03-02 17:55:59 +08:00
Micka	5cc104b412	fix: Change CMake variable for switch to knowhere-cuvs (#40105 ) issue: https://github.com/milvus-io/milvus/issues/39883 Signed-off-by: Mickael Ide <mide@nvidia.com>	2025-02-27 22:05:58 +08:00
Chun Han	259f9106ad	enhance: refine variable-length-type memory usage(#38736 ) (#39578 ) related: #38736 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-02-27 21:13:58 +08:00
sre-ci-robot	b2769fb357	[automated] Update Knowhere Commit (#40223 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-27 01:35:59 +08:00
Spade A	476cf61d98	fix: random sample consider empty input (#40201 ) issue: #40198 Fix random sample does not consider empty input, that is no data is hit by filter expression. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-26 16:15:58 +08:00
Bingyi Sun	f05e9628f6	fix: Fix search failure of null expression (#40129 ) issue: #40095 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-25 20:43:55 +08:00
Bingyi Sun	db4769281c	fix: Fall back to a brute-force search if json index type unmatched (#40076 ) issue: https://github.com/milvus-io/milvus/issues/35528 If the query data type does not match the index type, fall back to a brute-force search --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-24 16:25:57 +08:00
aoiasd	38f1608910	enhance: pack analyzer code and support lindera tokenizer (#39660 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-02-24 12:13:55 +08:00
sre-ci-robot	dd1347d041	[automated] Update Knowhere Commit (#40103 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-22 01:01:53 +08:00
sthuang	3eb3af5f08	feat: explicitly specify column groups for storage v2 api (#39790 ) * use the new packed reader and writer api to be compatible with current etcd meta * For the new packed writer API: column groups and paths are explicitly defined by users and won't split column groups by memory in storage v2. Packed writer follows the user-defined column groups to split arrow record and write into the corresponding file path. * For the new packed reader API: read paths are explicitly defined by users. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-02-21 22:03:54 +08:00
yihao.dai	2a037a97f1	enhance: Add get vector latency metric and refine request limit error message (#40083 ) issue: https://github.com/milvus-io/milvus/issues/40078 Signed-off-by: bigsheeper <yihao.dai@zilliz.com>	2025-02-21 19:41:55 +08:00
Spade A	d34d70582d	fix: fix misleading name _add_multi_ (#39997 ) fix: #39995 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-21 16:45:55 +08:00
sre-ci-robot	f0d3d98c3f	[automated] Update Knowhere Commit (#40063 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-21 01:19:54 +08:00
Patrick Weizhi Xu	04fff74a56	feat: introduce Text data type (#39874 ) issue: https://github.com/milvus-io/milvus/issues/39818 This PR mimics Varchar data type, allows insert, search, query, delete, full-text search and others. Functionalities related to filter expressions are disabled temporarily. Storage changes for Text data type will be in the following PRs. Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2025-02-19 11:04:51 +08:00
Spade A	52c7d7dd80	fix: offset combined with term should be based on Token positions in phrase match (#39931 ) fix: #39711 Unlike English sentence where each words are parsed exactly once and one after one with position length 1, one Chinese word may be parsed to multiple words with position length larger than 1. For example, "badminton and skiing" will be parsed to Token{ start: 0, length: 1, text: "badminton" }, Token{ start: 1, length: 1, text: "and" }, and Token{ start: 2, length: 1, text: "tennis" }. While for exmaple for Chinsese: "羽毛球和滑雪" may be parsed to Token{ start: 0, length: 2, text: "羽毛" }, Token{ start: 0, length: 3, text: "羽毛球" }, Token{ start: 3, length: 1, text: "和" }, and Token{ start: 4, length: 2, text: "滑雪" }. This PR fix that the code not recognizes this situation. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-18 20:38:51 +08:00
congqixia	59881a7f73	fix: Remove load field & schema column size check (#39833 ) Related to #39788 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-18 16:24:51 +08:00
Spade A	0dc21f0aeb	feat: support random sample (#39532 ) issue: #39541 This PR implements random sample, the syntax is: ``` filter="random_sample(factor)" or filter="boolean_expression && random_sample(factor)" where factor is a float between (0, 1) and boolean_expression is like "1 <= number < 10", "color in ["read, "blue"]" or others ``` --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com> Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-18 12:40:50 +08:00
zhagnlu	316534e065	enhance: optimize delete init construct code (#39327 ) #39326 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-02-17 21:05:26 +08:00
congqixia	7ccde3300e	fix: Use `text_log` prefix for TextMatchIndex null offset file (#39935 ) Related to #39933 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-02-17 20:17:25 +08:00
zhagnlu	8a9f02ef71	enhance: optimize expr performace for some points (#39695 ) 1. skip get expr arguments which deserialize proto for every batch execute. 2. replace unordered_set with sort array that has better performace for small set. #39688 Co-authored-by: luzhang <luzhang@zilliz.com>	2025-02-16 20:32:14 +08:00
sre-ci-robot	61cc22354e	[automated] Update Knowhere Commit (#39898 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-16 01:32:13 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
Spade A	f7d9587720	enhance: add tantivy collector for i64 (#39850 ) issue: #39852 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-14 15:50:15 +08:00
sre-ci-robot	ba03a435fb	[automated] Update Knowhere Commit (#39878 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-14 15:18:21 +08:00
aoiasd	24d2bbc441	enhance: unmashall ts msg in dispatcher instead in msgstream (#38656 ) relate: https://github.com/milvus-io/milvus/issues/38655 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-02-14 12:04:13 +08:00
cai.zhang	9e6e477c5d	fix: Fix modulo for long type (#39722 ) issue: #39640 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-02-11 20:04:46 +08:00
Bingyi Sun	c13fc8cd19	enhance: update tantivy version (#39253 ) https://github.com/milvus-io/milvus/issues/39254 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-08 14:08:43 +08:00
sre-ci-robot	ba312427f2	[automated] Update Knowhere Commit (#39696 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-08 01:36:43 +08:00
sparknack	2d9bef44d4	fix: sparse: add inverted_index_algo and dim_max_score_ratio config (#39358 ) issue: #39332 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-02-07 16:40:44 +08:00
Gao	c1794cc490	enhance: update knowhere version and IsAdditionalScalarSupported interface (#39573 ) Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-02-05 19:51:10 +08:00
sthuang	c4ae9f4ece	feat: introduce third-party milvus-storage (#39418 ) related: https://github.com/milvus-io/milvus/issues/39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-01-24 17:21:13 +08:00
Cai Yudong	5730b69e56	feat: Enable more VECTOR_INT8 unittest (#39569 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-24 17:03:07 +08:00
zhagnlu	8117d59f85	fix:fix GetValueFromConfig for bool type (#39526 ) #39525 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-01-24 16:17:05 +08:00
congqixia	844df76cc0	enhance: Rectify run_clang_format grep command (#39534 ) Previously the grep with regex does not work and failed to match lots of .cpp files This PR: - use "-E" flag to use regex match - commit the fixed result of current cpp code Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-23 17:07:05 +08:00
Spade A	547c686027	fix: fix assignment operator in AssertInfo to comparison operator (#39347 ) fix: #39346 Remove the problem line as it's redundant. --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-23 14:23:18 +08:00
Cai Yudong	341d6c1eb7	feat: Update segcore for VECTOR_INT8 (#39415 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-21 11:03:03 +08:00
Bingyi Sun	140c5a0a75	enhance: add unit test for string pk (#39329 ) https://github.com/milvus-io/milvus/issues/39107 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-20 19:03:04 +08:00
congqixia	45d49df89b	fix: Skip load extra indexes for sorted segment pk field (#39389 ) Related to #39339 Extra indexes can be ignored for most cases since sorted pk column already provided indexing features --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-20 18:13:15 +08:00
Bingyi Sun	cb959cd1f9	enhance: upgrade rust version to 1.83 (#39295 ) #39254 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-20 11:15:03 +08:00
Gao	1a680c29e2	fix: correct remote centroids path in clustering compaction (#39398 ) issue: https://github.com/milvus-io/milvus/issues/39353 The path was modified unintentionally, change it back. Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-01-20 10:59:10 +08:00
sre-ci-robot	fdb968d0ea	[automated] Update Knowhere Commit (#39420 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-20 01:17:02 +08:00
Cai Yudong	5b35fc700d	enhance: [skip-e2e] Use template to remove duplicate unittest (#39396 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-18 10:33:01 +08:00
congqixia	7cac87caca	fix: Skip erase field if index build on PK field (#39370 ) Related to #39339 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-17 20:31:02 +08:00
Cai Yudong	64feeb0e2b	enhance: Rename API GenDataset to GenFieldData in unittest (#39386 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-17 15:55:03 +08:00
Ted Xu	9209a70bb6	fix: clang format broken under osx (#38427 ) See: #38434 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-01-17 10:43:03 +08:00
Spade A	0461ddf776	fix: phrase match does not support offset input (#39338 ) fix: #39337 Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-16 22:05:01 +08:00
Gao	75d7978a18	enhance: pass partition key scalar info if enable for vector mem index (#39123 ) issue: #34332 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-01-16 14:33:03 +08:00
Spade A	8c4ba70a4c	fix: enable to build index with single segment (#39233 ) fix https://github.com/milvus-io/milvus/issues/39232 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-16 11:01:06 +08:00
congqixia	eb63334312	enhance: Add try-catch and return CStatus for NewCollection (#39279 ) Related to #28795 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-15 19:17:01 +08:00
sre-ci-robot	55dcac375c	[automated] Update Knowhere Commit (#39263 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-15 02:52:59 +08:00
Cai Yudong	5bf1b2b929	feat: Support Int8Vector in go (#38990 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-14 20:43:06 +08:00
congqixia	da1b786ef8	enhance: Utilize "find0" in segment.find_first (#39229 ) Related to #39003 Previous PR #39004 has to clone & flip bitset due to bitset does not support find0 operator. #39176 added this feature so clone & flip could be removed now. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-14 14:14:58 +08:00
Zhen Ye	3e788f0fbd	enhance: record memory size (uncompressed) item for index (#38770 ) issue: #38715 - Current milvus use a serialized index size(compressed) for estimate resource for loading. - Add a new field `MemSize` (before compressing) for index to estimate resource. --------- Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-14 10:33:06 +08:00
Buqian Zheng	5e38f01e5b	enhance: update knowhere version (#39212 ) Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-01-14 10:21:05 +08:00
Alexander Guzhva	3447ff7310	enhance: [bitset] extend op_find() to be able to search both 0 and 1 (#39176 ) issue: #39124 `bitset::find_first()` and `bitset::find_next()` now accept one more parameter, which allows to search for `0` bit instead of `1` bit Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>	2025-01-14 09:50:58 +08:00
Bingyi Sun	a00ba861a4	fix: Fix in filter search result is empty if pk type is varchar (#39106 ) https://github.com/milvus-io/milvus/issues/39107 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-13 16:14:58 +08:00
smellthemoon	accc9e7fbf	fix: fail to get empty index num rows (#39155 ) #39125 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-01-13 16:04:58 +08:00
Zhen Ye	5f94954bb4	fix: data race when accessing field_ when retrieving (#39151 ) issue: #39148 Signed-off-by: chyezh <chyezh@outlook.com>	2025-01-13 11:23:04 +08:00
Buqian Zheng	640a49ffb6	fix: fix chunk cache madvise when sparse raw data is mmaped (#39145 ) instead of marking as not supported, `ChunkedSparseFloatColumn::DataByteSize` can simply use the impl of super class. issue: https://github.com/milvus-io/milvus/issues/39158 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-01-13 10:34:57 +08:00
Cai Yudong	2a02bbe3ee	enhance: Use template to remove unittest duplication (#39144 ) Issue: #38666 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-13 09:58:57 +08:00
Spade A	032292a432	feat: support phrase match query (#38869 ) The relevant issue: https://github.com/milvus-io/milvus/issues/38930 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-12 20:24:58 +08:00
Cai Yudong	d6206ad2de	fix: Remove duplicated Macro definition (#39076 ) Issue: #39102 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-09 15:26:56 +08:00
Spade A	8abf6c9149	fix: build text index when loading field data (#39070 ) fix: https://github.com/milvus-io/milvus/issues/39053 may fix https://github.com/milvus-io/milvus/issues/38644 which could be caused by https://github.com/milvus-io/milvus/issues/39053 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-09 15:24:56 +08:00
Gao	f0dae81494	fix: set iterative filter hint to false when no expr specified (#39033 ) issue: https://github.com/milvus-io/milvus/issues/39013 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-01-08 12:56:56 +08:00
Ted Xu	3dc95153b7	fix: build break under debug mode (#38790 ) See #38435 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2025-01-07 17:36:56 +08:00
congqixia	182cac03e5	enhance: Use bitset or instead of bitwise set (#39037 ) Related to #39003 Copying bitset value bit by bit is slow and CPU heavy, this PR utilizes bitset operator "\|=" to accelerate this procedure Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-07 15:02:56 +08:00
Cai Yudong	84f8047a86	fix: Fix Milvus build error (#39008 ) Issue: #39005 Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>	2025-01-07 14:22:56 +08:00
Chun Han	3739446a33	enhance: refine array view to optimize memory usage(#38736 ) (#38808 ) related: #38736 700m data, array_length=10 non-mmap_offsets_uint64: 2.0G mmap_offsets_uint64: 1.1G mmap_offsets_uint32: 880MB Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2025-01-07 13:26:55 +08:00
congqixia	72f5b85c05	enhance: Accelerate `find_first` by utilizing bitset simd methods (#39004 ) Related to #39003 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-01-07 10:34:54 +08:00
zhagnlu	8165044b6d	fix: fix query incorrect in case of concurrent delete (#38991 ) #38961 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-01-06 15:14:54 +08:00
Bingyi Sun	f0cddfd160	fix: Fix panic caused by removing directory (#38622 ) https://github.com/milvus-io/milvus/issues/38604 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-06 10:54:54 +08:00
sre-ci-robot	11bfc93683	[automated] Update Knowhere Commit (#38993 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-04 01:16:53 +08:00
foxspy	af08b5b311	enhance: Update Knowhere version (#38942 ) Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-01-03 14:28:53 +08:00
Spade A	4245c5bed1	fix: text match panics when enable_match is set be false (#38950 ) fix: https://github.com/milvus-io/milvus/issues/38949 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-03 14:20:55 +08:00
Bingyi Sun	aa0a87eda7	fix: Block warmup submit if pool full in sync mode (#38690 ) https://github.com/milvus-io/milvus/issues/38692 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-02 15:04:58 +08:00
smellthemoon	907fc24f85	enhance: support null expr (#38772 ) #31728 --------- Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2025-01-02 14:16:54 +08:00
Bingyi Sun	3822819942	enhance: Remove an undefined behavior in index writer (#38657 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-31 10:42:52 +08:00
cai.zhang	ba3c2e6fb1	fix: Only generate the index_null_offset file when the field support null value (#38833 ) issue: #38832 Signed-off-by: cai.zhang <cai.zhang@zilliz.com>	2024-12-30 18:02:52 +08:00
Bingyi Sun	2557e3f2a9	enhance: Initialize field id to avoid negative number (#38789 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-27 18:00:50 +08:00
congqixia	19052ef3e5	enhance: Add buffered writer to reduce fwrite syscall (#38570 ) Related to previous PR #38157 If mmapped row is too small, frequent fwrite call still cost too much cpu time for context switching. This PR add buffered write to avoid this bad case with extra buffer per variable field. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-27 12:20:50 +08:00
Patrick Weizhi Xu	85f462be1a	enhance: speed up search iterator stage 1 (#37947 ) issue: #37548 Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>	2024-12-26 10:32:49 +08:00
aoiasd	bc15ad24f2	fix: sealed segment get empty index params when brute force search for bm25 (#38707 ) relate: https://github.com/milvus-io/milvus/issues/38236 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-12-25 19:06:51 +08:00
Gao	363d7f31ef	fix: report error when hints not supported (#38717 ) issue: #38705 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-12-25 19:02:56 +08:00
aoiasd	c7ea09a8be	enhance: return exception type name when segcore return unkonwn exception (#38326 ) relate: https://github.com/milvus-io/milvus/issues/38265 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-12-25 18:58:50 +08:00
Ted Xu	acc8fb7af6	enhance: eliminate compile warnings (part2) (#38535 ) See #38435 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-25 15:30:50 +08:00
sre-ci-robot	407035c994	[automated] Update Knowhere Commit (#38641 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-22 00:44:46 +08:00
sre-ci-robot	cce25ecdbc	[automated] Update Knowhere Commit (#38635 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-21 00:50:46 +08:00
zhagnlu	8fcb33c21d	fix:fix delete record assert failed (#38580 ) #38472 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-19 18:22:47 +08:00
Zhen Ye	b537a72309	fix: interted index out of range (#38577 ) issue: #38546, #38486 Signed-off-by: chyezh <chyezh@outlook.com>	2024-12-19 15:20:47 +08:00
foxspy	06d410b70f	enhance: update knowhere version (#38544 ) related: #37730 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-12-18 16:52:45 +08:00
zhagnlu	87056be748	fix: fix snapshot or size when query (#38549 ) #38472 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-18 16:42:45 +08:00
sre-ci-robot	ffd3c5d2f5	[automated] Update Knowhere Commit (#38542 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-18 01:24:47 +08:00
Chun Han	decdfdae10	fix: growing-groupby-crush(#38533 ) (#38538 ) related: #38533 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-12-17 21:05:12 +08:00
Bingyi Sun	f0096ec292	fix: Fix IsMmapSupported for scalar index (#38135 ) https://github.com/milvus-io/milvus/issues/38134 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-17 20:30:44 +08:00
zhagnlu	9afcc5bc5c	fix:fix incorrect dir operations when create or load inverted index (#38359 ) #37944 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-17 20:06:45 +08:00
zhagnlu	d0a7e98a27	fix:remove incorrect assert for delete query (#38509 ) #38472 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-17 17:48:44 +08:00
Bingyi Sun	dd4f33ae19	fix: Fix chunked segment can not warmup using mmap (#38492 ) issue: #38410 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-17 13:42:45 +08:00
Ted Xu	33aecb0655	fix: build break on target test_cpp under OSX (#38479 ) See: #38434 Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-17 13:38:45 +08:00
Bingyi Sun	3e2a2f278b	enhance: Handle rust error in c++ (#38113 ) https://github.com/milvus-io/milvus/issues/37930 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-16 19:40:45 +08:00
Ted Xu	4919ccf543	enhance: eliminate compile warnings (#38420 ) See: #38435 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-16 09:58:43 +08:00
zhagnlu	01de0afc4e	enhance: refactor delete mvcc function (#38066 ) #37413 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 18:02:43 +08:00
zhagnlu	6ea15265e1	enhance: add file info log when mmap failed. (#38386 ) #37944 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-15 17:36:43 +08:00
sre-ci-robot	1e274384cd	[automated] Update Knowhere Commit (#38458 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-14 00:54:43 +08:00
Chun Han	c1f9158996	fix: search-group-by failed to get data from multi-chunked-segment(##… (#38383 ) related: #38343 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-12-13 16:54:43 +08:00
Ted Xu	3038383e36	fix: UT compile broken under osx (#38432 ) See: #38434 --------- Signed-off-by: Ted Xu <ted.xu@zilliz.com>	2024-12-13 16:24:43 +08:00
zhagnlu	efbfa1cc3e	fix:fix ut failed for debug (#38384 ) #38382 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-13 14:38:43 +08:00
sre-ci-robot	e404123e3e	[automated] Update Knowhere Commit (#38422 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-13 02:40:44 +08:00
cqy123456	b14a0c4bf5	fix:GrowingDataGetter get the wrong string data (#38015 ) issue: https://github.com/milvus-io/milvus/issues/37994 2.4 pr: https://github.com/milvus-io/milvus/pull/37995 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-12-12 14:50:42 +08:00
Gao	994fc544e7	enhance: support iterative filter execution (#37363 ) issue: #37360 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2024-12-11 11:32:44 +08:00
zhagnlu	9ef76971ce	fix:add more info to local chunk manager log (#38357 ) #37944 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-11 10:24:49 +08:00
zhagnlu	32f575be0f	enhance: change bitmap index mmap mode to view mode (#38179 ) #38138 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-12-08 17:26:41 +08:00
Xianhui Lin	6d0a4fdb31	fix: Fix bug for Search fails with filter expression contains underscore (#38085 ) Enhance the matching for elements within the UnaryRangeArray https://github.com/milvus-io/milvus/issues/38068 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com>	2024-12-05 10:18:40 +08:00
tinswzy	262f6db3d8	enhance: Add mmap file usage metric (#38193 ) issue: #38156 Add mmap file usage metric Signed-off-by: tinswzy <zhenyuan.wei@zilliz.com>	2024-12-04 16:12:47 +08:00
aoiasd	87aa9a0f2d	fix: empty analyzer params not use standard tokenizer (#38148 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-12-04 14:58:39 +08:00
sre-ci-robot	3445b8bd44	[automated] Update Knowhere Commit (#38192 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-04 02:30:39 +08:00
congqixia	767b7e6218	enhance: Use fdopen, fwrite to reduce direct syscall (#38157 ) `File.Write` and `File.WriteInt` use `write`, which may be just direct syscall in some systems. When mappding field data and write line by line, this could cost lost of CPU time when the row number is large. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-12-03 15:24:39 +08:00
Bingyi Sun	90064cd47b	fix: Fix variable redeclaration in term filter (#38045 ) https://github.com/milvus-io/milvus/issues/38046 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-02 15:10:38 +08:00
Zhen Ye	c6dcef7b84	enhance: move segcore codes of segment into one package (#37722 ) issue: #33285 - move most cgo opeartions related to search/query into segcore package for reusing for streamingnode. - add go unittest for segcore operations. Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-29 10:22:36 +08:00
sre-ci-robot	0894ed0016	[automated] Update Knowhere Commit (#38082 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-29 01:08:36 +08:00
Bingyi Sun	e6af806a0d	enhance: optimize self defined rust error (#37975 ) Prepare for issue: https://github.com/milvus-io/milvus/issues/37930 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-28 20:30:36 +08:00
congqixia	cb6542339e	enhance: Mark cgo thread with tag name (#38000 ) Related to #37999 This PR add `SetThreadName` API for marking cgo thread and utilize it when initializing cgo worker. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-26 11:22:35 +08:00
Zhen Ye	fbb68ca370	enhance: make all index operation async scheduled by tokio (#37946 ) issue: #37851 related pr: https://github.com/milvus-io/tantivy/pull/3 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-25 10:12:34 +08:00
sre-ci-robot	ed73dfca3f	[automated] Update Knowhere Commit (#37965 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-25 01:28:32 +08:00
zhagnlu	62af24c1a1	fix: change search latency metric from us unit to ms unit (#37806 ) #37805 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-11-24 17:26:33 +08:00
Bingyi Sun	700a448a54	fix: Escape prefix before search in inverted index (#37925 ) issue: https://github.com/milvus-io/milvus/issues/37912 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-22 14:10:33 +08:00
Bingyi Sun	06d73cf2e2	enhance: Remove raw tokenizer register. (#37886 ) tantivy already register raw tokenizer by default Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-22 12:02:32 +08:00
Zhen Ye	1dc1a97e65	fix: use different thread pool for scheduler and merger (#37911 ) issue: #37895 related pr: https://github.com/milvus-io/tantivy/pull/2 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-21 21:34:33 +08:00
Zhen Ye	f3a36f8a29	fix: use global pool but not dedicated pool for every index (#37852 ) issue: #37851 - make a global thread pool at tantivy temporally. - set 1 but not 4 threads for inverted text index. Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-20 20:44:32 +08:00
cqy123456	8216345b07	enhance: reduce copy of bitset and id conversion of brurtforce search (#37675 ) issue: https://github.com/milvus-io/milvus/issues/37798 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2024-11-19 15:48:40 +08:00
Bingyi Sun	6b82320953	fix: Fix using wrong upperbound when searching by pk (#37769 ) issue: https://github.com/milvus-io/milvus/issues/37649 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-19 10:16:31 +08:00
smellthemoon	3d28d99411	fix: to use the correct offset in span (#37780 ) #37734 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-18 21:56:30 +08:00
aoiasd	16e206167c	enhance: analyzer length filter max should be close interval instead open interval (#37770 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-18 19:30:31 +08:00
aoiasd	e9391acf80	fix: bm25 brute force search need index params k1 and b (#37721 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-18 15:44:31 +08:00
Zhen Ye	3f1614e9d9	enhance: add trace_id into segcore logs (#37656 ) issue: #37655 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-18 10:20:30 +08:00
aoiasd	3b5a0df159	enhance: Optimize chinese analyzer and support CnAlphaNumFilter (#37727 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-16 10:28:30 +08:00
foxspy	0ba868ae64	enhance: update knowhere version (#37730 ) release note draft : https://github.com/zilliztech/knowhere/releases/tag/v2.5.0 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-16 10:08:30 +08:00
smellthemoon	7999367c0c	fix: use not retried err when get wrong parameter (#37707 ) #37508 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-15 19:14:30 +08:00
zhagnlu	e4b6773d0a	fix: fix create text index dir conflict bug (#37693 ) #37623 Signed-off-by: luzhang <luzhang@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2024-11-15 18:26:30 +08:00
Bingyi Sun	65d3c6622a	enhance: Optimize GetChunkIDByOffset and add ut (#37704 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-15 14:16:31 +08:00
Bingyi Sun	d1596297d9	fix: Fix query failure with inverted index (#37686 ) https://github.com/milvus-io/milvus/issues/37649 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-15 10:28:31 +08:00
foxspy	5ae347aba0	enhance: update knowhere version (#37688 ) issue: #37665 #37631 #37620 #37587 #36906 knowhere has add default nlist value, so some invalid param test ut with no nlist param will be valid. Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-15 10:10:31 +08:00
Bingyi Sun	1b4f7e3ac1	enhance: Add more expr ut for chunked segment (#37600 ) related pr: #37570 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-14 18:40:32 +08:00
aoiasd	993051bb49	fix: brute force bm25 search lack avgdl param (#37650 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-14 14:58:31 +08:00
Buqian Zheng	0565300b7f	fix: Sparse to use CC index as growing/temp index (#37591 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-11-14 10:54:31 +08:00
aoiasd	1c5b5e1e3d	feat: Add chinese and english analyzer with refactor jieba tokenizer (#37494 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-14 10:34:31 +08:00
foxspy	cf883b114e	enhance: update knowhere version (#37510 ) issue: #36925 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-13 16:36:27 +08:00
smellthemoon	3389a6b500	enhance: support null in text match index (#37517 ) #37508 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-13 11:08:29 +08:00
Zhen Ye	3c225e5c94	fix: data race when using fields_ (#37612 ) issue: #37609 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-13 04:06:30 +08:00
Chun Han	2d29dcd30c	enhance:refine group_strict_size parameter(#37482 ) (#37483 ) related: #37482 Signed-off-by: MrPresent-Han <chun.han@gmail.com> Co-authored-by: MrPresent-Han <chun.han@gmail.com>	2024-11-12 09:56:28 +08:00
Bingyi Sun	c1eccce2fa	enhance: enable multiple chunked segment by default (#37570 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-12 09:20:28 +08:00
aoiasd	12951f0abb	enhance: rename tokenizer to analyzer and check analyzer params (#37478 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-10 16:12:26 +08:00
Bingyi Sun	40ba5a3414	fix: fix chunked segment term filter expression and add ut (#37392 ) issue: https://github.com/milvus-io/milvus/issues/37143 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-07 11:04:19 -08:00
congqixia	5310d3469f	fix: Escape brace of dumped JSON for index err message (#37504 ) Related to #37503 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2024-11-07 18:58:25 +08:00
smellthemoon	9b6dd23f8e	fix: wrong path spelling when use rootpath in segcore (#37453 ) #36532 Signed-off-by: lixinguo <xinguo.li@zilliz.com> Co-authored-by: lixinguo <xinguo.li@zilliz.com>	2024-11-07 11:22:25 +08:00
aoiasd	d67853fa89	feat: Tokenizer support build with params and clone for concurrency (#37048 ) relate: https://github.com/milvus-io/milvus/issues/35853 https://github.com/milvus-io/milvus/issues/36751 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-06 17:48:24 +08:00
cai.zhang	625b6176cd	fix: Search for pk using raw data to reduce the overhead caused by views (#37202 ) issue: #37152 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2024-11-05 20:36:24 +08:00

... 3 4 5 6 7 ...

2126 Commits (master)