milvus

Commit Graph

Author	SHA1	Message	Date
Bingyi Sun	fbf5cb4e62	feat: Add json flat index (#39917 ) issue: https://github.com/milvus-io/milvus/issues/35528 This PR introduces a JSON flat index that allows indexing JSON fields and dynamic fields in the same way as other field types. In a previous PR (#36750), we implemented a JSON index that requires specifying a JSON path and casting a type. The only distinction lies in the json_cast_type parameter. When json_cast_type is set to JSON type, Milvus automatically creates a JSON flat index. For details on how Tantivy interprets JSON data, refer to the [tantivy documentation](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#pitfalls-limitation-and-corner-cases). Limitations Array handling: Arrays do not function as nested objects. See the [limitations section](https://github.com/quickwit-oss/tantivy/blob/main/doc/src/json.md#arrays-do-not-work-like-nested-object) for more details. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-06-10 19:14:35 +08:00
cqy123456	317bbfbf81	enhance: milvus support minhash vector and mhjaccard metric (#42036 ) issue: https://github.com/issues/assigned?issue=milvus-io%7Cmilvus%7C41746 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-06-10 14:38:34 +08:00
aoiasd	fd6e2b52ff	enhance: use english name as language name for all type language identifier (#42600 ) Set whatlang detect return language name as english name. Make sure same with lingua. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-10 10:24:35 +08:00
aoiasd	6e16653597	fix: update tantivy commit version to fix stemmer panic (#42171 ) relate: https://github.com/milvus-io/milvus/issues/42168 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-09 10:34:33 +08:00
foxspy	3dbad0306a	fix: Add bypass thread pool mode to avoid growing indexes blocking insert/load (#41012 ) issue: #40825 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-20 14:30:24 +08:00
congqixia	a22088a380	enhance: [StorageV2] Make packed reader use correct path (#41919 ) Related to #39173 This PR - Use updated path with bucketName for packedReader - Update milvus-storage commit to report reader/writer initialization failure, see also milvus-io/milvus-storage#192 --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-20 10:36:23 +08:00
congqixia	3bbc0fa560	enhance: [StorageV2] update storage to pass endpoint as-is (#41889 ) Related to milvus-io/milvus-storage#190 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-16 18:06:21 +08:00
Buqian Zheng	b0260d8676	feat: manual evict cache after built interim index (#41836 ) issue: https://github.com/milvus-io/milvus/issues/41435 this PR also makes HasRawData of ChunkedSegmentSealedImpl to return based on metadata, without needing to load the cache just to answer this simple question. --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-05-16 16:34:23 +08:00
congqixia	a6d09ff4cd	enhance: [StorageV2] fix issues integrating basic RW operations (#41834 ) Related to #39173 This PR: - Upgrade milvus-storage commit to fix filesystem finalized issue - Add bucket-name as prefix for all fs style access io - Initial arrow fs on querynodes startup - Fix timestamp access when loading sealed segment --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-05-15 09:52:23 +08:00
foxspy	358bc150df	enhance: add force rebuild index configuration (#41473 ) issue: #41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-14 10:52:21 +08:00
foxspy	e2ddbe4962	feat: add cachinglayer to index (#41653 ) issue: #41435 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-05-08 10:12:54 +08:00
Bingyi Sun	0dee3ccfd7	enhance: Make user specified doc id selectable for tantivy index writer (#41528 ) issue: https://github.com/milvus-io/milvus/issues/41527 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-05-07 10:48:53 +08:00
foxspy	1d99f8bd67	enhance: add force rebuild index configuration (#41473 ) issue: #41431 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-04-29 16:20:56 +08:00
Spade A	910f68c986	fix: update tantivy to fix tantivy doc out of order when merge (#41596 ) issue: #41597 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:46:49 +08:00
Spade A	f35e8f7420	fix: fix arm64 compile issue (#41593 ) issue: https://github.com/milvus-io/milvus/issues/41059, https://github.com/milvus-io/milvus/issues/41510 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-29 13:19:25 +08:00
cai.zhang	640f526301	fix: Update current scalar index version to compatible tantivy different versions (#41141 ) issue: #40823 Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>	2025-04-27 20:44:39 +08:00
Spade A	f3d878ab3f	fix: update tantivy for fixing phrase match (#41450 ) issue: #41454 https://github.com/zilliztech/tantivy/pull/8 fixes the problem, this PR update the tantivy. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-24 10:52:37 +08:00
aoiasd	a16bd6263b	feat: support more lauguage for build in stop words and add remove punct, regex filter (#41412 ) relate: https://github.com/milvus-io/milvus/issues/41213 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-23 11:44:37 +08:00
aoiasd	11f2fae42e	feat: support extend default dict for jieba tokenizer (#41360 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 20:34:37 +08:00
aoiasd	110c5aaaf4	feat: support icu and language identifier tokenizer (#41214 ) relate: https://github.com/milvus-io/milvus/issues/41213 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-22 15:56:37 +08:00
aoiasd	f166843c5e	enhance: support use lindera tag filter (#40416 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-21 15:56:36 +08:00
Spade A	5b1430f27e	enhance: tantivy collector set bitset directly (#39748 ) fix: #39755 The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K: master: 2.62ms this PR: 2.11ms bench mark code: ``` TEST(TextMatch, TestPerf) { auto schema = GenTestSchema({}, true); auto seg = CreateSealedSegment(schema, empty_index_meta); int64_t N = 1000000; uint64_t seed = 19190504; auto raw_data = DataGen(schema, N, seed); auto str_col = raw_data.raw_->mutable_fields_data() ->at(1) .mutable_scalars() ->mutable_string_data() ->mutable_data(); for (int64_t i = 0; i < N - 1; i++) { str_col->at(i) = "hello"; } SealedLoadFieldData(raw_data, *seg); seg->CreateTextIndex(FieldId(101)); auto now = std::chrono::high_resolution_clock::now(); auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch); auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - now); std::cout << "TextMatch query time: " << duration.count() << "ms" << std::endl; } ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-20 23:02:41 +08:00
Spade A	62293cb582	fix: revert batch add (#41374 ) issue: #41375 todo: to fix the problems fixed in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-17 22:32:38 +08:00
sthuang	1f1c836fb9	feat: Storage v2 growing segment load (#41001 ) support parallel loading sealed and growing segments with storage v2 format by async reading row groups. related: #39173 --------- Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-16 17:14:33 +08:00
Spade A	70d13dcf61	enhance: update tantivy for removing "doc_id" fast field (#41198 ) Issue: #41210 After https://github.com/zilliztech/tantivy/pull/5, we can provide milvus row id directly to tantivy rather than record it in the fast field "doc_id". So rather than search tantivy doc id and then get milvus row id from "doc_id", now, the searched tantivy doc id is the milvus row id, eliminating the expensive acquiring row id phase. The following shows a simple benchmark where insert 1M docs where all rows are "hello", the latency is segcore level, CPU is 9900K: ![image](https://github.com/user-attachments/assets/d8e72134-56b5-430b-8628-36c3bed8eaad) The latency is 2.02 and 2.1 times respectively. bench mark code: ``` TEST(TextMatch, TestPerf) { auto schema = GenTestSchema({}, true); auto seg = CreateSealedSegment(schema, empty_index_meta); int64_t N = 1000000; uint64_t seed = 19190504; auto raw_data = DataGen(schema, N, seed); auto str_col = raw_data.raw_->mutable_fields_data() ->at(1) .mutable_scalars() ->mutable_string_data() ->mutable_data(); for (int64_t i = 0; i < N - 1; i++) { str_col->at(i) = "hello"; } SealedLoadFieldData(raw_data, *seg); seg->CreateTextIndex(FieldId(101)); auto now = std::chrono::high_resolution_clock::now(); auto expr = GetMatchExpr(schema, "hello", OpType::TextMatch); auto final = ExecuteQueryExpr(expr, seg.get(), N, MAX_TIMESTAMP); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::microseconds>(end - now); std::cout << "TextMatch query time: " << duration.count() << "ms" << std::endl; } ``` --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-15 20:20:32 +08:00
Spade A	9ce3e3cb44	enhance: add documents in batch for json key stats (#41228 ) issue: https://github.com/milvus-io/milvus/issues/40897 After this, the document add operations scheduling duration is decreased roughly from 6s to 0.9s for the case in the issue. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-11 14:08:26 +08:00
Xianhui Lin	3bc24c264f	enhance: Add json key inverted index in stats for optimization (#38039 ) Add json key inverted index in stats for optimization https://github.com/milvus-io/milvus/issues/36995 --------- Signed-off-by: Xianhui.Lin <xianhui.lin@zilliz.com> Co-authored-by: luzhang <luzhang@zilliz.com>	2025-04-10 15:20:28 +08:00
Spade A	e9fa30f462	fix: remove single segment logic in V7 (#41159 ) Ref: https://github.com/milvus-io/milvus/issues/40823 It does not make any sense to create single segment tantivy index for old version such as 2.4 by using tantivy V7. So, clean the relevant code. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-09 19:54:27 +08:00
sthuang	50e02e3598	enhance: update packed reader api (#41055 ) related: https://github.com/milvus-io/milvus/issues/39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-04-09 10:18:26 +08:00
Spade A	c6a0c2ab64	enhance: process tantivy document add by batch (#40124 ) issue: https://github.com/milvus-io/milvus/issues/40006 This PR make tantivy document add by batch. Add document by batch can greately reduce the latency of scheduling the document add operation (call tantivy `add_document` only schdules the add operation and it returns immediately after scheduled) , because each call involes a tokio block_on which is relatively heavy. Reduce scheduling part not necessarily reduces the overall latency if the index writer threads does not process indexing quickly enough. But if scheduling itself is pretty slow, even the index writer threads process indexing very fast (by increasing thread number), the overall performance can still be limited. The following codes bench the PR (Note, the duration only counts for scheduling without commit) ``` fn test_performance() { let field_name = "text"; let dir = TempDir::new().unwrap(); let mut index_wrapper = IndexWriterWrapper::create_text_writer( field_name, dir.path().to_str().unwrap(), "default", "", 1, 50_000_000, false, TantivyIndexVersion::V7, ) .unwrap(); let mut batch = vec![]; for i in 0..1_000_000 { batch.push(format!("hello{:04}", i)); } let batch_ref = batch.iter().map(\|s\| s.as_str()).collect::<Vec<_>>(); let now = std::time::Instant::now(); index_wrapper .add_data_by_batch(&batch_ref, Some(0)) .unwrap(); let elapsed = now.elapsed(); println!("add_data_by_batch elapsed: {:?}", elapsed); } ``` Latency roughly reduces from 1.4s to 558ms. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-08 19:50:24 +08:00
aoiasd	6f17720e4e	enhance: support use jieba tokenizer with costum dictionary (#39854 ) relate: https://github.com/milvus-io/milvus/issues/40168 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-04-08 14:52:27 +08:00
Spade A	e4da2765ba	enhance: process batch of strings within one tantivy_index_add_string call (#40007 ) issue: #40006 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-08 01:20:25 +08:00
Spade A	f552ec67dd	fix: support building tantivy index with low version(5) (#40822 ) fix: https://github.com/milvus-io/milvus/issues/40823 To solve the problem in the issue, we have to support building tantivy index with low version for those query nodes with low tantivy version. This PR does two things: 1. refactor codes for IndexWriterWrapper to make it concise 2. enable IndexWriterWrapper to build tantivy index by different tantivy crate --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-04-02 18:46:20 +08:00
aoiasd	384d39ef5a	enhance: not build lindera features by default and support make milvus with tantivy features (#40813 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-03-27 14:08:22 +08:00
sthuang	d7df78a6c9	feat: Storage v2 compaction (#40667 ) - Feat: Support Mix compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact with duplicate primary key test. - Compact with bm25 segments. - Compact with merge sort segments. - Compact with no expiration segments. - Compact with lack binlog segments. - Compact with nullable field segments. - Feat: Support Clustering compaction. Covering tests include compatibility and rollback ability. - Read v1 segments and compact with v2 format. - Read both v1 and v2 segments and compact with v2 format. - Read v2 segments and compact with v2 format. - Compact bm25 segments with v2 format. - Compact with memory limit. - Enhance: Use serdeMap serialize in BuildRecord function to support all Milvus data types. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-03-21 10:16:12 +08:00
aoiasd	92bdf7a0c1	enhance: support run anayser return detaild token (#40458 ) relate: https://github.com/milvus-io/milvus/issues/39705 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-03-19 15:48:15 +08:00
Zhen Ye	8db708f67d	enhance: enable memory prof based on jemalloc (#40731 ) issue: #40730 also see: https://github.com/milvus-io/cgosymbolizer/pull/2 After these PR, at linux: - the milvus will always enable jemalloc by default. - jemalloc will always compiled with --enable-prof options. - all image will always enable the jemalloc prof by default. - a pprof http service for jemalloc at `/debug/jemalloc/` will be registered into restful. - `jeprof` can remote profile the memory of milvus. Signed-off-by: chyezh <chyezh@outlook.com>	2025-03-19 14:46:18 +08:00
Spade A	001fc992df	enhance: get doc ids by batch (#40608 ) issue: #40607 tantivy change: https://github.com/zilliztech/tantivy/pull/3 Benchmarks: Test Envrioment: CPU 9900K The data is insert by: ``` for i in 0..N { for j in 0..UNIQUE { let key = format!("hello{}", j); index_writer.add_string(&key, i * UNIQUE + j).unwrap(); } } ``` So the unique influences the locality of the matched docs. The latency is the avg latency over 1000 repeate quries. The result shows 22.5%-34.8% latency reduction. ![image](https://github.com/user-attachments/assets/dd8af75a-ddc3-445d-92df-50d354dd5645) --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-03-14 15:48:09 +08:00
Bingyi Sun	0698d04f7d	enhance: Upgrade simdjson version (#40538 ) issue: https://github.com/milvus-io/milvus/issues/40519 simdjson returns better error code in newer version. Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-03-11 15:04:05 +08:00
sre-ci-robot	a6d4121034	[automated] Update Knowhere Commit (#40486 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-10 12:28:04 +08:00
sre-ci-robot	6a57a1973f	[automated] Update Knowhere Commit (#40283 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-03 01:11:58 +08:00
Micka	5cc104b412	fix: Change CMake variable for switch to knowhere-cuvs (#40105 ) issue: https://github.com/milvus-io/milvus/issues/39883 Signed-off-by: Mickael Ide <mide@nvidia.com>	2025-02-27 22:05:58 +08:00
sre-ci-robot	b2769fb357	[automated] Update Knowhere Commit (#40223 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-27 01:35:59 +08:00
aoiasd	38f1608910	enhance: pack analyzer code and support lindera tokenizer (#39660 ) relate: https://github.com/milvus-io/milvus/issues/39659 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-02-24 12:13:55 +08:00
sre-ci-robot	dd1347d041	[automated] Update Knowhere Commit (#40103 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-22 01:01:53 +08:00
sthuang	3eb3af5f08	feat: explicitly specify column groups for storage v2 api (#39790 ) * use the new packed reader and writer api to be compatible with current etcd meta * For the new packed writer API: column groups and paths are explicitly defined by users and won't split column groups by memory in storage v2. Packed writer follows the user-defined column groups to split arrow record and write into the corresponding file path. * For the new packed reader API: read paths are explicitly defined by users. related: #39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-02-21 22:03:54 +08:00
Spade A	d34d70582d	fix: fix misleading name _add_multi_ (#39997 ) fix: #39995 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-21 16:45:55 +08:00
sre-ci-robot	f0d3d98c3f	[automated] Update Knowhere Commit (#40063 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-21 01:19:54 +08:00
Spade A	52c7d7dd80	fix: offset combined with term should be based on Token positions in phrase match (#39931 ) fix: #39711 Unlike English sentence where each words are parsed exactly once and one after one with position length 1, one Chinese word may be parsed to multiple words with position length larger than 1. For example, "badminton and skiing" will be parsed to Token{ start: 0, length: 1, text: "badminton" }, Token{ start: 1, length: 1, text: "and" }, and Token{ start: 2, length: 1, text: "tennis" }. While for exmaple for Chinsese: "羽毛球和滑雪" may be parsed to Token{ start: 0, length: 2, text: "羽毛" }, Token{ start: 0, length: 3, text: "羽毛球" }, Token{ start: 3, length: 1, text: "和" }, and Token{ start: 4, length: 2, text: "滑雪" }. This PR fix that the code not recognizes this situation. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-18 20:38:51 +08:00
sre-ci-robot	61cc22354e	[automated] Update Knowhere Commit (#39898 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-16 01:32:13 +08:00
Bingyi Sun	b59555057d	feat: support json index (#36750 ) https://github.com/milvus-io/milvus/issues/35528 This PR adds json index support for json and dynamic fields. Now you can only do unary query like 'a["b"] > 1' using this index. We will support more filter type later. basic usage: ``` collection.create_index("json_field", {"index_type": "INVERTED", "params": {"json_cast_type": DataType.STRING, "json_path": 'json_field["a"]["b"]'}}) ``` There are some limits to use this index: 1. If a record does not have the json path you specify, it will be ignored and there will not be an error. 2. If a value of the json path fails to be cast to the type you specify, it will be ignored and there will not be an error. 3. A specific json path can have only one json index. 4. If you try to create more than one json indexes for one json field, sdk(pymilvus<=2.4.7) may return immediately because of internal implementation. This will be fixed in a later version. --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-15 14:06:15 +08:00
Spade A	f7d9587720	enhance: add tantivy collector for i64 (#39850 ) issue: #39852 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-02-14 15:50:15 +08:00
sre-ci-robot	ba03a435fb	[automated] Update Knowhere Commit (#39878 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-14 15:18:21 +08:00
Bingyi Sun	c13fc8cd19	enhance: update tantivy version (#39253 ) https://github.com/milvus-io/milvus/issues/39254 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-02-08 14:08:43 +08:00
sre-ci-robot	ba312427f2	[automated] Update Knowhere Commit (#39696 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-02-08 01:36:43 +08:00
Gao	c1794cc490	enhance: update knowhere version and IsAdditionalScalarSupported interface (#39573 ) Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-02-05 19:51:10 +08:00
sthuang	c4ae9f4ece	feat: introduce third-party milvus-storage (#39418 ) related: https://github.com/milvus-io/milvus/issues/39173 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-01-24 17:21:13 +08:00
Bingyi Sun	cb959cd1f9	enhance: upgrade rust version to 1.83 (#39295 ) #39254 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-20 11:15:03 +08:00
sre-ci-robot	fdb968d0ea	[automated] Update Knowhere Commit (#39420 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-20 01:17:02 +08:00
Spade A	8c4ba70a4c	fix: enable to build index with single segment (#39233 ) fix https://github.com/milvus-io/milvus/issues/39232 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-16 11:01:06 +08:00
sre-ci-robot	55dcac375c	[automated] Update Knowhere Commit (#39263 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-15 02:52:59 +08:00
Buqian Zheng	5e38f01e5b	enhance: update knowhere version (#39212 ) Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2025-01-14 10:21:05 +08:00
Spade A	032292a432	feat: support phrase match query (#38869 ) The relevant issue: https://github.com/milvus-io/milvus/issues/38930 --------- Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-01-12 20:24:58 +08:00
Bingyi Sun	f0cddfd160	fix: Fix panic caused by removing directory (#38622 ) https://github.com/milvus-io/milvus/issues/38604 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2025-01-06 10:54:54 +08:00
sre-ci-robot	11bfc93683	[automated] Update Knowhere Commit (#38993 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-01-04 01:16:53 +08:00
foxspy	af08b5b311	enhance: Update Knowhere version (#38942 ) Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-01-03 14:28:53 +08:00
Bingyi Sun	3822819942	enhance: Remove an undefined behavior in index writer (#38657 ) Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-31 10:42:52 +08:00
sre-ci-robot	407035c994	[automated] Update Knowhere Commit (#38641 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-22 00:44:46 +08:00
sre-ci-robot	cce25ecdbc	[automated] Update Knowhere Commit (#38635 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-21 00:50:46 +08:00
foxspy	06d410b70f	enhance: update knowhere version (#38544 ) related: #37730 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-12-18 16:52:45 +08:00
sre-ci-robot	ffd3c5d2f5	[automated] Update Knowhere Commit (#38542 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-18 01:24:47 +08:00
Bingyi Sun	3e2a2f278b	enhance: Handle rust error in c++ (#38113 ) https://github.com/milvus-io/milvus/issues/37930 --------- Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-12-16 19:40:45 +08:00
sre-ci-robot	1e274384cd	[automated] Update Knowhere Commit (#38458 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-14 00:54:43 +08:00
sre-ci-robot	e404123e3e	[automated] Update Knowhere Commit (#38422 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-13 02:40:44 +08:00
aoiasd	87aa9a0f2d	fix: empty analyzer params not use standard tokenizer (#38148 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-12-04 14:58:39 +08:00
sre-ci-robot	3445b8bd44	[automated] Update Knowhere Commit (#38192 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-12-04 02:30:39 +08:00
sre-ci-robot	0894ed0016	[automated] Update Knowhere Commit (#38082 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-29 01:08:36 +08:00
Bingyi Sun	e6af806a0d	enhance: optimize self defined rust error (#37975 ) Prepare for issue: https://github.com/milvus-io/milvus/issues/37930 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-28 20:30:36 +08:00
Zhen Ye	fbb68ca370	enhance: make all index operation async scheduled by tokio (#37946 ) issue: #37851 related pr: https://github.com/milvus-io/tantivy/pull/3 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-25 10:12:34 +08:00
sre-ci-robot	ed73dfca3f	[automated] Update Knowhere Commit (#37965 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-11-25 01:28:32 +08:00
Bingyi Sun	700a448a54	fix: Escape prefix before search in inverted index (#37925 ) issue: https://github.com/milvus-io/milvus/issues/37912 Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-22 14:10:33 +08:00
Bingyi Sun	06d73cf2e2	enhance: Remove raw tokenizer register. (#37886 ) tantivy already register raw tokenizer by default Signed-off-by: sunby <sunbingyi1992@gmail.com>	2024-11-22 12:02:32 +08:00
Zhen Ye	1dc1a97e65	fix: use different thread pool for scheduler and merger (#37911 ) issue: #37895 related pr: https://github.com/milvus-io/tantivy/pull/2 Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-21 21:34:33 +08:00
Zhen Ye	f3a36f8a29	fix: use global pool but not dedicated pool for every index (#37852 ) issue: #37851 - make a global thread pool at tantivy temporally. - set 1 but not 4 threads for inverted text index. Signed-off-by: chyezh <chyezh@outlook.com>	2024-11-20 20:44:32 +08:00
aoiasd	16e206167c	enhance: analyzer length filter max should be close interval instead open interval (#37770 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-18 19:30:31 +08:00
aoiasd	3b5a0df159	enhance: Optimize chinese analyzer and support CnAlphaNumFilter (#37727 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-16 10:28:30 +08:00
foxspy	0ba868ae64	enhance: update knowhere version (#37730 ) release note draft : https://github.com/zilliztech/knowhere/releases/tag/v2.5.0 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-16 10:08:30 +08:00
foxspy	5ae347aba0	enhance: update knowhere version (#37688 ) issue: #37665 #37631 #37620 #37587 #36906 knowhere has add default nlist value, so some invalid param test ut with no nlist param will be valid. Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-15 10:10:31 +08:00
aoiasd	1c5b5e1e3d	feat: Add chinese and english analyzer with refactor jieba tokenizer (#37494 ) relate: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-14 10:34:31 +08:00
foxspy	cf883b114e	enhance: update knowhere version (#37510 ) issue: #36925 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-13 16:36:27 +08:00
aoiasd	12951f0abb	enhance: rename tokenizer to analyzer and check analyzer params (#37478 ) relate: https://github.com/milvus-io/milvus/issues/35853 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-10 16:12:26 +08:00
aoiasd	d67853fa89	feat: Tokenizer support build with params and clone for concurrency (#37048 ) relate: https://github.com/milvus-io/milvus/issues/35853 https://github.com/milvus-io/milvus/issues/36751 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2024-11-06 17:48:24 +08:00
foxspy	c27f477b6c	enhance: Update Knowhere version (#37333 ) issue: #37269 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-11-04 11:56:31 +08:00
liliu-z	4bac2eb13e	enhance: Update Knowhere version (#37315 ) Signed-off-by: Li Liu <li.liu@zilliz.com>	2024-10-31 17:24:20 +08:00
foxspy	346510ed23	enhance: Update Knowhere version (#37000 ) Signed-off-by: foxspy <xian_hust@foxmail.com>	2024-10-21 11:39:26 +08:00
foxspy	3de57ec4fa	enhance: add vector index mgr to remove vector index type dependency (#36843 ) issue: #34298 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2024-10-17 22:15:25 +08:00
Buqian Zheng	9997c5de34	fix: remove excessive logging (#36859 ) issue: https://github.com/milvus-io/milvus/issues/35853 Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-10-16 10:47:22 +08:00
sre-ci-robot	e170991a10	[automated] Update Knowhere Commit (#36823 ) Update Knowhere Commit Signed-off-by: sre-ci-robot sre-ci-robot@users.noreply.github.com Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-10-13 01:21:20 +08:00
Min Tian	ef0c649bda	enhance: update knowhere version to support diskann iterator (#36813 ) issue: #36812 Signed-off-by: min.tian <min.tian.cn@gmail.com>	2024-10-12 18:05:22 +08:00
Buqian Zheng	f7b811450d	feat: add enable_tokenizer params to VarChar field (#36480 ) issue: #35922 add an enable_tokenizer param to varchar field: must be set to true so that a varchar field can enable_match or used as input of BM25 function --------- Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>	2024-10-10 20:33:21 +08:00

1 2 3 4 5 ...

391 Commits (master)