milvus

Commit Graph

Author	SHA1	Message	Date
Tianx	2c0c5ef41e	feat: timestamptz expression & index & timezone (#44080 ) issue: https://github.com/milvus-io/milvus/issues/27467 >My plan is as follows. >- [x] M1 Create collection with timestamptz field >- [x] M2 Insert timestamptz field data >- [x] M3 Retrieve timestamptz field data >- [x] M4 Implement handoff >- [x] M5 Implement compare operator >- [x] M6 Implement extract operator >- [x] M8 Support database/collection level default timezone >- [x] M7 Support STL-SORT index for datatype timestamptz --- The third PR of issue: https://github.com/milvus-io/milvus/issues/27467, which completes M5, M6, M7, M8 described above. ## M8 Default Timezone We will be able to use alter_collection() and alter_database() in a future Python SDK release to modify the default timezone at the collection or database level. For insert requests, the timezone will be resolved using the following order of precedence: String Literal-> Collection Default -> Database Default. For retrieval requests, the timezone will be resolved in this order: Query Parameters -> Collection Default -> Database Default. In both cases, the final fallback timezone is UTC. ## M5: Comparison Operators We can now use the following expression format to filter on the timestamptz field: - `timestamptz_field [+/- INTERVAL 'interval_string'] {comparison_op} ISO 'iso_string' ` - The interval_string follows the ISO 8601 duration format, for example: P1Y2M3DT1H2M3S. - The iso_string follows the ISO 8601 timestamp format, for example: 2025-01-03T00:00:00+08:00. - Example expressions: "tsz + INTERVAL 'P0D' != ISO '2025-01-03T00:00:00+08:00'" or "tsz != ISO '2025-01-03T00:00:00+08:00'". ## M6: Extract We will be able to extract sepecific time filed by kwargs in a future Python SDK release. The key is `time_fields`, and value should be one or more of "year, month, day, hour, minute, second, microsecond", seperated by comma or space. Then the result of each record would be an array of int64. ## M7: Indexing Support Expressions without interval arithmetic can be accelerated using an STL-SORT index. However, expressions that include interval arithmetic cannot be indexed. This is because the result of an interval calculation depends on the specific timestamp value. For example, adding one month to a date in February results in a different number of added days than adding one month to a date in March. --- After this PR, the input / output type of timestamptz would be iso string. Timestampz would be stored as timestamptz data, which is int64_t finally. > for more information, see https://en.wikipedia.org/wiki/ISO_8601 --------- Signed-off-by: xtx <xtianx@smail.nju.edu.cn>	2025-09-23 10:24:12 +08:00
Gao	539f17f1ad	enhance: tiered index updates (#44433 ) issue: #42032 #44212 - special case for warmup param and cell storage size for tiered index - add a config to enable/disable storage usage tracking --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-22 21:34:11 +08:00
sthuang	edd250ffef	fix: [StorageV2] force virtual host for oss and cos (#44484 ) related: #44481 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-22 16:58:11 +08:00
Gao	d3784c6515	enhance: add storage resource usage for vector search (#44308 ) issue: #44212 Implement search/query storage usage statistics in go side(result reduce), only record storage usage in vector search C++ path. Need to be implemented in query c++ path in next prs. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com>	2025-09-19 20:20:02 +08:00
sangheee	bed94fc061	feat: support grpc tokenizer (#41994 ) relate: https://github.com/milvus-io/milvus/issues/41035 This PR adds support for a gRPC-based tokenizer. - The protobuf definition was added in [milvus-proto#445](https://github.com/milvus-io/milvus-proto/pull/445). - Based on this, the corresponding Rust client code was generated and added under `tantivi-binding`. - The generated file is `milvus.proto.tokenizer.rs`. I'm not very experienced with Rust, so there might be parts of the code that could be improved. I’d appreciate any suggestions or improvements. --------- Signed-off-by: park.sanghee <park.sanghee@navercorp.com>	2025-09-19 17:40:01 +08:00
congqixia	7b83314bf3	enhance: [StorageV2] Make datanode use non-singleton fs (#44418 ) Related to #39173 According to the current design, datanode shall create fs from storage config in request instead of using singleton fs. This PR upgrade milvus-storage and make packed reader/writer compose new fs from storage config. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-18 20:06:00 +08:00
congqixia	aa861f55e6	enhance: [StorageV2] Reverts #44232 bucket name change (#44390 ) Related to #39173 - Put bucket name concatenation logic back for azure support This reverts commit `8f97eb355f`. Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-16 10:10:00 +08:00
sparknack	060fc61e80	fix: milvus-common commits update (#44339 ) issue: #41435 related: #44268 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-12 12:43:57 +08:00
aoiasd	fb58701cbb	enhance: update rust version (#44322 ) relate: https://github.com/milvus-io/milvus/issues/44321 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-12 10:53:57 +08:00
cqy123456	f5c6138793	enhance: update knowhere version (#44294 ) issue: https://github.com/milvus-io/milvus/issues/42937 Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>	2025-09-11 11:21:56 +08:00
sparknack	e821468d2a	fix: milvus-common commit update (#44304 ) issue: #41435 related: #44268 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-11 10:19:56 +08:00
sparknack	4a01c726f3	enhance: cachinglayer: some metric and params update (#44276 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-09-10 11:03:57 +08:00
sthuang	dfc2335144	enhance: [StorageV2] storage file system error messages (#44255 ) related: https://github.com/milvus-io/milvus/issues/44138 bump milvus storage version, include the followings: * https://github.com/milvus-io/milvus-storage/pull/243 * https://github.com/milvus-io/milvus-storage/pull/240 * https://github.com/milvus-io/milvus-storage/pull/245 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-09-09 19:37:56 +08:00
aoiasd	92fedb8280	enhance: forbid panic when tantivy index path not exist (#44135 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-09-08 15:21:56 +08:00
congqixia	8f97eb355f	enhance: [StorageV2] Make bucket name concatenation transparent to user (#44232 ) Related to #39173 This PR: - Bump milvus-storage commit to handle bucket name concatenation logic in multipart s3 fs - Remove all user-side bucket name concatenation code Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-09-08 10:15:55 +08:00
Gao	2e98cb0103	enhance: load resource estimation for tiered index (#44171 ) issue: https://github.com/milvus-io/milvus/issues/42032 - Use bytes to estimate load resource in the whole estimation procedure - Add num_rows and dim info for vector index to better estimate - Disable eviction for tiered index's meta --------- Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-09-04 19:41:53 +08:00
foxspy	d55bf49bf1	enhance: update knowhere version (#44144 ) issue: #42937 --------- Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-09-03 01:31:53 +08:00
sparknack	70c8114e85	enhance: cachinglayer: resource management for segment loading (#43846 ) issue: #41435 --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-08-29 11:37:50 +08:00
XuanYang-cn	37a447d166	feat: Add CMEK cipher plugin (#43722 ) 1. Enable Milvus to read cipher configs 2. Enable cipher plugin in binlog reader and writer 3. Add a testCipher for unittests 4. Support pooling for datanode 5. Add encryption in storagev2 See also: #40321 Signed-off-by: yangxuan <xuan.yang@zilliz.com> --------- Signed-off-by: yangxuan <xuan.yang@zilliz.com>	2025-08-27 11:15:52 +08:00
Spade A	90a7e63665	enhance: collect doc_id from posting list directly for text match (#43899 ) issue: https://github.com/milvus-io/milvus/issues/43898 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-08-27 10:39:52 +08:00
Gao	e97a618630	enhance: support readAt interface for remote input stream (#43997 ) #42032 Also, fix the cacheoptfield method to work in storagev2. Also, change the sparse related interface for knowhere version bump #43974 . Also, includes https://github.com/milvus-io/milvus/pull/44046 for metric lost. --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Signed-off-by: marcelo.chen <marcelo.chen@zilliz.com> Signed-off-by: Congqi Xia <congqi.xia@zilliz.com> Co-authored-by: marcelo.chen <marcelo.chen@zilliz.com> Co-authored-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-26 11:19:58 +08:00
Gao	b602b4187d	enhance: upgrade aws-sdk from 1.9.234 to 1.11.352 (#43916 ) issue: #43908 Signed-off-by: chasingegg <chao.gao@zilliz.com>	2025-08-19 11:11:45 +08:00
foxspy	647c2bca2d	enhance: Support streaming read and write of vector index files (#43824 ) issue: #42032 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-08-15 23:41:43 +08:00
sthuang	5e4eb4a6e0	enhance: [StorageV2] bump storage version (#43871 ) related: https://github.com/milvus-io/milvus/issues/43869 bump storage version. include the following feature: * https://github.com/milvus-io/milvus-storage/pull/231 * https://github.com/milvus-io/milvus-storage/pull/232 * https://github.com/milvus-io/milvus-storage/pull/233 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-08-15 17:59:43 +08:00
Gao	81a0915c29	enhance: add milvus-common module to decouple knwhere & segcore (#43624 ) issue: https://github.com/milvus-io/milvus/issues/42032 https://github.com/milvus-io/milvus/issues/41435 based on pr: https://github.com/milvus-io/milvus/pull/42124 --------- Signed-off-by: chasingegg <chao.gao@zilliz.com> Co-authored-by: xianliang.li <xianliang.li@zilliz.com>	2025-08-11 14:09:42 +08:00
congqixia	1561a4ae8c	enhance: [StorageV2] Avoid create local parent dir if fs remote (#43790 ) Related to #43752 milvus-storage pr: milvus-io/milvus-storage#230 Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-08-08 10:19:40 +08:00
aoiasd	4f02b06abc	enhance: support set lindera dict build dir and download url in yaml (#43541 ) relate: https://github.com/milvus-io/milvus/issues/43120 --------- Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-08-04 09:47:38 +08:00
sparknack	4aabe23a45	enhance: update flat_hash_map.hpp to a modified version (#43506 ) issue: #41435 Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-31 20:09:36 +08:00
sthuang	a2c7ed2780	fix: [StorageV2] sort field binlogs paths for packed reader and writer (#43585 ) key changes: * fix unstable storage v2 compaction unit test by guaranteeing the order of paths during sync. * bump milvus-storage version, include https://github.com/milvus-io/milvus-storage/pull/222 https://github.com/milvus-io/milvus-storage/pull/223 https://github.com/milvus-io/milvus-storage/pull/224 https://github.com/milvus-io/milvus-storage/pull/225 https://github.com/milvus-io/milvus-storage/pull/226 * Also fix the below related oom issue. related: https://github.com/milvus-io/milvus/issues/43310 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-30 08:09:36 +08:00
foxspy	d57890449f	enhance: update knowhere version (#43528 ) issue: #42937 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-29 17:21:36 +08:00
aoiasd	c9412434c8	enhance: add char group tokenizer (#42793 ) relate: https://github.com/milvus-io/milvus/issues/42792 Add char group tokenizer which support use costum char group or use some build-in char group as delimiters. Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-29 11:11:35 +08:00
sthuang	f77571d5c1	fix: [StorageV2] file writer write row group split to default size (#43471 ) Bumped milvus storage version. related: https://github.com/milvus-io/milvus/issues/43310 * https://github.com/milvus-io/milvus-storage/pull/213 * https://github.com/milvus-io/milvus-storage/pull/217 * https://github.com/milvus-io/milvus-storage/pull/220 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-07-22 09:52:52 +08:00
aoiasd	e9fc140eaf	fix: jieba tokenizer cause panic when dict word was empty string (#43337 ) relate: https://github.com/milvus-io/milvus/issues/42779 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 16:34:53 +08:00
aoiasd	c7b53ed43b	enhance: run rust format (#43447 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-21 14:12:53 +08:00
aoiasd	f7e1f1c382	enhance: support download lindera system dictionary online (#43121 ) relate: https://github.com/milvus-io/milvus/issues/43120 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-07-20 23:24:52 +08:00
Spade A	42ad786f75	fix: update tantivy for fixing dir removing race condition (#43399 ) fix: https://github.com/milvus-io/milvus/issues/43258 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-18 15:44:56 +08:00
Spade A	8612a2c946	enhance: optimize in by batch-in (#43268 ) fix: https://github.com/milvus-io/milvus/issues/43267 --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-17 19:40:52 +08:00
sparknack	9b4081e110	enhance: cachinglayer: some performance optimization (#42858 ) issue: #41435 We compared the performance using the modified test_sealed.cpp, which randomly accesses all rows in all chunks and counts the number of runs within 3s. ## performance data comparison (ops/second) chunk config: 1x1000 \| Field Type \| w/o cachinglayer (commit `640f526301`) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 82428 \| -63.6% (29983) \| +2.7% (84675) \| \| Int8 field \| 82228 \| -63.3% (30166) \| +2.4% (84163) \| \| Int16 field \| 82572 \| -63.8% (29867) \| +1.8% (84036) \| \| Int32 field \| 82797 \| -63.7% (30031) \| +1.5% (84043) \| \| Int64 field \| 81077 \| -62.9% (30107) \| +0.6% (81604) \| \| Float field \| 82678 \| -63.4% (30266) \| +1.8% (84146) \| \| Double field \| 81925 \| -63.4% (29974) \| +0.2% (82097) \| \| Varchar field \| 19933 \| -19.6% (16027) \| +18.9% (23690) \| \| JSON field \| 16519 \| -96.8% (533) \| +2.5% (16927) \| \| Int array field \| 7325 \| -13.7% (6321) \| -1.4% (7220) \| \| Long array field \| 6347 \| -8.9% (5781) \| -0.1% (6344) \| \| Bool array field \| 8275 \| -14.0% (7116) \| +0.4% (8311) \| \| String array field \| 2281 \| -5.0% (2168) \| +0.2% (2287) \| \| Double array field \| 6427 \| -13.3% (5574) \| -2.0% (6301) \| \| Float array field \| 7291 \| -13.0% (6346) \| -1.5% (7183) \| \| Vector field \| 27487 \| -40.4% (16371) \| -4.7% (26192) \| \| Float16 vector field \| 49773 \| -54.6% (22601) \| -5.9% (46834) \| \| BFloat16 vector field \| 49783 \| -53.1% (23350) \| -5.7% (46934) \| \| Int8 vector field \| 63871 \| -59.0% (26179) \| -6.2% (59926) \| --- chunk config: 10x1000 \| Field Type \| w/o cachinglayer (commit `640f526301`) \| w/ cachinglayer \| w/ cachinglayer + opt \| \|---\|---\|---\|---\| \| Bool field \| 3659 \| -48.6% (1879) \| +110.1% (7686) \| \| Int8 field \| 3410 \| -45.3% (1864) \| +123.9% (7636) \| \| Int16 field \| 3647 \| -48.6% (1874) \| +110.1% (7661) \| \| Int32 field \| 3647 \| -48.8% (1866) \| +109.6% (7645) \| \| Int64 field \| 3645 \| -48.9% (1863) \| +107.8% (7573) \| \| Float field \| 3647 \| -49.0% (1861) \| +109.5% (7639) \| \| Double field \| 3640 \| -45.1% (1998) \| +108.4% (7586) \| \| Varchar field \| 1594 \| -23.9% (1213) \| +20.6% (1922) \| \| JSON field \| 1202 \| -26.5% (884) \| +16.1% (1396) \| \| Int array field \| 602 \| -12.3% (528) \| +12.7% (678) \| \| Long array field \| 529 \| -12.2% (465) \| +7.5% (569) \| \| Double array field \| 537 \| -13.0% (467) \| +6.4% (571) \| \| Vector field \| 1520 \| -37.9% (943) \| -5.5% (1437) \| \| Float16 vector field \| 2607 \| -47.0% (1382) \| +6.4% (2774) \| \| BFloat16 vector field \| 2586 \| -46.5% (1383) \| +8.8% (2813) \| \| Int8 vector field \| 3101 \| -47.3% (1633) \| +41.9% (4400) \| --------- Signed-off-by: Shawn Wang <shawn.wang@zilliz.com>	2025-07-17 11:20:51 +08:00
foxspy	58a9e49066	enhance: update knowhere version (#43331 ) issue: #42937 #43294 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-16 15:04:50 +08:00
Spade A	db91d85dbc	feat: more types of matches for ngram (#43081 ) Ref https://github.com/milvus-io/milvus/issues/42053 This PR enable ngram to support more kinds of matches such as prefix and postfix match. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-07-14 20:34:50 +08:00
foxspy	8171a2a0b5	enhance: update knowhere version (#43246 ) issue: #42937 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-07-14 11:06:49 +08:00
Spade A	26ec841feb	feat: optimize `Like` query with n-gram (#41803 ) Ref #42053 This is the first PR for optimizing `LIKE` with ngram inverted index. Now, only VARCHAR data type is supported and only InnerMatch LIKE (%xxx%) query is supported. How to use it: ``` milvus_client = MilvusClient("http://localhost:19530") schema = milvus_client.create_schema() ... schema.add_field("content_ngram", DataType.VARCHAR, max_length=10000) ... index_params = milvus_client.prepare_index_params() index_params.add_index(field_name="content_ngram", index_type="NGRAM", index_name="ngram_index", min_gram=2, max_gram=3) milvus_client.create_collection(COLLECTION_NAME, ...) ``` min_gram and max_gram controls how we tokenize the documents. For example, for min_gram=2 and max_gram=4, we will tokenize each document with 2-gram, 3-gram and 4-gram. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com> Signed-off-by: SpadeA-Tang <tangchenjie1210@gmail.com>	2025-07-01 10:08:44 +08:00
foxspy	be05b653c1	enhance: update knowhere version (#42938 ) issue: #42937 Signed-off-by: xianliang.li <xianliang.li@zilliz.com>	2025-06-26 01:22:41 +08:00
sthuang	ad6d620e9f	fix: [StorageV2] Compiling debug mode throw DCHECK s3 initialize error (#42922 ) related: https://github.com/milvus-io/milvus/issues/42844 Signed-off-by: shaoting-huang <shaoting.huang@zilliz.com>	2025-06-24 19:30:41 +08:00
Spade A	50f7579d8f	fix: fix some bugs discovered by chaos tests (#42906 ) fix: https://github.com/milvus-io/milvus/issues/42870 This PR fixes: 1. SetBitset fn shuold consider growing segments with concurrent write 2. avoid using from_raw_parts directly --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-24 16:32:42 +08:00
Spade A	e15926b40c	enhance: optimize tantivy cargo config (#42880 ) fix: https://github.com/milvus-io/milvus/issues/42879 Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-20 16:17:49 +08:00
aoiasd	43a9f7a79e	enhance: Add and run rust format command in makefile (#42807 ) relate: https://github.com/milvus-io/milvus/issues/42806 Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-20 10:22:39 +08:00
Spade A	e2c85eec81	fix: load stats index based on mmap config (#42788 ) ref https://github.com/milvus-io/milvus/issues/42626 This PR makes text match index and json key stats index be loaded based on mmap config. --------- Signed-off-by: SpadeA <tangchenjie1210@gmail.com>	2025-06-19 10:10:39 +08:00
aoiasd	d49989345b	enhance: forbid regex filter clone regex for each streamer (#42781 ) Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>	2025-06-18 16:10:39 +08:00
congqixia	f01ff57f3f	fix: [StorageV2] Use correct offset filling null bitmap (#42774 ) Related to #39173 `null_bitmap_data()` returns raw pointer of null bitmap of Array. While after slicing, this bitmap is not rewritten due to zero copy implementation, so the current start pos maybe non-zero while FillFieldData generating column `valid_data` array. This PR add `offset` param for `FillFieldData` method, and force all invocation pass correct offset of `null_bitmap_data` ptr. Also update milvus-storage commit fixing reader failed to return data when buffer size smaller than row group size problem. --------- Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>	2025-06-17 10:08:38 +08:00

1 2 3 4 5 ...

391 Commits (master)