zhagnlu
804dd5409a
enhance: mark duplicated pk as deleted ( #34586 )
...
fix #34247
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-07-16 14:25:39 +08:00
zhagnlu
3030e4625e
enhance: refactor variable column to reduce memory cost ( #33875 )
...
#33874
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2024-06-30 20:16:06 +08:00
cqy123456
dc4437ff82
enhance: use segment id and type to register in MmapChunkManager and opt malloc in variableChunk ( #33993 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-20 17:42:02 +08:00
cqy123456
32f685ff12
enhance: growing segment support mmap ( #32633 )
...
issue: https://github.com/milvus-io/milvus/issues/32984
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
Chun Han
416a2cf507
fix: query iterator lack results( #33137 ) ( #33422 )
...
related: #33137
adding has_more_result_tag for various level's reduce to rectify
reduce_stop_for_best
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-05-30 17:51:44 +08:00
chyezh
e19d17076f
fix: delete may lost when enable lru cache, some field should be reset when ReleaseData ( #32012 )
...
issue: #30361
- Delete may be lost when segment is not data-loaded status in lru
cache. skip filtering to fix it.
- `stats_` and `variable_fields_avg_size_` should be reset when
`ReleaseData`
- Remove repeat load delta log operation in lru.
---------
Signed-off-by: chyezh <chyezh@outlook.com>
2024-04-16 11:17:20 +08:00
Cai Yudong
246586be27
enhance: Unify data type check APIs under internal/core ( #31800 )
...
Issue: #22837
Move and rename following C++ APIs:
datatype_sizeof() ==> GetDataTypeSize()
datatype_name() ==> GetDataTypeName()
datatype_is_vector() / IsVectorType() ==> IsVectorDataType()
datatype_is_variable() ==> IsVariableDataType()
datatype_is_sparse_vector() ==> IsSparseFloatVectorDataType()
datatype_is_string() / IsString() ==> IsDataTypeString()
datatype_is_floating() / IsFloat() ==> IsDataTypeFloat()
datatype_is_binary() ==> IsDataTypeBinary()
datatype_is_json() ==> IsDataTypeJson()
datatype_is_array() ==> IsDataTypeArray()
datatype_is_variable() == IsDataTypeVariable()
datatype_is_integer() / IsIntegral() ==> IsDataTypeInteger()
Signed-off-by: Cai Yudong <yudong.cai@zilliz.com>
2024-04-02 19:15:14 +08:00
Buqian Zheng
96cfae55a5
feat: [Sparse Float Vector] segcore to support sparse vector search and get raw vector by id ( #30629 )
...
This PR adds the ability to search/get sparse float vectors in segcore,
and added unit tests by modifying lots of existing tests into
parameterized ones.
https://github.com/milvus-io/milvus/issues/29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-12 09:16:30 -07:00
Buqian Zheng
070dfc77bf
feat: [Sparse Float Vector] segcore basics and index building ( #30357 )
...
This commit adds sparse float vector support to segcore with the
following:
1. data type enum declarations
2. Adds corresponding data structures for handling sparse float vectors
in various scenarios, including:
* FieldData as a bridge between the binlog and the in memory data
structures
* mmap::Column as the in memory representation of a sparse float vector
column of a sealed segment;
* ConcurrentVector as the in memory representation of a sparse float
vector of a growing segment which supports inserts.
3. Adds logic in payload reader/writer to serialize/deserialize from/to
binlog
4. Adds the ability to allow the index node to build sparse float vector
index
5. Adds the ability to allow the query node to build growing index for
growing segment and temp index for sealed segment without index built
This commit also includes some code cleanness, comment improvement, and
some unit tests for sparse vector.
https://github.com/milvus-io/milvus/issues/29419
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-11 14:45:02 +08:00
xige-16
e9fdd2475d
fix: fix searchPlan metricType modified concurrently ( #30227 )
...
issue: #30225
/kind bug
Signed-off-by: xige-16 <xi.ge@zilliz.com>
---------
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-26 14:03:09 +08:00
yah01
f542bdbf3c
enhance: calc the accurate mem size of segment ( #30093 )
...
this stats the real memory size of segment, also reduces the memory
usage in mmap mode
resolve #30095
Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-19 12:32:53 +08:00
xige-16
fa7cf587b0
enhance: Opt metric type does not match error message ( #29927 )
...
issue: #29791
/kind improvement
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-17 20:25:03 +08:00
xige-16
9702cef2b5
feat: Support multiple vector search ( #29433 )
...
issue #25639
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2024-01-08 15:34:48 +08:00
MrPresent-Han
9e2e7157e9
feat: support search_group_by for milvus( #25324 ) ( #28983 )
...
related: #25324
Search GroupBy function, used to aggregate result entities based on a
specific scalar column.
several points to mention:
1. Temporarliy, the whole groupby is implemented separated from
iterative expr framework **for the first period**
2. In the long term, the groupBy operation will be incorporated into the
iterative expr framework:https://github.com/milvus-io/milvus/pull/28166
3. This pr includes some unrelated mocked interface regarding alterIndex
due to some unworth-to-mention reasons. All these un-associated content
will be removed before the final pr is merged. This version of pr is
only for review
4. All other related details were commented in the files comparison
Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2024-01-05 15:50:47 +08:00
zhagnlu
a602171d06
enhance: Refactor runtime and expr framework ( #28166 )
...
#28165
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-12-18 12:04:42 +08:00
Bingyi Sun
36f69ea031
feat: integrate storagev2 in building index of segcore ( #28768 )
...
issue: https://github.com/milvus-io/milvus/issues/28655
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
2023-12-05 16:48:54 +08:00
yah01
f7d2ab6677
enhance: reduce 1x copy for variable length field while retrieving ( #28345 )
...
- Reduce 1x copy for varchar/string/JSON/array types while retrieving
- Reduce 1x copy for int8/int16 while retrieving
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-15 18:08:20 +08:00
yah01
93e2eb78c9
Delete only if primary keys exist ( #25292 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-09-20 19:03:25 +08:00
cai.zhang
a362bb1457
Support array datatype ( #26369 )
...
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2023-09-19 14:23:23 +08:00
Enwei Jiao
c3f15c6b95
Refactor duplicate error class into one place ( #26985 )
...
Signed-off-by: Enwei Jiao <enwei.jiao@zilliz.com>
2023-09-11 20:43:17 +08:00
yah01
ba882b49b6
Optimize query/search on growing segment while output vector field ( #26542 )
...
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-08-24 09:46:24 +08:00
Jiquan Long
bafb183a2b
Optimize bitset usage ( #26096 )
...
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-08-03 15:25:09 +08:00
Jiquan Long
5c1f79dc54
Push down the limit operator to segcore ( #25959 )
...
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
2023-08-01 20:29:05 +08:00
foxspy
31173727b2
growing segment index memory opt & get vector bugfix ( #25272 )
...
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-07-05 00:04:25 +08:00
xige-16
04082b3de2
Migrate the ability to upload and download binlog to cpp ( #22984 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2023-06-25 14:38:44 +08:00
yah01
a413842e38
Fix deleted data is still visible ( #24849 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-06-16 17:16:41 +08:00
zhagnlu
f3f3f8a849
Segcore retrieve by pk optimazation ( #24659 ) ( #24660 )
...
Signed-off-by: luzhang <luzhang@zilliz.com>
Co-authored-by: luzhang <luzhang@zilliz.com>
2023-06-15 15:04:39 +08:00
foxspy
6f4ed517de
add growing segment index ( #23615 )
...
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-04-26 10:14:41 +08:00
yihao.dai
092d743917
Add support for getting vectors by ids ( #23450 )
...
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-04-23 09:00:32 +08:00
yah01
546080dcdd
Support to retrieve json ( #23563 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-04-21 11:46:32 +08:00
yah01
bdd6bc7695
Re-format cpp code ( #22513 )
...
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-03-02 15:55:49 +08:00
aoiasd
2b58bd5c0a
Optimize large memory usage of InsertRecord by using vector instead of unordered_map if InsertRecord used in sealed segment ( #19245 )
...
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2022-09-23 17:08:51 +08:00
xige-16
428840178c
Support diskann index for vector field ( #19093 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-09-21 20:16:51 +08:00
Cai Yudong
dcf45df029
Optimize API vector_search parameter in segcore ( #18827 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-08-25 16:16:54 +08:00
Jeng.Gwan
638f6c36e9
Support to get real row count of segment ( #18115 )
...
Signed-off-by: xaxys <zheng.guan@zilliz.com>
2022-07-18 09:58:28 +08:00
xige-16
54d17bc5da
Fix query too slow when insert multi repeated pk data ( #18231 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-07-13 10:22:26 +08:00
Cai Yudong
7385770014
Upgrade to knowhere-v1.1.12 ( #17692 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-24 10:34:18 +08:00
Cai Yudong
e78269f450
Optimize search related interface in segcore ( #17568 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-06-23 18:16:13 +08:00
xige-16
36ad989590
Fix segOffset grater than insert barrier when mark delete ( #17444 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-06-10 20:02:08 +08:00
xige-16
56778787be
Reverse data from scalar index ( #17145 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-05-26 14:58:01 +08:00
xige-16
08ad77c71b
Delete all repeated primary keys ( #16863 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
2022-05-12 14:09:53 +08:00
Letian Jiang
b3eb2b1d0d
Support deltaLog loading on growing segment ( #16903 )
...
Signed-off-by: Letian Jiang <letian.jiang@zilliz.com>
2022-05-12 11:59:53 +08:00
xige-16
515d0369de
Support string type in segcore ( #16546 )
...
Signed-off-by: xige-16 <xi.ge@zilliz.com>
Co-authored-by: dragondriver <jiquan.long@zilliz.com>
Co-authored-by: dragondriver <jiquan.long@zilliz.com>
2022-04-29 13:35:49 +08:00
zhenshan.cao
58ea38142f
Use boost dynamic_bitset in segcore ( #16476 )
...
Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2022-04-14 22:37:34 +08:00
Cai Yudong
54b8b24151
Rename variable names for better readibility ( #15700 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2022-02-22 22:15:52 +08:00
jaime
307a8ce535
Support compile and run on Mac ( #15491 )
...
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Jenny Li <jing.li@zilliz.com>
Co-authored-by: Nemo <yuchen.gao@zilliz.com>
Signed-off-by: yun.zhang <yun.zhang@zilliz.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Jenny Li <jing.li@zilliz.com>
Co-authored-by: Nemo <yuchen.gao@zilliz.com>
2022-02-09 14:27:46 +08:00
bigsheeper
ebed1a68ff
Add log for segcore search ( #15159 )
...
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2022-01-11 18:07:34 +08:00
Cai Yudong
3aca73969f
Optimize segcore API arrangement ( #12135 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-19 17:23:12 +08:00
Cai Yudong
e920b6d6ff
Reorder header files for segcore/SegmentGrowingImpl.h ( #11830 )
...
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
2021-11-16 10:33:16 +08:00
yukun
0304a8014b
Support delete in query ( #10452 )
...
Signed-off-by: fishpenguin <kun.yu@zilliz.com>
2021-10-22 20:05:12 +08:00