Commit Graph

52 Commits (60dd55f2929263eb65ba593a4e392591131008ff)

Author SHA1 Message Date
congqixia 051bc280dd
enhance: Make dynamic load/release partition follow targets (#38059)
Related to #37849

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-12-05 16:24:40 +08:00
Zhen Ye c6dcef7b84
enhance: move segcore codes of segment into one package (#37722)
issue: #33285

- move most cgo opeartions related to search/query into segcore package
for reusing for streamingnode.
- add go unittest for segcore operations.

Signed-off-by: chyezh <chyezh@outlook.com>
2024-11-29 10:22:36 +08:00
congqixia 1ed686783f
enhance: Use `PrimaryKeys` to replace interface slice for segment delete (#37880)
Related to #35303

Reduce temporary memory usage for PK interface for segment delete.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-11-22 11:52:33 +08:00
congqixia 3106384fc4
enhance: Return deltadata for `DeleteCodec.Deserialize` (#37214)
Related to #35303 #30404

This PR change return type of `DeleteCodec.Deserialize` from
`storage.DeleteData` to `DeltaData`, which
reduces the memory usage of interface header.

Also refine `storage.DeltaData` methods to make it easier to usage.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-29 12:04:24 +08:00
congqixia 7774b7275e
enhance: Replace PrimaryKey slice with PrimaryKeys saving memory (#37127)
Related to #35303

Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.

This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-10-28 10:29:30 +08:00
aoiasd db34572c56
feat: support load and query with bm25 metric (#36071)
relate: https://github.com/milvus-io/milvus/issues/35853

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-10-11 10:23:20 +08:00
congqixia f87af9bc54
enhance: Exclude L0 segment from readable snapshot (#35507)
L0 segments now do not contain insert data and may cause confusion for
query hook optimizer if counted as sealed segment number.

This PR add segment level flag in segment entry and exclude L0 segments
while get readable segment snaphsot

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-08-16 15:28:53 +08:00
congqixia de8a266d8a
enhance: Enable linux code checker (#35084)
See also #34483

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-07-30 15:53:51 +08:00
jaime 3b62138c5c
fix: unstable UT for level0 deletion (#34524)
issue: #34533

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-11 10:02:56 +08:00
cqy123456 32f685ff12
enhance: growing segment support mmap (#32633)
issue: https://github.com/milvus-io/milvus/issues/32984

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
wei liu ab93d9c23d
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611)
issue:#33610

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 17:57:56 +08:00
wayblink a1232fafda
feat: Major compaction (#33620)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-06-10 21:34:08 +08:00
wei liu c6a1c49e02
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405)
issue: #32995
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

- Block BF construct time	{"time": "54.128131ms"}
- Block BF size	                {"size": 3021578}
- Block BF Test cost	        {"time": "55.407352ms"}
- Basic BF construct time	{"time": "210.262183ms"}
- Basic BF size	                {"size": 2396308}
- Basic BF Test cost	        {"time": "192.596229ms"}

In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

- Block BF TestLocation cost    {"time": "529.97183ms"}
- Basic BF TestLocation cost	{"time": "3.197430181s"}

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 17:49:45 +08:00
SimFG cb99e3db34
enhance: add the includeCurrentMsg param for the Seek method (#33326)
/kind improvement
- issue: #33325

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-05-27 10:31:41 +08:00
Xiaofan 3d105fcb4d
enhance: Remove l0 delete cache (#32990)
fix #32979
remove l0 cache and build delete pk and ts everytime. this reduce the
memory and also increase the code readability

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-21 22:53:40 +08:00
wei liu 5038036ece
enhance: Reuse hash locations during access bloom fitler (#32642)
issue: #32530 

when try to match segment bloom filter with pk, we can reuse the hash
locations. This PR maintain the max hash Func, and compute hash location
once for all segment, reuse hash location can speed up bf access

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-07 06:13:47 -07:00
congqixia 40728ce83d
enhance: Add `metautil.Channel` to convert string compare to int (#32749)
See also #32748

This PR:

- Add `metautil.Channel` utiltiy which convert virtual name to physical
channel name, collectionID and shard idx
- Add channel mapper interface & implementation to convert limited
physical channel name into int index
- Apply `metautil.Channel` filter in querynode segment manager logic

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-07 19:13:35 +08:00
Bingyi Sun fecd9c21ba
feat: LRU cache implementation (#32567)
issue: https://github.com/milvus-io/milvus/issues/32783
This pr is the implementation of lru cache on branch lru-dev.

Signed-off-by: sunby <sunbingyi1992@gmail.com>
Co-authored-by: chyezh <chyezh@outlook.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
Co-authored-by: Ted Xu <ted.xu@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: wayblink <anyang.wang@zilliz.com>
2024-05-06 20:29:30 +08:00
wei liu df208d538c
fix: Check exclude segment before add new growing segment (#31803)
issue: #31479 #31797

milvus will add released segment to excluded info, and filter out it's
stream data in filter_node. but for data buffered in insert_node's
channel, if it belongs to growing segment which already be released,
then it will all the growing segment back again.

This PR maintain `excluded segments` in delegator, and check excluded
segment before new growing segment.

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-04-10 15:29:17 +08:00
aoiasd 5b693c466d
fix: delegator filter out all partition's delete msg when loading segment (#31585)
May cause deleted data queryable a period of time.
relate: https://github.com/milvus-io/milvus/issues/31484
https://github.com/milvus-io/milvus/issues/31548

---------

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-04-09 15:21:24 +08:00
Chun Han c3264ca3e3
feat: support segment pruner (#31003)
related: #30376
2024-03-22 13:57:06 +08:00
congqixia a647b84f3e
enhance: Add AllPartitionsID const to replace InvalidPartitionID (#31438)
"-1" as `InvalidPartitionID` previously used as All partition place
holder in delete cases. It's confusing and hard to maintain when a const
var has more than one meaning.

This PR add `AllPartitionsID` to replace these usages in delete
scenarios.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-20 19:01:05 +08:00
Xiaofan a63b4cedcf
fix: remove some unnecessary unrecoverable errors (#31327)
use retry.handle when request is not able to service but don't throw
unrecoverable erros
fix #31323

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-03-20 11:35:07 +08:00
yah01 a0cec4047a
fix: make the entity num metric accurate (#29643)
fix #29642

Signed-off-by: yah01 <yang.cen@zilliz.com>
2024-01-05 18:24:47 +08:00
congqixia da7c3cbd88
enhance: make delegator delete buffer holding all delete from cp (#29626)
See also #29625

This PR:
- Add a new implemention of `DeleteBuffer`: listDeleteBuffer
  - holds cacheBlock slice
  - `Put` method append new delete data into last block
  - when a block is full, append a new block into the list
- Add `TryDiscard` method for `DeleteBuffer` interface
  - For doubleCacheBuffer, do nothing
- For listDeleteBuffer, try to evict "old" blocks, which are blocks
before the first block whose start ts is behind provided ts
- Add checkpoint field for `UpdateVersion` sync action, which shall be
used to discard old cache delete block

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-04 17:02:46 +08:00
MrPresent-Han ed644983e2
enhance: add param for bloomfilter(#29388) (#29490)
related: #29388

Signed-off-by: MrPresent-Han <chun.han@zilliz.com>
2023-12-28 18:10:46 +08:00
congqixia b251c3a682
enhance: add ctx for HandleCStatus and callers (#29517)
See also #29516

Make `HandleCStatus` print trace id for better logging

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-27 16:10:47 +08:00
congqixia 1eacdc591b
fix: delegator may mark segment offline by mistake (#29343)
See also #29332

The segment may be released before or during the request when delegator
tries to forward delete request to yet. Currently, these two situation
returns different error code.

In this particular case, `ErrSegmentNotLoaded` and `ErrSegmentNotFound`
shall both be ignored preventing return search service unavailable by
mistake.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-12-20 21:22:43 +08:00
yah01 9b3e06ae86
enhance: add more metrics for level zero segments (#29029)
- Add SegmentNum metric for level zero segments
- Add level zero segments size metirc

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-12-07 14:34:35 +08:00
Gao 3e77365de5
fix: correct autoindex segment num (#28387)
Fix #28386 
Current code snippet 
```
// get delegator
sd, ok := node.delegators.Get(channel)
if !ok {
err := merr.WrapErrChannelNotFound(channel)
log.Warn("Query failed, failed to get shard delegator for search", zap.Error(err))
return nil, err
}
req, err = node.optimizeSearchParams(ctx, req, sd)
if err != nil {
log.Warn("failed to optimize search params", zap.Error(err))
return nil, err
}
// do search
results, err := sd.Search(searchCtx, req)
```

We could move these into `ShardDelegator`, and directly use sealed
segment num in `Search` methods, also segment num got outside could be
wrong when we specify partitions.

Signed-off-by: chasingegg <chao.gao@zilliz.com>
2023-11-22 11:12:22 +08:00
yah01 cc952e0486
enhance: optimize forwarding level0 deletions by respecting partition (#28456)
- Cache the level 0 deletions after loading level0 segments
- Divide the level 0 deletions by partition
related: #27349

---------

Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-21 18:24:22 +08:00
yah01 d20ea061d6
Fix panic while forwarding empty deletions to growing segment (#28213)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-08 16:42:21 +08:00
yah01 ece592a42f
Deliver L0 segments delete records (#27722)
Signed-off-by: yah01 <yah2er0ne@outlook.com>
2023-11-07 01:44:18 +08:00
wei liu ecec5dfcfd
fix retry on offline node (#28079)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-11-03 10:14:16 +08:00
congqixia 13877a07ff
Add ctx parameter for tsafe pkg & NewDelegator method (#27877)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-10-26 19:14:10 +08:00
XuanYang-cn 7f1ae35e72
Add timeout in dispatcher, AsConsumer and Seek (#26686)
See also: #25309

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2023-09-08 09:51:17 +08:00
congqixia 4b58c71908
Add ctx parameter for organizeTask and GetWorker method (#26835)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-09-05 10:05:48 +08:00
congqixia 89fc9aad82
Improve sync target version logic (#26630)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-08-29 23:12:27 +08:00
wei liu 23baecd70f
set sealed segment to unreadable before sync target version (#26338)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-08-15 17:27:35 +08:00
wei liu fc19b85a40
fix count(*)retrieve redundant growing segment (#25825)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-07-24 14:09:00 +08:00
Cai Yudong 9a4761dcc7
Remove binary metrics TANIMOTO/SUPERSTRUCTURE/SUBSTRUCTURE (#25708)
Signed-off-by: Yudong Cai <yudong.cai@zilliz.com>
2023-07-19 16:16:58 +08:00
yah01 d216f9abda
Clear collection meta after all channels/segments released (#25486)
Signed-off-by: yah01 <yang.cen@zilliz.com>
2023-07-14 10:28:30 +08:00
SimFG f9e2d00f91
Prevent `exclusive consumer` exception in pulsar (#25376)
Signed-off-by: SimFG <bang.fu@zilliz.com>
2023-07-12 17:26:30 +08:00
wei liu 68ae199a9f
load segment with target version, avoid read redundant segment (#24929)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-06-27 11:48:45 +08:00
congqixia 41af0a98fa
Use go-api/v2 for milvus-proto (#24770)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-06-09 01:28:37 +08:00
congqixia 73a181d226
Fix get vector it timeout and improve some string const usage (#24141)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2023-05-16 17:41:22 +08:00
yihao.dai 7e0d1492c7
Load delete from channel checkpoint (#23961)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2023-05-09 19:10:41 +08:00
foxspy 6f4ed517de
add growing segment index (#23615)
Signed-off-by: xianliang <xianliang.li@zilliz.com>
2023-04-26 10:14:41 +08:00
wei liu 3ad9ff7a9a
fix release segment failed (#23462)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2023-04-17 18:46:30 +08:00
jaime c9d0c157ec
Move some modules from internal to public package (#22572)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2023-04-06 19:14:32 +08:00