Related to #35303
Slice of `storage.PrimaryKey` will have extra interface cost for each
element, which may cause notable memory usage when delta row count
number is large.
This PR replaces PrimaryKey slice with PrimaryKeys interface saving the
extra interface cost.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #36621
1. Add API to access task runtime metrics, including:
- build index task
- compaction task
- import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
- sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.
Signed-off-by: jaime <yun.zhang@zilliz.com>
Related to #37112
Skip load logic used to work only when there is multiple segment load
info entires in load request. In continous delete case, delegator still
loads l0 segment, which occupies lot of memory.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
OSPP 2024 project:
https://summer-ospp.ac.cn/org/prodetail/247410235?list=org&navpage=org
Solutions:
- parser (planparserv2)
- add CallExpr in planparserv2/Plan.g4
- update parser_visitor and show_visitor
- grpc protobuf
- add CallExpr in plan.proto
- execution (`core/src/exec`)
- add `CallExpr` `ValueExpr` and `ColumnExpr` (both logical and
physical) for function call and function parameters
- function factory (`core/src/exec/expression/function`)
- create a global hashmap when starting milvus (see server.go)
- the global hashmap stores function signatures and their function
pointers, the CallExpr in execution engine can get the function pointer
by function signature.
- custom functions
- empty(string)
- starts_with(string, string)
- add cpp/go unittests and E2E tests
closes: #36559
Signed-off-by: Yinzuo Jiang <jiangyinzuo@foxmail.com>
issue: https://github.com/milvus-io/milvus/issues/37083
We use vector of string_view to save data temporally but real string
data will be released after record batch is deconstructed.
Change it to vector of string to avoid memory corruption.
---------
Signed-off-by: sunby <sunbingyi1992@gmail.com>
Related to #35303
Delta data is not needed when using `RemoteLoad` l0 forward policy. By
skipping load delta data, memory pressure could be eased if l0 segment
size/number is large.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #37054
after querycoord restart, segment_checker may release segment by mistake
due to next target isn't ready yet.
This PR requires release segment must happens after next target is
ready.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Related to #35303
This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Relatedt #36887
DirectFoward streaming delete will cause memory usage explode if the
segments number was large. This PR add batching delete API and using it
for direct forward implementation.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #33550
cause wrong impl of UpdateCollectionNextTarget, if ReleaseCollection and
UpdateCollectionNextTarget happens at same time, the the released
partition's segment list may be add to target again, and delegator will
be marked as unserviceable due to lack of segment.
This PR fix the impl of UpdateCollectionNextTarget
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #35576
This pr is to cover those cases when queryHook optimize search params
and make the result size insufficient, add retry search mechanism and
add related metrics for alarming.
---------
Signed-off-by: chasingegg <chao.gao@zilliz.com>
issue: #36686
This pr will remove pre-marking segments as L2 during clustering
compaction in version 2.5, and ensure compatibility with version 2.4.
The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**
feature:
- Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.
compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
- If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
- the level of input segments need to revert to LastLevel
- If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue #34117
* Refactoring
* Added a capability to perform multiple bitwise `and` and `or`
operations in a single op
* AVX2, AVX512, ARM NEON, ARM SVE backed bitwise `and`, `op`, `xor` and
`sub` ops
* more unit tests for bitset
* fixed a bug in `or_with_count` for certain bitset sizes
* fixed a bug for certain offset values for inplace operations that take
two bitsets
Signed-off-by: Alexandr Guzhva <alexanderguzhva@gmail.com>
issue: #36686
bug reason:
- The clustering compaction tasks on the datanode were never cleaned up.
- The clustering compaction task contains a mapping from clustering key
to buffer, this caused a large memory leak.
fix:
- clean the tasks on datanode by datacoord when clustering compaction
finished.
- reset the mapping that from clustering key to buffer on datanode when
clustering finished.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #34553
when rootcoord trigger graceful stop progress, it will block until all
rpc finished. for create collection request, rootcoord need to block
until datacoord finish to watch all channels, but datacoord need to call
`rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to
new request anymore. which cause create collection stucks, and graceful
stop progress stucks.
This PR remove the func call `rootcoord.Alloc` to solve the logic dead
lock during graceful stop progress.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #36868
if datacoord is syncing segments to datanode, and stop datacoord
happens, datacoord's stop progress will stuck until syncing segment
finished.
This PR add ctx to syncing segment, which will failed if stopping
datacoord happens.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/36835
currently searching BM25 output field using IP will end up in an error
in segcore which is hard to understand. now returning error in query
node delegator and provide more useful error message
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
Related to #36887
Forward delete to L0 segment will return error and mark l0 segment
offline causing delegator unserviceable
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Start position & level info is missing for growing segment loaded in
watch dml channel operation.
Level is important for metrics and start position is crucial for growing
exclude logic.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
In quota center, ignore the "DB not found error" to prevent it from
affecting the rate limiting of other databases.
/kind improvement
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
1. support isClusteringKey in restful api;
2. throw err if passed invalid 'enableDynamicField' params
3. parameters in indexparams are not processed properly, related with
#36365
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/35853
* BM25 Function now takes no params, k1, b should be passed via index
params
* support BM25 full text search when metric type is not present in
search request
* add more strict validation with functions at collection creation time
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
This PR splits sealed segment to chunked data to avoid unnecessary
memory copy and save memory usage when loading segments so that loading
can be accelerated.
To support rollback to previous version, we add an option
`multipleChunkedEnable` which is false by default.
Signed-off-by: sunby <sunbingyi1992@gmail.com>