issue: #37156
1. Still need to record the current stats version.
2. Set it to 0 when the current stats version is not found.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Also remove conflit check when executing L0. The exclusive is already
guarenteed in scheduler
See also: #37140
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Timeout is a bad design for long running tasks, especially using a
static timeout config. We should monitor execution progress and fail the
task if the progress has been stale for a long time.
This pr is a small patch to stop DC from marking compaction tasks
timeout, while still waiting for DN to finish. The design is
self-conflicted. After this pr, mix and L0 compaction are no longer
controlled by DC timeout, but clustering is still under timeout control.
The compaction queue capacity grows larger for priority calc, hence
timeout compactions appears more often, and when timeout, the queuing
tasks will be timeout too, no compaction will success after.
See also: #37108, #37015
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #36621
1. Add API to access task runtime metrics, including:
- build index task
- compaction task
- import task
- balance (including load/release of segments/channels and some leader
tasks on querycoord)
- sync task
2. Add a debug model to the webpage by using debug=true or debug=false
in the URL query parameters to enable or disable debug mode.
Signed-off-by: jaime <yun.zhang@zilliz.com>
Related to #35303
This PR add metrics for querynode delegator delete buffer information,
which is related to dml quota logic.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #36686
This pr will remove pre-marking segments as L2 during clustering
compaction in version 2.5, and ensure compatibility with version 2.4.
The core of this change is to **ensure that the many-to-many lineage
derivation logic is correct, making sure that both the parent and child
cannot simultaneously exist in the target segment view.**
feature:
- Clustering compaction no longer marks the input segments as L2.
- Add a new field `is_invisible` to `segmentInfo`, and mark segments
that have completed clustering but have not yet built indexes as
`is_invisible` to prevent them from being loaded prematurely."
- Do not mark the input segment as `Dropped` before the clustering
compaction is completed.
- After compaction fails, only the result segment needs to be marked as
Dropped.
compatibility:
- If the upgraded task has not failed, there are no compatibility
issues.
- If the status after the upgrade is `MetaSaved`, then skip the stats
task based on whether TmpSegments is empty.
- If the failure occurs before `MetaSaved`:
- there are no ResultSegments, and InputSegments have not been marked as
dropped yet.
- the level of input segments need to revert to LastLevel
- If the failure occurs after `MetaSaved`:
- ResultSegments have already been generated, and InputSegments have
been marked as Dropped. At this point, simply make the ResultSegments
visible.
- the level of ResultSegments needs to be set to L1(in order to
participate in mixCompaction)
---------
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #36686
bug reason:
- The clustering compaction tasks on the datanode were never cleaned up.
- The clustering compaction task contains a mapping from clustering key
to buffer, this caused a large memory leak.
fix:
- clean the tasks on datanode by datacoord when clustering compaction
finished.
- reset the mapping that from clustering key to buffer on datanode when
clustering finished.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
issue: #34553
when rootcoord trigger graceful stop progress, it will block until all
rpc finished. for create collection request, rootcoord need to block
until datacoord finish to watch all channels, but datacoord need to call
`rootcoord.Alloc` during watch channel, and rootcoord doesn't respond to
new request anymore. which cause create collection stucks, and graceful
stop progress stucks.
This PR remove the func call `rootcoord.Alloc` to solve the logic dead
lock during graceful stop progress.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #36868
if datacoord is syncing segments to datanode, and stop datacoord
happens, datacoord's stop progress will stuck until syncing segment
finished.
This PR add ctx to syncing segment, which will failed if stopping
datacoord happens.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #35922
add an enable_tokenizer param to varchar field: must be set to true so
that a varchar field can enable_match or used as input of BM25 function
---------
Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
See #36550
This PR made 2 changes:
1. Introducing a prioritization mechanism, if
`dataCoord.compaction.taskPrioritizer` is set to `level`, compaction
tasks are always executed as the priority of L0>Mix>Clustering
2. `dataCoord.compaction.maxParallelTaskNum` now controls the
parallelism of executing tasks, not the task number of queue +
executing.
---------
Signed-off-by: Ted Xu <ted.xu@zilliz.com>
1. Optimize import scheduling strategic:
a. Revise slot weights, calculating them based on the number of files
and segments for both import and pre-import tasks.
b. Ensure that the DN executes tasks in ascending order of task ID.
2. Add time cost metric and log.
issue: https://github.com/milvus-io/milvus/issues/36600,
https://github.com/milvus-io/milvus/issues/36518
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Directly add import segments from the meta, eliminating the dependency
on the segment manager.
issue: https://github.com/milvus-io/milvus/issues/34648
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Native support for Google cloud storage using the Google Cloud Storage
libraries. Authentication is performed using GCS service account
credentials JSON.
Currently, Milvus supports Google Cloud Storage using S3-compatible APIs
via the AWS SDK. This approach has the following limitations:
1. Overhead: Translating requests between S3-compatible APIs and GCS can
introduce additional overhead.
2. Compatibility Limitations: Some features of the original S3 API may
not fully translate or work as expected with GCS.
To address these limitations, This enhancement is needed.
Related Issue: #36212