See also #31143
This PR add short cut for datanoe metacache `WithSegmentIDs` filter,
which could just fetch segment from map with provided segmentIDs. Also
add benchmark for new implementation vs old one.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Replacing the current import API v1 implementation with the v2
implementation.
issue: https://github.com/milvus-io/milvus/issues/28521
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
This PR add metrics for load segment progress:
1. add metrics for load segment/index concurrency
2. add metrics for load index latency
3. change load segment latency's time unit to ms
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30890
when leader checker find that leader view has an older load version of
segment, it will try to correct leader view. but the sync action doesn't
specify the latest load version. so the update operation will failed.
This PR fix leader checker can't update segment's load version and
keeping generate same task to scheduler.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #31125
Delegator shall build level zero delete cache from l0 segments belongs
to it. Previously it build cache from all existing level zero segments
in the querynode which may lead to high memory usage and even panicking
when pk types are not matched
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #31103
Since querycoord need index meta information from datacoord only, broker
shall use `ListIndexes` to skip segment index building check logic in
datacoord
This PR is also related to #30538, in which DescribeIndex caused lots of
memory usage and lead to OOM eventually
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.
issue: https://github.com/milvus-io/milvus/issues/30004
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
See also #31103
This PR add `listIndexes` API for datacoor server to list all indexes
for provided collection.
Comparing to the existing `DescribeIndex` API, the new one does NOT
check the segment index building progress to ease the burden when
invoking it
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Define FieldValue, FieldStats, PartitionStats
FieldValue is largely copied from PrimaryKey
FieldStats is largely copied from PrimaryKeyStats
PartitionStats is map[segmentid][]FieldStats
Each partition can have a PartitionStats file
/kind feature
related: #30287
related: #30633
---------
Signed-off-by: wayblink <anyang.wang@zilliz.com>
Skip partition key naming & hash value pre process if collection schema
does not have partition key
The PR removes mislead warning when collection has no partition key
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Trigger l0 compaction when l0 views don't change
So that leftover l0 segments would be compacted in the end.
1. Refresh LevelZero plans in comactionPlanHandler, remove the meta
dependency
of compaction trigger v2
2. Add ForceTrigger method for CompactionView interface
3. rename mu to taskGuard
4. Add a new TriggerTypeLevelZeroViewIDLE
5. Add an idleTicker for compaction view manager
See also: #30098, #30556
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Segment load memory usage is underestimated due to removing the load
memroy factor. This PR adds it back to protect querynode OOM during some
extreme memory cases.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #20553
This PR add retry on all interface which belong to indexcoord in milvus
2.2 and. move to data coord in milvus 2.3, to prevent meet
`unimplemented` error during rolling upgrade from milvus 2.2 to 2.3.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30980#22837
2.3 pr: #30873
should use GetBFloat16Field, GetBfloat16Field rather than GetBinaryField
Signed-off-by: PowderLi <min.li@zilliz.com>
In the cache of the timeTickSender, retain only the latest stats instead
of storing stats for every time tick.
issue: https://github.com/milvus-io/milvus/issues/30967
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.
issue: https://github.com/milvus-io/milvus/issues/28521
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #30950
due to segment version doesn't update as expected.
This PR will update segment version until segment become loaded
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also #30538
Previously the `SelectSegments` changed to clone all return value
preventing possible update to returned info.
Since meta is implemented following COW rules, this shall not happen and
any update on segment shall have copy before it.
This PR:
- Remove clone for read-only Get segment info
- Add Segment Operator abstraction for changing segment
- Implemnt COW for updating MaxRowNum
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
If DC restarted, those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.
See also: #30137
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth
issue: #30310
also see pr: #30306
---------
Signed-off-by: chyezh <chyezh@outlook.com>
Compaction would copy logPaths from comapctFrom segA to compactTo segB,
and previous code would copy the logPath directly, causing there're
full-logPaths-of-segA in compactTo segB's meta. So, for the next
compaction of segB, if segA has been GCed, Download would report error
"The sperified key not found".
This PR makes sure compactTo segment's meta contains logID only. And
this PR also refines CompleteComapctionMutation, increasing some
readability and merge two methods into one.
See also: #30496
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
Support get sdk type by user agent when we can't get sdk version by
connection in access log.
---------
Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
/kind improvement
this removes the 1x copying while loading variable length data, also
avoids constructing std::string, which could lead to memory
fragmentation
---------
Signed-off-by: yah01 <yah2er0ne@outlook.com>
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
Co-authored-by: yah01 <yah2er0ne@outlook.com>
issue: #30150
`checkLeaderTaskStale` will check segment whether exist on next current
for leaderTask's growing action, which will cause promote leader task
failed when segment only exist on current target
This PR will check segment for both current or next target.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
if check in Segcore, will not do the it when not insert data.
so, check "radius" and "range_filter" in proxy.
related with #30365
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
See also #30832
This PR removes time tick delay metrics when rootcoord GetMetrics
response does not have previously existed querynode/datanode
Also add unit tests for this case
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Signed-off-by: Congqi.Xia <congqi.xia@zilliz.com>
bug list
1. element data type is needed while create a collection with an array
field #30638
2. spelling mistake about metricsType #30643
3. new collection enable dynamic field / auto id as user defined #30665
4. convert rowCount from string to int #30661
5. try it's best to create a new collection #30652
6. int64 percious #20415
7. insert into collection which has multi vector fields #30674
8. cannot rename a collection to other database #30700
9. update the request body of "indexes/create" #30769
10. got [] while list indexes of a collection which has no index #30722
11. restful need encode password before call
CreateCredential/UpdateCredential #30730
12. some parameters missed the required label #30737
13.define the field to be or not to be a partition key while create a
collection #30797
enhance: support opentelemetry
enhance: support dataType: Float16Vector & BFloat16Vector #22837
enhance: describe collection will show the field is partition key or not
#30789
Signed-off-by: PowderLi <min.li@zilliz.com>
See also #30756
This PR:
- Request disk resource when index type, version loaded with disk
- Add attribute cache for index utility
- Add `typeutil.Pair`
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
patch search cache param from index configs when index meta could not
get the search cache size key
#issue: #30113
Signed-off-by: xianliang.li <xianliang.li@zilliz.com>
issue: #30150
see also: #30258
cause `SyncDataDistribution` will try to load delta for segment. if miss
indexInfo in request, sync action will failed due to lack of index info.
This PR set indexinfo when try to set segment to leader view
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30723
This PR skip generate balance task when collection's target isn't ready.
also refine the check stale logic in query coord's scheduler, if channel
exist in current or next target, task won't be canceled.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
issue: #30715
- Bug: Set nil struct pointer to describe nil interface.
Panic with segment violation when calling method on this nil struct
pointer.
Signed-off-by: chyezh <chyezh@outlook.com>
issue: https://github.com/milvus-io/milvus/issues/30687
We store all the varchar datas in an continuous address and use
string_view to quickly find them. In this case, using string_view.data()
directly will point to all rest varchar datas.
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
See also #30191
It turned out that in auto id and batch delete scenario actual memory
size of deltalog maybe way larger than deltalog file size. This PR add a
configurable expansion rate for deltalog memory usage to prevent
out-of-memory panicking during loading deltalogs.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30150
For leader view distribution with offline nodes, a release task can
never be sent to querynode due to targetNode online check logic. Even
the request is dispatched, normal release task does not have "force"
flag when calling `delegator.ReleaseSegment`.
This PR adds a new type of querycoord task: LeaderTask, the
responsibility of which is to rectify leader view distribtion.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
the old version Knowhere would copy the index data while loading, we
need to consider this to avoid OOM.
Knowhere provides a util function to indicate whether it will load the
index with disk, if not, we need to double the memory usage prediction
for index data
Signed-off-by: yah01 <yang.cen@zilliz.com>
Related to #30191
When loading segment, segment loader shall check memory usage for
current loading task. Previously l0 segment was ignored but level zero
segment may actually cost lots of memory.
This PR adds back memory resource check for Level zero segment loading.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30651
Append operator of `std::filesystem::path` will replace whole path when
the param of "/" operation is an absolute path.
In "All-in-one" mode, this shall cause ChunkCache removing the original
vector data file when building chunk cache during/after load procedure.
This PR changes the ChunkCache path generation logic to a separate
function in which will check whether the file path is absolute or not.
If the file path is absolute, it removes the root path prefix and return
concatenated file path.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Refine compaction interfaces in datacoord, support compaction result
with more than one segment. Prepare for major compaction.
related: #30633
Signed-off-by: wayblink <anyang.wang@zilliz.com>
1. Increase maxCount of L0 compaction tasks to 30
This could reduce the l0 compaction task number by 30% for
high-frequently-generated-small l0 segments, with the maximum size 64MB
stay not changed. So that l0 segments would accumulate slower and
decrease the mem presure caused by L0 segment for QueryNode
2. Add force Trigger for later manual timely l0 compaction triggers.
See also: #30191, #30556
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #30553
when datacoord with version 2.2 and querycoord with version 2.3 coexist
during rolling upgrade, `DescribeIndex/GetIndexInfo` will return
`unimplemented` error
This PR add retry on `DescribeIndex/GetIndexInfo`, to prevent load
collection failed during rolling upgrade from milvus 2.2 to 2.3.
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
flush rate control at collection level to avoid generate too much
segment.
0.1 qps by default.
issue: #29477
Signed-off-by: chyezh <ye.zhen@zilliz.com>
See also #30571
When `compactionExecutor` stops one compaction task, the `stop` method
will case `injectDone` called.
However in `executeTask` when `compact` method returns error, it shall
also invoke `injectDone` as well. That the reason `Unlock of unlocked
RWMutex` panicking happened.
This PR add sync.Once to make sure that `injectDone` is called only
once. We did not remove any of the `injectDone` since removal any of
those invocation may cause logic problem.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Data write through rawkv API may pollute tikv data. It should be
disallowed.
We will add this check to all repos that involves metadata access.
In the longer term, we should have a metadata service that implements
access control.
relate: #30029
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
This PR mainly improve two items:
1. Target observer should refresh loading status during init time. An
uninitialized loading status blocks search/query. Currently, the target
observer refreshes every 10 seconds, i.e. we'd need to wait for 10s for
no reason. That's also the reason why we constantly see false log
"collection unloaded" upon mixcoord restarts.
2. Delete session when service is stopped. So that the new service
doesn't need to wait for the previous session to expire (~10s).
Item 1 is the major improvement of this PR, which should speed up init
time by 10s.
Item 2 is not a big concern in most cases as coordinators usually shut
down after stop(). In those cases, coordinator restart triggers serverID
change which further triggers an existing logic that deletes expired
session. This PR only fixes rare cases where serverID doesn't change.
integration test:
`go test -tags dynamic -v -coverprofile=profile.out -covermode=atomic
tests/integration/coordrecovery/coord_recovery_test.go -timeout=20m`
Performance after the change:
Average init time of coordinators: 10s
Hardware: M2 Pro
Test setup: 1000 collections with 1000 rows (dim=128) per collection.
issue: #29409
Signed-off-by: yiwangdr <yiwangdr@gmail.com>
See also #27675#30469
For a sync task, the segment could be compacted during sync task. In
previous implementation, this sync task will hold only the old segment
id as KeyLock, in which case compaction on compacted to segment may run
in parallel with delta sync of this sync task.
This PR introduces sync target segment verification logic. It shall
check target segment lock it's holding beforing actually syncing logic.
If this check failed, sync task shall return`errTargetSegementNotMatch`
error and make manager re-fetch the current target segment id.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR changes the following to speed up L0 compaction and
prevent OOM:
1. Lower deltabuf limit to 16MB by default, so that each L0 segment
would be 4X smaller than before.
2. Add BatchProcess, use it if memory is sufficient
3. Iterator will Deserialize when called HasNext to avoid massive memory
peek
4. Add tracing in spiltDelta
See also: #30191
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.
issue: https://github.com/milvus-io/milvus/issues/30359
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
syncMgr.Block() will lock the segment when executing compaction.
Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.
This PR makes sure compaction executor would Unblock these segments
after a failure compaction.
Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:
1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error
See also: #30213
---------
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
issue: #29988
This pr adds full-support for wildcard pattern matching from end to end.
Before this pr, the users can only use prefix match in their expression,
for example, "like 'prefix%'". With this pr, more flexible syntax can be
combined.
To do so, this pr makes these changes:
- 1. support regex query both on index and raw data;
- 2. translate the pattern matching to regex query, so that it can be
handled by the regex query logic;
- 3. loose the limit of the expression parsing, which allows general
pattern matching syntax;
With the support of regex query in segcore backend, we can also add
mysql-like `REGEXP` syntax later easily.
---------
Signed-off-by: longjiquan <jiquan.long@zilliz.com>
See also #27606
Previously l0 linear compaction will scan all target segment id from
metacache for each line of delta entry, which is not needed since
compaction target segments shall be all immutable.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30404
`PrimaryKey` is used to hold pk values for both int64 & varchar data
type. Since it is an interface it may occupies more memory than pure
slices when holding a group of pks.
This PR add `PrimaryKeys` interface when some other module need to hold
lots of PrimaryKeys.
By using this interface, it could reduce the memory of pk slice to half
when using Int64 Pk data type and reduce interface cost for each row of
varchar as well.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;
issue: https://github.com/milvus-io/milvus/issues/28521
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
See also #27675
This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #27675
BloomFilterSet.current shall be reset after RollStats, otherwise it will
keep tracking whole segment data causing the false positive ratio larger
than expected.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #28521#29732
include
1. list collection's import jobs
2. create a new import job
3. get the progress of an import job
fix:
1. mix the order of dbName & collectionName #29728
2. trace log keep the same as v1
3. support traceID
4. azure precheck, blob name cannot end with / #29703
---------
Signed-off-by: PowderLi <min.li@zilliz.com>
the proxy miss-returned nil while failed to listen the port, then the
server continues to run but we can't connect to service
resolve#30034
Signed-off-by: yah01 <yah2er0ne@outlook.com>
according to our benchmark, concurrency level 16 is enough to fully
utilize the object storage network bandwidth
Signed-off-by: yah01 <yang.cen@zilliz.com>
issue: #29772
1. `DropPartition` only invalidates the cache related to the partition.
2. `CreateAlias` does not invalidate the cache.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
See also #30167
After support open telemetry tracing, we want to have traceID as well,
this PR adds util functions to set traceID with span & propagate traceID
between different context.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #27606
`MultiRead` actually download file in sequence, which may lead to large
time consumption during l0 compaction download phase.
This PR make l0 compactor download deltalogs in parallel utilizing conc
package & io pool.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #25639
/kind improvement
When the number of vector columns increases, the number of rows per
segment will decrease. In order to reduce the impact on vector indexing
performance, it is necessary to increase the segment max limit.
If a collection has multiple vector fields with memory and disk indices
on different vector fields, the size limit after segment compaction is
the minimum of segment.maxSize and segment.diskSegmentMaxSize.
Signed-off-by: xige-16 <xi.ge@zilliz.com>
---------
Signed-off-by: xige-16 <xi.ge@zilliz.com>
issue: #30102#30225
we should read MetricType from SearchResult,
because query node never
1. read metricType from LoadMeta
2. store to collection
3. set SearchRequest.MetricType
Signed-off-by: PowderLi <min.li@zilliz.com>
issue: #29772
The shardLeaders cache does not actively expire, update the cache when
search/query fails.
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
After #28873, PartitionID and CollectionID should be filled in
CompactionSegmentBinlog so that DataNode can compose
the correct logPath. However There're some places left forgotten to fill
in the information, causing Datanode downloading `xxx/0/0/xxxx/xxxx`
binlogs during compaction
See also: #30213
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
See also #30273
This PR:
- Rename confusing `LoadIndexInfo` to `UpdateIndexInfo` for LocalSegment
- Use `DynamicPool` instead of `LoadPool` for `UpdateSealedSegmentIndex`
- Fix cgo call missing pool control
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Allows proactive warming up of chunk cache. Original vector data will be
asynchronously loaded into the chunk cache during the load process. It
has the potential to significantly reduce query/search latency for a
certain duration after the load, albeit with a concurrent increase in
disk usage.
issue: https://github.com/milvus-io/milvus/issues/30181
---------
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
before this, every time writting the index chunk data into the disk,
there are 4 I/O operations:
- open the file
- seek to the offset
- write the data
- close the file
this optimized this to open only once and continiously write all data.
This also makes it concurrent to load the files from object storage
Signed-off-by: yah01 <yang.cen@zilliz.com>
See also #30250
This PR add requery flag in query task. When reQuery flag is true, query
task shall skip partition name conversion and use pre-calculated
partitionIDs passed from search task.
TODO: hybrid search does not have partition id information. we shall
apply same logic for hybrid search later.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also #30111
Segments could be "Flushed" only by `FlushSegments` grpc call from
datacoord by design. There are two possible reason to cause one segment
got flushed multiple times.
- Segment is in flushing state during multiple epoch in flowgraph
- Segment is flushed by flushTs & Flush segments
So this pr fix:
- Remove state change logic form FlushTs policy
- Change Flush segment into three stage way: Sealed->Flushing->Flushed
preventing multiple Flushed=true operations.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: https://github.com/milvus-io/milvus/issues/29020
Json can't not pass a max_int32 value to int32_t, so let knowhere check
value range by itself.
After fix this, pymilvus will report:
pymilvus.exceptions.MilvusException: <MilvusException: (code=65535,
message=fail to search on QueryNode 6: worker(6) query failed: => failed
to search: arithmetic overflow: param search_list_size should be at most
2147483647)>
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
Resolves#30167
This PR add tracing for all compaction from the task start in datacoord
and execution procedures in datanode.
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.
issue: https://github.com/milvus-io/milvus/issues/30106
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Add a counter monitoring metric for the ratelimited rpc requests with
labels: proxy nodeID, rpc request type, and state.
issue: https://github.com/milvus-io/milvus/issues/30052
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
issue: #30150
This PR fix three problems:
1. leader checker use wrong node id when generate release task, which
cause the release task finished immediately
2. the release request generated by leader_checker doesn't set the
`force` flag, the operation to clean leader view on delegator will fail.
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
See also: #30121#27675
This PR changes the delete buffering logic:
- Write buffer shall buffer insert first
- Then the delete messages shall be evaluated
- Whether PK matches previous Bloom filter, which ts is always smaller
- Whether PK matches insert data which has smaller timestamp
- Then the segment bloom filter is updates by the newly buffered pk rows
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
See also: #28873
When datanode returns error or go offline during GetCompactionResult
call, the compress binlog logic will panic since it was using a nil
result
This PR move it after the CheckRPCCall error to prevent this case.
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
This PR also fixes bugs in l0 compactor where
l0 results would never be removed from datanode
See also: #30099
Signed-off-by: yangxuan <xuan.yang@zilliz.com>