Commit Graph

1140 Commits (8fadcde403067812d3b0a7822d5178d980306e65)

Author SHA1 Message Date
jaime d6afb31b94
enhance: make subfunctions of datanode component modular (#33992)
issue: #33994

also remove deprecated channel manager based on the etcd implementation

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-07-01 14:46:07 +08:00
wayblink e5d691d854
Use new stream segment reader in clustering compaction (#34232)
#32939

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-30 20:26:07 +08:00
wayblink fbe3231b1f
fix: fix error ignore in compactor (#34169)
#34170

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-26 10:24:03 +08:00
jaime 9630974fbb
enhance: move rocksmq from internal to pkg module (#33881)
issue: #33956

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-06-25 21:18:15 +08:00
cai.zhang c65f41dc60
fix: Only sync flushed segments to datanode (#34156)
issue: #33540

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-25 20:46:07 +08:00
congqixia 962a5446f8
enhance: Add ctx in `SyncTask.Run` to be cancellable (#34042)
Related to #33716

This PR add context param in SyncTask.Run execution functions to make it
cancellable from the caller.

This make it possible to cancel task when datanode/data sync service is
beeing shut down.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-25 14:22:04 +08:00
congqixia 506a915272
fix: Deep copy ImportTask.segmentsInfo to prevent data race (#34090)
See also #34089

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-25 10:06:02 +08:00
congqixia b5c9a7364b
fix: Prevent remove new growing L1 segment when SyncSegments (#34056)
Related to #34018

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-24 10:18:01 +08:00
yihao.dai 6c1d815894
enhance: Remove the unused compaction logic from shard (#33932)
1. Remove the `compactTo` field in `SegmentInfo`.
2. Remove the target segment not match and its retry logic in
`SyncManager`.

issue: https://github.com/milvus-io/milvus/issues/32809

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-23 21:12:01 +08:00
wayblink 380d3f4469
fix: Fix memory buffer error & some renaming (#33850)
#30633

---------

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-21 17:30:01 +08:00
XuanYang-cn 04edb07d82
enhance: Add deltaRowCount in l0 compaction (#33997)
See also: #33998

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-20 17:46:01 +08:00
cqy123456 dc4437ff82
enhance: use segment id and type to register in MmapChunkManager and opt malloc in variableChunk (#33993)
issue: https://github.com/milvus-io/milvus/issues/32984

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-20 17:42:02 +08:00
wei liu 31ef0a1fe8
enhance: Add trace for bf cost in l0 compactor (#33860)
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-20 10:10:05 +08:00
smellthemoon 2a1356985d
enhance: support null in go payload (#32296)
#31728

---------

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-06-19 17:08:00 +08:00
wayblink 5cb0760187
fix: Small fixs of major compaction (#33929)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-06-18 16:53:58 +08:00
cqy123456 32f685ff12
enhance: growing segment support mmap (#32633)
issue: https://github.com/milvus-io/milvus/issues/32984

Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
2024-06-18 14:42:00 +08:00
cai.zhang 95148866ed
fix: Don't remove growing L0 segment in datanode metacache (#33829)
issue: #33540 
1. gorwing L0 segments is invisible to datacoord.
2. flushed L0 segments need to clean by datacoord.

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-17 10:09:57 +08:00
yihao.dai 1a9ab52f66
enhance: Ensure the idempotency of compaction task (#33872)
/kind enhancement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-16 22:09:57 +08:00
yihao.dai 8537f3daeb
enhance: Rename Compaction to CompactionV2 (#33858)
Due to the removal of injection and syncSegments from the compaction, we
need to ensure that no compaction is successfully executed during the
rolling upgrade. This PR renames Compaction to CompactionV2, with the
following effects:
- New datacoord + old datanode: Utilizes the CompactionV2 interface,
resulting in the datanode error "CompactionV2 not implemented," causing
compaction to fail;
- Old datacoord + new datanode: Utilizes the CompactionV1 interface,
resulting in the datanode error "CompactionV1 not implemented," causing
compaction to fail.

issue: https://github.com/milvus-io/milvus/issues/32809

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-16 22:07:57 +08:00
yihao.dai 86a36b105a
enhance: Tidy compaction executor (#33778)
Move compaction executor to compaction pacakge.

issue: https://github.com/milvus-io/milvus/issues/32451

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-14 14:34:01 +08:00
wei liu 4987067375
enhance: Execute bloom filter apply in parallel to speed up segment predict (#33792)
issue: #33610

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-14 11:37:56 +08:00
wei liu ab93d9c23d
enhance: Use BatchPkExist to reduce bloom filter func call cost (#33611)
issue:#33610

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-06-13 17:57:56 +08:00
ArenaSu 2dfa752527
doc: [skip-e2e] add comments for event manager (#33444)
Add comments for event manager(internal/datanode/event_manager.go).

Signed-off-by: ArenaSu <704427617@qq.com>
2024-06-13 17:56:06 +08:00
congqixia 512ea6be5f
enhance: Avoid merging insert data when buffering insert msgs (#33562)
See also #33561

This PR:
- Use zero copy when buffering insert messages
- Make `storage.InsertCodec` support serialize multiple insert data
chunk into same batch binlog files

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-13 11:15:56 +08:00
congqixia 9ab3058da2
fix: Prevent restart timetick sender creating ut datanode (#33790)
See also #33789

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-12 22:29:58 +08:00
yihao.dai 9a3e4080f1
enhance: Add comment for channel cp updater (#33759)
/kind enhancement

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-12 20:01:55 +08:00
coldWater 6b9901c59f
enhance: add a semaphore for CompactionExecutor (#33558)
#33182

---------

Signed-off-by: coldWater <254244460@qq.com>
2024-06-11 17:25:55 +08:00
yihao.dai eb5d4de390
fix: Check if the import job exists (#33672)
issue: https://github.com/milvus-io/milvus/issues/33671

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-10 21:51:55 +08:00
wayblink a1232fafda
feat: Major compaction (#33620)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
Co-authored-by: MrPresent-Han <chun.han@zilliz.com>
2024-06-10 21:34:08 +08:00
yihao.dai 3540eee977
enhance: Support L0 import (#33514)
issue: https://github.com/milvus-io/milvus/issues/33157

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-07 14:17:20 +08:00
XuanYang-cn 4dd0c54ca0
fix: Fix l0 compactor may cause DN from OOM (#33554)
See also: #33547

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-06 14:33:52 +08:00
XuanYang-cn 68c9e7db8c
fix: Sync dropped segment for dropped partition (#33331)
See also: #33330

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-06 10:25:52 +08:00
cai.zhang feeb869ff9
enhance: Remove compaction plans on the datanode (#33548)
issue: #33546

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-06-05 20:27:51 +08:00
yihao.dai bbdf99a45e
fix: Fix import segment size is uneven (#33605)
The data coordinator computed the appropriate number of import segments,
thus when importing in the data node, one can randomly select a segment.

issue: https://github.com/milvus-io/milvus/issues/33604

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-06-05 15:41:51 +08:00
zhenshan.cao ac4f3997ce
enhance: Reconstructing Compaction to possess persistence capability (#33265)
issue #33586

Signed-off-by: zhenshan.cao <zhenshan.cao@zilliz.com>
2024-06-05 10:17:50 +08:00
aoiasd 387b7cd7f4
enhance:avoid maintain checkpoint info in sync manager (#33413)
relate: https://github.com/milvus-io/milvus/issues/32915

Signed-off-by: aoiasd <zhicheng.yue@zilliz.com>
2024-06-05 10:05:50 +08:00
congqixia 7f4698f4a7
enhance: Use map PK to timestamp in buffer insert (#33566)
Related to #27675

Store pk to minimal timestamp in `inData` instead of bloom filter to
check whether some delete entry hit current insert batch

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-06-04 10:07:48 +08:00
XuanYang-cn 0382628668
enhance: Add more tracing for l0 compactor (#33435)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-06-03 10:19:49 +08:00
wei liu c6a1c49e02
enhance: Use Blocked Bloom Filter instead of basic bloom fitler impl. (#33405)
issue: #32995
To speed up the construction and querying of Bloom filters, we chose a
blocked Bloom filter instead of a basic Bloom filter implementation.

WARN: This PR is compatible with old version bf impl, but if fall back
to old milvus version, it may causes bloom filter deserialize failed.

In single Bloom filter test cases with a capacity of 1,000,000 and a
false positive rate (FPR) of 0.001, the blocked Bloom filter is 5 times
faster than the basic Bloom filter in both querying and construction, at
the cost of a 30% increase in memory usage.

- Block BF construct time	{"time": "54.128131ms"}
- Block BF size	                {"size": 3021578}
- Block BF Test cost	        {"time": "55.407352ms"}
- Basic BF construct time	{"time": "210.262183ms"}
- Basic BF size	                {"size": 2396308}
- Basic BF Test cost	        {"time": "192.596229ms"}

In multi Bloom filter test cases with a capacity of 100,000, an FPR of
0.001, and 100 Bloom filters, we reuse the primary key locations for all
Bloom filters to avoid repeated hash computations. As a result, the
blocked Bloom filter is also 5 times faster than the basic Bloom filter
in querying.

- Block BF TestLocation cost    {"time": "529.97183ms"}
- Basic BF TestLocation cost	{"time": "3.197430181s"}

---------

Signed-off-by: Wei Liu <wei.liu@zilliz.com>
2024-05-31 17:49:45 +08:00
cai.zhang 77637180fa
enhance: Periodically synchronize segments to datanode watcher (#33420)
issue: #32809

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
2024-05-30 13:37:44 +08:00
smellthemoon 08b94ea81d
enhance:change wrong log (#33447)
Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-05-29 18:35:44 +08:00
congqixia 73c9b80a7d
enhance: Store locations for largest K in `LocationCache` (#33429)
See also #32642

`LocationCache` used map to store different locations for different K
which may cause lots of CPU time when get locations many times.

This PR change the implementation of LocationCache to store only the
location for the largest K used to totally remove the map access
operation.

See pprof from test of @XuanYang-cn 

![image](https://github.com/milvus-io/milvus/assets/84113973/ad17cff8-62ad-4d78-9bb0-f6df0512f4ea)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-29 10:05:42 +08:00
congqixia e71b7c7cc9
enhance: Reduce datanode metacache frequent scan range (#33400)
See also #32165

There were some frequent scan in metacache:
- List all segments whose start positions not synced
- List compacted segments

Those scan shall cause lots of CPU time when flushed segment number is
large meanwhile `Flushed` segments can be skipped in those two scenarios

This PR make:
- Add segment state shortcut in metacache
- List start positions state before `Flushed`
- Make compacted segments state to be `Dropped` and use `Dropped` state
while scanning them

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-28 14:19:42 +08:00
SimFG cb99e3db34
enhance: add the includeCurrentMsg param for the Seek method (#33326)
/kind improvement
- issue: #33325

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-05-27 10:31:41 +08:00
Xiaofan 36cbce4def
enhance: optimize datanode cpu usage under large collection number (#33267)
fix #33266 
try to improve cpu usage by refactoring the ttchecker logic and caching
string

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-25 04:43:41 +08:00
yihao.dai 7730b910b9
enhance: Decouple compaction from shard (#33138)
Decouple compaction from shard, remove dependencies on shards (e.g.
SyncSegments, injection).

issue: https://github.com/milvus-io/milvus/issues/32809

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-24 09:07:41 +08:00
congqixia 5452376e90
fix: Remove task from syncmgr after task done (#33302)
See also #33247
Introduced in PR #32865

Remove task after task done to keep checkpoint sound and safe

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-23 14:33:40 +08:00
yihao.dai 895799ec61
enhance: Abstract Execute interface for import/preimport task (#33234)
Abstract Execute interface for import/preimport task, simplify import
scheduler.

issue: https://github.com/milvus-io/milvus/issues/33157

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-23 11:29:41 +08:00
XuanYang-cn 22bddde5ff
enhance: Tidy compactor and remove dup codes (#32198)
See also: #32451

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-23 09:53:40 +08:00
congqixia e1bafd7105
enhance: Use pre-built logger for write buffer frequent ops (#33273)
See also #33266

Each `WriteBuffer` shall have same channel/collection id attribute, so
use same logger will do and reduce logger allocation & frequent name
composition

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-22 21:11:40 +08:00
XuanYang-cn 2d6f12d48b
fix: channel manager's goroutine run order (#33118)
See also: #33117

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-21 14:35:39 +08:00
XuanYang-cn b3bcc107bb
fix: Remove L0 compactor in completedCompactor (#33169)
See also: #33168

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-21 11:35:38 +08:00
yihao.dai 32560263fa
enhance: Query slot for compaction task (#32881)
Query slot of compaction in datanode, and transfer the control logic for
limiting compaction tasks from datacoord to the datanode.

issue: https://github.com/milvus-io/milvus/issues/32809

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-17 18:19:38 +08:00
yihao.dai bcaacf6fe6
enhance: Load BF from storage instead of memory during L0 compaction (#32913)
To decouple compaction from shard, loading BF from storage instead of
memory during L0 compaction in datanode.

issue: https://github.com/milvus-io/milvus/issues/32809

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-17 17:25:36 +08:00
Ted Xu a9c7ce72b8
enhance: enable stream writer in compactions (#32612)
See #31679

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-05-17 15:05:37 +08:00
congqixia 892fe66b57
enhance: Refine channelCpUpdater field & test (#33083)
Avoid passing datanode around preparing datanode code directory
refactory.

Also refine unit test code for same component. The `Await` shall return
first before checking the counter number since when lock cost is heavy
(using deadlock.RWMutex See PR #33069.) case may fail due to long
running time submitting tasks.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-16 14:19:34 +08:00
cai.zhang 6ea7633bd5
enhance: Add memory size for binlog (#33025)
issue: #33005
1. add `MemorySize` field for insert binlog.
2. `LogSize` means the file size in the storage object.
3. `MemorySize` means the size of the data in the memory.

---------

Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Signed-off-by: cai.zhang <cai.zhang@zilliz.com>
2024-05-15 12:59:34 +08:00
XuanYang-cn d4837307b3
fix: Make submit idempotent (#33053)
issue: #33054

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-14 22:09:34 +08:00
congqixia 4ae7cabb04
fix: Remove channel when create flowgraph timeout (#33014)
See also #33013

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-14 10:07:33 +08:00
yihao.dai a984e46a29
enhance: Remove rootcoord from datanode broker (#32818)
issue: https://github.com/milvus-io/milvus/issues/32827

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-05-14 10:03:32 +08:00
XuanYang-cn efdbd8e7c1
enhance: Enable to upload by batch (#32788)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-13 19:07:32 +08:00
XuanYang-cn 29b621f759
fix: Make compactor able to clear empty segments (#32821)
See also: #32553

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-05-13 18:21:32 +08:00
congqixia 12ec3d61d9
fix: Fill deltalog entry num & time range in L0 compactions (#33004)
Resolves #33003

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-13 14:13:31 +08:00
congqixia 0e5765b116
enhance: Utilize `TestLocations` ability to accelerate write & compaction (#32948)
See also #32642

This PR reuses hash locations for bloom filter prediction utilizing
`storage.Location`, like enhancement #32642.

Also adds a utility struct in storage: `LocationCache` to storage
locations for variable K (numbers of hash functions)

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-13 10:15:32 +08:00
congqixia 77fa615772
fix: Make SyncManager callback func ignore nil error (#32891)
introduced by #32865

sync manager callback handler panicked when error is nil

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-09 18:03:31 +08:00
congqixia a06f601c6e
fix: Make syncmgr lock key before returning future (#32865)
See also #32860

SyncMgr did not ensure task key is locked before `SyncData` returning
which may cause concurrent problem during sync wich multiple policies.

This PR change sync mgr implementation to make sure the key is locked
before returning task result `*conc.Future`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-05-09 10:09:30 +08:00
yiwangdr d6e537c91c
fix: allow datanode's server id to be updated (#31597)
issue: #31516

background: the server id field in data node is redundant. session id
already provides the source of truth.

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-08 14:03:29 +08:00
yiwangdr b1eacb2ae8
feat: datacoord/node watch based on rpc (#32036)
issue: https://github.com/milvus-io/milvus/issues/25309

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-05-07 15:49:30 +08:00
Xiaofan 1e47d7afc4
improve: change some frequent log to debug (#32779)
remove the frequent log "filter insert messages"

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-05-06 20:51:29 +08:00
congqixia 2c1e8f4774
enhance: Use `struct{}` for sync task future result (#32673)
Related to #27675

Use `struct{}` instead `error` for sync task future result type to
reduce result size and preventing logci error.

Also change some unused parameter to `_` to suppress lint warning

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-29 10:59:26 +08:00
Buqian Zheng 8a1017a152
enhance: add helpers to parse sparse float vector in JSON (#32543)
issue: #29419

added helper functions to parse JSON representation of sparse float
vectors, will be used by both the restful server and the import utils.

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-04-25 14:47:24 +08:00
XuanYang-cn 15b989bb80
fix: Zero flushReq metric for all sealed segs (#32404)
See also: #32399

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-24 14:27:23 +08:00
congqixia 6ef677f79e
fix: Remove metrics after flowgraph closed (#32515)
See also #32403

`fg_buffer_size` was decreased after metrics removed in flowgraph
ddnode, which make metrics value negative.

This PR move remove metrics logic into `dataSyncService.Close`

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-23 17:33:23 +08:00
yihao.dai 558feed5ed
fix: Use pk from binlog during import (#32118)
During binlog import, even if the primary key's autoID is set to true,
the primary key from the binlog should be used instead of being
reassigned.

issue: https://github.com/milvus-io/milvus/discussions/31943,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-16 14:51:20 +08:00
SimFG 1af084ea6b
enhance: Make datanode exit and case `TestProxy` faster (#32218)
/kind improvement
issue: #32219

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-04-16 10:49:20 +08:00
yihao.dai aa96843d31
fix: Fix import hanging and improve logging output (#32166)
Fix import hanging when the previous import task failed, and improve
parquet import logging outout.

issue: https://github.com/milvus-io/milvus/issues/31834

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-13 22:03:23 +08:00
SimFG c012e6786f
feat: support rate limiter based on db and partition levels (#31070)
issue: https://github.com/milvus-io/milvus/issues/30577
co-author: @jaime0815

---------

Signed-off-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
Signed-off-by: SimFG <bang.fu@zilliz.com>
Co-authored-by: Patrick Weizhi Xu <weizhi.xu@zilliz.com>
2024-04-12 16:01:19 +08:00
congqixia c0fa169d9a
enhance: Make write buffer memory check do until safe (#32172)
See also #27675 #26177

Make memory check evict memory buffer until memory water level is safe.
Also make `EvictBuffer` wait until sync task done.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-12 10:55:18 +08:00
jaime 371e6d2c1a
enhance: refine sync memory watermark configuration (#32140)
Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-04-11 20:07:24 +08:00
XuanYang-cn aad3ed3835
fix: [cherry-pick]Skip changing meta if nodeID not match with channel (#31672)
See also: #31648
pr: #31665, #31694

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-10 15:09:18 +08:00
yihao.dai 49d109de18
enhance: Use an individual buffer size parameter for imports (#31833)
Use an individual buffer size parameter for imports and set buffer size
to 64MB.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-08 21:07:18 +08:00
yihao.dai d6cdcf74db
fix: Return err for conc.Future in sync manager (#31790)
Should not return `err, nil` when using conc.Future, as the error will
be lost/ignored when using `AwaitAll` to wait for the future.

issue: https://github.com/milvus-io/milvus/issues/31788

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-06 11:36:57 -07:00
congqixia 49b8ee4339
fix: Make FlushTs Sync Policy apply to all buffers (#31839)
See also #30552

FlushTS policy was orignally designed to flushed/L0 segments only, but
in some edge case, new growing segment buffer would by-pass flush
request and hold a buffer before flush ts, which caused flush timeout

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-03 11:47:13 +08:00
congqixia 0feee53631
enhance: Add back unit test for compactor and fix some TODOs (#31829)
This PR adds back compactor "Unhandled" data type unit test and fixes
some TODOs behvaior

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-04-02 20:35:14 +08:00
yihao.dai 4e264003bf
enhance: Ensure ImportV2 waits for the index to be built and refine some logic (#31629)
Feature Introduced:
1. Ensure ImportV2 waits for the index to be built

Enhancements Introduced:
1. Utilization of local time for timeout ts instead of allocating ts
from rootcoord.
3. Enhanced input file length check for binlog import.
4. Removal of duplicated manager in datanode.
5. Renaming of executor to scheduler in datanode.
6. Utilization of a thread pool in the scheduler in datanode.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-04-01 20:09:13 +08:00
XuanYang-cn 39337e09b8
fix: Using zero serverID for metrics (#31518)
Fixes: #31516

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-04-01 16:55:19 +08:00
yihao.dai 78fbb87b3a
enhance: Release blobs in sync task once sync is completed (#31661)
Once the synchronization of the sync task is completed, it's necessary
to release the blob within the sync task, as the caller may continue to
reference it.

issue: https://github.com/milvus-io/milvus/issues/31545

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-28 10:23:11 +08:00
SimFG b1a1cca10b
feat: add more operation detail info for better allocation (#30438)
issue: #30436

---------

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-03-28 06:33:11 +08:00
yihao.dai 31cf849f68
enhance: Support retriving file size from importutilv2.Reader (#31533)
To reduce the overhead caused by listing the S3 objects, add an
interface to importutil.Reader to retrieve file sizes.

issue: https://github.com/milvus-io/milvus/issues/31532,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-25 20:29:07 +08:00
Bingyi Sun 8e661f791a
fix: lazy load index data in cache (#31094)
issue: https://github.com/milvus-io/milvus/issues/31571

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-03-25 15:43:07 +08:00
yihao.dai f65a796d18
enhance: Add max file num limit and max file size limit for import (#31497)
The max number of import files per request should not exceed 1024 by
default (configurable).
The import file size allowed for importing should not exceed 16GB by
default (configurable).

issue: https://github.com/milvus-io/milvus/issues/28521

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 18:13:06 +08:00
yihao.dai 0fe5e90e8b
enhance: Remove import v1 (#31403)
Remove all code and logic related to import v1.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-22 15:29:09 +08:00
congqixia a647b84f3e
enhance: Add AllPartitionsID const to replace InvalidPartitionID (#31438)
"-1" as `InvalidPartitionID` previously used as All partition place
holder in delete cases. It's confusing and hard to maintain when a const
var has more than one meaning.

This PR add `AllPartitionsID` to replace these usages in delete
scenarios.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-20 19:01:05 +08:00
Xiaofan a63b4cedcf
fix: remove some unnecessary unrecoverable errors (#31327)
use retry.handle when request is not able to service but don't throw
unrecoverable erros
fix #31323

Signed-off-by: xiaofanluan <xiaofan.luan@zilliz.com>
2024-03-20 11:35:07 +08:00
congqixia d9efea2fea
fix: Cleanup write buffer when flowgraph released (#31376)
See also #30137

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-19 01:33:05 +08:00
yihao.dai 776709e5ff
fix: Fix binlog import (#31310)
Fix binlog import functionality by removing the existing check and
refining the size retrieval process.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-17 20:59:04 +08:00
yihao.dai 811316d2ba
fix: Fix binlog import and refine error reporting (#31241)
1. Fix binlog import with partition key.
2. Refine binlog import error reportins.
3. Avoid division by zero when retrieving import progress.

issue: https://github.com/milvus-io/milvus/issues/31221,
https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-15 10:55:05 +08:00
jaime db79be3ae0
fix: ctx cancel should be the last step while stopping server (#31220)
issue: #31219

Signed-off-by: jaime <yun.zhang@zilliz.com>
2024-03-15 10:33:05 +08:00
XuanYang-cn a52a52064d
fix: Use lock and map instead of concurrentMap (#31212)
See also: #31209

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-14 18:39:04 +08:00
Buqian Zheng 3c80083f51
feat: [Sparse Float Vector] add sparse vector support to milvus components (#30630)
add sparse float vector support to different milvus components,
including proxy, data node to receive and write sparse float vectors to
binlog, query node to handle search requests, index node to build index
for sparse float column, etc.

https://github.com/milvus-io/milvus/issues/29419

---------

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-03-13 14:32:54 -07:00
yihao.dai b5c67948b7
enhance: Enhance and modify the return content of ImportV2 (#31192)
1. The Import APIs now provide detailed progress information for each
imported file, including details such as file name, file size, progress,
and more.
2. The APIs now return the collection name and the completion time.
3. Other modifications include changing jobID to jobId and other similar
adjustments.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-13 19:51:03 +08:00
congqixia 937f2440ab
fix: TestBlock case use different segment id in testcase (#31173)
Resolves: #31172

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 17:51:03 +08:00
congqixia ff1e967e89
enhance: Add segment id short cut for WithSegmentID filter (#31144)
See also #31143

This PR add short cut for datanoe metacache `WithSegmentIDs` filter,
which could just fetch segment from map with provided segmentIDs. Also
add benchmark for new implementation vs old one.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-03-11 10:55:02 +08:00
Ted Xu 987d9023a5
enhance: Enable binlog deserialize reader in datanode compaction (#31036)
See #30863

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2024-03-08 18:25:02 +08:00
yihao.dai c411cb4a49
enhance: Prevent the backlog of channelCP update tasks, perform batch updates of channelCPs (#30941)
This PR includes the following adjustments:
1. To prevent channelCP update task backlog, only one task with the same
vchannel is retained in the updater. Additionally, the lastUpdateTime is
refreshed after the flowgraph submits the update task, rather than in
the callBack function.
2. Batch updates of multiple vchannel checkpoints are performed in the
UpdateChannelCheckpoint RPC (default batch size is 128). Additionally,
the lock for channelCPs in DataCoord meta has been switched from key
lock to global lock.
3. The concurrency of UpdateChannelCheckpoint RPCs in the datanode has
been reduced from 1000 to 10.

issue: https://github.com/milvus-io/milvus/issues/30004

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
Co-authored-by: jaime <yun.zhang@zilliz.com>
Co-authored-by: congqixia <congqi.xia@zilliz.com>
2024-03-07 20:39:02 +08:00
yihao.dai 0a2c255630
enhance: Reduce the memory usage of the timeTickSender (#30968)
In the cache of the timeTickSender, retain only the latest stats instead
of storing stats for every time tick.

issue: https://github.com/milvus-io/milvus/issues/30967

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-02 10:13:01 +08:00
yihao.dai a434d33e75
feat: Add import scheduler and manager (#29367)
This PR introduces novel managerial roles for importv2:
1. ImportMeta: To manage all the import tasks;
2. ImportScheduler: To process tasks and modify their states;
3. ImportChecker: To ascertain the completion of all tasks and instigate
relevant operations.

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 18:31:02 +08:00
wayblink f3c56c83ab
fix: Fix binlog_io metric name conflict (#30689)
follow: #29725

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-03-01 18:13:02 +08:00
XuanYang-cn 2867f50fcc
fix: Clear DN unkown compaction tasks (#30850)
If DC restarted,  those unkonwn compaction tasks
will never get call back in DN, so that the segments in the compaction
task will be locked, unable to sync and compaction again, blocking cp
advance and compaction executing.

See also: #30137

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-03-01 11:31:00 +08:00
yihao.dai a5a4ca8459
enhance: Remove debug log (#30955)
Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-03-01 10:02:59 +08:00
chyezh 0c7474d7e8
enhance: add graceful stop timeout to avoid node stop hang under extreme cases (#30317)
1. add coordinator graceful stop timeout to 5s
2. change the order of datacoord component while stop
3. change querynode grace stop timeout to 900s, and we should
potentially change this to 600s when graceful stop is smooth

issue: #30310
also see pr: #30306

---------

Signed-off-by: chyezh <chyezh@outlook.com>
2024-02-29 17:01:50 +08:00
yiwangdr c6665c2a4c
test: support multiple data/querynodes in integration test (#30618)
issue: https://github.com/milvus-io/milvus/issues/29507

Signed-off-by: yiwangdr <yiwangdr@gmail.com>
2024-02-21 11:54:53 +08:00
wayblink f976385421
enhance: replace binlogIO with io.BinlogIO in datanode (#29725)
#30633

Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-20 14:38:51 +08:00
wayblink 6c89609de7
enhance: Reduce unnessary log in binlog_io (#30625)
Signed-off-by: wayblink <anyang.wang@zilliz.com>
2024-02-18 16:50:51 +08:00
congqixia 3c2e0375df
fix: make compactor inject done called no more than once (#30603)
See also #30571

When `compactionExecutor` stops one compaction task, the `stop` method
will case `injectDone` called.

However in `executeTask` when `compact` method returns error, it shall
also invoke `injectDone` as well. That the reason `Unlock of unlocked
RWMutex` panicking happened.

This PR add sync.Once to make sure that `injectDone` is called only
once. We did not remove any of the `injectDone` since removal any of
those invocation may cause logic problem.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 14:08:49 +08:00
congqixia 91b02b5d22
enhance: Add param item for datanode l0 batch/linear mode memory ratio (#30523)
See also #27606

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-18 13:02:50 +08:00
congqixia b111f3b110
enhance: Use RWMutex and change WLock to RLock (#30557)
Related to #27675

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-06 17:13:56 +08:00
congqixia d4100d5442
enhance: Change update channel cp magic number to param item (#30555)
See also #28817

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-06 16:02:00 +08:00
congqixia a68b32134a
fix: Verify sync task target segment and retry if not match (#30500)
See also #27675 #30469

For a sync task, the segment could be compacted during sync task. In
previous implementation, this sync task will hold only the old segment
id as KeyLock, in which case compaction on compacted to segment may run
in parallel with delta sync of this sync task.

This PR introduces sync target segment verification logic. It shall
check target segment lock it's holding beforing actually syncing logic.
If this check failed, sync task shall return`errTargetSegementNotMatch`
error and make manager re-fetch the current target segment id.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-05 11:33:43 +08:00
yihao.dai 18b979d9b4
enhance: Extend support for varchar autoID to BulkInsertV2 (#30477)
issue: https://github.com/milvus-io/milvus/issues/30476

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-04 16:57:05 +08:00
XuanYang-cn e6eb6f2c78
enhance: Speed up L0 compaction (#30410)
This PR changes the following to speed up L0 compaction and
prevent OOM:

1. Lower deltabuf limit to 16MB by default, so that each L0 segment
would be 4X smaller than before.
2. Add BatchProcess, use it if memory is sufficient
3. Iterator will Deserialize when called HasNext to avoid massive memory
peek
4. Add tracing in spiltDelta

See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-04 10:49:05 +08:00
yihao.dai 7ce876a072
fix: Decoupling importing segment from flush process (#30402)
This pr decoups importing segment from flush process by:
1. Exclude the importing segment from the flush policy, this approch
avoids notifying the datanode to flush the importing segment, which may
not exist.
2. When RootCoord call Flush, DataCoord directly set the importing
segment state to `Flushed`.

issue: https://github.com/milvus-io/milvus/issues/30359

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-02-03 13:01:12 +08:00
congqixia 1ab851d73f
enhance: Remove useless frequent log in Mintimestamp (#30471)
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-02 20:39:05 +08:00
XuanYang-cn d744962aa1
fix: Correct Size calculation of DeleteData (#30397)
This PR would correct the actual deltalog size

See also: #30191

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-02 10:47:04 +08:00
XuanYang-cn fb5e09d94d
fix: call injectDone after compaction failed (#30277)
syncMgr.Block() will lock the segment when executing compaction.

Previous implementation was unable to Unblock thoese segments when
compaction failed. If next compaction of the same segments arrives,
it'll stuck forever and block all later compation tasks.

This PR makes sure compaction executor would Unblock these segments
after a failure compaction.

Apart form that, this PR also refines some logs and clean some codes of
compaction, compactor:

1. Log segment count instead of segmentIDs to avoid logging too many
segments
2. Flush RPC returns L1 segments only, skip L0 and L2
3. CompactionType is checked in `Compaction`, no need to check again
inside compactor
4. Use ligter method to replace `getSegmentMeta`
5. Log information for L0 compaction when encounters an error

See also: #30213

---------

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-02-01 14:25:04 +08:00
congqixia be8831b311
enhance: Reduce get segments scan during l0 compaction (#30408)
See also #27606

Previously l0 linear compaction will scan all target segment id from
metacache for each line of delta entry, which is not needed since
compaction target segments shall be all immutable.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-02-01 10:59:03 +08:00
yihao.dai c5918290e6
feat: Add import executor and manager for datanode (#29438)
This PR introduces novel importv2 roles for datanode:
1. Executor: To execute tasks, a import task will be divided into the
following steps: read data -> hash data -> sync data;
2. Manager: To manage all the tasks;

issue: https://github.com/milvus-io/milvus/issues/28521

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-31 20:45:04 +08:00
congqixia fc0d007bd1
enhance: Add `MemoryHighSyncPolicy` back to write buffer manager (#29997)
See also #27675

This PR adds back MemoryHighSyncPolicy implementation. Also change
MinSegmentSize & CheckInterval to configurable param item.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 19:03:04 +08:00
congqixia b5e078c4d3
enhance: Remove current stats after RollStats action (#30391)
See also #27675

BloomFilterSet.current shall be reset after RollStats, otherwise it will
keep tracking whole segment data causing the false positive ratio larger
than expected.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-31 18:55:04 +08:00
chyezh 6d63fb5d3f
fix: panic with datanode negetive wait group counter (#30135)
issue: #29170

Signed-off-by: chyezh <chyezh@outlook.com>
2024-01-30 18:15:04 +08:00
congqixia 0c7a96b48d
enhance: Make compaction log has traceID (#30338)
See also #30167

After support open telemetry tracing, we want to have traceID as well,
this PR adds util functions to set traceID with span & propagate traceID
between different context.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-30 10:09:03 +08:00
congqixia 743bdf1434
enhance: Make l0 compactor download files in parallel (#30309)
See also #27606

`MultiRead` actually download file in sequence, which may lead to large
time consumption during l0 compaction download phase.

This PR make l0 compactor download deltalogs in parallel utilizing conc
package & io pool.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-30 10:07:09 +08:00
congqixia 6445880753
fix: prevent segments got flushed multiple times (#30240)
See also #30111

Segments could be "Flushed" only by `FlushSegments` grpc call from
datacoord by design. There are two possible reason to cause one segment
got flushed multiple times.

- Segment is in flushing state during multiple epoch in flowgraph
- Segment is flushed by flushTs & Flush segments

So this pr fix:

- Remove state change logic form FlushTs policy
- Change Flush segment into three stage way: Sealed->Flushing->Flushed
preventing multiple Flushed=true operations.

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-24 14:19:00 +08:00
congqixia 6a73860815
enhance: Add open telemetry tracing for compaction (#30168)
Resolves #30167

This PR add tracing for all compaction from the task start in datacoord
and execution procedures in datanode.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-23 10:37:00 +08:00
congqixia 8a6de3d2b1
fix: decompress deltelog path for level zero compaction (#30164)
Resolves: #30161
See also: #28873

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-22 14:44:55 +08:00
yihao.dai 8780d65b66
fix: Use channel cp as the dml&start position for import segments (#30107)
This PR discontinuing the subscription to the mq and, instead, employing
the channel checkpoint as the DML and starting position for the import
segments.

issue: https://github.com/milvus-io/milvus/issues/30106

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2024-01-22 14:36:55 +08:00
XuanYang-cn 3d46096f86
fix: Set segment level for comapct to segment (#30129)
See also: #29204

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-19 18:52:53 +08:00
congqixia 0d356b0545
enhance: only buffer delete if it match insert has smaller timestamp (#30122)
See also: #30121 #27675

This PR changes the delete buffering logic:
- Write buffer shall buffer insert first
- Then the delete messages shall be evaluated
  - Whether PK matches previous Bloom filter, which ts is always smaller
  - Whether PK matches insert data which has smaller timestamp
- Then the segment bloom filter is updates by the newly buffered pk rows

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-19 17:28:53 +08:00
XuanYang-cn 86f48861c1
fix: Add more throughput in related metrics (#30038)
This PR also fixes bugs in l0 compactor where
l0 results would never be removed from datanode

See also: #30099

Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-19 11:34:54 +08:00
smellthemoon e52ce370b6
enhance:don't store logPath in meta to reduce memory (#28873)
don't store logPath in meta to reduce memory, when service get
segmentinfo, generate logpath from logid.
#28885

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-18 22:06:31 +08:00
XuanYang-cn ad7a0b4091
fix: Change finish log level to info (#30031)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-17 10:12:55 +08:00
SimFG d9edd50f97
fix: the delete msg disorder issue (#29915)
/kind improvement

Signed-off-by: SimFG <bang.fu@zilliz.com>
2024-01-14 10:26:52 +08:00
congqixia ed89c6a2ee
enhance: make compactor use actual buffer size to decide when to sync (#29945)
See also: #29657

Datanode Compactor use estimated row number from schema to decide when
to sync the batch of data when executing compaction. This est value
could go way from actual size when the schema contains variable field(
say VarChar, JSON, etc.)

This PR make compactor able to check the actual buffer data size and
make it possible to sync when buffer is actually beyong max binglog
size.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-13 01:32:52 +08:00
Bingyi Sun e1258b8cad
feat: integrate storagev2 into loading segment (#29336)
issue: #29335

---------

Signed-off-by: sunby <sunbingyi1992@gmail.com>
2024-01-12 18:10:51 +08:00
Xu Tong e429965f32
Add float16 approve for multi-type part (#28427)
issue:https://github.com/milvus-io/milvus/issues/22837

Add bfloat16 vector, add the index part of float16 vector.

Signed-off-by: Writer-X <1256866856@qq.com>
2024-01-11 15:48:51 +08:00
XuanYang-cn 9c8fd5e51d
fix: Save lite WatchInfo into etcd in DataNode (#29687)
Signed-off-by: yangxuan <xuan.yang@zilliz.com>
2024-01-10 21:18:49 +08:00
congqixia a040692129
enhance: Use estimated batch size to initalize BF (#29842)
See also: #27675

The bloom filter set initialized new BF with fixed configured `n`. This
value is always larger than the actual batch size and causes generated
BF using more memory.

This PR make write buffer to initialize BF with estimated batch size
from schema & configuration value.

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-10 20:36:50 +08:00
Buqian Zheng d506d33a8d
fix: meta cache in datanode incorrectly tracking row nums (#29817)
... of compacted segments

issue: #29816

Signed-off-by: Buqian Zheng <zhengbuqian@gmail.com>
2024-01-10 13:22:48 +08:00
congqixia f18a7191f2
enhance: make `ColumnBasedInsertMsgToInsertData` check field missing (#29758)
fix: #29757

In previous code, `ColumnBasedInsertMsgToInsertData` adds empty field if
the insertMsg parameter does not have the column schema defined. This
may lead to unexpected behavior of caller functions.

This PR:
- Add column missing check
- Add column length check
- Generate BlobInfo for ColumnBasedInsertMsgToInsertData result

---------

Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
2024-01-09 11:50:48 +08:00
smellthemoon 1c1f2a1371
enhance:change some logs (#29579)
related #29588

Signed-off-by: lixinguo <xinguo.li@zilliz.com>
Co-authored-by: lixinguo <xinguo.li@zilliz.com>
2024-01-05 16:12:48 +08:00