milvus/internal/datacoord
Ted Xu ebb648f647
feat: integrate manifest-based statistics for V3 storage segments (#48005)
See: #48006

design doc:
https://github.com/milvus-io/milvus-design-docs/blob/main/design_docs/20260226-manifest-format.md

Integrate manifest-based statistics storage (bloom filter, BM25,
text index, JSON key index) for V3 LOON segments across all write
and read paths.

Key changes:
- Add StatsResolver to centralize V2/V3 stats branching with lazy
  manifest caching, shared across segment loading, L0 compaction,
  and flush recovery
- Write-side: pack_writer_v3, sort_compaction, task_stats register
  stats in manifest with memory_size metadata
- Read-side: segment_loader, l0_compactor, data_sync_service use
  StatsResolver for unified path resolution
- New FFI layer for manifest stats reading/writing
- PackSegmentLoadInfo clears legacy fields when manifest_path set
- Fix manifest version chaining across sequential stats registrations

---------

Signed-off-by: Ted Xu <ted.xu@zilliz.com>
2026-03-23 12:01:28 +08:00
..
allocator
broker enhance: add CMEK compatibility validation for snapshot restore (#47089) 2026-02-05 10:57:50 +08:00
session
task fix: pickNode should deduct task slots instead of zeroing available slots (#48170) 2026-03-15 21:17:25 +08:00
.mockery.yaml
OWNERS
README.md
analyze_inspector.go
analyze_meta.go
analyze_meta_test.go
build_index_policy.go
channel.go
compaction_inspector.go fix: prevent node panic when unsupported types used as ClusteringKey (#48184) 2026-03-15 21:21:25 +08:00
compaction_inspector_test.go fix: prevent node panic when unsupported types used as ClusteringKey (#48184) 2026-03-15 21:21:25 +08:00
compaction_l0_view.go
compaction_l0_view_test.go
compaction_policy_clustering.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_policy_clustering_test.go feat: [ExternalTable Part2] Disable all write operations on external collections (#46673) 2026-01-15 19:05:28 +08:00
compaction_policy_forcemerge.go fix: convert ForceMerge targetSize from MB to bytes for proper validation (#47327) 2026-02-03 19:31:53 +08:00
compaction_policy_forcemerge_test.go fix: convert ForceMerge targetSize from MB to bytes for proper validation (#47327) 2026-02-03 19:31:53 +08:00
compaction_policy_l0.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
compaction_policy_l0_test.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
compaction_policy_single.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_policy_single_test.go feat: [ExternalTable Part2] Disable all write operations on external collections (#46673) 2026-01-15 19:05:28 +08:00
compaction_policy_storage_version.go enhance: add session version requirement for storage version upgrade compaction (#47376) 2026-01-30 12:05:47 +08:00
compaction_policy_storage_version_test.go enhance: add session version requirement for storage version upgrade compaction (#47376) 2026-01-30 12:05:47 +08:00
compaction_queue.go
compaction_queue_test.go
compaction_task.go
compaction_task_clustering.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_task_clustering_test.go
compaction_task_l0.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_task_l0_test.go fix: Fast finish compaction when L0Comp hit zero L1/L2 (#47154) 2026-01-20 11:21:30 +08:00
compaction_task_meta.go fix: Ignore L0Compaction when check PreAllocSegmentIDs (#47117) 2026-01-19 14:45:29 +08:00
compaction_task_meta_test.go fix: Ignore L0Compaction when check PreAllocSegmentIDs (#47117) 2026-01-19 14:45:29 +08:00
compaction_task_mix.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_task_mix_test.go
compaction_trigger.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
compaction_trigger_test.go fix: pass scalar index version through load path to fix version 3 index loading (#47342) 2026-01-29 14:11:32 +08:00
compaction_trigger_v2.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
compaction_trigger_v2_test.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
compaction_util.go
compaction_view.go
compaction_view_forcemerge.go enhance: add logging for force merge algorithm selection (#47210) 2026-01-22 15:21:30 +08:00
compaction_view_forcemerge_test.go fix: Use user-provided target size (#46835) 2026-01-16 11:41:28 +08:00
const.go
copy_segment_checker.go fix: prevent DropSnapshot while restore jobs reference the snapshot (#47608) 2026-02-13 11:54:41 +08:00
copy_segment_checker_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
copy_segment_inspector.go
copy_segment_inspector_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
copy_segment_job.go
copy_segment_job_test.go
copy_segment_meta.go fix: prevent DropSnapshot while restore jobs reference the snapshot (#47608) 2026-02-13 11:54:41 +08:00
copy_segment_meta_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
copy_segment_task.go enhance: unify copy segment logic with manifest path support (#47751) 2026-03-02 15:37:19 +08:00
copy_segment_task_test.go
create_meta_test.go
ddl_callbacks.go fix: prevent TOCTOU race and deadlock in snapshot restore broadcast (#47870) 2026-03-05 15:41:22 +08:00
ddl_callbacks_alter_index.go
ddl_callbacks_create_index.go
ddl_callbacks_drop_index.go
ddl_callbacks_external_collection.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
ddl_callbacks_external_collection_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
ddl_callbacks_flushall.go
ddl_callbacks_snapshot.go fix: Improve snapshot restore and RefIndex loading reliability (#46883) 2026-01-13 21:53:27 +08:00
ddl_callbacks_snapshot_test.go fix: prevent TOCTOU race and deadlock in snapshot restore broadcast (#47870) 2026-03-05 15:41:22 +08:00
errors.go
errors_test.go
external_collection_refresh_checker.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_checker_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_inspector.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_inspector_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_manager.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_manager_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_meta.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
external_collection_refresh_meta_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
garbage_collector.go fix: Prevent GC from deleting index files referenced by snapshots (#48022) 2026-03-06 18:39:21 +08:00
garbage_collector_test.go fix: Prevent GC from deleting index files referenced by snapshots (#48022) 2026-03-06 18:39:21 +08:00
go_channel_singleton.go
handler.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
handler_test.go fix: populate LevelZeroSegmentIds in GetDataVChanPositions (#47579) 2026-02-05 19:39:50 +08:00
import_checker.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
import_checker_test.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
import_inspector.go
import_inspector_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
import_job.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
import_meta.go enhance: Prevent import job or task rolling back state (#47100) 2026-01-16 17:49:29 +08:00
import_meta_test.go
import_task.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
import_task_import.go fix: Use actual data timestamps for imported segment positions (#47276) 2026-01-28 14:13:32 +08:00
import_task_import_test.go fix: Use actual data timestamps for imported segment positions (#47276) 2026-01-28 14:13:32 +08:00
import_task_preimport.go
import_task_preimport_test.go
import_util.go fix: Use actual data timestamps for imported segment positions (#47276) 2026-01-28 14:13:32 +08:00
import_util_test.go feat: [ExternalTable Part3] Support manual refresh for external collections (#47492) 2026-02-26 11:20:46 +08:00
index_engine_version_manager.go enhance: add session version requirement for storage version upgrade compaction (#47376) 2026-01-30 12:05:47 +08:00
index_engine_version_manager_test.go enhance: add session version requirement for storage version upgrade compaction (#47376) 2026-01-30 12:05:47 +08:00
index_inspector.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
index_inspector_test.go
index_meta.go fix: add WarmupKey to checkParams filter for CreateIndex idempotency (#47595) 2026-02-06 11:09:50 +08:00
index_meta_test.go fix: add WarmupKey to checkParams filter for CreateIndex idempotency (#47595) 2026-02-06 11:09:50 +08:00
index_service.go enhance: support user specified warmup (#47373) 2026-02-02 14:34:00 +08:00
index_service_test.go
knapsack.go
knapsack_test.go
meta.go fix: generate ManifestPath for StorageV3 growing segments at creation time (#48203) 2026-03-11 22:47:25 +08:00
meta_test.go fix: generate ManifestPath for StorageV3 growing segments at creation time (#48203) 2026-03-11 22:47:25 +08:00
meta_util.go
metrics_info.go enhance: optimize mixcoord cpu usage (#47618) 2026-03-08 22:31:21 +08:00
metrics_info_test.go enhance: Improve the consistency of file resource sync (#47113) 2026-01-26 11:45:32 +08:00
mock_collection_topology_querier.go
mock_compaction_inspector.go
mock_compaction_meta.go
mock_handler.go
mock_import_meta.go
mock_index_engine_version_manager.go enhance: add session version requirement for storage version upgrade compaction (#47376) 2026-01-30 12:05:47 +08:00
mock_segment_manager.go
mock_stats_job_manager.go
mock_sub_cluster.go
mock_test.go enhance: optimize mixcoord cpu usage (#47618) 2026-03-08 22:31:21 +08:00
mock_trigger.go
mock_trigger_manager.go fix: remove interactive logic between import and L0 compaction (#47768) 2026-03-10 15:57:22 +08:00
partition_stats_meta.go
partition_stats_meta_test.go
segment_allocation_policy.go
segment_allocation_policy_test.go
segment_info.go feat: integrate manifest-based statistics for V3 storage segments (#48005) 2026-03-23 12:01:28 +08:00
segment_info_test.go feat: integrate manifest-based statistics for V3 storage segments (#48005) 2026-03-23 12:01:28 +08:00
segment_manager.go fix: generate ManifestPath for StorageV3 growing segments at creation time (#48203) 2026-03-11 22:47:25 +08:00
segment_manager_test.go fix: generate ManifestPath for StorageV3 growing segments at creation time (#48203) 2026-03-11 22:47:25 +08:00
segment_operator.go feat: integrate manifest-based statistics for V3 storage segments (#48005) 2026-03-23 12:01:28 +08:00
segment_operator_test.go fix: Use actual data timestamps for imported segment positions (#47276) 2026-01-28 14:13:32 +08:00
server.go fix: remove IsTriggerKill SIGINT from datacoord and querycoord session watchers (#48252) 2026-03-19 22:29:28 +08:00
server_test.go fix: remove IsTriggerKill SIGINT from datacoord and querycoord session watchers (#48252) 2026-03-19 22:29:28 +08:00
services.go feat: integrate manifest-based statistics for V3 storage segments (#48005) 2026-03-23 12:01:28 +08:00
services_test.go fix: prevent TOCTOU race and deadlock in snapshot restore broadcast (#47870) 2026-03-05 15:41:22 +08:00
snapshot.go fix: Prevent GC from deleting index files referenced by snapshots (#48022) 2026-03-06 18:39:21 +08:00
snapshot_manager.go fix: prevent TOCTOU race and deadlock in snapshot restore broadcast (#47870) 2026-03-05 15:41:22 +08:00
snapshot_manager_test.go fix: prevent TOCTOU race and deadlock in snapshot restore broadcast (#47870) 2026-03-05 15:41:22 +08:00
snapshot_meta.go fix: Prevent GC from deleting index files referenced by snapshots (#48022) 2026-03-06 18:39:21 +08:00
snapshot_meta_test.go fix: Prevent GC from deleting index files referenced by snapshots (#48022) 2026-03-06 18:39:21 +08:00
snapshot_test.go
stats_inspector.go fix: Optimize namespace compaction and query implementation (#46512) 2026-03-09 21:11:23 +08:00
stats_inspector_test.go feat: [ExternalTable Part2] Disable all write operations on external collections (#46673) 2026-01-15 19:05:28 +08:00
stats_task_meta.go
stats_task_meta_test.go
task_analyze.go
task_analyze_test.go
task_index.go fix: skip index creation for nullable vector fields with insufficient valid rows (#46903) 2026-01-13 10:11:27 +08:00
task_index_test.go
task_refresh_external_collection.go feat: [ExternalTable Part4] Support data mapping for external collections (#47730) 2026-03-04 18:09:21 +08:00
task_refresh_external_collection_test.go feat: [ExternalTable Part4] Support data mapping for external collections (#47730) 2026-03-04 18:09:21 +08:00
task_stats.go feat: integrate manifest-based statistics for V3 storage segments (#48005) 2026-03-23 12:01:28 +08:00
task_stats_test.go fix: fix non-atomic CreateCollection causes schema loss (#47900) 2026-03-01 21:31:20 +08:00
util.go enhance: support configurable TLS minimum version for object storage connections (#48000) 2026-03-04 19:45:21 +08:00
util_test.go

README.md

Data Coordinator

Data cooridnator(datacoord for short) is the component to organize DataNodes and segments allocations.

Dependency

  • KV store: a kv store has all the meta info datacoord needs to operate. (etcd)
  • Message stream: a message stream to communicate statistics information with data nodes. (Pulsar)
  • Root Coordinator: timestamp, id and meta source.
  • Data Node(s): could be an instance or a cluster, actual worker group handles data modification operations.