design doc:
https://github.com/milvus-io/milvus-design-docs/blob/main/design_docs/20260105-external_table.md
issue: #45881
This change introduces manual refresh capability for external
collections, allowing users to trigger on-demand data synchronization
from external sources. It replaces the legacy update mechanism with a
more robust job-task hierarchy and persistent state management.
Key changes:
- Add RefreshExternalCollection, GetRefreshExternalCollectionProgress,
and ListRefreshExternalCollectionJobs APIs across Client, Proxy,
and DataCoord
- Implement ExternalCollectionRefreshManager to manage refresh jobs
with a 1:N Job-Task hierarchy
- Add ExternalCollectionRefreshMeta for persistent storage of jobs and
tasks in the metastore
- Add ExternalCollectionRefreshChecker for task state management and
worker assignment
- Implement ExternalCollectionRefreshInspector for periodic job
cleanup
- Use WAL Broadcast mechanism for distributed consistency and
idempotency
- Replace legacy external_collection_inspector and update tasks with
the new refresh-based implementation
- Add comprehensive unit tests for refresh job lifecycle and state
transitions
design doc:
https://github.com/milvus-io/milvus-design-docs/blob/main/design_docs/20260105-external_table.md
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Related to #47303
The Go SDK had two issues with Timestamptz field type:
1. FieldTypeTimestamptz was incorrectly defined as 15, but server
expects 26
2. Timestamptz data was serialized as int64 via TimestamptzData, but
server expects ISO 8601 strings (RFC3339Nano format) via StringData
Changes:
- Update FieldTypeTimestamptz value from 15 to 26
- Modify ColumnTimestamptz to store data as RFC3339Nano strings
internally
- Change NewColumnTimestamptz to accept []time.Time and convert to ISO
strings
- Add ColumnTimestampTzIsoString for direct ISO string input
- Update FieldDataColumn to parse Timestamptz from StringData
- Update values2Scalars to handle Timestamptz as string type
- Add NewNullableColumnTimestamptz for nullable time.Time input
- Update NewNullableColumnTimestamptzIsoString for nullable ISO string
input
- Add corresponding unit tests
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue : https://github.com/milvus-io/milvus/issues/41746
This PR adds MinHash "DIDO" (Data In, Data Out) support to Milvus, which
allows computing MinHash signatures on-the-fly during search operations
instead of requiring pre-stored vectors.
Key changes:
- Implemented SIMD-optimized C++ MinHash computation (AVX2/AVX512 for
x86, NEON/SVE for ARM)
- Added runtime CPU detection and function hooks to automatically select
the best SIMD implementation
- Integrated MinHash computation into search pipeline (brute force
search, growing segment search)
- Added support for LSH-based MinHash search with configurable band
width and bit width parameters
- Enabled direct text-to-signature conversion during query execution,
reducing storage overhead
This enables efficient text deduplication and similarity search without
storing pre-computed MinHash vectors.
Signed-off-by: cqy123456 <qianya.cheng@zilliz.com>
issue: #45881
- Add ExternalSource and ExternalSpec fields to collection schema
- Add ExternalField mapping for field schema to map external columns
- Implement ValidateExternalCollectionSchema() to enforce restrictions:
- No primary key (virtual PK generated automatically)
- No dynamic fields, partition keys, clustering keys, or auto ID
- No text match or function features
- All user fields must have external_field mapping
- Return virtual PK schema for external collections in
GetPrimaryFieldSchema()
- Skip primary key validation for external collections during creation
- Add comprehensive unit tests and integration tests
- Add design document and user guide
---------
Signed-off-by: Wei Liu <wei.liu@zilliz.com>
Co-authored-by: sunby <sunbingyi1992@gmail.com>
related: #45993
This commit extends nullable vector support to the proxy layer,
querynode,
and adds comprehensive validation, search reduce, and field data
handling
for nullable vectors with sparse storage.
Proxy layer changes:
- Update validate_util.go checkAligned() with getExpectedVectorRows()
helper
to validate nullable vector field alignment using valid data count
- Update checkFloatVectorFieldData/checkSparseFloatVectorFieldData for
nullable vector validation with proper row count expectations
- Add FieldDataIdxComputer in typeutil/schema.go for logical-to-physical
index translation during search reduce operations
- Update search_reduce_util.go reduceSearchResultData to use
idxComputers
for correct field data indexing with nullable vectors
- Update task.go, task_query.go, task_upsert.go for nullable vector
handling
- Update msg_pack.go with nullable vector field data processing
QueryNode layer changes:
- Update segments/result.go for nullable vector result handling
- Update segments/search_reduce.go with nullable vector offset
translation
Storage and index changes:
- Update data_codec.go and utils.go for nullable vector serialization
- Update indexcgowrapper/dataset.go and index.go for nullable vector
indexing
Utility changes:
- Add FieldDataIdxComputer struct with Compute() method for efficient
logical-to-physical index mapping across multiple field data
- Update EstimateEntitySize() and AppendFieldData() with fieldIdxs
parameter
- Update funcutil.go with nullable vector support functions
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **New Features**
* Full support for nullable vector fields (float, binary, float16,
bfloat16, int8, sparse) across ingest, storage, indexing, search and
retrieval; logical↔physical offset mapping preserves row semantics.
* Client: compaction control and compaction-state APIs.
* **Bug Fixes**
* Improved validation for adding vector fields (nullable + dimension
checks) and corrected search/query behavior for nullable vectors.
* **Chores**
* Persisted validity maps with indexes and on-disk formats.
* **Tests**
* Extensive new and updated end-to-end nullable-vector tests.
<sub>✏️ Tip: You can customize this high-level summary in your review
settings.</sub>
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: marcelo-cjl <marcelo.chen@zilliz.com>
Related to #42148
Add comprehensive support for struct array field type in the Go SDK,
including data structure definitions, column operations, schema
construction, and full test coverage.
**Struct Array Column Implementation (`client/column/struct.go`)**
- Add `columnStructArray` type to handle struct array fields
- Implement `Column` interface methods:
- `NewColumnStructArray()`: Create new struct array column from
sub-fields
- `Name()`, `Type()`: Basic metadata accessors
- `Slice()`: Support slicing across all sub-fields
- `FieldData()`: Convert to protobuf `StructArrayField` format
- `Get()`: Retrieve struct values as `map[string]any`
- `ValidateNullable()`, `CompactNullableValues()`: Nullable support
- Placeholder implementations for unsupported operations (AppendValue,
GetAsX, IsNull, AppendNull)
**Struct Array Parsing (`client/column/columns.go`)**
- Add `parseStructArrayData()` function to parse `StructArrayField` from
protobuf
- Update `FieldDataColumn()` to detect and parse struct array fields
- Support range-based slicing for struct array data
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
issue: #43427
This pr's main goal is merge #37417 to milvus 2.5 without conflicts.
# Main Goals
1. Create and describe collections with geospatial type
2. Insert geospatial data into the insert binlog
3. Load segments containing geospatial data into memory
4. Enable query and search can display geospatial data
5. Support using GIS funtions like ST_EQUALS in query
6. Support R-Tree index for geometry type
# Solution
1. **Add Type**: Modify the Milvus core by adding a Geospatial type in
both the C++ and Go code layers, defining the Geospatial data structure
and the corresponding interfaces.
2. **Dependency Libraries**: Introduce necessary geospatial data
processing libraries. In the C++ source code, use Conan package
management to include the GDAL library. In the Go source code, add the
go-geom library to the go.mod file.
3. **Protocol Interface**: Revise the Milvus protocol to provide
mechanisms for Geospatial message serialization and deserialization.
4. **Data Pipeline**: Facilitate interaction between the client and
proxy using the WKT format for geospatial data. The proxy will convert
all data into WKB format for downstream processing, providing column
data interfaces, segment encapsulation, segment loading, payload
writing, and cache block management.
5. **Query Operators**: Implement simple display and support for filter
queries. Initially, focus on filtering based on spatial relationships
for a single column of geospatial literal values, providing parsing and
execution for query expressions.Now only support brutal search
7. **Client Modification**: Enable the client to handle user input for
geospatial data and facilitate end-to-end testing.Check the modification
in pymilvus.
---------
Signed-off-by: Yinwei Li <yinwei.li@zilliz.com>
Signed-off-by: Cai Zhang <cai.zhang@zilliz.com>
Co-authored-by: ZhuXi <150327960+Yinwei-Yu@users.noreply.github.com>
See failure run in #40352
This PR:
- move metaheader map to client struct from config
- set default value for field schema
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #39093
This PR add update timestamp check and retry policy according to the
design of the related issue
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #31293#37031
This PR:
- Add DescribeReplica API
- Add unified RBAC v2 API names(AddPrivilegesToGroup,
RemovePrivilegesFromGroup, GrantPrivilegeV2, RevokePrivilegeV2)
- Mark old ones deprecated
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #39095
Previous PR #39990 update pkg module path using "/v2" package name, this
PR update milvusclient go sdk dependency for this update
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #31293
This PR:
- Add AlterDatabaseProperties API
- Add DropDatabaseProperties API
- Add DescribeDatabase API
- Rename AlterCollection to AlterCollectionProperties
- Add DropCollectionProperties API
- Add AlterCollectionFieldProperties API
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #31293
- Rename `UsingDatabase` to `UseDatabase`
- Uncomment default value methods
- Add missing RBAC APIs
- Add some resource group APIs
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Previous PR: #37978
This unit test is unable due to dim is a random number. When dim is
large enough precision loss will be greater than 0.04
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>
Related to #35853
This PR contains following changes:
- Add function and related proto and helper functions
- Remove the insert column missing check and leave it to server
- Add text as search input data
- Add some unit tests for logic above
---------
Signed-off-by: Congqi Xia <congqi.xia@zilliz.com>