Commit Graph

72 Commits (17532517c611cafe5ec7a79bda47c9f296e82682)

Author SHA1 Message Date
zhuwenxing 96615cce41
test: add MinHash DIDO function test suite (#47324)
/kind improvement

add testcases and fix a related issue

issue: #47928

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 17:17:30 +08:00
zhuwenxing e6e2645991
test: set Strong consistency level for chaos test collections (#47931)
## Summary
- Set `consistency_level="Strong"` at collection creation time in chaos
`checker.py` (9 places) to ensure data correctness during chaos testing
- Add explicit `consistency_level="Strong"` to all search/query calls in
`test_all_collections_after_chaos.py` (8 places) since it operates on
pre-existing collections

## Test plan
- [x] Pod logs confirm `[consistency_level=Strong]` at CreateCollection
- [x] `describe_collection` returns `consistency_level: 0` (Strong)
- [x] Insert-then-query returns all data immediately without explicit
consistency_level param

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 17:11:24 +08:00
Feilong Hou 2f8456c54d
test: add entity TTL chaos tests for upsert and insert during node kill (#48057)
## Summary
- Add chaos test for upsert-during-node-kill: verifies TTL extension via
upsert persists after querynode kill and WAL replay
- Add chaos test for insert-during-node-kill: verifies new data with TTL
inserted during chaos is correctly handled after recovery
- Fix `_verify_correctness` to distinguish "service never recovered"
from "wrong counts"
- Strengthen `_verify_search_consistency` to require non-zero valid
results from non-expired groups

## Test plan
- [x] All 5 chaos tests pass against a Milvus cluster with Chaos Mesh
- [x] Upsert test confirms TTL extension survives querynode kill
- [x] Insert test confirms both long-TTL (alive) and short-TTL (expired)
data inserted during chaos are correctly handled after recovery

issue: https://github.com/milvus-io/milvus/issues/47482

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 20:51:24 +08:00
yanliang567 4322a8cae1
test: Add null vector checkers to chaos test pipeline (#47979)
## Summary

Wire three null-vector checkers into the chaos test concurrent operation
pipeline (`test_concurrent_operation.py`):

- **NullVectorSearchChecker** — detects NaN distances from null vector
leaks in search index
- **NullVectorQueryChecker** — validates null/non-null ratio consistency
in queries
- **AddVectorFieldChecker** — dynamically adds nullable FLOAT_VECTOR
fields, creates index, inserts data, and verifies

These checkers are already implemented in `checker.py` but were not
registered in `init_health_checkers()`. The default collection schema
already has nullable vector fields (`float_vector` and `image_emb` both
`nullable=True`), so the checkers use the shared collection name.

issue: #47867

### Changes

**Modified:**
- `tests/python_client/chaos/testcases/test_concurrent_operation.py`:
Added imports and registered 3 checkers

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Signed-off-by: yanliang567 <82361606+yanliang567@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 20:15:20 +08:00
zhuwenxing 8fc0b2d238
test: fix snapshot checker row count mismatch in chaos test (#47826)
## Summary
- Split snapshot testing into two checkers to fix row count mismatch
failures in chaos tests
- **SnapshotChecker** (`Op.snapshot`): lightweight, shares collection
with other checkers, only verifies snapshot create/restore operations
succeed
- **SnapshotRestoreChecker** (`Op.restore_snapshot`): uses independent
collection with internal DML operations, verifies data correctness (row
count + PK) after restore
- Removed `_snapshot_lock` from `Checker` class and all DML checkers
(Insert/Upsert/Delete) to eliminate coupling

## Root Cause
The `SnapshotRestoreChecker` shared a collection with other concurrent
checkers and relied on a global `_snapshot_lock` to prevent data
modifications during snapshot creation. However, some code paths (e.g.,
`Checker.insert_data()` base method,
`DeleteChecker.update_delete_expr()` refill logic) bypassed the lock,
causing row count mismatches after restore. The delta was exactly 3000
(`DELTA_PER_INS`).

Failure log: `expected=158188, actual=155188` (delta=3000)

ref:
https://qa-jenkins.milvus.io/blue/organizations/jenkins/chaos-test-cron/detail/chaos-test-cron/23943/pipeline

## Test plan
- [ ] Run chaos-test-cron pipeline and verify `Op.snapshot` (shared
collection) passes
- [ ] Verify `Op.restore_snapshot` (independent collection) passes with
no row count mismatch
- [ ] Confirm other checkers (insert/upsert/delete) are not impacted by
lock removal

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2026-02-25 15:58:47 +08:00
zhuwenxing c7bacaf325
test: add SnapshotRestoreChecker for chaos testing (#47418)
## Summary
- Add `SnapshotRestoreChecker` to test snapshot restore functionality in
chaos testing
- Support concurrent execution with other checkers (shared collection)
- Use `self.get_schema()` to get latest schema for DML operations
- Simplified data verification for concurrent scenarios

## Test plan
- [x] Single run test passed
- [x] Concurrent operation test passed (100% success rate)
- [x] Added to test_concurrent_operation.py
- [x] Added to test_single_request_operation.py

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2026-02-03 15:34:24 +08:00
zhuwenxing e3a85be435
test: replace parquet with jsonl for EventRecords and RequestRecords in checker (#46671)
/kind improvement

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
- Core invariant: tests' persistence of EventRecords and RequestRecords
must be append-safe under concurrent writers; this PR replaces Parquet
with JSONL and uses per-file locks and explicit buffer flushes to
guarantee atomic, append-safe writes (EventRecords uses event_lock +
append per line; RequestRecords buffers under request_lock and flushes
to file when threshold or on sink()).

- Logic removed/simplified and rationale: DataFrame-based parquet
append/read logic (pyarrow/fastparquet) and implicit parquet buffering
were removed in favor of simple line-oriented JSON writes and explicit
buffer management. The complex Parquet append/merge paths were redundant
because parquet append under concurrent test-writer patterns caused
corruption; JSONL removes the append-mode complexity and the
parquet-specific buffering/serialization code.

- Why no data loss or behavior regression (concrete code paths):
EventRecords.insert writes a complete JSON object per event under
event_lock to /tmp/ci_logs/event_records_*.jsonl and get_records_df
reads every JSON line under the same lock (or returns an empty DataFrame
with the same schema on FileNotFound/Error), preserving all fields
event_name/event_status/event_ts. RequestRecords.insert appends to an
in-memory buffer under request_lock and triggers _flush_buffer() when
len(buffer) >= 100; _flush_buffer() writes each buffered JSON line to
/tmp/ci_logs/request_records_*.jsonl and clears the buffer; sink() calls
_flush_buffer() under request_lock before get_records_df() reads the
file — ensuring all buffered records are persisted before reads. Both
read paths handle FileNotFoundError and exceptions by returning empty
DataFrames with identical column schemas, so external callers see the
same API and no silent record loss.

- Enhancement summary (concrete): Replaces flaky Parquet append/read
with JSONL + explicit locking and deterministic flush semantics,
removing the root cause of parquet append corruption in tests while
keeping the original DataFrame-based analysis consumers unchanged
(get_records_df returns equivalent schemas).
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-12-30 14:13:21 +08:00
zhuwenxing 2d7574b5a3
test: refactor connection method to prioritize uri/token and add query limit (#45901)
- Refactor connection logic to prioritize uri and token parameters over
host/port/user/password for a more modern connection approach
- Add explicit limit parameter (limit=5) to search and query operations
in chaos checkers to avoid returning unlimited results
- Migrate test_all_collections_after_chaos.py from Collection wrapper to
MilvusClient API style
- Update pytest fixtures in chaos test files to support uri/token params

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-27 19:25:07 +08:00
zhuwenxing e0df44481d
test: refactor checker to using milvus client (#45524)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-11-20 11:59:08 +08:00
zhuwenxing 1e130683be
test: add geometry datatype in checker (#44794)
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-10-24 11:28:04 +08:00
Feilong Hou f9afde23d1
test: Add New Test Cases for Partial Update (#44483)
Issue: #43872 
<fix>: <fix after accidental force pull>

 Changes to be committed:
	modified:   chaos/checker.py
	modified:   chaos/testcases/test_single_request_operation.py
	modified:   common/common_func.py
	modified:   common/common_type.py
	modified:   milvus_client/test_milvus_client_insert.py

This includes e2e cases and chaos checker.
All the cases are currently skipped due to partial update feature not
ready.

1. test_milvus_client_partial_update_insert_delete_upsert_with_flush():
insert -> delete -> flush -> query -> upsert -> flush -> query
2.
test_milvus_client_partial_update_insert_upsert_delete_upsert_flush():
insert -> upsert -> delete -> upsert -> flush -> query
3.
test_milvus_client_partial_update_insert_upsert_flush_delete_upsert_flush():
insert -> upsert -> flush -> delete -> upsert -> flush -> query Also
update requirements.txt to use latest pymilvus version

---------

Signed-off-by: Eric Hou <eric.hou@zilliz.com>
Co-authored-by: Eric Hou <eric.hou@zilliz.com>
2025-09-23 09:06:12 +08:00
zhuwenxing b619684ca2
test: add collection rename checker in chaos test (#43412)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-07-21 11:34:53 +08:00
zhuwenxing 21008c1bd2
test: add rolling upgrade test scripts (#43109)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-07-17 14:26:52 +08:00
zhuwenxing b043ff14c2
test: add `add_collection_field` feature in checker for chaos test (#42085)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-06-20 15:24:39 +08:00
zhuwenxing 69be718105
test: [skip e2e]remove tls connection (#42799)
/kind improvement

---------

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-06-18 10:34:43 +08:00
Emmanuel Ferdman c5adc09127
test: fix: resolve Python Logger warnings (#41827)
# PR Summary
This PR resolves the deprecation warnings of the `logger` library:
```python
DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
```

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-06-03 14:47:51 +08:00
zhuwenxing 6a12304d1e
test: add alter collection checker (#41281)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-15 11:02:34 +08:00
zhuwenxing 8b9bb5dd68
test: add JsonQueryChecker in test (#41096)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-04-10 10:28:29 +08:00
yihao.dai b2a8694686
enhance: Merge IndexNode and DataNode (#40272)
Merge DataNode and IndexNode into DataNode.

issue: https://github.com/milvus-io/milvus/issues/39115

---------

Signed-off-by: bigsheeper <yihao.dai@zilliz.com>
2025-03-13 14:26:11 +08:00
zhuwenxing 45256c41d6
test: fix text match (#40295)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-03-03 21:26:00 +08:00
zhuwenxing 828ecacadc
test: fix checker function name, release mistake and add nullable (#40135)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-02-26 10:27:56 +08:00
zhuwenxing 9d37f0f9ee
test: add fts and text match verification in second test (#39970)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-02-19 14:12:58 +08:00
zhuwenxing 31fe8cc1c3
test: add phrase match in chaos test (#39765)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2025-02-11 13:56:45 +08:00
zhuwenxing 6e37372619
test: update checker (#37275)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-31 09:50:20 +08:00
zhuwenxing cdee149191
test: fix testcases for verification after chaos (#37153)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-28 10:33:29 +08:00
zhuwenxing ac2858d418
test: add full text search checker in test (#37122)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-25 14:09:29 +08:00
zhuwenxing 80d48f1e53
test: add text match checker in test (#37052)
/kind improvement

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-10-23 09:49:27 +08:00
zhuwenxing 9a269f1489
test: add import checker to chaos test (#32908)
add import checker to chaos test

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-05-10 11:43:30 +08:00
zhuwenxing d2c286c536
test: skip building index when field already has index (#30820)
skip building index when field already has index

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-02-28 14:08:58 +08:00
zhuwenxing f0bff1e1a8
test: add hybrid search in checker for test (#30341)
add hybrid search in checker for test

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-02-23 10:46:52 +08:00
zhuwenxing e6daff49a6
test: fix query result verification (#30351)
fix query result verification:
changed the query expression and adopted a more lenient validation
method to address the issue of not being able to guarantee the retrieval
of specific IDs due to frequent deletion operation

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-01-31 14:11:04 +08:00
zhuwenxing aab7cc9ecd
test: add freshness checker (#30280)
add freshness checker

insert/upsert --> query:  Get the time when it can be queried

delete --> query: Get the time when it can not be queried

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-01-29 12:09:01 +08:00
zhuwenxing 72c81c8ae4
test: add multi-tenancy checker (#29635)
add multi-tenancy checker

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2024-01-03 15:20:49 +08:00
zhuwenxing 6efb7afd3f
test: add more request type checker for test (#29210)
add more request type checker for test
* partition 
* database
* upsert

Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-12-14 19:38:45 +08:00
zhuwenxing 5b405ca28a
[skip e2e]Remove compact in concurrent test (#27666)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-10-12 15:13:34 +08:00
zhuwenxing 567fb23126
[test]Update health_checkers assertion for standby test (#26986)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-09-11 17:55:23 +08:00
zhuwenxing 0f4475e5e3
[test]Refine health checker in test (#26920)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-09-08 10:09:16 +08:00
zhuwenxing b3de99e336
[test]Add method to analyze chaos test result (#26724)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-09-01 10:31:01 +08:00
zhuwenxing 64a9762cf3
[test]Add all succ check after rolling update (#26638)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-08-29 14:30:27 +08:00
zhuwenxing cb34edde88
[test]Add database for chaos test (#26636)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-08-29 14:28:40 +08:00
zhuwenxing e7d5196f68
[test]Wait index building complete before rolling update (#26377)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-08-17 16:38:18 +08:00
zhuwenxing 037a58a60d
[test]Enable standby during rolling update and refine bulk insert (#26039)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-08-01 09:37:04 +08:00
zhuwenxing ee5da73fae
[test]Add bulk insert for test and refactoring the checker function (#25997)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-07-31 12:45:03 +08:00
zhuwenxing 7603fd2bd4
[test]Use pytest assume to assert (#25890)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-07-25 17:07:02 +08:00
zhuwenxing b70da0859a
[test]Add RPO and RTO metric for rolling update test (#25612)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-07-14 19:22:36 +08:00
zhuwenxing f56c65efa8
[test]Update the insert number in verification (#25047)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-06-21 13:52:42 +08:00
zhuwenxing 2bcd1bb0d8
[test]Add standby test and adapt to different schemas (#24781)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-06-09 15:20:36 +08:00
zhuwenxing bede8f6171
[test]Skip new feature in upgrade deploy test cases (#24748)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-06-08 18:52:36 +08:00
zhuwenxing f73f4d5ff1
[test]Update the method for check insert number (#24289)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-05-23 13:25:31 +08:00
zhuwenxing 4a03fb8bb5
[test]Skip compact check (#24137)
Signed-off-by: zhuwenxing <wenxing.zhu@zilliz.com>
2023-05-16 17:57:31 +08:00