Merge branch '0.6.0' into badge

pull/514/head
quicksilver 2019-11-25 11:57:35 +08:00 committed by GitHub
commit 2dbc5f4e10
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
83 changed files with 2243 additions and 606 deletions

View File

@ -18,6 +18,9 @@ Please mark all change in change log and use the ticket from JIRA.
- \#412 - Message returned is confused when partition created with null partition name
- \#416 - Drop the same partition success repeatally
- \#440 - Query API in customization still uses old version
- \#440 - Server cannot startup with gpu_resource_config.enable=false in GPU version
- \#458 - Index data is not compatible between 0.5 and 0.6
- \#486 - gpu no usage during index building
## Feature
- \#12 - Pure CPU version for Milvus
@ -26,6 +29,8 @@ Please mark all change in change log and use the ticket from JIRA.
- \#226 - Experimental shards middleware for Milvus
- \#227 - Support new index types SPTAG-KDT and SPTAG-BKT
- \#346 - Support build index with multiple gpu
- \#488 - Add log in scheduler/optimizer
- \#502 - C++ SDK support IVFPQ and SPTAG
## Improvement
- \#255 - Add ivfsq8 test report detailed version
@ -41,6 +46,8 @@ Please mark all change in change log and use the ticket from JIRA.
- \#404 - Add virtual method Init() in Pass abstract class
- \#409 - Add a Fallback pass in optimizer
- \#433 - C++ SDK query result is not easy to use
- \#449 - Add ShowPartitions example for C++ SDK
- \#470 - Small raw files should not be build index
## Task

View File

@ -5,12 +5,11 @@
![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen)
![Language](https://img.shields.io/badge/language-C%2B%2B-blue)
[![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master)
![Release](https://img.shields.io/badge/release-v0.5.1-yellowgreen)
![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen)
![Release_date](https://img.shields.io/badge/release%20date-November-yellowgreen)
[![codecov](https://codecov.io/gh/milvus-io/milvus/branch/master/graph/badge.svg)](https://codecov.io/gh/milvus-io/milvus)
[中文版](README_CN.md)
[中文版](README_CN.md) | [日本語版](README_JP.md)
## What is Milvus
@ -20,7 +19,7 @@ For more detailed introduction of Milvus and its architecture, see [Milvus overv
Milvus provides stable [Python](https://github.com/milvus-io/pymilvus), [Java](https://github.com/milvus-io/milvus-sdk-java) and [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) APIs.
Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://www.milvus.io/docs/en/release/v0.5.0/).
Keep up-to-date with newest releases and latest updates by reading Milvus [release notes](https://www.milvus.io/docs/en/release/v0.5.3/).
## Get started
@ -54,11 +53,13 @@ We use [GitHub issues](https://github.com/milvus-io/milvus/issues) to track issu
To connect with other users and contributors, welcome to join our [Slack channel](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk).
## Thanks
## Contributors
We greatly appreciate the help of the following people.
Below is a list of Milvus contributors. We greatly appreciate your contributions!
- [akihoni](https://github.com/akihoni) found a broken link and a small typo in the README file.
- [akihoni](https://github.com/akihoni) provided the CN version of README, and found a broken link in the doc.
- [goodhamgupta](https://github.com/goodhamgupta) fixed a filename typo in the bootcamp doc.
- [erdustiggen](https://github.com/erdustiggen) changed from std::cout to LOG for error messages, and fixed a clang format issue as well as some grammatical errors.
## Resources
@ -66,6 +67,8 @@ We greatly appreciate the help of the following people.
- [Milvus bootcamp](https://github.com/milvus-io/bootcamp)
- [Milvus test reports](https://github.com/milvus-io/milvus/tree/master/docs)
- [Milvus Medium](https://medium.com/@milvusio)
- [Milvus CSDN](https://zilliz.blog.csdn.net/)
@ -76,6 +79,4 @@ We greatly appreciate the help of the following people.
## License
[Apache License 2.0](LICENSE)
[Apache License 2.0](LICENSE)

View File

@ -1,157 +1,36 @@
![Milvuslogo](https://raw.githubusercontent.com/milvus-io/docs/master/assets/milvus_logo.png)
[![Slack](https://img.shields.io/badge/Join-Slack-orange)](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)
![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen)
![Language](https://img.shields.io/badge/language-C%2B%2B-blue)
[![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master)
![Release](https://img.shields.io/badge/release-v0.5.0-orange)
![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen)
![Release_date](https://img.shields.io/badge/release_date-October-yellowgreen)
[![codecov](https://codecov.io/gh/milvus-io/milvus/branch/master/graph/badge.svg)](https://codecov.io/gh/milvus-io/milvus)
- [Slack 频道](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)
- [Twitter](https://twitter.com/milvusio)
- [Facebook](https://www.facebook.com/io.milvus.5)
- [博客](https://www.milvus.io/blog/)
- [CSDN](https://zilliz.blog.csdn.net/)
- [中文官网](https://www.milvus.io/zh-CN/)
# 欢迎来到 Milvus
## Milvus 是什么
Milvus 是一款开源的、针对海量特征向量的相似性搜索引擎。基于异构众核计算框架设计,成本更低,性能更好。在有限的计算资源下,十亿向量搜索仅毫秒响应。
Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java) 以及 [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) 的 API 接口
若要了解 Milvus 详细介绍和整体架构,请访问 [Milvus 简介](https://www.milvus.io/docs/zh-CN/aboutmilvus/overview/)。
通过 [版本发布说明](https://milvus.io/docs/zh-CN/release/v0.5.0/) 获取最新发行版本的 Milvus
Milvus 提供稳定的 [Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java) 以及[C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) 的 API 接口
- 异构众核
Milvus 基于异构众核计算框架设计,成本更低,性能更好。
- 多元化索引
Milvus 支持多种索引方式,使用量化索引、基于树的索引和图索引等算法。
- 资源智能管理
Milvus 根据实际数据规模和可利用资源,智能调节优化查询计算和索引构建过程。
- 水平扩容
Milvus 支持在线 / 离线扩容,仅需执行简单命令,便可弹性伸缩计算节点和存储节点。
- 高可用性
Milvus 集成了 Kubernetes 框架,能有效避免单点障碍情况的发生。
- 简单易用
Milvus 安装简单,使用方便,并可使您专注于特征向量。
- 可视化监控
您可以使用基于 Prometheus 的图形化监控,以便实时跟踪系统性能。
## 整体架构
![Milvus_arch](https://github.com/milvus-io/docs/blob/master/assets/milvus_arch.png)
通过 [版本发布说明](https://milvus.io/docs/zh-CN/release/v0.5.3/) 获取最新版本的功能和更新。
## 开始使用 Milvus
### 硬件要求
请参阅 [Milvus 安装指南](https://www.milvus.io/docs/zh-CN/userguide/install_milvus/) 使用 Docker 容器安装 Milvus。若要基于源码编译请访问 [源码安装](install.md)。
| 硬件设备 | 推荐配置 |
| -------- | ------------------------------------- |
| CPU | Intel CPU Haswell 及以上 |
| GPU | NVIDIA Pascal 系列及以上 |
| 内存 | 8 GB 或以上(取决于具体向量数据规模) |
| 硬盘 | SATA 3.0 SSD 及以上 |
### 使用 Docker
您可以方便地使用 Docker 安装 Milvus。具体请查看 [Milvus 安装指南](https://milvus.io/docs/zh-CN/userguide/install_milvus/)。
### 从源代码编译
#### 软件要求
- Ubuntu 18.04 及以上
- CMake 3.14 及以上
- CUDA 10.0 及以上
- NVIDIA driver 418 及以上
#### 编译
##### 第一步 安装依赖项
```shell
$ cd [Milvus sourcecode path]/core
$ ./ubuntu_build_deps.sh
```
##### 第二步 编译
```shell
$ cd [Milvus sourcecode path]/core
$ ./build.sh -t Debug
or
$ ./build.sh -t Release
```
当您成功编译后,所有 Milvus 必需组件将安装在`[Milvus root path]/core/milvus`路径下。
##### 启动 Milvus 服务
```shell
$ cd [Milvus root path]/core/milvus
```
`LD_LIBRARY_PATH` 中添加 `lib/` 目录:
```shell
$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/milvus/lib
```
启动 Milvus 服务:
```shell
$ cd scripts
$ ./start_server.sh
```
若要停止 Milvus 服务,请使用如下命令:
```shell
$ ./stop_server.sh
```
若需要修改 Milvus 配置文件 `conf/server_config.yaml` 和`conf/log_config.conf`,请查看 [Milvus 配置](https://milvus.io/docs/zh-CN/reference/milvus_config/)。
若要更改 Milvus 设置,请参阅 [Milvus 配置](https://www.milvus.io/docs/zh-CN/reference/milvus_config/)。
### 开始您的第一个 Milvus 程序
#### 运行 Python 示例代码
您可以尝试用 [Python](https://www.milvus.io/docs/en/userguide/example_code/) 或 [Java example code](https://github.com/milvus-io/milvus-sdk-java/tree/master/examples) 运行 Milvus 示例代码。
请确保系统的 Python 版本为 [Python 3.5](https://www.python.org/downloads/) 或以上。
安装 Milvus Python SDK。
```shell
# Install Milvus Python SDK
$ pip install pymilvus==0.2.3
```
创建 `example.py` 文件,并向文件中加入 [Python 示例代码](https://github.com/milvus-io/pymilvus/blob/master/examples/advanced_example.py)。
运行示例代码
```shell
# Run Milvus Python example
$ python3 example.py
```
#### 运行 C++ 示例代码
若要使用 C++ 示例代码,请使用以下命令:
```shell
# Run Milvus C++ example
@ -159,41 +38,44 @@ $ python3 example.py
$ ./sdk_simple
```
#### 运行 Java 示例代码
## 路线图
请确保系统的 Java 版本为 Java 8 或以上。
请从[此处](https://github.com/milvus-io/milvus-sdk-java/tree/master/examples)获取 Java 示例代码。
请阅读我们的[路线图](https://milvus.io/docs/zh-CN/roadmap/)以了解更多即将开发的新功能。
## 贡献者指南
我们由衷欢迎您推送贡献。关于贡献流程的详细信息,请参阅 [贡献者指南](https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md)。本项目遵循 Milvus [行为准则](https://github.com/milvus-io/milvus/blob/master/CODE_OF_CONDUCT.md)。如果您希望参与本项目,请遵守该准则的内容。
我们由衷欢迎您推送贡献。关于贡献流程的详细信息,请参阅[贡献者指南](https://github.com/milvus-io/milvus/blob/master/CONTRIBUTING.md)。本项目遵循 Milvus [行为准则](https://github.com/milvus-io/milvus/blob/master/CODE_OF_CONDUCT.md)。如果您希望参与本项目,请遵守该准则的内容。
我们使用 [GitHub issues](https://github.com/milvus-io/milvus/issues/new/choose) 追踪问题和补丁。若您希望提出问题或进行讨论,请加入我们的社区。
我们使用 [GitHub issues](https://github.com/milvus-io/milvus/issues) 追踪问题和补丁。若您希望提出问题或进行讨论,请加入我们的社区。
## 加入 Milvus 社区
欢迎加入我们的 [Slack 频道](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk) 以便与其他用户和贡献者进行交流。
欢迎加入我们的 [Slack 频道](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)以便与其他用户和贡献者进行交流。
## Milvus 路线图
## 贡献者
请阅读我们的[路线图](https://milvus.io/docs/zh-CN/roadmap/)以获得更多即将开发的新功能。
以下是 Milvus 贡献者名单,在此我们深表感谢:
- [akihoni](https://github.com/akihoni) 提供了中文版 README并发现了 README 中的无效链接。
- [goodhamgupta](https://github.com/goodhamgupta) 发现并修正了在线训练营文档中的文件名拼写错误。
- [erdustiggen](https://github.com/erdustiggen) 将错误信息里的 std::cout 修改为 LOG修正了一个 Clang 格式问题和一些语法错误。
## 相关链接
[Milvus 官方网站](https://www.milvus.io/)
- [Milvus.io](https://www.milvus.io)
[Milvus 文档](https://www.milvus.io/docs/en/userguide/install_milvus/)
- [Milvus 在线训练营](https://github.com/milvus-io/bootcamp)
[Milvus 在线训练营](https://github.com/milvus-io/bootcamp)
- [Milvus 测试报告](https://github.com/milvus-io/milvus/tree/master/docs)
[Milvus 博客](https://www.milvus.io/blog/)
- [Milvus Medium](https://medium.com/@milvusio)
[Milvus CSDN](https://zilliz.blog.csdn.net/)
- [Milvus CSDN](https://zilliz.blog.csdn.net/)
[Milvus 路线图](https://milvus.io/docs/en/roadmap/)
- [Milvus Twitter](https://twitter.com/milvusio)
- [Milvus Facebook](https://www.facebook.com/io.milvus.5)
## 许可协议
[Apache 许可协议2.0版](https://github.com/milvus-io/milvus/blob/master/LICENSE)

75
README_JP.md Normal file
View File

@ -0,0 +1,75 @@
![Milvuslogo](https://github.com/milvus-io/docs/blob/master/assets/milvus_logo.png)
[![Slack](https://img.shields.io/badge/Join-Slack-orange)](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)
![LICENSE](https://img.shields.io/badge/license-Apache--2.0-brightgreen)
![Language](https://img.shields.io/badge/language-C%2B%2B-blue)
[![codebeat badge](https://codebeat.co/badges/e030a4f6-b126-4475-a938-4723d54ec3a7?style=plastic)](https://codebeat.co/projects/github-com-jinhai-cn-milvus-master)
![Release](https://img.shields.io/badge/release-v0.5.3-yellowgreen)
![Release_date](https://img.shields.io/badge/release%20date-November-yellowgreen)
# Milvus へようこそ
## 概要
Milvusは世界中一番早い特徴ベクトルにむかう類似性検索エンジンです。不均質な計算アーキテクチャーに基づいて効率を最大化出来ます。数十億のベクタの中に目標を検索できるまで数ミリ秒しかかからず、最低限の計算資源だけが必要です。
Milvusは安定的な[Python](https://github.com/milvus-io/pymilvus)、[Java](https://github.com/milvus-io/milvus-sdk-java)又は [C++](https://github.com/milvus-io/milvus/tree/master/core/src/sdk) APIsを提供します。
Milvus [リリースノート](https://milvus.io/docs/en/release/v0.5.3/)を読んで最新バージョンや更新情報を手に入れます。
## はじめに
DockerでMilvusをインストールすることは簡単です。[Milvusインストール案内](https://milvus.io/docs/en/userguide/install_milvus/) を参考してください。ソースからMilvusを構築するために、[ソースから構築する](install.md)を参考してください。
Milvusをコンフィグするために、[Milvusコンフィグ](https://github.com/milvus-io/docs/blob/master/reference/milvus_config.md)を読んでください。
### 初めてのMilvusプログラムを試す
[Python](https://www.milvus.io/docs/en/userguide/example_code/)や[Java](https://github.com/milvus-io/milvus-sdk-java/tree/master/examples)などのサンプルコードを使ってMilvusプログラムを試す。
C++サンプルコードを実行するために、次のコマンドをつかってください。
```shell
# Run Milvus C++ example
$ cd [Milvus root path]/core/milvus/bin
$ ./sdk_simple
```
## Milvusロードマップ
[ロードマップ](https://milvus.io/docs/en/roadmap/)を読んで、追加する予定の特性が分かります。
## 貢献規約
本プロジェクトへの貢献に心より感謝いたします。 Milvusを貢献したいと思うなら、[貢献規約](CONTRIBUTING.md)を読んでください。 本プロジェクトはMilvusの[行動規範](CODE_OF_CONDUCT.md)に従います。プロジェクトに参加したい場合は、行動規範を従ってください。
[GitHub issues](https://github.com/milvus-io/milvus/issues) を使って問題やバッグなとを報告しでください。 一般てきな問題なら, Milvusコミュニティに参加してください。
## Milvusコミュニティを参加する
他の貢献者と交流したい場合は、Milvusの [slackチャンネル](https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk)に参加してください。
## 参考情報
- [Milvus.io](https://www.milvus.io)
- [Milvus](https://github.com/milvus-io/bootcamp)
- [Milvus テストレポート](https://github.com/milvus-io/milvus/tree/master/docs)
- [Milvus Medium](https://medium.com/@milvusio)
- [Milvus CSDN](https://zilliz.blog.csdn.net/)
- [Milvus ツイッター](https://twitter.com/milvusio)
- [Milvus Facebook](https://www.facebook.com/io.milvus.5)
## ライセンス
[Apache 2.0ライセンス](LICENSE)

View File

@ -25,6 +25,7 @@
namespace milvus {
namespace cache {
#ifdef MILVUS_GPU_VERSION
std::mutex GpuCacheMgr::mutex_;
std::unordered_map<uint64_t, GpuCacheMgrPtr> GpuCacheMgr::instance_;
@ -76,6 +77,7 @@ GpuCacheMgr::GetIndex(const std::string& key) {
DataObjPtr obj = GetItem(key);
return obj;
}
#endif
} // namespace cache
} // namespace milvus

View File

@ -25,6 +25,7 @@
namespace milvus {
namespace cache {
#ifdef MILVUS_GPU_VERSION
class GpuCacheMgr;
using GpuCacheMgrPtr = std::shared_ptr<GpuCacheMgr>;
@ -42,6 +43,7 @@ class GpuCacheMgr : public CacheMgr<DataObjPtr> {
static std::mutex mutex_;
static std::unordered_map<uint64_t, GpuCacheMgrPtr> instance_;
};
#endif
} // namespace cache
} // namespace milvus

View File

@ -838,6 +838,25 @@ DBImpl::BackgroundBuildIndex() {
// ENGINE_LOG_TRACE << "Background build index thread exit";
}
Status
DBImpl::GetFilesToBuildIndex(const std::string& table_id, const std::vector<int>& file_types,
meta::TableFilesSchema& files) {
files.clear();
auto status = meta_ptr_->FilesByType(table_id, file_types, files);
// only build index for files that row count greater than certain threshold
for (auto it = files.begin(); it != files.end();) {
if ((*it).file_type_ == static_cast<int>(meta::TableFileSchema::RAW) &&
(*it).row_count_ < meta::BUILD_INDEX_THRESHOLD) {
it = files.erase(it);
} else {
it++;
}
}
return Status::OK();
}
Status
DBImpl::GetFilesToSearch(const std::string& table_id, const std::vector<size_t>& file_ids, const meta::DatesT& dates,
meta::TableFilesSchema& files) {
@ -946,18 +965,18 @@ DBImpl::BuildTableIndexRecursively(const std::string& table_id, const TableIndex
}
// get files to build index
std::vector<std::string> file_ids;
auto status = meta_ptr_->FilesByType(table_id, file_types, file_ids);
meta::TableFilesSchema table_files;
auto status = GetFilesToBuildIndex(table_id, file_types, table_files);
int times = 1;
while (!file_ids.empty()) {
while (!table_files.empty()) {
ENGINE_LOG_DEBUG << "Non index files detected! Will build index " << times;
if (index.engine_type_ != (int)EngineType::FAISS_IDMAP) {
status = meta_ptr_->UpdateTableFilesToIndex(table_id);
}
std::this_thread::sleep_for(std::chrono::milliseconds(std::min(10 * 1000, times * 100)));
status = meta_ptr_->FilesByType(table_id, file_types, file_ids);
GetFilesToBuildIndex(table_id, file_types, table_files);
times++;
}

View File

@ -152,6 +152,10 @@ class DBImpl : public DB {
Status
MemSerialize();
Status
GetFilesToBuildIndex(const std::string& table_id, const std::vector<int>& file_types,
meta::TableFilesSchema& files);
Status
GetFilesToSearch(const std::string& table_id, const std::vector<size_t>& file_ids, const meta::DatesT& dates,
meta::TableFilesSchema& files);

View File

@ -151,6 +151,7 @@ ExecutionEngineImpl::HybridLoad() const {
return;
}
#ifdef MILVUS_GPU_VERSION
const std::string key = location_ + ".quantizer";
server::Config& config = server::Config::GetInstance();
@ -205,6 +206,7 @@ ExecutionEngineImpl::HybridLoad() const {
auto cache_quantizer = std::make_shared<CachedQuantizer>(quantizer);
cache::GpuCacheMgr::GetInstance(best_device_id)->InsertItem(key, cache_quantizer);
}
#endif
}
void
@ -342,6 +344,7 @@ ExecutionEngineImpl::CopyToGpu(uint64_t device_id, bool hybrid) {
}
#endif
#ifdef MILVUS_GPU_VERSION
auto index = std::static_pointer_cast<VecIndex>(cache::GpuCacheMgr::GetInstance(device_id)->GetIndex(location_));
bool already_in_cache = (index != nullptr);
if (already_in_cache) {
@ -364,16 +367,19 @@ ExecutionEngineImpl::CopyToGpu(uint64_t device_id, bool hybrid) {
if (!already_in_cache) {
GpuCache(device_id);
}
#endif
return Status::OK();
}
Status
ExecutionEngineImpl::CopyToIndexFileToGpu(uint64_t device_id) {
#ifdef MILVUS_GPU_VERSION
gpu_num_ = device_id;
auto to_index_data = std::make_shared<ToIndexData>(PhysicalSize());
cache::DataObjPtr obj = std::static_pointer_cast<cache::DataObj>(to_index_data);
milvus::cache::GpuCacheMgr::GetInstance(device_id)->InsertItem(location_, obj);
#endif
return Status::OK();
}
@ -584,15 +590,17 @@ ExecutionEngineImpl::Cache() {
Status
ExecutionEngineImpl::GpuCache(uint64_t gpu_id) {
#ifdef MILVUS_GPU_VERSION
cache::DataObjPtr obj = std::static_pointer_cast<cache::DataObj>(index_);
milvus::cache::GpuCacheMgr::GetInstance(gpu_id)->InsertItem(location_, obj);
#endif
return Status::OK();
}
// TODO(linxj): remove.
Status
ExecutionEngineImpl::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
std::vector<int64_t> gpu_ids;
Status s = config.GetGpuResourceConfigBuildIndexResources(gpu_ids);
@ -604,6 +612,9 @@ ExecutionEngineImpl::Init() {
std::string msg = "Invalid gpu_num";
return Status(SERVER_INVALID_ARGUMENT, msg);
#else
return Status::OK();
#endif
}
} // namespace engine

View File

@ -109,8 +109,7 @@ class Meta {
FilesToIndex(TableFilesSchema&) = 0;
virtual Status
FilesByType(const std::string& table_id, const std::vector<int>& file_types,
std::vector<std::string>& file_ids) = 0;
FilesByType(const std::string& table_id, const std::vector<int>& file_types, TableFilesSchema& table_files) = 0;
virtual Status
Size(uint64_t& result) = 0;

View File

@ -32,6 +32,13 @@ const size_t H_SEC = 60 * M_SEC;
const size_t D_SEC = 24 * H_SEC;
const size_t W_SEC = 7 * D_SEC;
// This value is to ignore small raw files when building index.
// The reason is:
// 1. The performance of brute-search for small raw files could be better than small index file.
// 2. And small raw files can be merged to larger files, thus reduce fragmented files count.
// We decide the value based on a testing for small size raw/index files.
const size_t BUILD_INDEX_THRESHOLD = 5000;
} // namespace meta
} // namespace engine
} // namespace milvus

View File

@ -959,6 +959,7 @@ MySQLMetaImpl::UpdateTableFilesToIndex(const std::string& table_id) {
updateTableFilesToIndexQuery << "UPDATE " << META_TABLEFILES
<< " SET file_type = " << std::to_string(TableFileSchema::TO_INDEX)
<< " WHERE table_id = " << mysqlpp::quote << table_id
<< " AND row_count >= " << std::to_string(meta::BUILD_INDEX_THRESHOLD)
<< " AND file_type = " << std::to_string(TableFileSchema::RAW) << ";";
ENGINE_LOG_DEBUG << "MySQLMetaImpl::UpdateTableFilesToIndex: " << updateTableFilesToIndexQuery.str();
@ -1527,13 +1528,13 @@ MySQLMetaImpl::FilesToIndex(TableFilesSchema& files) {
Status
MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>& file_types,
std::vector<std::string>& file_ids) {
TableFilesSchema& table_files) {
if (file_types.empty()) {
return Status(DB_ERROR, "file types array is empty");
}
try {
file_ids.clear();
table_files.clear();
mysqlpp::StoreQueryResult res;
{
@ -1553,9 +1554,10 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
mysqlpp::Query hasNonIndexFilesQuery = connectionPtr->query();
// since table_id is a unique column we just need to check whether it exists or not
hasNonIndexFilesQuery << "SELECT file_id, file_type"
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
<< " AND file_type in (" << types << ");";
hasNonIndexFilesQuery
<< "SELECT id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
<< " AND file_type in (" << types << ");";
ENGINE_LOG_DEBUG << "MySQLMetaImpl::FilesByType: " << hasNonIndexFilesQuery.str();
@ -1566,9 +1568,18 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
int raw_count = 0, new_count = 0, new_merge_count = 0, new_index_count = 0;
int to_index_count = 0, index_count = 0, backup_count = 0;
for (auto& resRow : res) {
std::string file_id;
resRow["file_id"].to_string(file_id);
file_ids.push_back(file_id);
TableFileSchema file_schema;
file_schema.id_ = resRow["id"];
file_schema.table_id_ = table_id;
file_schema.engine_type_ = resRow["engine_type"];
resRow["file_id"].to_string(file_schema.file_id_);
file_schema.file_type_ = resRow["file_type"];
file_schema.file_size_ = resRow["file_size"];
file_schema.row_count_ = resRow["row_count"];
file_schema.date_ = resRow["date"];
file_schema.created_on_ = resRow["created_on"];
table_files.emplace_back(file_schema);
int32_t file_type = resRow["file_type"];
switch (file_type) {

View File

@ -108,7 +108,7 @@ class MySQLMetaImpl : public Meta {
Status
FilesByType(const std::string& table_id, const std::vector<int>& file_types,
std::vector<std::string>& file_ids) override;
TableFilesSchema& table_files) override;
Status
Archive() override;

View File

@ -58,7 +58,7 @@ HandleException(const std::string& desc, const char* what = nullptr) {
} // namespace
inline auto
StoragePrototype(const std::string &path) {
StoragePrototype(const std::string& path) {
return make_storage(path,
make_table(META_TABLES,
make_column("id", &TableSchema::id_, primary_key()),
@ -160,7 +160,7 @@ SqliteMetaImpl::Initialize() {
}
Status
SqliteMetaImpl::CreateTable(TableSchema &table_schema) {
SqliteMetaImpl::CreateTable(TableSchema& table_schema) {
try {
server::MetricCollector metric;
@ -188,20 +188,20 @@ SqliteMetaImpl::CreateTable(TableSchema &table_schema) {
try {
auto id = ConnectorPtr->insert(table_schema);
table_schema.id_ = id;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when create table", e.what());
}
ENGINE_LOG_DEBUG << "Successfully create table: " << table_schema.table_id_;
return utils::CreateTablePath(options_, table_schema.table_id_);
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when create table", e.what());
}
}
Status
SqliteMetaImpl::DescribeTable(TableSchema &table_schema) {
SqliteMetaImpl::DescribeTable(TableSchema& table_schema) {
try {
server::MetricCollector metric;
@ -218,7 +218,7 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) {
&TableSchema::partition_tag_,
&TableSchema::version_),
where(c(&TableSchema::table_id_) == table_schema.table_id_
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (groups.size() == 1) {
table_schema.id_ = std::get<0>(groups[0]);
@ -236,7 +236,7 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) {
} else {
return Status(DB_NOT_FOUND, "Table " + table_schema.table_id_ + " not found");
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when describe table", e.what());
}
@ -244,20 +244,20 @@ SqliteMetaImpl::DescribeTable(TableSchema &table_schema) {
}
Status
SqliteMetaImpl::HasTable(const std::string &table_id, bool &has_or_not) {
SqliteMetaImpl::HasTable(const std::string& table_id, bool& has_or_not) {
has_or_not = false;
try {
server::MetricCollector metric;
auto tables = ConnectorPtr->select(columns(&TableSchema::id_),
where(c(&TableSchema::table_id_) == table_id
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (tables.size() == 1) {
has_or_not = true;
} else {
has_or_not = false;
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when lookup table", e.what());
}
@ -265,7 +265,7 @@ SqliteMetaImpl::HasTable(const std::string &table_id, bool &has_or_not) {
}
Status
SqliteMetaImpl::AllTables(std::vector<TableSchema> &table_schema_array) {
SqliteMetaImpl::AllTables(std::vector<TableSchema>& table_schema_array) {
try {
server::MetricCollector metric;
@ -281,8 +281,8 @@ SqliteMetaImpl::AllTables(std::vector<TableSchema> &table_schema_array) {
&TableSchema::owner_table_,
&TableSchema::partition_tag_,
&TableSchema::version_),
where(c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
for (auto &table : selected) {
where(c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
for (auto& table : selected) {
TableSchema schema;
schema.id_ = std::get<0>(table);
schema.table_id_ = std::get<1>(table);
@ -299,7 +299,7 @@ SqliteMetaImpl::AllTables(std::vector<TableSchema> &table_schema_array) {
table_schema_array.emplace_back(schema);
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when lookup all tables", e.what());
}
@ -307,7 +307,7 @@ SqliteMetaImpl::AllTables(std::vector<TableSchema> &table_schema_array) {
}
Status
SqliteMetaImpl::DropTable(const std::string &table_id) {
SqliteMetaImpl::DropTable(const std::string& table_id) {
try {
server::MetricCollector metric;
@ -317,13 +317,13 @@ SqliteMetaImpl::DropTable(const std::string &table_id) {
//soft delete table
ConnectorPtr->update_all(
set(
c(&TableSchema::state_) = (int) TableSchema::TO_DELETE),
c(&TableSchema::state_) = (int)TableSchema::TO_DELETE),
where(
c(&TableSchema::table_id_) == table_id and
c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
ENGINE_LOG_DEBUG << "Successfully delete table, table id = " << table_id;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when delete table", e.what());
}
@ -331,7 +331,7 @@ SqliteMetaImpl::DropTable(const std::string &table_id) {
}
Status
SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) {
SqliteMetaImpl::DeleteTableFiles(const std::string& table_id) {
try {
server::MetricCollector metric;
@ -341,14 +341,14 @@ SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) {
//soft delete table files
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE,
c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE,
c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()),
where(
c(&TableFileSchema::table_id_) == table_id and
c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE));
c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE));
ENGINE_LOG_DEBUG << "Successfully delete table files, table id = " << table_id;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when delete table files", e.what());
}
@ -356,7 +356,7 @@ SqliteMetaImpl::DeleteTableFiles(const std::string &table_id) {
}
Status
SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) {
SqliteMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
if (file_schema.date_ == EmptyDate) {
file_schema.date_ = utils::GetDate();
}
@ -389,7 +389,7 @@ SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) {
ENGINE_LOG_DEBUG << "Successfully create table file, file id = " << file_schema.file_id_;
return utils::CreateTableFilePath(options_, file_schema);
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when create table file", e.what());
}
@ -398,8 +398,8 @@ SqliteMetaImpl::CreateTableFile(TableFileSchema &file_schema) {
// TODO(myh): Delete single vecotor by id
Status
SqliteMetaImpl::DropDataByDate(const std::string &table_id,
const DatesT &dates) {
SqliteMetaImpl::DropDataByDate(const std::string& table_id,
const DatesT& dates) {
if (dates.empty()) {
return Status::OK();
}
@ -440,7 +440,7 @@ SqliteMetaImpl::DropDataByDate(const std::string &table_id,
}
ENGINE_LOG_DEBUG << "Successfully drop data by date, table id = " << table_schema.table_id_;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when drop partition", e.what());
}
@ -448,9 +448,9 @@ SqliteMetaImpl::DropDataByDate(const std::string &table_id,
}
Status
SqliteMetaImpl::GetTableFiles(const std::string &table_id,
const std::vector<size_t> &ids,
TableFilesSchema &table_files) {
SqliteMetaImpl::GetTableFiles(const std::string& table_id,
const std::vector<size_t>& ids,
TableFilesSchema& table_files) {
try {
table_files.clear();
auto files = ConnectorPtr->select(columns(&TableFileSchema::id_,
@ -463,7 +463,7 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id,
&TableFileSchema::created_on_),
where(c(&TableFileSchema::table_id_) == table_id and
in(&TableFileSchema::id_, ids) and
c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE));
c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE));
TableSchema table_schema;
table_schema.table_id_ = table_id;
auto status = DescribeTable(table_schema);
@ -472,7 +472,7 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id,
}
Status result;
for (auto &file : files) {
for (auto& file : files) {
TableFileSchema file_schema;
file_schema.table_id_ = table_id;
file_schema.id_ = std::get<0>(file);
@ -495,13 +495,13 @@ SqliteMetaImpl::GetTableFiles(const std::string &table_id,
ENGINE_LOG_DEBUG << "Get table files by id";
return result;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when lookup table files", e.what());
}
}
Status
SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) {
SqliteMetaImpl::UpdateTableFlag(const std::string& table_id, int64_t flag) {
try {
server::MetricCollector metric;
@ -512,7 +512,7 @@ SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) {
where(
c(&TableSchema::table_id_) == table_id));
ENGINE_LOG_DEBUG << "Successfully update table flag, table id = " << table_id;
} catch (std::exception &e) {
} catch (std::exception& e) {
std::string msg = "Encounter exception when update table flag: table_id = " + table_id;
return HandleException(msg, e.what());
}
@ -521,7 +521,7 @@ SqliteMetaImpl::UpdateTableFlag(const std::string &table_id, int64_t flag) {
}
Status
SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) {
SqliteMetaImpl::UpdateTableFile(TableFileSchema& file_schema) {
file_schema.updated_time_ = utils::GetMicroSecTimeStamp();
try {
server::MetricCollector metric;
@ -534,14 +534,14 @@ SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) {
//if the table has been deleted, just mark the table file as TO_DELETE
//clean thread will delete the file later
if (tables.size() < 1 || std::get<0>(tables[0]) == (int) TableSchema::TO_DELETE) {
if (tables.size() < 1 || std::get<0>(tables[0]) == (int)TableSchema::TO_DELETE) {
file_schema.file_type_ = TableFileSchema::TO_DELETE;
}
ConnectorPtr->update(file_schema);
ENGINE_LOG_DEBUG << "Update single table file, file id = " << file_schema.file_id_;
} catch (std::exception &e) {
} catch (std::exception& e) {
std::string msg = "Exception update table file: table_id = " + file_schema.table_id_
+ " file_id = " + file_schema.file_id_;
return HandleException(msg, e.what());
@ -550,7 +550,7 @@ SqliteMetaImpl::UpdateTableFile(TableFileSchema &file_schema) {
}
Status
SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) {
SqliteMetaImpl::UpdateTableFiles(TableFilesSchema& files) {
try {
server::MetricCollector metric;
@ -558,13 +558,13 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) {
std::lock_guard<std::mutex> meta_lock(meta_mutex_);
std::map<std::string, bool> has_tables;
for (auto &file : files) {
for (auto& file : files) {
if (has_tables.find(file.table_id_) != has_tables.end()) {
continue;
}
auto tables = ConnectorPtr->select(columns(&TableSchema::id_),
where(c(&TableSchema::table_id_) == file.table_id_
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (tables.size() >= 1) {
has_tables[file.table_id_] = true;
} else {
@ -573,7 +573,7 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) {
}
auto commited = ConnectorPtr->transaction([&]() mutable {
for (auto &file : files) {
for (auto& file : files) {
if (!has_tables[file.table_id_]) {
file.file_type_ = TableFileSchema::TO_DELETE;
}
@ -589,7 +589,7 @@ SqliteMetaImpl::UpdateTableFiles(TableFilesSchema &files) {
}
ENGINE_LOG_DEBUG << "Update " << files.size() << " table files";
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when update table files", e.what());
}
return Status::OK();
@ -613,7 +613,7 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex&
&TableSchema::partition_tag_,
&TableSchema::version_),
where(c(&TableSchema::table_id_) == table_id
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (tables.size() > 0) {
meta::TableSchema table_schema;
@ -639,11 +639,11 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex&
//set all backup file to raw
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::RAW,
c(&TableFileSchema::file_type_) = (int)TableFileSchema::RAW,
c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()),
where(
c(&TableFileSchema::table_id_) == table_id and
c(&TableFileSchema::file_type_) == (int) TableFileSchema::BACKUP));
c(&TableFileSchema::file_type_) == (int)TableFileSchema::BACKUP));
ENGINE_LOG_DEBUG << "Successfully update table index, table id = " << table_id;
} catch (std::exception& e) {
@ -655,7 +655,7 @@ SqliteMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex&
}
Status
SqliteMetaImpl::UpdateTableFilesToIndex(const std::string &table_id) {
SqliteMetaImpl::UpdateTableFilesToIndex(const std::string& table_id) {
try {
server::MetricCollector metric;
@ -664,13 +664,14 @@ SqliteMetaImpl::UpdateTableFilesToIndex(const std::string &table_id) {
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_INDEX),
c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_INDEX),
where(
c(&TableFileSchema::table_id_) == table_id and
c(&TableFileSchema::file_type_) == (int) TableFileSchema::RAW));
c(&TableFileSchema::row_count_) >= meta::BUILD_INDEX_THRESHOLD and
c(&TableFileSchema::file_type_) == (int)TableFileSchema::RAW));
ENGINE_LOG_DEBUG << "Update files to to_index, table id = " << table_id;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when update table files to to_index", e.what());
}
@ -686,7 +687,7 @@ SqliteMetaImpl::DescribeTableIndex(const std::string& table_id, TableIndex& inde
&TableSchema::nlist_,
&TableSchema::metric_type_),
where(c(&TableSchema::table_id_) == table_id
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (groups.size() == 1) {
index.engine_type_ = std::get<0>(groups[0]);
@ -713,20 +714,20 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) {
//soft delete index files
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE,
c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE,
c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()),
where(
c(&TableFileSchema::table_id_) == table_id and
c(&TableFileSchema::file_type_) == (int) TableFileSchema::INDEX));
c(&TableFileSchema::file_type_) == (int)TableFileSchema::INDEX));
//set all backup file to raw
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::RAW,
c(&TableFileSchema::file_type_) = (int)TableFileSchema::RAW,
c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()),
where(
c(&TableFileSchema::table_id_) == table_id and
c(&TableFileSchema::file_type_) == (int) TableFileSchema::BACKUP));
c(&TableFileSchema::file_type_) == (int)TableFileSchema::BACKUP));
//set table index type to raw
ConnectorPtr->update_all(
@ -738,7 +739,7 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) {
c(&TableSchema::table_id_) == table_id));
ENGINE_LOG_DEBUG << "Successfully drop table index, table id = " << table_id;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when delete table index files", e.what());
}
@ -746,7 +747,9 @@ SqliteMetaImpl::DropTableIndex(const std::string& table_id) {
}
Status
SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag) {
SqliteMetaImpl::CreatePartition(const std::string& table_id,
const std::string& partition_name,
const std::string& tag) {
server::MetricCollector metric;
TableSchema table_schema;
@ -757,7 +760,7 @@ SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string&
}
// not allow create partition under partition
if(!table_schema.owner_table_.empty()) {
if (!table_schema.owner_table_.empty()) {
return Status(DB_ERROR, "Nested partition is not allowed");
}
@ -769,7 +772,7 @@ SqliteMetaImpl::CreatePartition(const std::string& table_id, const std::string&
// not allow duplicated partition
std::string exist_partition;
GetPartitionName(table_id, valid_tag, exist_partition);
if(!exist_partition.empty()) {
if (!exist_partition.empty()) {
return Status(DB_ERROR, "Duplicate partition is not allowed");
}
@ -805,16 +808,16 @@ SqliteMetaImpl::ShowPartitions(const std::string& table_id, std::vector<meta::Ta
server::MetricCollector metric;
auto partitions = ConnectorPtr->select(columns(&TableSchema::table_id_),
where(c(&TableSchema::owner_table_) == table_id
and c(&TableSchema::state_) != (int) TableSchema::TO_DELETE));
for(size_t i = 0; i < partitions.size(); i++) {
where(c(&TableSchema::owner_table_) == table_id
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
for (size_t i = 0; i < partitions.size(); i++) {
std::string partition_name = std::get<0>(partitions[i]);
meta::TableSchema partition_schema;
partition_schema.table_id_ = partition_name;
DescribeTable(partition_schema);
partiton_schema_array.emplace_back(partition_schema);
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when show partitions", e.what());
}
@ -832,14 +835,15 @@ SqliteMetaImpl::GetPartitionName(const std::string& table_id, const std::string&
server::StringHelpFunctions::TrimStringBlank(valid_tag);
auto name = ConnectorPtr->select(columns(&TableSchema::table_id_),
where(c(&TableSchema::owner_table_) == table_id
and c(&TableSchema::partition_tag_) == valid_tag));
where(c(&TableSchema::owner_table_) == table_id
and c(&TableSchema::partition_tag_) == valid_tag
and c(&TableSchema::state_) != (int)TableSchema::TO_DELETE));
if (name.size() > 0) {
partition_name = std::get<0>(name[0]);
} else {
return Status(DB_NOT_FOUND, "Table " + table_id + "'s partition " + valid_tag + " not found");
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when get partition name", e.what());
}
@ -1032,7 +1036,7 @@ SqliteMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFile
}
Status
SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) {
SqliteMetaImpl::FilesToIndex(TableFilesSchema& files) {
files.clear();
try {
@ -1048,13 +1052,13 @@ SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) {
&TableFileSchema::engine_type_,
&TableFileSchema::created_on_),
where(c(&TableFileSchema::file_type_)
== (int) TableFileSchema::TO_INDEX));
== (int)TableFileSchema::TO_INDEX));
std::map<std::string, TableSchema> groups;
TableFileSchema table_file;
Status ret;
for (auto &file : selected) {
for (auto& file : selected) {
table_file.id_ = std::get<0>(file);
table_file.table_id_ = std::get<1>(file);
table_file.file_id_ = std::get<2>(file);
@ -1090,48 +1094,66 @@ SqliteMetaImpl::FilesToIndex(TableFilesSchema &files) {
ENGINE_LOG_DEBUG << "Collect " << selected.size() << " to-index files";
}
return ret;
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when iterate raw files", e.what());
}
}
Status
SqliteMetaImpl::FilesByType(const std::string &table_id,
const std::vector<int> &file_types,
std::vector<std::string> &file_ids) {
SqliteMetaImpl::FilesByType(const std::string& table_id,
const std::vector<int>& file_types,
TableFilesSchema& table_files) {
if (file_types.empty()) {
return Status(DB_ERROR, "file types array is empty");
}
try {
file_ids.clear();
auto selected = ConnectorPtr->select(columns(&TableFileSchema::file_id_,
&TableFileSchema::file_type_),
table_files.clear();
auto selected = ConnectorPtr->select(columns(&TableFileSchema::id_,
&TableFileSchema::file_id_,
&TableFileSchema::file_type_,
&TableFileSchema::file_size_,
&TableFileSchema::row_count_,
&TableFileSchema::date_,
&TableFileSchema::engine_type_,
&TableFileSchema::created_on_),
where(in(&TableFileSchema::file_type_, file_types)
and c(&TableFileSchema::table_id_) == table_id));
if (selected.size() >= 1) {
int raw_count = 0, new_count = 0, new_merge_count = 0, new_index_count = 0;
int to_index_count = 0, index_count = 0, backup_count = 0;
for (auto &file : selected) {
file_ids.push_back(std::get<0>(file));
switch (std::get<1>(file)) {
case (int) TableFileSchema::RAW:raw_count++;
for (auto& file : selected) {
TableFileSchema file_schema;
file_schema.table_id_ = table_id;
file_schema.id_ = std::get<0>(file);
file_schema.file_id_ = std::get<1>(file);
file_schema.file_type_ = std::get<2>(file);
file_schema.file_size_ = std::get<3>(file);
file_schema.row_count_ = std::get<4>(file);
file_schema.date_ = std::get<5>(file);
file_schema.engine_type_ = std::get<6>(file);
file_schema.created_on_ = std::get<7>(file);
switch (file_schema.file_type_) {
case (int)TableFileSchema::RAW:raw_count++;
break;
case (int) TableFileSchema::NEW:new_count++;
case (int)TableFileSchema::NEW:new_count++;
break;
case (int) TableFileSchema::NEW_MERGE:new_merge_count++;
case (int)TableFileSchema::NEW_MERGE:new_merge_count++;
break;
case (int) TableFileSchema::NEW_INDEX:new_index_count++;
case (int)TableFileSchema::NEW_INDEX:new_index_count++;
break;
case (int) TableFileSchema::TO_INDEX:to_index_count++;
case (int)TableFileSchema::TO_INDEX:to_index_count++;
break;
case (int) TableFileSchema::INDEX:index_count++;
case (int)TableFileSchema::INDEX:index_count++;
break;
case (int) TableFileSchema::BACKUP:backup_count++;
case (int)TableFileSchema::BACKUP:backup_count++;
break;
default:break;
}
table_files.emplace_back(file_schema);
}
ENGINE_LOG_DEBUG << "Table " << table_id << " currently has raw files:" << raw_count
@ -1139,13 +1161,12 @@ SqliteMetaImpl::FilesByType(const std::string &table_id,
<< " new_index files:" << new_index_count << " to_index files:" << to_index_count
<< " index files:" << index_count << " backup files:" << backup_count;
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when check non index files", e.what());
}
return Status::OK();
}
// TODO(myh): Support swap to cloud storage
Status
SqliteMetaImpl::Archive() {
@ -1166,11 +1187,11 @@ SqliteMetaImpl::Archive() {
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE),
c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE),
where(
c(&TableFileSchema::created_on_) < (int64_t) (now - usecs) and
c(&TableFileSchema::file_type_) != (int) TableFileSchema::TO_DELETE));
} catch (std::exception &e) {
c(&TableFileSchema::created_on_) < (int64_t)(now - usecs) and
c(&TableFileSchema::file_type_) != (int)TableFileSchema::TO_DELETE));
} catch (std::exception& e) {
return HandleException("Encounter exception when update table files", e.what());
}
@ -1218,15 +1239,15 @@ SqliteMetaImpl::CleanUp() {
std::lock_guard<std::mutex> meta_lock(meta_mutex_);
std::vector<int> file_types = {
(int) TableFileSchema::NEW,
(int) TableFileSchema::NEW_INDEX,
(int) TableFileSchema::NEW_MERGE
(int)TableFileSchema::NEW,
(int)TableFileSchema::NEW_INDEX,
(int)TableFileSchema::NEW_MERGE
};
auto files =
ConnectorPtr->select(columns(&TableFileSchema::id_), where(in(&TableFileSchema::file_type_, file_types)));
auto commited = ConnectorPtr->transaction([&]() mutable {
for (auto &file : files) {
for (auto& file : files) {
ENGINE_LOG_DEBUG << "Remove table file type as NEW";
ConnectorPtr->remove<TableFileSchema>(std::get<0>(file));
}
@ -1240,7 +1261,7 @@ SqliteMetaImpl::CleanUp() {
if (files.size() > 0) {
ENGINE_LOG_DEBUG << "Clean " << files.size() << " files";
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when clean table file", e.what());
}
@ -1265,7 +1286,7 @@ SqliteMetaImpl::CleanUpFilesWithTTL(uint16_t seconds) {
&TableFileSchema::date_),
where(
c(&TableFileSchema::file_type_) ==
(int) TableFileSchema::TO_DELETE
(int)TableFileSchema::TO_DELETE
and
c(&TableFileSchema::updated_time_)
< now - seconds * US_PS));
@ -1354,7 +1375,7 @@ SqliteMetaImpl::CleanUpFilesWithTTL(uint16_t seconds) {
}
Status
SqliteMetaImpl::Count(const std::string &table_id, uint64_t &result) {
SqliteMetaImpl::Count(const std::string& table_id, uint64_t& result) {
try {
server::MetricCollector metric;
@ -1414,14 +1435,14 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) {
auto selected = ConnectorPtr->select(columns(&TableFileSchema::id_,
&TableFileSchema::file_size_),
where(c(&TableFileSchema::file_type_)
!= (int) TableFileSchema::TO_DELETE),
!= (int)TableFileSchema::TO_DELETE),
order_by(&TableFileSchema::id_),
limit(10));
std::vector<int> ids;
TableFileSchema table_file;
for (auto &file : selected) {
for (auto& file : selected) {
if (to_discard_size <= 0) break;
table_file.id_ = std::get<0>(file);
table_file.file_size_ = std::get<1>(file);
@ -1437,7 +1458,7 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) {
ConnectorPtr->update_all(
set(
c(&TableFileSchema::file_type_) = (int) TableFileSchema::TO_DELETE,
c(&TableFileSchema::file_type_) = (int)TableFileSchema::TO_DELETE,
c(&TableFileSchema::updated_time_) = utils::GetMicroSecTimeStamp()),
where(
in(&TableFileSchema::id_, ids)));
@ -1448,7 +1469,7 @@ SqliteMetaImpl::DiscardFiles(int64_t to_discard_size) {
if (!commited) {
return HandleException("DiscardFiles error: sqlite transaction failed");
}
} catch (std::exception &e) {
} catch (std::exception& e) {
return HandleException("Encounter exception when discard table file", e.what());
}

View File

@ -108,7 +108,7 @@ class SqliteMetaImpl : public Meta {
Status
FilesByType(const std::string& table_id, const std::vector<int>& file_types,
std::vector<std::string>& file_ids) override;
TableFilesSchema& table_files) override;
Status
Size(uint64_t& result) override;

View File

@ -33,7 +33,7 @@ FaissBaseIndex::SerializeImpl() {
try {
faiss::Index* index = index_.get();
SealImpl();
// SealImpl();
MemoryIOWriter writer;
faiss::write_index(index, &writer);
@ -60,6 +60,8 @@ FaissBaseIndex::LoadImpl(const BinarySet& index_binary) {
faiss::Index* index = faiss::read_index(&reader);
index_.reset(index);
SealImpl();
}
void

View File

@ -86,9 +86,6 @@ GPUIVF::SerializeImpl() {
faiss::Index* index = index_.get();
faiss::Index* host_index = faiss::gpu::index_gpu_to_cpu(index);
// TODO(linxj): support seal
// SealImpl();
faiss::write_index(host_index, &writer);
delete host_index;
}

View File

@ -97,7 +97,6 @@ IVF::Serialize() {
}
std::lock_guard<std::mutex> lk(mutex_);
Seal();
return SerializeImpl();
}

View File

@ -59,9 +59,9 @@ print_banner() {
#endif
<< " library." << std::endl;
#ifdef MILVUS_CPU_VERSION
std::cout << "You are using Milvus CPU version" << std::endl;
std::cout << "You are using Milvus CPU edition" << std::endl;
#else
std::cout << "You are using Milvus GPU version" << std::endl;
std::cout << "You are using Milvus GPU edition" << std::endl;
#endif
std::cout << std::endl;
}

View File

@ -54,36 +54,40 @@ load_simple_config() {
// get resources
#ifdef MILVUS_GPU_VERSION
bool enable_gpu = false;
server::Config& config = server::Config::GetInstance();
std::vector<int64_t> gpu_ids;
config.GetGpuResourceConfigSearchResources(gpu_ids);
std::vector<int64_t> build_gpu_ids;
config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids);
auto pcie = Connection("pcie", 12000);
config.GetGpuResourceConfigEnable(enable_gpu);
if (enable_gpu) {
std::vector<int64_t> gpu_ids;
config.GetGpuResourceConfigSearchResources(gpu_ids);
std::vector<int64_t> build_gpu_ids;
config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids);
auto pcie = Connection("pcie", 12000);
std::vector<int64_t> not_find_build_ids;
for (auto& build_id : build_gpu_ids) {
bool find_gpu_id = false;
for (auto& gpu_id : gpu_ids) {
if (gpu_id == build_id) {
find_gpu_id = true;
break;
std::vector<int64_t> not_find_build_ids;
for (auto& build_id : build_gpu_ids) {
bool find_gpu_id = false;
for (auto& gpu_id : gpu_ids) {
if (gpu_id == build_id) {
find_gpu_id = true;
break;
}
}
if (not find_gpu_id) {
not_find_build_ids.emplace_back(build_id);
}
}
if (not find_gpu_id) {
not_find_build_ids.emplace_back(build_id);
for (auto& gpu_id : gpu_ids) {
ResMgrInst::GetInstance()->Add(ResourceFactory::Create(std::to_string(gpu_id), "GPU", gpu_id, true, true));
ResMgrInst::GetInstance()->Connect("cpu", std::to_string(gpu_id), pcie);
}
}
for (auto& gpu_id : gpu_ids) {
ResMgrInst::GetInstance()->Add(ResourceFactory::Create(std::to_string(gpu_id), "GPU", gpu_id, true, true));
ResMgrInst::GetInstance()->Connect("cpu", std::to_string(gpu_id), pcie);
}
for (auto& not_find_id : not_find_build_ids) {
ResMgrInst::GetInstance()->Add(
ResourceFactory::Create(std::to_string(not_find_id), "GPU", not_find_id, true, true));
ResMgrInst::GetInstance()->Connect("cpu", std::to_string(not_find_id), pcie);
for (auto& not_find_id : not_find_build_ids) {
ResMgrInst::GetInstance()->Add(
ResourceFactory::Create(std::to_string(not_find_id), "GPU", not_find_id, true, true));
ResMgrInst::GetInstance()->Connect("cpu", std::to_string(not_find_id), pcie);
}
}
#endif
}

View File

@ -102,11 +102,35 @@ class OptimizerInst {
if (instance == nullptr) {
std::vector<PassPtr> pass_list;
#ifdef MILVUS_GPU_VERSION
pass_list.push_back(std::make_shared<BuildIndexPass>());
pass_list.push_back(std::make_shared<FaissFlatPass>());
pass_list.push_back(std::make_shared<FaissIVFFlatPass>());
pass_list.push_back(std::make_shared<FaissIVFSQ8Pass>());
pass_list.push_back(std::make_shared<FaissIVFSQ8HPass>());
bool enable_gpu = false;
server::Config& config = server::Config::GetInstance();
config.GetGpuResourceConfigEnable(enable_gpu);
if (enable_gpu) {
std::vector<int64_t> build_gpus;
std::vector<int64_t> search_gpus;
int64_t gpu_search_threshold;
config.GetGpuResourceConfigBuildIndexResources(build_gpus);
config.GetGpuResourceConfigSearchResources(search_gpus);
config.GetEngineConfigGpuSearchThreshold(gpu_search_threshold);
std::string build_msg = "Build index gpu:";
for (auto build_id : build_gpus) {
build_msg.append(" gpu" + std::to_string(build_id));
}
SERVER_LOG_DEBUG << build_msg;
std::string search_msg = "Search gpu:";
for (auto search_id : search_gpus) {
search_msg.append(" gpu" + std::to_string(search_id));
}
search_msg.append(". gpu_search_threshold:" + std::to_string(gpu_search_threshold));
SERVER_LOG_DEBUG << search_msg;
pass_list.push_back(std::make_shared<BuildIndexPass>());
pass_list.push_back(std::make_shared<FaissFlatPass>());
pass_list.push_back(std::make_shared<FaissIVFFlatPass>());
pass_list.push_back(std::make_shared<FaissIVFSQ8Pass>());
pass_list.push_back(std::make_shared<FaissIVFSQ8HPass>());
}
#endif
pass_list.push_back(std::make_shared<FallbackPass>());
instance = std::make_shared<Optimizer>(pass_list);

View File

@ -25,12 +25,13 @@ namespace scheduler {
void
BuildIndexPass::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
std::vector<int64_t> build_resources;
Status s = config.GetGpuResourceConfigBuildIndexResources(build_resources);
Status s = config.GetGpuResourceConfigBuildIndexResources(build_gpu_ids_);
if (!s.ok()) {
throw;
}
#endif
}
bool
@ -38,13 +39,16 @@ BuildIndexPass::Run(const TaskPtr& task) {
if (task->Type() != TaskType::BuildIndexTask)
return false;
if (build_gpu_ids_.empty())
if (build_gpu_ids_.empty()) {
SERVER_LOG_WARNING << "BuildIndexPass cannot get build index gpu!";
return false;
}
ResourcePtr res_ptr;
res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, build_gpu_ids_[specified_gpu_id_]);
auto label = std::make_shared<SpecResLabel>(std::weak_ptr<Resource>(res_ptr));
task->label() = label;
SERVER_LOG_DEBUG << "Specify gpu" << specified_gpu_id_ << " to build index!";
specified_gpu_id_ = (specified_gpu_id_ + 1) % build_gpu_ids_.size();
return true;

View File

@ -45,7 +45,7 @@ class BuildIndexPass : public Pass {
private:
uint64_t specified_gpu_id_ = 0;
std::vector<int32_t> build_gpu_ids_;
std::vector<int64_t> build_gpu_ids_;
};
using BuildIndexPassPtr = std::shared_ptr<BuildIndexPass>;

View File

@ -29,6 +29,7 @@ namespace scheduler {
void
FaissFlatPass::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
Status s = config.GetEngineConfigGpuSearchThreshold(threshold_);
if (!s.ok()) {
@ -38,6 +39,7 @@ FaissFlatPass::Init() {
if (!s.ok()) {
throw;
}
#endif
}
bool
@ -54,9 +56,11 @@ FaissFlatPass::Run(const TaskPtr& task) {
auto search_job = std::static_pointer_cast<SearchJob>(search_task->job_.lock());
ResourcePtr res_ptr;
if (search_job->nq() < threshold_) {
SERVER_LOG_DEBUG << "FaissFlatPass: nq < gpu_search_threshold, specify cpu to search!";
res_ptr = ResMgrInst::GetInstance()->GetResource("cpu");
} else {
auto best_device_id = count_ % gpus.size();
SERVER_LOG_DEBUG << "FaissFlatPass: nq > gpu_search_threshold, specify gpu" << best_device_id << " to search!";
count_++;
res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id);
}

View File

@ -29,6 +29,7 @@ namespace scheduler {
void
FaissIVFFlatPass::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
Status s = config.GetEngineConfigGpuSearchThreshold(threshold_);
if (!s.ok()) {
@ -38,6 +39,7 @@ FaissIVFFlatPass::Init() {
if (!s.ok()) {
throw;
}
#endif
}
bool
@ -54,9 +56,12 @@ FaissIVFFlatPass::Run(const TaskPtr& task) {
auto search_job = std::static_pointer_cast<SearchJob>(search_task->job_.lock());
ResourcePtr res_ptr;
if (search_job->nq() < threshold_) {
SERVER_LOG_DEBUG << "FaissIVFFlatPass: nq < gpu_search_threshold, specify cpu to search!";
res_ptr = ResMgrInst::GetInstance()->GetResource("cpu");
} else {
auto best_device_id = count_ % gpus.size();
SERVER_LOG_DEBUG << "FaissIVFFlatPass: nq > gpu_search_threshold, specify gpu" << best_device_id
<< " to search!";
count_++;
res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id);
}

View File

@ -29,12 +29,14 @@ namespace scheduler {
void
FaissIVFSQ8HPass::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
Status s = config.GetEngineConfigGpuSearchThreshold(threshold_);
if (!s.ok()) {
threshold_ = std::numeric_limits<int64_t>::max();
}
s = config.GetGpuResourceConfigSearchResources(gpus);
#endif
}
bool
@ -51,9 +53,12 @@ FaissIVFSQ8HPass::Run(const TaskPtr& task) {
auto search_job = std::static_pointer_cast<SearchJob>(search_task->job_.lock());
ResourcePtr res_ptr;
if (search_job->nq() < threshold_) {
SERVER_LOG_DEBUG << "FaissIVFSQ8HPass: nq < gpu_search_threshold, specify cpu to search!";
res_ptr = ResMgrInst::GetInstance()->GetResource("cpu");
} else {
auto best_device_id = count_ % gpus.size();
SERVER_LOG_DEBUG << "FaissIVFSQ8HPass: nq > gpu_search_threshold, specify gpu" << best_device_id
<< " to search!";
count_++;
res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id);
}

View File

@ -29,6 +29,7 @@ namespace scheduler {
void
FaissIVFSQ8Pass::Init() {
#ifdef MILVUS_GPU_VERSION
server::Config& config = server::Config::GetInstance();
Status s = config.GetEngineConfigGpuSearchThreshold(threshold_);
if (!s.ok()) {
@ -38,6 +39,7 @@ FaissIVFSQ8Pass::Init() {
if (!s.ok()) {
throw;
}
#endif
}
bool
@ -54,9 +56,12 @@ FaissIVFSQ8Pass::Run(const TaskPtr& task) {
auto search_job = std::static_pointer_cast<SearchJob>(search_task->job_.lock());
ResourcePtr res_ptr;
if (search_job->nq() < threshold_) {
SERVER_LOG_DEBUG << "FaissIVFSQ8Pass: nq < gpu_search_threshold, specify cpu to search!";
res_ptr = ResMgrInst::GetInstance()->GetResource("cpu");
} else {
auto best_device_id = count_ % gpus.size();
SERVER_LOG_DEBUG << "FaissIVFSQ8Pass: nq > gpu_search_threshold, specify gpu" << best_device_id
<< " to search!";
count_++;
res_ptr = ResMgrInst::GetInstance()->GetResource(ResourceType::GPU, best_device_id);
}

View File

@ -33,6 +33,7 @@ FallbackPass::Run(const TaskPtr& task) {
return false;
}
// NEVER be empty
SERVER_LOG_DEBUG << "FallbackPass!";
auto cpu = ResMgrInst::GetInstance()->GetCpuResources()[0];
auto label = std::make_shared<SpecResLabel>(cpu);
task->label() = label;

View File

@ -85,7 +85,7 @@ XBuildIndexTask::Load(milvus::scheduler::LoadType type, uint8_t device_id) {
size_t file_size = to_index_engine_->PhysicalSize();
std::string info = "Load file id:" + std::to_string(file_->id_) +
std::string info = "Load file id:" + std::to_string(file_->id_) + " " + type_str +
" file type:" + std::to_string(file_->file_type_) + " size:" + std::to_string(file_size) +
" bytes from location: " + file_->location_ + " totally cost";
double span = rc.ElapseFromBegin(info);

View File

@ -93,6 +93,15 @@ ClientTest::Test(const std::string& address, const std::string& port) {
std::cout << "CreatePartition function call status: " << stat.message() << std::endl;
milvus_sdk::Utils::PrintPartitionParam(partition_param);
}
// show partitions
milvus::PartitionList partition_array;
stat = conn->ShowPartitions(TABLE_NAME, partition_array);
std::cout << partition_array.size() << " partitions created:" << std::endl;
for (auto& partition : partition_array) {
std::cout << "\t" << partition.partition_name << "\t tag = " << partition.partition_tag << std::endl;
}
}
{ // insert vectors

View File

@ -99,6 +99,12 @@ Utils::IndexTypeName(const milvus::IndexType& index_type) {
return "NSG";
case milvus::IndexType::IVFSQ8H:
return "IVFSQ8H";
case milvus::IndexType::IVFPQ:
return "IVFPQ";
case milvus::IndexType::SPTAGKDT:
return "SPTAGKDT";
case milvus::IndexType::SPTAGBKT:
return "SPTAGBKT";
default:
return "Unknown index type";
}

View File

@ -37,6 +37,9 @@ enum class IndexType {
IVFSQ8 = 3,
NSG = 4,
IVFSQ8H = 5,
IVFPQ = 6,
SPTAGKDT = 7,
SPTAGBKT = 8,
};
enum class MetricType {

View File

@ -182,6 +182,7 @@ Config::ValidateConfig() {
return s;
}
#ifdef MILVUS_GPU_VERSION
int64_t engine_gpu_search_threshold;
s = GetEngineConfigGpuSearchThreshold(engine_gpu_search_threshold);
if (!s.ok()) {
@ -189,35 +190,36 @@ Config::ValidateConfig() {
}
/* gpu resource config */
#ifdef MILVUS_GPU_VERSION
bool gpu_resource_enable;
s = GetGpuResourceConfigEnable(gpu_resource_enable);
if (!s.ok()) {
return s;
}
int64_t resource_cache_capacity;
s = GetGpuResourceConfigCacheCapacity(resource_cache_capacity);
if (!s.ok()) {
return s;
}
if (gpu_resource_enable) {
int64_t resource_cache_capacity;
s = GetGpuResourceConfigCacheCapacity(resource_cache_capacity);
if (!s.ok()) {
return s;
}
float resource_cache_threshold;
s = GetGpuResourceConfigCacheThreshold(resource_cache_threshold);
if (!s.ok()) {
return s;
}
float resource_cache_threshold;
s = GetGpuResourceConfigCacheThreshold(resource_cache_threshold);
if (!s.ok()) {
return s;
}
std::vector<int64_t> search_resources;
s = GetGpuResourceConfigSearchResources(search_resources);
if (!s.ok()) {
return s;
}
std::vector<int64_t> search_resources;
s = GetGpuResourceConfigSearchResources(search_resources);
if (!s.ok()) {
return s;
}
std::vector<int64_t> index_build_resources;
s = GetGpuResourceConfigBuildIndexResources(index_build_resources);
if (!s.ok()) {
return s;
std::vector<int64_t> index_build_resources;
s = GetGpuResourceConfigBuildIndexResources(index_build_resources);
if (!s.ok()) {
return s;
}
}
#endif
@ -323,13 +325,13 @@ Config::ResetDefaultConfig() {
return s;
}
#ifdef MILVUS_GPU_VERSION
/* gpu resource config */
s = SetEngineConfigGpuSearchThreshold(CONFIG_ENGINE_GPU_SEARCH_THRESHOLD_DEFAULT);
if (!s.ok()) {
return s;
}
/* gpu resource config */
#ifdef MILVUS_GPU_VERSION
s = SetGpuResourceConfigEnable(CONFIG_GPU_RESOURCE_ENABLE_DEFAULT);
if (!s.ok()) {
return s;
@ -630,6 +632,7 @@ Config::CheckEngineConfigOmpThreadNum(const std::string& value) {
return Status::OK();
}
#ifdef MILVUS_GPU_VERSION
Status
Config::CheckEngineConfigGpuSearchThreshold(const std::string& value) {
if (!ValidationUtil::ValidateStringIsNumber(value).ok()) {
@ -759,6 +762,7 @@ Config::CheckGpuResourceConfigBuildIndexResources(const std::vector<std::string>
return Status::OK();
}
#endif
////////////////////////////////////////////////////////////////////////////////
ConfigNode&
@ -979,6 +983,7 @@ Config::GetEngineConfigOmpThreadNum(int64_t& value) {
return Status::OK();
}
#ifdef MILVUS_GPU_VERSION
Status
Config::GetEngineConfigGpuSearchThreshold(int64_t& value) {
std::string str =
@ -1095,6 +1100,7 @@ Config::GetGpuResourceConfigBuildIndexResources(std::vector<int64_t>& value) {
}
return Status::OK();
}
#endif
///////////////////////////////////////////////////////////////////////////////
/* server config */
@ -1282,6 +1288,8 @@ Config::SetEngineConfigOmpThreadNum(const std::string& value) {
return Status::OK();
}
#ifdef MILVUS_GPU_VERSION
/* gpu resource config */
Status
Config::SetEngineConfigGpuSearchThreshold(const std::string& value) {
Status s = CheckEngineConfigGpuSearchThreshold(value);
@ -1292,7 +1300,6 @@ Config::SetEngineConfigGpuSearchThreshold(const std::string& value) {
return Status::OK();
}
/* gpu resource config */
Status
Config::SetGpuResourceConfigEnable(const std::string& value) {
Status s = CheckGpuResourceConfigEnable(value);
@ -1346,6 +1353,7 @@ Config::SetGpuResourceConfigBuildIndexResources(const std::string& value) {
SetConfigValueInMem(CONFIG_GPU_RESOURCE, CONFIG_GPU_RESOURCE_BUILD_INDEX_RESOURCES, value);
return Status::OK();
} // namespace server
#endif
} // namespace server
} // namespace milvus

View File

@ -170,6 +170,8 @@ class Config {
CheckEngineConfigUseBlasThreshold(const std::string& value);
Status
CheckEngineConfigOmpThreadNum(const std::string& value);
#ifdef MILVUS_GPU_VERSION
Status
CheckEngineConfigGpuSearchThreshold(const std::string& value);
@ -184,6 +186,7 @@ class Config {
CheckGpuResourceConfigSearchResources(const std::vector<std::string>& value);
Status
CheckGpuResourceConfigBuildIndexResources(const std::vector<std::string>& value);
#endif
std::string
GetConfigStr(const std::string& parent_key, const std::string& child_key, const std::string& default_value = "");
@ -239,6 +242,8 @@ class Config {
GetEngineConfigUseBlasThreshold(int64_t& value);
Status
GetEngineConfigOmpThreadNum(int64_t& value);
#ifdef MILVUS_GPU_VERSION
Status
GetEngineConfigGpuSearchThreshold(int64_t& value);
@ -253,6 +258,7 @@ class Config {
GetGpuResourceConfigSearchResources(std::vector<int64_t>& value);
Status
GetGpuResourceConfigBuildIndexResources(std::vector<int64_t>& value);
#endif
public:
/* server config */
@ -300,6 +306,8 @@ class Config {
SetEngineConfigUseBlasThreshold(const std::string& value);
Status
SetEngineConfigOmpThreadNum(const std::string& value);
#ifdef MILVUS_GPU_VERSION
Status
SetEngineConfigGpuSearchThreshold(const std::string& value);
@ -314,6 +322,7 @@ class Config {
SetGpuResourceConfigSearchResources(const std::string& value);
Status
SetGpuResourceConfigBuildIndexResources(const std::string& value);
#endif
private:
std::unordered_map<std::string, std::unordered_map<std::string, std::string>> config_map_;

View File

@ -183,7 +183,11 @@ Server::Start() {
// print version information
SERVER_LOG_INFO << "Milvus " << BUILD_TYPE << " version: v" << MILVUS_VERSION << ", built at " << BUILD_TIME;
#ifdef MILVUS_CPU_VERSION
SERVER_LOG_INFO << "CPU edition";
#else
SERVER_LOG_INFO << "GPU edition";
#endif
server::Metrics::GetInstance().Init();
server::SystemInfo::GetInstance().Init();

View File

@ -90,8 +90,8 @@ GrpcBaseRequest::SetStatus(ErrorCode error_code, const std::string& error_msg) {
std::string
GrpcBaseRequest::TableNotExistMsg(const std::string& table_name) {
return "Table " + table_name +
" not exist. Use milvus.has_table to verify whether the table exists. You also can check if the table name "
"exists.";
" does not exist. Use milvus.has_table to verify whether the table exists. "
"You also can check whether the table name exists.";
}
Status

View File

@ -30,9 +30,13 @@ class StringHelpFunctions {
StringHelpFunctions() = default;
public:
// trim blanks from begin and end
// " a b c " => "a b c"
static void
TrimStringBlank(std::string& string);
// trim quotes from begin and end
// "'abc'" => "abc"
static void
TrimStringQuote(std::string& string, const std::string& qoute);
@ -46,6 +50,8 @@ class StringHelpFunctions {
static void
SplitStringByDelimeter(const std::string& str, const std::string& delimeter, std::vector<std::string>& result);
// merge strings with delimeter
// "a", "b", "c" => "a,b,c"
static void
MergeStringWithDelimeter(const std::vector<std::string>& strs, const std::string& delimeter, std::string& result);

View File

@ -218,10 +218,9 @@ ValidationUtil::ValidateGpuIndex(int32_t gpu_index) {
return Status::OK();
}
#ifdef MILVUS_GPU_VERSION
Status
ValidationUtil::GetGpuMemory(int32_t gpu_index, size_t& memory) {
#ifdef MILVUS_GPU_VERSION
cudaDeviceProp deviceProp;
auto cuda_err = cudaGetDeviceProperties(&deviceProp, gpu_index);
if (cuda_err) {
@ -232,10 +231,9 @@ ValidationUtil::GetGpuMemory(int32_t gpu_index, size_t& memory) {
}
memory = deviceProp.totalGlobalMem;
#endif
return Status::OK();
}
#endif
Status
ValidationUtil::ValidateIpAddress(const std::string& ip_address) {

View File

@ -64,8 +64,10 @@ class ValidationUtil {
static Status
ValidateGpuIndex(int32_t gpu_index);
#ifdef MILVUS_GPU_VERSION
static Status
GetGpuMemory(int32_t gpu_index, size_t& memory);
#endif
static Status
ValidateIpAddress(const std::string& ip_address);

View File

@ -37,6 +37,16 @@ constexpr int64_t M_BYTE = 1024 * 1024;
Status
KnowhereResource::Initialize() {
#ifdef MILVUS_GPU_VERSION
Status s;
bool enable_gpu = false;
server::Config& config = server::Config::GetInstance();
s = config.GetGpuResourceConfigEnable(enable_gpu);
if (!s.ok())
return s;
if (not enable_gpu)
return Status::OK();
struct GpuResourceSetting {
int64_t pinned_memory = 300 * M_BYTE;
int64_t temp_memory = 300 * M_BYTE;
@ -44,10 +54,8 @@ KnowhereResource::Initialize() {
};
using GpuResourcesArray = std::map<int64_t, GpuResourceSetting>;
GpuResourcesArray gpu_resources;
Status s;
// get build index gpu resource
server::Config& config = server::Config::GetInstance();
std::vector<int64_t> build_index_gpus;
s = config.GetGpuResourceConfigBuildIndexResources(build_index_gpus);
if (!s.ok())

View File

@ -306,9 +306,9 @@ TEST_F(MetaTest, TABLE_FILES_TEST) {
ASSERT_EQ(dated_files[table_file.date_].size(), 0);
std::vector<int> file_types;
std::vector<std::string> file_ids;
status = impl_->FilesByType(table.table_id_, file_types, file_ids);
ASSERT_TRUE(file_ids.empty());
milvus::engine::meta::TableFilesSchema table_files;
status = impl_->FilesByType(table.table_id_, file_types, table_files);
ASSERT_TRUE(table_files.empty());
ASSERT_FALSE(status.ok());
file_types = {
@ -317,11 +317,11 @@ TEST_F(MetaTest, TABLE_FILES_TEST) {
milvus::engine::meta::TableFileSchema::INDEX, milvus::engine::meta::TableFileSchema::RAW,
milvus::engine::meta::TableFileSchema::BACKUP,
};
status = impl_->FilesByType(table.table_id_, file_types, file_ids);
status = impl_->FilesByType(table.table_id_, file_types, table_files);
ASSERT_TRUE(status.ok());
uint64_t total_cnt = new_index_files_cnt + new_merge_files_cnt + backup_files_cnt + new_files_cnt + raw_files_cnt +
to_index_files_cnt + index_files_cnt;
ASSERT_EQ(file_ids.size(), total_cnt);
ASSERT_EQ(table_files.size(), total_cnt);
status = impl_->DeleteTableFiles(table_id);
ASSERT_TRUE(status.ok());

View File

@ -169,9 +169,9 @@ TEST_F(MySqlMetaTest, ARCHIVE_TEST_DAYS) {
std::vector<int> file_types = {
(int)milvus::engine::meta::TableFileSchema::NEW,
};
std::vector<std::string> file_ids;
status = impl.FilesByType(table_id, file_types, file_ids);
ASSERT_FALSE(file_ids.empty());
milvus::engine::meta::TableFilesSchema table_files;
status = impl.FilesByType(table_id, file_types, table_files);
ASSERT_FALSE(table_files.empty());
status = impl.UpdateTableFilesToIndex(table_id);
ASSERT_TRUE(status.ok());
@ -326,9 +326,9 @@ TEST_F(MySqlMetaTest, TABLE_FILES_TEST) {
ASSERT_EQ(dated_files[table_file.date_].size(), 0);
std::vector<int> file_types;
std::vector<std::string> file_ids;
status = impl_->FilesByType(table.table_id_, file_types, file_ids);
ASSERT_TRUE(file_ids.empty());
milvus::engine::meta::TableFilesSchema table_files;
status = impl_->FilesByType(table.table_id_, file_types, table_files);
ASSERT_TRUE(table_files.empty());
ASSERT_FALSE(status.ok());
file_types = {
@ -337,11 +337,11 @@ TEST_F(MySqlMetaTest, TABLE_FILES_TEST) {
milvus::engine::meta::TableFileSchema::INDEX, milvus::engine::meta::TableFileSchema::RAW,
milvus::engine::meta::TableFileSchema::BACKUP,
};
status = impl_->FilesByType(table.table_id_, file_types, file_ids);
status = impl_->FilesByType(table.table_id_, file_types, table_files);
ASSERT_TRUE(status.ok());
uint64_t total_cnt = new_index_files_cnt + new_merge_files_cnt + backup_files_cnt + new_files_cnt + raw_files_cnt +
to_index_files_cnt + index_files_cnt;
ASSERT_EQ(file_ids.size(), total_cnt);
ASSERT_EQ(table_files.size(), total_cnt);
status = impl_->DeleteTableFiles(table_id);
ASSERT_TRUE(status.ok());

View File

@ -132,8 +132,8 @@ BaseTest::SetUp() {
void
BaseTest::TearDown() {
milvus::cache::CpuCacheMgr::GetInstance()->ClearCache();
milvus::cache::GpuCacheMgr::GetInstance(0)->ClearCache();
#ifdef MILVUS_GPU_VERSION
milvus::cache::GpuCacheMgr::GetInstance(0)->ClearCache();
knowhere::FaissGpuResourceMgr::GetInstance().Free();
#endif
}

View File

@ -98,24 +98,25 @@ class SchedulerTest : public testing::Test {
protected:
void
SetUp() override {
res_mgr_ = std::make_shared<ResourceMgr>();
ResourcePtr disk = ResourceFactory::Create("disk", "DISK", 0, true, false);
ResourcePtr cpu = ResourceFactory::Create("cpu", "CPU", 0, true, false);
disk_resource_ = res_mgr_->Add(std::move(disk));
cpu_resource_ = res_mgr_->Add(std::move(cpu));
#ifdef MILVUS_GPU_VERSION
constexpr int64_t cache_cap = 1024 * 1024 * 1024;
cache::GpuCacheMgr::GetInstance(0)->SetCapacity(cache_cap);
cache::GpuCacheMgr::GetInstance(1)->SetCapacity(cache_cap);
ResourcePtr disk = ResourceFactory::Create("disk", "DISK", 0, true, false);
ResourcePtr cpu = ResourceFactory::Create("cpu", "CPU", 0, true, false);
ResourcePtr gpu_0 = ResourceFactory::Create("gpu0", "GPU", 0);
ResourcePtr gpu_1 = ResourceFactory::Create("gpu1", "GPU", 1);
res_mgr_ = std::make_shared<ResourceMgr>();
disk_resource_ = res_mgr_->Add(std::move(disk));
cpu_resource_ = res_mgr_->Add(std::move(cpu));
gpu_resource_0_ = res_mgr_->Add(std::move(gpu_0));
gpu_resource_1_ = res_mgr_->Add(std::move(gpu_1));
auto PCIE = Connection("IO", 11000.0);
res_mgr_->Connect("cpu", "gpu0", PCIE);
res_mgr_->Connect("cpu", "gpu1", PCIE);
#endif
scheduler_ = std::make_shared<Scheduler>(res_mgr_);
@ -138,17 +139,6 @@ class SchedulerTest : public testing::Test {
std::shared_ptr<Scheduler> scheduler_;
};
void
insert_dummy_index_into_gpu_cache(uint64_t device_id) {
MockVecIndex* mock_index = new MockVecIndex();
mock_index->ntotal_ = 1000;
engine::VecIndexPtr index(mock_index);
cache::DataObjPtr obj = std::static_pointer_cast<cache::DataObj>(index);
cache::GpuCacheMgr::GetInstance(device_id)->InsertItem("location", obj);
}
class SchedulerTest2 : public testing::Test {
protected:
void
@ -157,16 +147,13 @@ class SchedulerTest2 : public testing::Test {
ResourcePtr cpu0 = ResourceFactory::Create("cpu0", "CPU", 0, true, false);
ResourcePtr cpu1 = ResourceFactory::Create("cpu1", "CPU", 1, true, false);
ResourcePtr cpu2 = ResourceFactory::Create("cpu2", "CPU", 2, true, false);
ResourcePtr gpu0 = ResourceFactory::Create("gpu0", "GPU", 0, true, true);
ResourcePtr gpu1 = ResourceFactory::Create("gpu1", "GPU", 1, true, true);
res_mgr_ = std::make_shared<ResourceMgr>();
disk_ = res_mgr_->Add(std::move(disk));
cpu_0_ = res_mgr_->Add(std::move(cpu0));
cpu_1_ = res_mgr_->Add(std::move(cpu1));
cpu_2_ = res_mgr_->Add(std::move(cpu2));
gpu_0_ = res_mgr_->Add(std::move(gpu0));
gpu_1_ = res_mgr_->Add(std::move(gpu1));
auto IO = Connection("IO", 5.0);
auto PCIE1 = Connection("PCIE", 11.0);
auto PCIE2 = Connection("PCIE", 20.0);
@ -174,8 +161,15 @@ class SchedulerTest2 : public testing::Test {
res_mgr_->Connect("cpu0", "cpu1", IO);
res_mgr_->Connect("cpu1", "cpu2", IO);
res_mgr_->Connect("cpu0", "cpu2", IO);
#ifdef MILVUS_GPU_VERSION
ResourcePtr gpu0 = ResourceFactory::Create("gpu0", "GPU", 0, true, true);
ResourcePtr gpu1 = ResourceFactory::Create("gpu1", "GPU", 1, true, true);
gpu_0_ = res_mgr_->Add(std::move(gpu0));
gpu_1_ = res_mgr_->Add(std::move(gpu1));
res_mgr_->Connect("cpu1", "gpu0", PCIE1);
res_mgr_->Connect("cpu2", "gpu1", PCIE2);
#endif
scheduler_ = std::make_shared<Scheduler>(res_mgr_);

View File

@ -175,6 +175,7 @@ TEST(CacheTest, CPU_CACHE_TEST) {
cpu_mgr->PrintInfo();
}
#ifdef MILVUS_GPU_VERSION
TEST(CacheTest, GPU_CACHE_TEST) {
auto gpu_mgr = milvus::cache::GpuCacheMgr::GetInstance(0);
@ -202,6 +203,7 @@ TEST(CacheTest, GPU_CACHE_TEST) {
gpu_mgr->ClearCache();
ASSERT_EQ(gpu_mgr->ItemCount(), 0);
}
#endif
TEST(CacheTest, INVALID_TEST) {
{

View File

@ -25,6 +25,8 @@
#include "utils/StringHelpFunctions.h"
#include "utils/ValidationUtil.h"
#include <limits>
namespace {
static constexpr uint64_t KB = 1024;
@ -63,9 +65,21 @@ TEST_F(ConfigTest, CONFIG_TEST) {
int64_t port = server_config.GetInt64Value("port");
ASSERT_NE(port, 0);
server_config.SetValue("test", "2.5");
double test = server_config.GetDoubleValue("test");
ASSERT_EQ(test, 2.5);
server_config.SetValue("float_test", "2.5");
double dbl = server_config.GetDoubleValue("float_test");
ASSERT_LE(abs(dbl - 2.5), std::numeric_limits<double>::epsilon());
float flt = server_config.GetFloatValue("float_test");
ASSERT_LE(abs(flt - 2.5), std::numeric_limits<float>::epsilon());
server_config.SetValue("bool_test", "true");
bool blt = server_config.GetBoolValue("bool_test");
ASSERT_TRUE(blt);
server_config.SetValue("int_test", "34");
int32_t it32 = server_config.GetInt32Value("int_test");
ASSERT_EQ(it32, 34);
int64_t it64 = server_config.GetInt64Value("int_test");
ASSERT_EQ(it64, 34);
milvus::server::ConfigNode fake;
server_config.AddChild("fake", fake);
@ -236,6 +250,7 @@ TEST_F(ConfigTest, SERVER_CONFIG_VALID_TEST) {
ASSERT_TRUE(s.ok());
ASSERT_TRUE(int64_val == engine_omp_thread_num);
#ifdef MILVUS_GPU_VERSION
int64_t engine_gpu_search_threshold = 800;
s = config.SetEngineConfigGpuSearchThreshold(std::to_string(engine_gpu_search_threshold));
ASSERT_TRUE(s.ok());
@ -251,7 +266,6 @@ TEST_F(ConfigTest, SERVER_CONFIG_VALID_TEST) {
ASSERT_TRUE(s.ok());
ASSERT_TRUE(bool_val == resource_enable_gpu);
#ifdef MILVUS_GPU_VERSION
int64_t gpu_cache_capacity = 1;
s = config.SetGpuResourceConfigCacheCapacity(std::to_string(gpu_cache_capacity));
ASSERT_TRUE(s.ok());
@ -389,6 +403,7 @@ TEST_F(ConfigTest, SERVER_CONFIG_INVALID_TEST) {
s = config.SetEngineConfigOmpThreadNum("10000");
ASSERT_FALSE(s.ok());
#ifdef MILVUS_GPU_VERSION
s = config.SetEngineConfigGpuSearchThreshold("-1");
ASSERT_FALSE(s.ok());
@ -396,7 +411,6 @@ TEST_F(ConfigTest, SERVER_CONFIG_INVALID_TEST) {
s = config.SetGpuResourceConfigEnable("ok");
ASSERT_FALSE(s.ok());
#ifdef MILVUS_GPU_VERSION
s = config.SetGpuResourceConfigCacheCapacity("a");
ASSERT_FALSE(s.ok());
s = config.SetGpuResourceConfigCacheCapacity("128");

View File

@ -313,6 +313,9 @@ TEST_F(RpcHandlerTest, TABLES_TEST) {
std::vector<std::vector<float>> record_array;
BuildVectors(0, VECTOR_COUNT, record_array);
::milvus::grpc::VectorIds vector_ids;
for (int64_t i = 0; i < VECTOR_COUNT; i++) {
vector_ids.add_vector_id_array(i);
}
// Insert vectors
// test invalid table name
handler->Insert(&context, &request, &vector_ids);

View File

@ -120,7 +120,13 @@ TEST(UtilTest, STRINGFUNCTIONS_TEST) {
milvus::server::StringHelpFunctions::SplitStringByDelimeter(str, ",", result);
ASSERT_EQ(result.size(), 3UL);
std::string merge_str;
milvus::server::StringHelpFunctions::MergeStringWithDelimeter(result, ",", merge_str);
ASSERT_EQ(merge_str, "a,b,c");
result.clear();
milvus::server::StringHelpFunctions::MergeStringWithDelimeter(result, ",", merge_str);
ASSERT_TRUE(merge_str.empty());
auto status = milvus::server::StringHelpFunctions::SplitStringByQuote(str, ",", "\"", result);
ASSERT_TRUE(status.ok());
ASSERT_EQ(result.size(), 3UL);
@ -211,6 +217,11 @@ TEST(UtilTest, STATUS_TEST) {
str = status.ToString();
ASSERT_FALSE(str.empty());
status = milvus::Status(milvus::DB_INVALID_PATH, "mistake");
ASSERT_EQ(status.code(), milvus::DB_INVALID_PATH);
str = status.ToString();
ASSERT_FALSE(str.empty());
status = milvus::Status(milvus::DB_META_TRANSACTION_FAILED, "mistake");
ASSERT_EQ(status.code(), milvus::DB_META_TRANSACTION_FAILED);
str = status.ToString();
@ -261,6 +272,10 @@ TEST(ValidationUtilTest, VALIDATE_TABLENAME_TEST) {
table_name = std::string(10000, 'a');
status = milvus::server::ValidationUtil::ValidateTableName(table_name);
ASSERT_EQ(status.code(), milvus::SERVER_INVALID_TABLE_NAME);
table_name = "";
status = milvus::server::ValidationUtil::ValidatePartitionName(table_name);
ASSERT_EQ(status.code(), milvus::SERVER_INVALID_TABLE_NAME);
}
TEST(ValidationUtilTest, VALIDATE_DIMENSION_TEST) {

View File

@ -16,25 +16,25 @@
### 软硬件环境
操作系统: CentOS Linux release 7.6.1810 (Core)
操作系统CentOS Linux release 7.6.1810 (Core)
CPU: Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz
CPUIntel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz
GPU0: GeForce GTX 1080
GPU0GeForce GTX 1080
GPU1: GeForce GTX 1080
GPU1GeForce GTX 1080
内存: 503GB
内存503GB
Docker版本: 18.09
Docker版本18.09
NVIDIA Driver版本: 430.34
NVIDIA Driver版本430.34
Milvus版本: 0.5.3
Milvus版本0.5.3
SDK接口: Python 3.6.8
SDK接口Python 3.6.8
pymilvus版本: 0.2.5
pymilvus版本0.2.5
@ -51,7 +51,7 @@ pymilvus版本: 0.2.5
### 测试指标
- Query Elapsed Time: 数据库查询所有向量的时间以秒计。影响Query Elapsed Time的变量:
- Query Elapsed Time数据库查询所有向量的时间以秒计。影响Query Elapsed Time的变量
- nq (被查询向量的数量)
@ -59,7 +59,7 @@ pymilvus版本: 0.2.5
>
> 被查询向量的数量nq将按照 [1, 5, 10, 200, 400, 600, 800, 1000]的数量分组。
- Recall: 实际返回的正确结果占总数之比 . 影响Recall的变量:
- Recall实际返回的正确结果占总数之比。影响Recall的变量
- nq (被查询向量的数量)
- topk (单条查询中最相似的K个结果)
@ -76,7 +76,7 @@ pymilvus版本: 0.2.5
### 测试环境
数据集: sift1b-1,000,000,000向量, 128维
数据集sift1b-1,000,000,000向量128维
表格属性:
@ -143,7 +143,7 @@ search_resources: cpu, gpu0
| nq=800 | 23.24 |
| nq=1000 | 27.41 |
当nq为1000时GPU模式下查询一条128维向量需要耗时约27毫秒。
当nq为1000时CPU模式下查询一条128维向量需要耗时约27毫秒。

View File

@ -139,7 +139,7 @@ topk = 100
**总结**
当nq小于1200时查询耗时随nq的增长快速增大当nq大于1200时查询耗时的增大则缓慢许多。这是因为gpu_search_threshold这一参数的值被设为1200当nq<1200CPUGPUCPU
当nq小于1200时查询耗时随nq的增长快速增大当nq大于1200时查询耗时的增大则缓慢许多。这是因为gpu_search_threshold这一参数的值被设为1200当nq小于1200时选择CPU进行操作否则选择GPU进行操作。
在GPU模式下的查询耗时由两部分组成1索引从CPU到GPU的拷贝时间2所有分桶的查询时间。当nq小于500时索引从CPU到GPU 的拷贝时间无法被有效均摊此时CPU模式时一个更优的选择当nq大于500时选择GPU模式更合理。和CPU相比GPU具有更多的核数和更强的算力。当nq较大时GPU在计算上的优势能被更好地被体现。

View File

@ -54,7 +54,7 @@ Follow below steps to start a standalone Milvus instance with Mishards from sour
3. Start Milvus server.
```shell
$ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b
$ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus
```
4. Update path permissions.

View File

@ -48,7 +48,7 @@ Python 版本为3.6及以上。
3. 启动 Milvus 服务。
```shell
$ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus:0.5.0-d102119-ede20b
$ sudo nvidia-docker run --rm -d -p 19530:19530 -v /tmp/milvus/db:/opt/milvus/db milvusdb/milvus
```
4. 更改目录权限。

View File

@ -3,14 +3,15 @@ services:
milvus_wr:
runtime: nvidia
restart: always
image: milvusdb/milvus:0.5.0-d102119-ede20b
image: milvusdb/milvus
volumes:
- /tmp/milvus/db:/opt/milvus/db
- ./wr_server.yml:/opt/milvus/conf/server_config.yaml
milvus_ro:
runtime: nvidia
restart: always
image: milvusdb/milvus:0.5.0-d102119-ede20b
image: milvusdb/milvus
volumes:
- /tmp/milvus/db:/opt/milvus/db
- ./ro_server.yml:/opt/milvus/conf/server_config.yaml

View File

@ -12,7 +12,7 @@ db_config:
# Keep 'dialect://:@:/', and replace other texts with real values
# Replace 'dialect' with 'mysql' or 'sqlite'
insert_buffer_size: 4 # GB, maximum insert buffer size allowed
insert_buffer_size: 1 # GB, maximum insert buffer size allowed
# sum of insert_buffer_size and cpu_cache_capacity cannot exceed total memory
preload_table: # preload data at startup, '*' means load all tables, empty value means no preload
@ -25,14 +25,14 @@ metric_config:
port: 8080 # port prometheus uses to fetch metrics
cache_config:
cpu_cache_capacity: 16 # GB, CPU memory used for cache
cpu_cache_capacity: 4 # GB, CPU memory used for cache
cpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered
gpu_cache_capacity: 4 # GB, GPU memory used for cache
gpu_cache_capacity: 1 # GB, GPU memory used for cache
gpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered
cache_insert_data: false # whether to load inserted data into cache
engine_config:
use_blas_threshold: 20 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times
use_blas_threshold: 800 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times
# if nq >= use_blas_threshold, use OpenBlas, slower with stable response times
resource_config:

View File

@ -0,0 +1,41 @@
server_config:
address: 0.0.0.0 # milvus server ip address (IPv4)
port: 19530 # port range: 1025 ~ 65534
deploy_mode: cluster_writable # deployment type: single, cluster_readonly, cluster_writable
time_zone: UTC+8
db_config:
primary_path: /opt/milvus # path used to store data and meta
secondary_path: # path used to store data only, split by semicolon
backend_url: sqlite://:@:/ # URI format: dialect://username:password@host:port/database
# Keep 'dialect://:@:/', and replace other texts with real values
# Replace 'dialect' with 'mysql' or 'sqlite'
insert_buffer_size: 2 # GB, maximum insert buffer size allowed
# sum of insert_buffer_size and cpu_cache_capacity cannot exceed total memory
preload_table: # preload data at startup, '*' means load all tables, empty value means no preload
# you can specify preload tables like this: table1,table2,table3
metric_config:
enable_monitor: false # enable monitoring or not
collector: prometheus # prometheus
prometheus_config:
port: 8080 # port prometheus uses to fetch metrics
cache_config:
cpu_cache_capacity: 2 # GB, CPU memory used for cache
cpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered
gpu_cache_capacity: 2 # GB, GPU memory used for cache
gpu_cache_threshold: 0.85 # percentage of data that will be kept when cache cleanup is triggered
cache_insert_data: false # whether to load inserted data into cache
engine_config:
use_blas_threshold: 800 # if nq < use_blas_threshold, use SSE, faster with fluctuated response times
# if nq >= use_blas_threshold, use OpenBlas, slower with stable response times
resource_config:
search_resources: # define the GPUs used for search computation, valid value: gpux
- gpu0
index_build_device: gpu0 # GPU used for building index

View File

@ -1,7 +1,7 @@
DEBUG=True
WOSERVER=tcp://127.0.0.1:19530
SERVER_PORT=19532
SERVER_PORT=19535
SERVER_TEST_PORT=19888
#SQLALCHEMY_DATABASE_URI=mysql+pymysql://root:root@127.0.0.1:3306/milvus?charset=utf8mb4
@ -19,7 +19,7 @@ TRACER_CLASS_NAME=jaeger
TRACING_SERVICE_NAME=fortest
TRACING_SAMPLER_TYPE=const
TRACING_SAMPLER_PARAM=1
TRACING_LOG_PAYLOAD=True
TRACING_LOG_PAYLOAD=False
#TRACING_SAMPLER_TYPE=probabilistic
#TRACING_SAMPLER_PARAM=0.5

View File

@ -0,0 +1,10 @@
def FileTransfer (sourceFiles, remoteDirectory, remoteIP, protocol = "ftp", makeEmptyDirs = true) {
if (protocol == "ftp") {
ftpPublisher masterNodeName: '', paramPublish: [parameterName: ''], alwaysPublishFromMaster: false, continueOnError: false, failOnError: true, publishers: [
[configName: "${remoteIP}", transfers: [
[asciiMode: false, cleanRemote: false, excludes: '', flatten: false, makeEmptyDirs: "${makeEmptyDirs}", noDefaultExcludes: false, patternSeparator: '[, ]+', remoteDirectory: "${remoteDirectory}", remoteDirectorySDF: false, removePrefix: '', sourceFiles: "${sourceFiles}"]], usePromotionTimestamp: true, useWorkspaceInPromotion: false, verbose: true
]
]
}
}
return this

View File

@ -0,0 +1,16 @@
timeout(time: 7200, unit: 'MINUTES') {
try {
dir ("milvu_ann_acc") {
print "Git clone url: ${TEST_URL}:${TEST_BRANCH}"
checkout([$class: 'GitSCM', branches: [[name: "${TEST_BRANCH}"]], doGenerateSubmoduleConfigurations: false, extensions: [], submoduleCfg: [], userRemoteConfigs: [[credentialsId: "${params.GIT_USER}", url: "${TEST_URL}", name: 'origin', refspec: "+refs/heads/${TEST_BRANCH}:refs/remotes/origin/${TEST_BRANCH}"]]])
print "Install requirements"
sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com'
// sleep(120000)
sh "python3 main.py --suite=${params.SUITE} --host=acc-test-${env.JOB_NAME}-${env.BUILD_NUMBER}-engine.milvus.svc.cluster.local --port=19530"
}
} catch (exc) {
echo 'Milvus Ann Accuracy Test Failed !'
throw exc
}
}

View File

@ -0,0 +1,13 @@
try {
def result = sh script: "helm status ${env.JOB_NAME}-${env.BUILD_NUMBER}", returnStatus: true
if (!result) {
sh "helm del --purge ${env.JOB_NAME}-${env.BUILD_NUMBER}"
}
} catch (exc) {
def result = sh script: "helm status ${env.JOB_NAME}-${env.BUILD_NUMBER}", returnStatus: true
if (!result) {
sh "helm del --purge ${env.JOB_NAME}-${env.BUILD_NUMBER}"
}
throw exc
}

View File

@ -0,0 +1,22 @@
timeout(time: 30, unit: 'MINUTES') {
try {
dir ("milvus") {
sh 'helm init --client-only --skip-refresh --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts'
sh 'helm repo update'
checkout([$class: 'GitSCM', branches: [[name: "${HELM_BRANCH}"]], userRemoteConfigs: [[url: "${HELM_URL}", name: 'origin', refspec: "+refs/heads/${HELM_BRANCH}:refs/remotes/origin/${HELM_BRANCH}"]]])
dir ("milvus") {
sh "helm install --wait --timeout 300 --set engine.image.tag=${IMAGE_TAG} --set expose.type=clusterIP --name acc-test-${env.JOB_NAME}-${env.BUILD_NUMBER} -f ci/db_backend/sqlite_${params.IMAGE_TYPE}_values.yaml -f ci/filebeat/values.yaml --namespace milvus --version ${HELM_BRANCH} ."
}
}
// dir ("milvus") {
// checkout([$class: 'GitSCM', branches: [[name: "${env.SERVER_BRANCH}"]], userRemoteConfigs: [[url: "${env.SERVER_URL}", name: 'origin', refspec: "+refs/heads/${env.SERVER_BRANCH}:refs/remotes/origin/${env.SERVER_BRANCH}"]]])
// dir ("milvus") {
// load "ci/jenkins/step/deploySingle2Dev.groovy"
// }
// }
} catch (exc) {
echo 'Deploy Milvus Server Failed !'
throw exc
}
}

View File

@ -0,0 +1,15 @@
def notify() {
if (!currentBuild.resultIsBetterOrEqualTo('SUCCESS')) {
// Send an email only if the build status has changed from green/unstable to red
emailext subject: '$DEFAULT_SUBJECT',
body: '$DEFAULT_CONTENT',
recipientProviders: [
[$class: 'DevelopersRecipientProvider'],
[$class: 'RequesterRecipientProvider']
],
replyTo: '$DEFAULT_REPLYTO',
to: '$DEFAULT_RECIPIENTS'
}
}
return this

View File

@ -0,0 +1,130 @@
pipeline {
agent none
options {
timestamps()
}
parameters{
choice choices: ['cpu', 'gpu'], description: 'cpu or gpu version', name: 'IMAGE_TYPE'
string defaultValue: '0.6.0', description: 'server image version', name: 'IMAGE_VERSION', trim: true
string defaultValue: 'suite.yaml', description: 'test suite config yaml', name: 'SUITE', trim: true
string defaultValue: '09509e53-9125-4f5d-9ce8-42855987ad67', description: 'git credentials', name: 'GIT_USER', trim: true
}
environment {
IMAGE_TAG = "${params.IMAGE_VERSION}-${params.IMAGE_TYPE}-ubuntu18.04-release"
HELM_URL = "https://github.com/milvus-io/milvus-helm.git"
HELM_BRANCH = "0.6.0"
TEST_URL = "git@192.168.1.105:Test/milvus_ann_acc.git"
TEST_BRANCH = "0.6.0"
}
stages {
stage("Setup env") {
agent {
kubernetes {
label 'dev-test'
defaultContainer 'jnlp'
yaml """
apiVersion: v1
kind: Pod
metadata:
labels:
app: milvus
componet: test
spec:
containers:
- name: milvus-testframework
image: registry.zilliz.com/milvus/milvus-test:v0.2
command:
- cat
tty: true
volumeMounts:
- name: kubeconf
mountPath: /root/.kube/
readOnly: true
- name: hdf5-path
mountPath: /test
readOnly: true
volumes:
- name: kubeconf
secret:
secretName: test-cluster-config
- name: hdf5-path
flexVolume:
driver: "fstab/cifs"
fsType: "cifs"
secretRef:
name: "cifs-test-secret"
options:
networkPath: "//192.168.1.126/test"
mountOptions: "vers=1.0"
"""
}
}
stages {
stage("Deploy Default Server") {
steps {
gitlabCommitStatus(name: 'Accuracy Test') {
container('milvus-testframework') {
script {
print "In Deploy Default Server Stage"
load "${env.WORKSPACE}/ci/jenkinsfile/deploy_default_server.groovy"
}
}
}
}
}
stage("Acc Test") {
steps {
gitlabCommitStatus(name: 'Accuracy Test') {
container('milvus-testframework') {
script {
print "In Acc test stage"
load "${env.WORKSPACE}/ci/jenkinsfile/acc_test.groovy"
}
}
}
}
}
stage ("Cleanup Env") {
steps {
gitlabCommitStatus(name: 'Cleanup Env') {
container('milvus-testframework') {
script {
load "${env.WORKSPACE}/ci/jenkinsfile/cleanup.groovy"
}
}
}
}
}
}
post {
always {
container('milvus-testframework') {
script {
load "${env.WORKSPACE}/ci/jenkinsfile/cleanup.groovy"
}
}
}
success {
script {
echo "Milvus ann-accuracy test success !"
}
}
aborted {
script {
echo "Milvus ann-accuracy test aborted !"
}
}
failure {
script {
echo "Milvus ann-accuracy test failed !"
}
}
}
}
}
}

View File

@ -0,0 +1,13 @@
apiVersion: v1
kind: Pod
metadata:
labels:
app: milvus
componet: testframework
spec:
containers:
- name: milvus-testframework
image: registry.zilliz.com/milvus/milvus-test:v0.2
command:
- cat
tty: true

View File

@ -8,7 +8,7 @@ import numpy
import sklearn.preprocessing
from milvus import Milvus, IndexType, MetricType
logger = logging.getLogger("milvus_ann_acc.client")
logger = logging.getLogger("milvus_acc.client")
SERVER_HOST_DEFAULT = "127.0.0.1"
SERVER_PORT_DEFAULT = 19530
@ -28,17 +28,17 @@ def time_wrapper(func):
class MilvusClient(object):
def __init__(self, table_name=None, ip=None, port=None):
def __init__(self, table_name=None, host=None, port=None):
self._milvus = Milvus()
self._table_name = table_name
try:
if not ip:
if not host:
self._milvus.connect(
host = SERVER_HOST_DEFAULT,
port = SERVER_PORT_DEFAULT)
else:
self._milvus.connect(
host = ip,
host = host,
port = port)
except Exception as e:
raise e
@ -113,7 +113,6 @@ class MilvusClient(object):
X = X.astype(numpy.float32)
status, results = self._milvus.search_vectors(self._table_name, top_k, nprobe, X.tolist())
self.check_status(status)
# logger.info(results[0])
ids = []
for result in results:
tmp_ids = []
@ -125,24 +124,20 @@ class MilvusClient(object):
def count(self):
return self._milvus.get_table_row_count(self._table_name)[1]
def delete(self, timeout=60):
logger.info("Start delete table: %s" % self._table_name)
self._milvus.delete_table(self._table_name)
i = 0
while i < timeout:
if self.count():
time.sleep(1)
i = i + 1
else:
break
if i >= timeout:
logger.error("Delete table timeout")
def delete(self, table_name):
logger.info("Start delete table: %s" % table_name)
return self._milvus.delete_table(table_name)
def describe(self):
return self._milvus.describe_table(self._table_name)
def exists_table(self):
return self._milvus.has_table(self._table_name)
def exists_table(self, table_name):
return self._milvus.has_table(table_name)
def get_server_version(self):
status, res = self._milvus.server_version()
self.check_status(status)
return res
@time_wrapper
def preload_table(self):

View File

@ -1,26 +1,57 @@
import os
import sys
import argparse
from yaml import load, dump
import logging
from logging import handlers
from client import MilvusClient
import runner
LOG_FOLDER = "logs"
logger = logging.getLogger("milvus_acc")
formatter = logging.Formatter('[%(asctime)s] [%(levelname)-4s] [%(pathname)s:%(lineno)d] %(message)s')
if not os.path.exists(LOG_FOLDER):
os.system('mkdir -p %s' % LOG_FOLDER)
fileTimeHandler = handlers.TimedRotatingFileHandler(os.path.join(LOG_FOLDER, 'acc'), "D", 1, 10)
fileTimeHandler.suffix = "%Y%m%d.log"
fileTimeHandler.setFormatter(formatter)
logging.basicConfig(level=logging.DEBUG)
fileTimeHandler.setFormatter(formatter)
logger.addHandler(fileTimeHandler)
def main():
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument(
'--dataset',
metavar='NAME',
help='the dataset to load training points from',
default='glove-100-angular',
choices=DATASETS.keys())
"--host",
default="127.0.0.1",
help="server host")
parser.add_argument(
"-k", "--count",
default=10,
type=positive_int,
help="the number of near neighbours to search for")
"--port",
default=19530,
help="server port")
parser.add_argument(
'--definitions',
'--suite',
metavar='FILE',
help='load algorithm definitions from FILE',
default='algos.yaml')
parser.add_argument(
'--image-tag',
default=None,
help='pull image first')
help='load config definitions from suite_czr'
'.yaml',
default='suite_czr.yaml')
args = parser.parse_args()
if args.suite:
with open(args.suite, "r") as f:
suite = load(f)
hdf5_path = suite["hdf5_path"]
dataset_configs = suite["datasets"]
if not hdf5_path or not dataset_configs:
logger.warning("No datasets given")
sys.exit()
f.close()
for dataset_config in dataset_configs:
logger.debug(dataset_config)
milvus_instance = MilvusClient(host=args.host, port=args.port)
runner.run(milvus_instance, dataset_config, hdf5_path)
if __name__ == "__main__":
main()

View File

@ -2,3 +2,8 @@ numpy==1.16.3
pymilvus>=0.2.0
scikit-learn==0.19.1
h5py==2.7.1
influxdb==5.2.2
pyyaml==3.12
tableprint==0.8.0
ansicolors==1.1.8
scipy==1.3.1

View File

@ -0,0 +1,162 @@
import os
import pdb
import time
import random
import sys
import logging
import h5py
import numpy
from influxdb import InfluxDBClient
INSERT_INTERVAL = 100000
# s
DELETE_INTERVAL_TIME = 5
INFLUXDB_HOST = "192.168.1.194"
INFLUXDB_PORT = 8086
INFLUXDB_USER = "admin"
INFLUXDB_PASSWD = "admin"
INFLUXDB_NAME = "test_result"
influxdb_client = InfluxDBClient(host=INFLUXDB_HOST, port=INFLUXDB_PORT, username=INFLUXDB_USER, password=INFLUXDB_PASSWD, database=INFLUXDB_NAME)
logger = logging.getLogger("milvus_acc.runner")
def parse_dataset_name(dataset_name):
data_type = dataset_name.split("-")[0]
dimension = int(dataset_name.split("-")[1])
metric = dataset_name.split("-")[-1]
# metric = dataset.attrs['distance']
# dimension = len(dataset["train"][0])
if metric == "euclidean":
metric_type = "l2"
elif metric == "angular":
metric_type = "ip"
return ("ann"+data_type, dimension, metric_type)
def get_dataset(hdf5_path, dataset_name):
file_path = os.path.join(hdf5_path, '%s.hdf5' % dataset_name)
if not os.path.exists(file_path):
raise Exception("%s not existed" % file_path)
dataset = h5py.File(file_path)
return dataset
def get_table_name(hdf5_path, dataset_name, index_file_size):
data_type, dimension, metric_type = parse_dataset_name(dataset_name)
dataset = get_dataset(hdf5_path, dataset_name)
table_size = len(dataset["train"])
table_size = str(table_size // 1000000)+"m"
table_name = data_type+'_'+table_size+'_'+str(index_file_size)+'_'+str(dimension)+'_'+metric_type
return table_name
def recall_calc(result_ids, true_ids, top_k, recall_k):
sum_intersect_num = 0
recall = 0.0
for index, result_item in enumerate(result_ids):
if len(set(true_ids[index][:top_k])) != len(set(result_item)):
logger.warning("Error happened: query result length is wrong")
continue
tmp = set(true_ids[index][:recall_k]).intersection(set(result_item))
sum_intersect_num = sum_intersect_num + len(tmp)
recall = round(sum_intersect_num / (len(result_ids) * recall_k), 4)
return recall
def run(milvus, config, hdf5_path, force=True):
server_version = milvus.get_server_version()
logger.info(server_version)
for dataset_name, config_value in config.items():
dataset = get_dataset(hdf5_path, dataset_name)
index_file_sizes = config_value["index_file_sizes"]
index_types = config_value["index_types"]
nlists = config_value["nlists"]
search_param = config_value["search_param"]
top_ks = search_param["top_ks"]
nprobes = search_param["nprobes"]
nqs = search_param["nqs"]
for index_file_size in index_file_sizes:
table_name = get_table_name(hdf5_path, dataset_name, index_file_size)
if milvus.exists_table(table_name):
if force is True:
logger.info("Re-create table: %s" % table_name)
milvus.delete(table_name)
time.sleep(DELETE_INTERVAL_TIME)
else:
logger.warning("Table name: %s existed" % table_name)
continue
data_type, dimension, metric_type = parse_dataset_name(dataset_name)
milvus.create_table(table_name, dimension, index_file_size, metric_type)
logger.info(milvus.describe())
insert_vectors = numpy.array(dataset["train"])
# milvus.insert(insert_vectors)
loops = len(insert_vectors) // INSERT_INTERVAL + 1
for i in range(loops):
start = i*INSERT_INTERVAL
end = min((i+1)*INSERT_INTERVAL, len(insert_vectors))
tmp_vectors = insert_vectors[start:end]
if start < end:
milvus.insert(tmp_vectors, ids=[i for i in range(start, end)])
time.sleep(20)
row_count = milvus.count()
logger.info("Table: %s, row count: %s" % (table_name, row_count))
if milvus.count() != len(insert_vectors):
logger.error("Table row count is not equal to insert vectors")
return
for index_type in index_types:
for nlist in nlists:
milvus.create_index(index_type, nlist)
logger.info(milvus.describe_index())
logger.info("Start preload table: %s, index_type: %s, nlist: %s" % (table_name, index_type, nlist))
milvus.preload_table()
true_ids = numpy.array(dataset["neighbors"])
for nprobe in nprobes:
for nq in nqs:
query_vectors = numpy.array(dataset["test"][:nq])
for top_k in top_ks:
rec1 = 0.0
rec10 = 0.0
rec100 = 0.0
result_ids = milvus.query(query_vectors, top_k, nprobe)
logger.info("Query result: %s" % len(result_ids))
rec1 = recall_calc(result_ids, true_ids, top_k, 1)
if top_k == 10:
rec10 = recall_calc(result_ids, true_ids, top_k, 10)
if top_k == 100:
rec10 = recall_calc(result_ids, true_ids, top_k, 10)
rec100 = recall_calc(result_ids, true_ids, top_k, 100)
avg_radio = recall_calc(result_ids, true_ids, top_k, top_k)
logger.debug("Recall_1: %s" % rec1)
logger.debug("Recall_10: %s" % rec10)
logger.debug("Recall_100: %s" % rec100)
logger.debug("Accuracy: %s" % avg_radio)
acc_record = [{
"measurement": "accuracy",
"tags": {
"server_version": server_version,
"dataset": dataset_name,
"index_file_size": index_file_size,
"index_type": index_type,
"nlist": nlist,
"search_nprobe": nprobe,
"top_k": top_k,
"nq": len(query_vectors)
},
# "time": time.ctime(),
"time": time.strftime("%Y-%m-%dT%H:%M:%SZ"),
"fields": {
"recall1": rec1,
"recall10": rec10,
"recall100": rec100,
"avg_radio": avg_radio
}
}]
logger.info(acc_record)
try:
res = influxdb_client.write_points(acc_record)
except Exception as e:
logger.error("Insert infuxdb failed: %s" % str(e))

View File

@ -0,0 +1,29 @@
datasets:
- sift-128-euclidean:
index_file_sizes: [50, 1024]
index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [1, 32, 128, 256]
top_ks: [10]
nqs: [10000]
- glove-25-angular:
index_file_sizes: [50, 1024]
index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [1, 32, 128, 256]
top_ks: [10]
nqs: [10000]
- glove-200-angular:
index_file_sizes: [50, 1024]
index_types: ['ivf_flat', 'ivf_sq8', 'ivf_sq8h']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [1, 32, 128, 256]
top_ks: [10]
nqs: [10000]
hdf5_path: /test/milvus/ann_hdf5/

View File

@ -0,0 +1,11 @@
datasets:
- glove-200-angular:
index_file_sizes: [1024]
index_types: ['ivf_sq8']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [256, 400, 256]
top_ks: [100]
nqs: [10000]
hdf5_path: /test/milvus/ann_hdf5/

View File

@ -0,0 +1,20 @@
datasets:
- sift-128-euclidean:
index_file_sizes: [1024]
index_types: ['ivf_sq8', 'ivf_sq8h']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [16, 128, 1024]
top_ks: [1, 10, 100]
nqs: [10, 100, 1000]
- glove-200-angular:
index_file_sizes: [1024]
index_types: ['ivf_sq8', 'ivf_sq8h']
# index_types: ['ivf_sq8']
nlists: [16384]
search_param:
nprobes: [16, 128, 1024]
top_ks: [1, 10, 100]
nqs: [10, 100, 1000]
hdf5_path: /test/milvus/ann_hdf5/

View File

@ -0,0 +1,10 @@
datasets:
- sift-128-euclidean:
index_file_sizes: [1024]
index_types: ['ivf_flat']
nlists: [16384]
search_param:
nprobes: [1, 256]
top_ks: [10]
nqs: [10000]
hdf5_path: /test/milvus/ann_hdf5/

View File

@ -1,132 +1,33 @@
import os
import pdb
import time
import random
import sys
import h5py
import numpy
import logging
from logging import handlers
from influxdb import InfluxDBClient
from client import MilvusClient
INFLUXDB_HOST = "192.168.1.194"
INFLUXDB_PORT = 8086
INFLUXDB_USER = "admin"
INFLUXDB_PASSWD = "admin"
INFLUXDB_NAME = "test_result"
LOG_FOLDER = "logs"
logger = logging.getLogger("milvus_ann_acc")
client = InfluxDBClient(host=INFLUXDB_HOST, port=INFLUXDB_PORT, username=INFLUXDB_USER, password=INFLUXDB_PASSWD, database=INFLUXDB_NAME)
formatter = logging.Formatter('[%(asctime)s] [%(levelname)-4s] [%(pathname)s:%(lineno)d] %(message)s')
if not os.path.exists(LOG_FOLDER):
os.system('mkdir -p %s' % LOG_FOLDER)
fileTimeHandler = handlers.TimedRotatingFileHandler(os.path.join(LOG_FOLDER, 'acc'), "D", 1, 10)
fileTimeHandler.suffix = "%Y%m%d.log"
fileTimeHandler.setFormatter(formatter)
logging.basicConfig(level=logging.DEBUG)
fileTimeHandler.setFormatter(formatter)
logger.addHandler(fileTimeHandler)
def get_dataset_fn(dataset_name):
file_path = "/test/milvus/ann_hdf5/"
if not os.path.exists(file_path):
raise Exception("%s not exists" % file_path)
return os.path.join(file_path, '%s.hdf5' % dataset_name)
def get_dataset(dataset_name):
hdf5_fn = get_dataset_fn(dataset_name)
hdf5_f = h5py.File(hdf5_fn)
return hdf5_f
def parse_dataset_name(dataset_name):
data_type = dataset_name.split("-")[0]
dimension = int(dataset_name.split("-")[1])
metric = dataset_name.split("-")[-1]
# metric = dataset.attrs['distance']
# dimension = len(dataset["train"][0])
if metric == "euclidean":
metric_type = "l2"
elif metric == "angular":
metric_type = "ip"
return ("ann"+data_type, dimension, metric_type)
def get_table_name(dataset_name, index_file_size):
data_type, dimension, metric_type = parse_dataset_name(dataset_name)
dataset = get_dataset(dataset_name)
table_size = len(dataset["train"])
table_size = str(table_size // 1000000)+"m"
table_name = data_type+'_'+table_size+'_'+str(index_file_size)+'_'+str(dimension)+'_'+metric_type
return table_name
def main(dataset_name, index_file_size, nlist=16384, force=False):
top_k = 10
nprobes = [32, 128]
dataset = get_dataset(dataset_name)
table_name = get_table_name(dataset_name, index_file_size)
m = MilvusClient(table_name)
if m.exists_table():
if force is True:
logger.info("Re-create table: %s" % table_name)
m.delete()
time.sleep(10)
else:
logger.info("Table name: %s existed" % table_name)
return
data_type, dimension, metric_type = parse_dataset_name(dataset_name)
m.create_table(table_name, dimension, index_file_size, metric_type)
print(m.describe())
vectors = numpy.array(dataset["train"])
query_vectors = numpy.array(dataset["test"])
# m.insert(vectors)
interval = 100000
loops = len(vectors) // interval + 1
for i in range(loops):
start = i*interval
end = min((i+1)*interval, len(vectors))
tmp_vectors = vectors[start:end]
if start < end:
m.insert(tmp_vectors, ids=[i for i in range(start, end)])
time.sleep(60)
print(m.count())
for index_type in ["ivf_flat", "ivf_sq8", "ivf_sq8h"]:
m.create_index(index_type, nlist)
print(m.describe_index())
if m.count() != len(vectors):
return
m.preload_table()
true_ids = numpy.array(dataset["neighbors"])
for nprobe in nprobes:
print("nprobe: %s" % nprobe)
sum_radio = 0.0; avg_radio = 0.0
result_ids = m.query(query_vectors, top_k, nprobe)
# print(result_ids[:10])
for index, result_item in enumerate(result_ids):
if len(set(true_ids[index][:top_k])) != len(set(result_item)):
logger.info("Error happened")
# logger.info(query_vectors[index])
# logger.info(true_ids[index][:top_k], result_item)
tmp = set(true_ids[index][:top_k]).intersection(set(result_item))
sum_radio = sum_radio + (len(tmp) / top_k)
avg_radio = round(sum_radio / len(result_ids), 4)
logger.info(avg_radio)
m.drop_index()
if __name__ == "__main__":
print("glove-25-angular")
# main("sift-128-euclidean", 1024, force=True)
for index_file_size in [50, 1024]:
print("Index file size: %d" % index_file_size)
main("glove-25-angular", index_file_size, force=True)
print("sift-128-euclidean")
for index_file_size in [50, 1024]:
print("Index file size: %d" % index_file_size)
main("sift-128-euclidean", index_file_size, force=True)
# m = MilvusClient()
print(client.get_list_database())
acc_record = [{
"measurement": "accuracy",
"tags": {
"server_version": "0.4.3",
"dataset": "test",
"index_type": "test",
"nlist": 12,
"search_nprobe": 12,
"top_k": 1,
"nq": 1
},
"time": time.ctime(),
"fields": {
"accuracy": 0.1
}
}]
try:
res = client.write_points(acc_record)
print(res)
except Exception as e:
print(str(e))

View File

@ -4,6 +4,6 @@ log_format = [%(asctime)s-%(levelname)s-%(name)s]: %(message)s (%(filename)s:%(l
log_cli = true
log_level = 20
timeout = 300
timeout = 600
level = 1

View File

@ -22,4 +22,4 @@ wcwidth==0.1.7
wrapt==1.11.1
zipp==0.5.1
scikit-learn>=0.19.1
pymilvus-test>=0.2.0
pymilvus-test>=0.2.0

View File

@ -17,7 +17,6 @@ allure-pytest==2.7.0
pytest-print==0.1.2
pytest-level==0.1.1
six==1.12.0
thrift==0.11.0
typed-ast==1.3.5
wcwidth==0.1.7
wrapt==1.11.1

View File

@ -15,7 +15,7 @@ table_id = "test_add"
ADD_TIMEOUT = 60
nprobe = 1
epsilon = 0.0001
tag = "1970-01-01"
class TestAddBase:
"""
@ -186,6 +186,7 @@ class TestAddBase:
expected: status ok
'''
index_param = get_simple_index_params
logging.getLogger().info(index_param)
vector = gen_single_vector(dim)
status, ids = connect.add_vectors(table, vector)
status = connect.create_index(table, index_param)
@ -439,6 +440,80 @@ class TestAddBase:
assert status.OK()
assert len(ids) == nq
@pytest.mark.timeout(ADD_TIMEOUT)
def test_add_vectors_tag(self, connect, table):
'''
target: test add vectors in table created before
method: create table and add vectors in it, with the partition_tag param
expected: the table row count equals to nq
'''
nq = 5
partition_name = gen_unique_str()
vectors = gen_vectors(nq, dim)
status = connect.create_partition(table, partition_name, tag)
status, ids = connect.add_vectors(table, vectors, partition_tag=tag)
assert status.OK()
assert len(ids) == nq
@pytest.mark.timeout(ADD_TIMEOUT)
def test_add_vectors_tag_A(self, connect, table):
'''
target: test add vectors in table created before
method: create partition and add vectors in it
expected: the table row count equals to nq
'''
nq = 5
partition_name = gen_unique_str()
vectors = gen_vectors(nq, dim)
status = connect.create_partition(table, partition_name, tag)
status, ids = connect.add_vectors(partition_name, vectors)
assert status.OK()
assert len(ids) == nq
@pytest.mark.timeout(ADD_TIMEOUT)
def test_add_vectors_tag_not_existed(self, connect, table):
'''
target: test add vectors in table created before
method: create table and add vectors in it, with the not existed partition_tag param
expected: status not ok
'''
nq = 5
vectors = gen_vectors(nq, dim)
status, ids = connect.add_vectors(table, vectors, partition_tag=tag)
assert not status.OK()
@pytest.mark.timeout(ADD_TIMEOUT)
def test_add_vectors_tag_not_existed_A(self, connect, table):
'''
target: test add vectors in table created before
method: create partition, add vectors with the not existed partition_tag param
expected: status not ok
'''
nq = 5
vectors = gen_vectors(nq, dim)
new_tag = "new_tag"
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status, ids = connect.add_vectors(table, vectors, partition_tag=new_tag)
assert not status.OK()
@pytest.mark.timeout(ADD_TIMEOUT)
def test_add_vectors_tag_existed(self, connect, table):
'''
target: test add vectors in table created before
method: create table and add vectors in it repeatly, with the partition_tag param
expected: the table row count equals to nq
'''
nq = 5
partition_name = gen_unique_str()
vectors = gen_vectors(nq, dim)
status = connect.create_partition(table, partition_name, tag)
status, ids = connect.add_vectors(table, vectors, partition_tag=tag)
for i in range(5):
status, ids = connect.add_vectors(table, vectors, partition_tag=tag)
assert status.OK()
assert len(ids) == nq
@pytest.mark.level(2)
def test_add_vectors_without_connect(self, dis_connect, table):
'''
@ -1198,7 +1273,8 @@ class TestAddAdvance:
assert len(ids) == nb
assert status.OK()
class TestAddTableNameInvalid(object):
class TestNameInvalid(object):
"""
Test adding vectors with invalid table names
"""
@ -1209,13 +1285,27 @@ class TestAddTableNameInvalid(object):
def get_table_name(self, request):
yield request.param
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_tag_name(self, request):
yield request.param
@pytest.mark.level(2)
def test_add_vectors_with_invalid_tablename(self, connect, get_table_name):
def test_add_vectors_with_invalid_table_name(self, connect, get_table_name):
table_name = get_table_name
vectors = gen_vectors(1, dim)
status, result = connect.add_vectors(table_name, vectors)
assert not status.OK()
@pytest.mark.level(2)
def test_add_vectors_with_invalid_tag_name(self, connect, get_tag_name):
tag_name = get_tag_name
vectors = gen_vectors(1, dim)
status, result = connect.add_vectors(table_name, vectors, partition_tag=tag_name)
assert not status.OK()
class TestAddTableVectorsInvalid(object):
single_vector = gen_single_vector(dim)

View File

@ -149,7 +149,7 @@ class TestConnect:
milvus.connect(uri=uri_value, timeout=1)
assert not milvus.connected()
# TODO: enable
# disable
def _test_connect_with_multiprocess(self, args):
'''
target: test uri connect with multiprocess
@ -157,7 +157,7 @@ class TestConnect:
expected: all connection is connected
'''
uri_value = "tcp://%s:%s" % (args["ip"], args["port"])
process_num = 4
process_num = 10
processes = []
def connect(milvus):
@ -248,7 +248,7 @@ class TestConnect:
expected: connect raise an exception and connected is false
'''
milvus = Milvus()
uri_value = "tcp://%s:19540" % args["ip"]
uri_value = "tcp://%s:39540" % args["ip"]
with pytest.raises(Exception) as e:
milvus.connect(host=args["ip"], port="", uri=uri_value)
@ -264,6 +264,7 @@ class TestConnect:
milvus.connect(host="", port=args["port"], uri=uri_value, timeout=1)
assert not milvus.connected()
# Disable, (issue: https://github.com/milvus-io/milvus/issues/288)
def test_connect_param_priority_both_hostip_uri(self, args):
'''
target: both host_ip_port / uri are both given, and not null, use the uri params
@ -273,8 +274,9 @@ class TestConnect:
milvus = Milvus()
uri_value = "tcp://%s:%s" % (args["ip"], args["port"])
with pytest.raises(Exception) as e:
milvus.connect(host=args["ip"], port=19540, uri=uri_value, timeout=1)
assert not milvus.connected()
res = milvus.connect(host=args["ip"], port=39540, uri=uri_value, timeout=1)
logging.getLogger().info(res)
# assert not milvus.connected()
def _test_add_vector_and_disconnect_concurrently(self):
'''

View File

@ -20,6 +20,7 @@ vectors = sklearn.preprocessing.normalize(vectors, axis=1, norm='l2')
vectors = vectors.tolist()
BUILD_TIMEOUT = 60
nprobe = 1
tag = "1970-01-01"
class TestIndexBase:
@ -62,6 +63,21 @@ class TestIndexBase:
status = connect.create_index(table, index_params)
assert status.OK()
@pytest.mark.timeout(BUILD_TIMEOUT)
def test_create_index_partition(self, connect, table, get_index_params):
'''
target: test create index interface
method: create table, create partition, and add vectors in it, create index
expected: return code equals to 0, and search success
'''
partition_name = gen_unique_str()
index_params = get_index_params
logging.getLogger().info(index_params)
status = connect.create_partition(table, partition_name, tag)
status, ids = connect.add_vectors(table, vectors, partition_tag=tag)
status = connect.create_index(table, index_params)
assert status.OK()
@pytest.mark.level(2)
def test_create_index_without_connect(self, dis_connect, table):
'''
@ -555,6 +571,21 @@ class TestIndexIP:
status = connect.create_index(ip_table, index_params)
assert status.OK()
@pytest.mark.timeout(BUILD_TIMEOUT)
def test_create_index_partition(self, connect, ip_table, get_index_params):
'''
target: test create index interface
method: create table, create partition, and add vectors in it, create index
expected: return code equals to 0, and search success
'''
partition_name = gen_unique_str()
index_params = get_index_params
logging.getLogger().info(index_params)
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(partition_name, index_params)
assert status.OK()
@pytest.mark.level(2)
def test_create_index_without_connect(self, dis_connect, ip_table):
'''
@ -583,9 +614,9 @@ class TestIndexIP:
query_vecs = [vectors[0], vectors[1], vectors[2]]
top_k = 5
status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vecs)
logging.getLogger().info(result)
assert status.OK()
assert len(result) == len(query_vecs)
# logging.getLogger().info(result)
# TODO: enable
@pytest.mark.timeout(BUILD_TIMEOUT)
@ -743,13 +774,13 @@ class TestIndexIP:
******************************************************************
"""
def test_describe_index(self, connect, ip_table, get_index_params):
def test_describe_index(self, connect, ip_table, get_simple_index_params):
'''
target: test describe index interface
method: create table and add vectors in it, create index, call describe index
expected: return code 0, and index instructure
'''
index_params = get_index_params
index_params = get_simple_index_params
logging.getLogger().info(index_params)
status, ids = connect.add_vectors(ip_table, vectors)
status = connect.create_index(ip_table, index_params)
@ -759,6 +790,80 @@ class TestIndexIP:
assert result._table_name == ip_table
assert result._index_type == index_params["index_type"]
def test_describe_index_partition(self, connect, ip_table, get_simple_index_params):
'''
target: test describe index interface
method: create table, create partition and add vectors in it, create index, call describe index
expected: return code 0, and index instructure
'''
partition_name = gen_unique_str()
index_params = get_simple_index_params
logging.getLogger().info(index_params)
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(ip_table, index_params)
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == ip_table
assert result._index_type == index_params["index_type"]
status, result = connect.describe_index(partition_name)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == partition_name
assert result._index_type == index_params["index_type"]
def test_describe_index_partition_A(self, connect, ip_table, get_simple_index_params):
'''
target: test describe index interface
method: create table, create partition and add vectors in it, create index on partition, call describe index
expected: return code 0, and index instructure
'''
partition_name = gen_unique_str()
index_params = get_simple_index_params
logging.getLogger().info(index_params)
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(partition_name, index_params)
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
status, result = connect.describe_index(partition_name)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == partition_name
assert result._index_type == index_params["index_type"]
def test_describe_index_partition_B(self, connect, ip_table, get_simple_index_params):
'''
target: test describe index interface
method: create table, create partitions and add vectors in it, create index on partitions, call describe index
expected: return code 0, and index instructure
'''
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
new_tag = "new_tag"
index_params = get_simple_index_params
logging.getLogger().info(index_params)
status = connect.create_partition(ip_table, partition_name, tag)
status = connect.create_partition(ip_table, new_partition_name, new_tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=new_tag)
status = connect.create_index(partition_name, index_params)
status = connect.create_index(new_partition_name, index_params)
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
status, result = connect.describe_index(new_partition_name)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == new_partition_name
assert result._index_type == index_params["index_type"]
def test_describe_and_drop_index_multi_tables(self, connect, get_simple_index_params):
'''
target: test create, describe and drop index interface with multiple tables of IP
@ -849,6 +954,111 @@ class TestIndexIP:
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
def test_drop_index_partition(self, connect, ip_table, get_simple_index_params):
'''
target: test drop index interface
method: create table, create partition and add vectors in it, create index on table, call drop table index
expected: return code 0, and default index param
'''
partition_name = gen_unique_str()
index_params = get_simple_index_params
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(ip_table, index_params)
assert status.OK()
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
status = connect.drop_index(ip_table)
assert status.OK()
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
def test_drop_index_partition_A(self, connect, ip_table, get_simple_index_params):
'''
target: test drop index interface
method: create table, create partition and add vectors in it, create index on partition, call drop table index
expected: return code 0, and default index param
'''
partition_name = gen_unique_str()
index_params = get_simple_index_params
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(partition_name, index_params)
assert status.OK()
status = connect.drop_index(ip_table)
assert status.OK()
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
status, result = connect.describe_index(partition_name)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == partition_name
assert result._index_type == IndexType.FLAT
def test_drop_index_partition_B(self, connect, ip_table, get_simple_index_params):
'''
target: test drop index interface
method: create table, create partition and add vectors in it, create index on partition, call drop partition index
expected: return code 0, and default index param
'''
partition_name = gen_unique_str()
index_params = get_simple_index_params
status = connect.create_partition(ip_table, partition_name, tag)
status, ids = connect.add_vectors(ip_table, vectors, partition_tag=tag)
status = connect.create_index(partition_name, index_params)
assert status.OK()
status = connect.drop_index(partition_name)
assert status.OK()
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == ip_table
assert result._index_type == IndexType.FLAT
status, result = connect.describe_index(partition_name)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == partition_name
assert result._index_type == IndexType.FLAT
def test_drop_index_partition_C(self, connect, ip_table, get_simple_index_params):
'''
target: test drop index interface
method: create table, create partitions and add vectors in it, create index on partitions, call drop partition index
expected: return code 0, and default index param
'''
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
new_tag = "new_tag"
index_params = get_simple_index_params
status = connect.create_partition(ip_table, partition_name, tag)
status = connect.create_partition(ip_table, new_partition_name, new_tag)
status, ids = connect.add_vectors(ip_table, vectors)
status = connect.create_index(ip_table, index_params)
assert status.OK()
status = connect.drop_index(new_partition_name)
assert status.OK()
status, result = connect.describe_index(new_partition_name)
logging.getLogger().info(result)
assert result._nlist == 16384
assert result._table_name == new_partition_name
assert result._index_type == IndexType.FLAT
status, result = connect.describe_index(partition_name)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == partition_name
assert result._index_type == index_params["index_type"]
status, result = connect.describe_index(ip_table)
logging.getLogger().info(result)
assert result._nlist == index_params["nlist"]
assert result._table_name == ip_table
assert result._index_type == index_params["index_type"]
def test_drop_index_repeatly(self, connect, ip_table, get_simple_index_params):
'''
target: test drop index repeatly

View File

@ -25,9 +25,9 @@ index_params = {'index_type': IndexType.IVFLAT, 'nlist': 16384}
class TestMixBase:
# TODO: enable
def test_search_during_createIndex(self, args):
loops = 100000
# disable
def _test_search_during_createIndex(self, args):
loops = 10000
table = gen_unique_str()
query_vecs = [vectors[0], vectors[1]]
uri = "tcp://%s:%s" % (args["ip"], args["port"])

View File

@ -0,0 +1,431 @@
import time
import random
import pdb
import threading
import logging
from multiprocessing import Pool, Process
import pytest
from milvus import Milvus, IndexType, MetricType
from utils import *
dim = 128
index_file_size = 10
table_id = "test_add"
ADD_TIMEOUT = 60
nprobe = 1
epsilon = 0.0001
tag = "1970-01-01"
class TestCreateBase:
"""
******************************************************************
The following cases are used to test `create_partition` function
******************************************************************
"""
def test_create_partition(self, connect, table):
'''
target: test create partition, check status returned
method: call function: create_partition
expected: status ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
def test_create_partition_repeat(self, connect, table):
'''
target: test create partition, check status returned
method: call function: create_partition
expected: status ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, partition_name, tag)
assert not status.OK()
def test_create_partition_recursively(self, connect, table):
'''
target: test create partition, and create partition in parent partition, check status returned
method: call function: create_partition
expected: status not ok
'''
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
new_tag = "new_tag"
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(partition_name, new_partition_name, new_tag)
assert not status.OK()
def test_create_partition_table_not_existed(self, connect):
'''
target: test create partition, its owner table name not existed in db, check status returned
method: call function: create_partition
expected: status not ok
'''
table_name = gen_unique_str()
partition_name = gen_unique_str()
status = connect.create_partition(table_name, partition_name, tag)
assert not status.OK()
def test_create_partition_partition_name_existed(self, connect, table):
'''
target: test create partition, and create the same partition again, check status returned
method: call function: create_partition
expected: status not ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
tag_new = "tag_new"
status = connect.create_partition(table, partition_name, tag_new)
assert not status.OK()
def test_create_partition_partition_name_equals_table(self, connect, table):
'''
target: test create partition, the partition equals to table, check status returned
method: call function: create_partition
expected: status not ok
'''
status = connect.create_partition(table, table, tag)
assert not status.OK()
def test_create_partition_partition_name_None(self, connect, table):
'''
target: test create partition, partition name set None, check status returned
method: call function: create_partition
expected: status not ok
'''
partition_name = None
status = connect.create_partition(table, partition_name, tag)
assert not status.OK()
def test_create_partition_tag_name_None(self, connect, table):
'''
target: test create partition, tag name set None, check status returned
method: call function: create_partition
expected: status ok
'''
tag_name = None
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag_name)
assert not status.OK()
def test_create_different_partition_tag_name_existed(self, connect, table):
'''
target: test create partition, and create the same partition tag again, check status returned
method: call function: create_partition with the same tag name
expected: status not ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
new_partition_name = gen_unique_str()
status = connect.create_partition(table, new_partition_name, tag)
assert not status.OK()
def test_create_partition_add_vectors(self, connect, table):
'''
target: test create partition, and insert vectors, check status returned
method: call function: create_partition
expected: status ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
nq = 100
vectors = gen_vectors(nq, dim)
ids = [i for i in range(nq)]
status, ids = connect.insert(table, vectors, ids)
assert status.OK()
def test_create_partition_insert_with_tag(self, connect, table):
'''
target: test create partition, and insert vectors, check status returned
method: call function: create_partition
expected: status ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
nq = 100
vectors = gen_vectors(nq, dim)
ids = [i for i in range(nq)]
status, ids = connect.insert(table, vectors, ids, partition_tag=tag)
assert status.OK()
def test_create_partition_insert_with_tag_not_existed(self, connect, table):
'''
target: test create partition, and insert vectors, check status returned
method: call function: create_partition
expected: status not ok
'''
tag_new = "tag_new"
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
nq = 100
vectors = gen_vectors(nq, dim)
ids = [i for i in range(nq)]
status, ids = connect.insert(table, vectors, ids, partition_tag=tag_new)
assert not status.OK()
def test_create_partition_insert_same_tags(self, connect, table):
'''
target: test create partition, and insert vectors, check status returned
method: call function: create_partition
expected: status ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
nq = 100
vectors = gen_vectors(nq, dim)
ids = [i for i in range(nq)]
status, ids = connect.insert(table, vectors, ids, partition_tag=tag)
ids = [(i+100) for i in range(nq)]
status, ids = connect.insert(table, vectors, ids, partition_tag=tag)
assert status.OK()
time.sleep(1)
status, res = connect.get_table_row_count(partition_name)
assert res == nq * 2
def test_create_partition_insert_same_tags_two_tables(self, connect, table):
'''
target: test create two partitions, and insert vectors with the same tag to each table, check status returned
method: call function: create_partition
expected: status ok, table length is correct
'''
partition_name = gen_unique_str()
table_new = gen_unique_str()
new_partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
param = {'table_name': table_new,
'dimension': dim,
'index_file_size': index_file_size,
'metric_type': MetricType.L2}
status = connect.create_table(param)
status = connect.create_partition(table_new, new_partition_name, tag)
assert status.OK()
nq = 100
vectors = gen_vectors(nq, dim)
ids = [i for i in range(nq)]
status, ids = connect.insert(table, vectors, ids, partition_tag=tag)
ids = [(i+100) for i in range(nq)]
status, ids = connect.insert(table_new, vectors, ids, partition_tag=tag)
assert status.OK()
time.sleep(1)
status, res = connect.get_table_row_count(new_partition_name)
assert res == nq
class TestShowBase:
"""
******************************************************************
The following cases are used to test `show_partitions` function
******************************************************************
"""
def test_show_partitions(self, connect, table):
'''
target: test show partitions, check status and partitions returned
method: create partition first, then call function: show_partitions
expected: status ok, partition correct
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status, res = connect.show_partitions(table)
assert status.OK()
def test_show_partitions_no_partition(self, connect, table):
'''
target: test show partitions with table name, check status and partitions returned
method: call function: show_partitions
expected: status ok, partitions correct
'''
partition_name = gen_unique_str()
status, res = connect.show_partitions(table)
assert status.OK()
def test_show_partitions_no_partition_recursive(self, connect, table):
'''
target: test show partitions with partition name, check status and partitions returned
method: call function: show_partitions
expected: status ok, no partitions
'''
partition_name = gen_unique_str()
status, res = connect.show_partitions(partition_name)
assert status.OK()
assert len(res) == 0
def test_show_multi_partitions(self, connect, table):
'''
target: test show partitions, check status and partitions returned
method: create partitions first, then call function: show_partitions
expected: status ok, partitions correct
'''
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, new_partition_name, tag)
status, res = connect.show_partitions(table)
assert status.OK()
class TestDropBase:
"""
******************************************************************
The following cases are used to test `drop_partition` function
******************************************************************
"""
def test_drop_partition(self, connect, table):
'''
target: test drop partition, check status and partition if existed
method: create partitions first, then call function: drop_partition
expected: status ok, no partitions in db
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.drop_partition(table, tag)
assert status.OK()
# check if the partition existed
status, res = connect.show_partitions(table)
assert partition_name not in res
def test_drop_partition_tag_not_existed(self, connect, table):
'''
target: test drop partition, but tag not existed
method: create partitions first, then call function: drop_partition
expected: status not ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
new_tag = "new_tag"
status = connect.drop_partition(table, new_tag)
assert not status.OK()
def test_drop_partition_tag_not_existed_A(self, connect, table):
'''
target: test drop partition, but table not existed
method: create partitions first, then call function: drop_partition
expected: status not ok
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
new_table = gen_unique_str()
status = connect.drop_partition(new_table, tag)
assert not status.OK()
def test_drop_partition_repeatedly(self, connect, table):
'''
target: test drop partition twice, check status and partition if existed
method: create partitions first, then call function: drop_partition
expected: status not ok, no partitions in db
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.drop_partition(table, tag)
status = connect.drop_partition(table, tag)
time.sleep(2)
assert not status.OK()
status, res = connect.show_partitions(table)
assert partition_name not in res
def test_drop_partition_create(self, connect, table):
'''
target: test drop partition, and create again, check status
method: create partitions first, then call function: drop_partition, create_partition
expected: status not ok, partition in db
'''
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.drop_partition(table, tag)
time.sleep(2)
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
status, res = connect.show_partitions(table)
assert partition_name == res[0].partition_name
class TestNameInvalid(object):
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_partition_name(self, request):
yield request.param
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_tag_name(self, request):
yield request.param
@pytest.fixture(
scope="function",
params=gen_invalid_table_names()
)
def get_table_name(self, request):
yield request.param
def test_create_partition_with_invalid_partition_name(self, connect, table, get_partition_name):
'''
target: test create partition, with invalid partition name, check status returned
method: call function: create_partition
expected: status not ok
'''
partition_name = get_partition_name
status = connect.create_partition(table, partition_name, tag)
assert not status.OK()
def test_create_partition_with_invalid_tag_name(self, connect, table):
'''
target: test create partition, with invalid partition name, check status returned
method: call function: create_partition
expected: status not ok
'''
tag_name = " "
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag_name)
assert not status.OK()
def test_drop_partition_with_invalid_table_name(self, connect, table, get_table_name):
'''
target: test drop partition, with invalid table name, check status returned
method: call function: drop_partition
expected: status not ok
'''
table_name = get_table_name
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.drop_partition(table_name, tag)
assert not status.OK()
def test_drop_partition_with_invalid_tag_name(self, connect, table, get_tag_name):
'''
target: test drop partition, with invalid tag name, check status returned
method: call function: drop_partition
expected: status not ok
'''
tag_name = get_tag_name
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.drop_partition(table, tag_name)
assert not status.OK()
def test_show_partitions_with_invalid_table_name(self, connect, table, get_table_name):
'''
target: test show partitions, with invalid table name, check status returned
method: call function: show_partitions
expected: status not ok
'''
table_name = get_table_name
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status, res = connect.show_partitions(table_name)
assert not status.OK()

View File

@ -16,8 +16,9 @@ add_interval_time = 2
vectors = gen_vectors(100, dim)
# vectors /= numpy.linalg.norm(vectors)
# vectors = vectors.tolist()
nrpobe = 1
nprobe = 1
epsilon = 0.001
tag = "1970-01-01"
class TestSearchBase:
@ -49,6 +50,15 @@ class TestSearchBase:
pytest.skip("sq8h not support in open source")
return request.param
@pytest.fixture(
scope="function",
params=gen_simple_index_params()
)
def get_simple_index_params(self, request, args):
if "internal" not in args:
if request.param["index_type"] == IndexType.IVF_SQ8H:
pytest.skip("sq8h not support in open source")
return request.param
"""
generate top-k params
"""
@ -70,7 +80,7 @@ class TestSearchBase:
query_vec = [vectors[0]]
top_k = get_top_k
nprobe = 1
status, result = connect.search_vectors(table, top_k, nrpobe, query_vec)
status, result = connect.search_vectors(table, top_k, nprobe, query_vec)
if top_k <= 2048:
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
@ -85,7 +95,6 @@ class TestSearchBase:
method: search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_index_params
logging.getLogger().info(index_params)
vectors, ids = self.init_data(connect, table)
@ -93,7 +102,7 @@ class TestSearchBase:
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nrpobe, query_vec)
status, result = connect.search_vectors(table, top_k, nprobe, query_vec)
logging.getLogger().info(result)
if top_k <= 1024:
assert status.OK()
@ -103,6 +112,160 @@ class TestSearchBase:
else:
assert not status.OK()
def test_search_l2_index_params_partition(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: add vectors into table, search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k, search table with partition tag return empty
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
vectors, ids = self.init_data(connect, table)
status = connect.create_index(table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nprobe, query_vec)
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result) == 0
def test_search_l2_index_params_partition_A(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search partition with the given vectors, check the result
expected: search status ok, and the length of the result is 0
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
vectors, ids = self.init_data(connect, table)
status = connect.create_index(table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result) == 0
def test_search_l2_index_params_partition_B(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
vectors, ids = self.init_data(connect, partition_name)
status = connect.create_index(table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nprobe, query_vec)
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result) == 0
def test_search_l2_index_params_partition_C(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors and tags (one of the tags not existed in table), check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
vectors, ids = self.init_data(connect, partition_name)
status = connect.create_index(table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag, "new_tag"])
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert result[0][0].distance <= epsilon
def test_search_l2_index_params_partition_D(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors and tag (tag name not existed in table), check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
vectors, ids = self.init_data(connect, partition_name)
status = connect.create_index(table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=["new_tag"])
logging.getLogger().info(result)
assert status.OK()
assert len(result) == 0
def test_search_l2_index_params_partition_E(self, connect, table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search table with the given vectors and tags, check the result
expected: search status ok, and the length of the result is top_k
'''
new_tag = "new_tag"
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, new_partition_name, new_tag)
vectors, ids = self.init_data(connect, partition_name)
new_vectors, new_ids = self.init_data(connect, new_partition_name, nb=1000)
status = connect.create_index(table, index_params)
query_vec = [vectors[0], new_vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[tag, new_tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert check_result(result[1], new_ids[0])
assert result[0][0].distance <= epsilon
assert result[1][0].distance <= epsilon
status, result = connect.search_vectors(table, top_k, nprobe, query_vec, partition_tags=[new_tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[1], new_ids[0])
assert result[1][0].distance <= epsilon
def test_search_ip_index_params(self, connect, ip_table, get_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
@ -117,7 +280,7 @@ class TestSearchBase:
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(ip_table, top_k, nrpobe, query_vec)
status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec)
logging.getLogger().info(result)
if top_k <= 1024:
@ -128,6 +291,59 @@ class TestSearchBase:
else:
assert not status.OK()
def test_search_ip_index_params_partition(self, connect, ip_table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(ip_table, partition_name, tag)
vectors, ids = self.init_data(connect, ip_table)
status = connect.create_index(ip_table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec)
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert abs(result[0][0].distance - numpy.inner(numpy.array(query_vec[0]), numpy.array(query_vec[0]))) <= gen_inaccuracy(result[0][0].distance)
status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result) == 0
def test_search_ip_index_params_partition_A(self, connect, ip_table, get_simple_index_params):
'''
target: test basic search fuction, all the search params is corrent, test all index params, and build
method: search with the given vectors and tag, check the result
expected: search status ok, and the length of the result is top_k
'''
index_params = get_simple_index_params
logging.getLogger().info(index_params)
partition_name = gen_unique_str()
status = connect.create_partition(ip_table, partition_name, tag)
vectors, ids = self.init_data(connect, partition_name)
status = connect.create_index(ip_table, index_params)
query_vec = [vectors[0]]
top_k = 10
nprobe = 1
status, result = connect.search_vectors(ip_table, top_k, nprobe, query_vec, partition_tags=[tag])
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
assert abs(result[0][0].distance - numpy.inner(numpy.array(query_vec[0]), numpy.array(query_vec[0]))) <= gen_inaccuracy(result[0][0].distance)
status, result = connect.search_vectors(partition_name, top_k, nprobe, query_vec)
logging.getLogger().info(result)
assert status.OK()
assert len(result[0]) == min(len(vectors), top_k)
assert check_result(result[0], ids[0])
@pytest.mark.level(2)
def test_search_vectors_without_connect(self, dis_connect, table):
'''
@ -518,6 +734,14 @@ class TestSearchParamsInvalid(object):
status, result = connect.search_vectors(table_name, top_k, nprobe, query_vecs)
assert not status.OK()
@pytest.mark.level(1)
def test_search_with_invalid_tag_format(self, connect, table):
top_k = 1
nprobe = 1
query_vecs = gen_vectors(1, dim)
with pytest.raises(Exception) as e:
status, result = connect.search_vectors(table_name, top_k, nprobe, query_vecs, partition_tags="tag")
"""
Test search table with invalid top-k
"""
@ -574,7 +798,7 @@ class TestSearchParamsInvalid(object):
yield request.param
@pytest.mark.level(1)
def test_search_with_invalid_nrpobe(self, connect, table, get_nprobes):
def test_search_with_invalid_nprobe(self, connect, table, get_nprobes):
'''
target: test search fuction, with the wrong top_k
method: search with top_k
@ -592,7 +816,7 @@ class TestSearchParamsInvalid(object):
status, result = connect.search_vectors(table, top_k, nprobe, query_vecs)
@pytest.mark.level(2)
def test_search_with_invalid_nrpobe_ip(self, connect, ip_table, get_nprobes):
def test_search_with_invalid_nprobe_ip(self, connect, ip_table, get_nprobes):
'''
target: test search fuction, with the wrong top_k
method: search with top_k

View File

@ -297,7 +297,7 @@ class TestTable:
'''
table_name = gen_unique_str("test_table")
status = connect.delete_table(table_name)
assert not status.code==0
assert not status.OK()
def test_delete_table_repeatedly(self, connect):
'''

View File

@ -13,8 +13,8 @@ from milvus import IndexType, MetricType
dim = 128
index_file_size = 10
add_time_interval = 5
add_time_interval = 3
tag = "1970-01-01"
class TestTableCount:
"""
@ -58,6 +58,90 @@ class TestTableCount:
status, res = connect.get_table_row_count(table)
assert res == nb
def test_table_rows_count_partition(self, connect, table, add_vectors_nb):
'''
target: test table rows_count is correct or not
method: create table, create partition and add vectors in it,
assert the value returned by get_table_row_count method is equal to length of vectors
expected: the count is equal to the length of vectors
'''
nb = add_vectors_nb
partition_name = gen_unique_str()
vectors = gen_vectors(nb, dim)
status = connect.create_partition(table, partition_name, tag)
assert status.OK()
res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag)
time.sleep(add_time_interval)
status, res = connect.get_table_row_count(table)
assert res == nb
def test_table_rows_count_multi_partitions_A(self, connect, table, add_vectors_nb):
'''
target: test table rows_count is correct or not
method: create table, create partitions and add vectors in it,
assert the value returned by get_table_row_count method is equal to length of vectors
expected: the count is equal to the length of vectors
'''
new_tag = "new_tag"
nb = add_vectors_nb
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
vectors = gen_vectors(nb, dim)
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, new_partition_name, new_tag)
assert status.OK()
res = connect.add_vectors(table_name=table, records=vectors)
time.sleep(add_time_interval)
status, res = connect.get_table_row_count(table)
assert res == nb
def test_table_rows_count_multi_partitions_B(self, connect, table, add_vectors_nb):
'''
target: test table rows_count is correct or not
method: create table, create partitions and add vectors in one of the partitions,
assert the value returned by get_table_row_count method is equal to length of vectors
expected: the count is equal to the length of vectors
'''
new_tag = "new_tag"
nb = add_vectors_nb
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
vectors = gen_vectors(nb, dim)
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, new_partition_name, new_tag)
assert status.OK()
res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag)
time.sleep(add_time_interval)
status, res = connect.get_table_row_count(partition_name)
assert res == nb
status, res = connect.get_table_row_count(new_partition_name)
assert res == 0
def test_table_rows_count_multi_partitions_C(self, connect, table, add_vectors_nb):
'''
target: test table rows_count is correct or not
method: create table, create partitions and add vectors in one of the partitions,
assert the value returned by get_table_row_count method is equal to length of vectors
expected: the table count is equal to the length of vectors
'''
new_tag = "new_tag"
nb = add_vectors_nb
partition_name = gen_unique_str()
new_partition_name = gen_unique_str()
vectors = gen_vectors(nb, dim)
status = connect.create_partition(table, partition_name, tag)
status = connect.create_partition(table, new_partition_name, new_tag)
assert status.OK()
res = connect.add_vectors(table_name=table, records=vectors, partition_tag=tag)
res = connect.add_vectors(table_name=table, records=vectors, partition_tag=new_tag)
time.sleep(add_time_interval)
status, res = connect.get_table_row_count(partition_name)
assert res == nb
status, res = connect.get_table_row_count(new_partition_name)
assert res == nb
status, res = connect.get_table_row_count(table)
assert res == nb * 2
def test_table_rows_count_after_index_created(self, connect, table, get_simple_index_params):
'''
target: test get_table_row_count, after index have been created