mirror of https://github.com/milvus-io/milvus.git
Delete and WAL feature branch merge (#1436)
* add read/write lock
* change compact to ddl queue
* add api to get vector data
* add flush / merge / compact lock
* add api to get vector data
* add data size for table info
* add db recovery test
* add data_size check
* change file name to uppercase
Signed-off-by: jinhai <hai.jin@zilliz.com>
* update wal flush_merge_compact_mutex_
* update wal flush_merge_compact_mutex_
* change requirement
* change requirement
* upd requirement
* add logging
* add logging
* add logging
* add logging
* add logging
* add logging
* add logging
* add logging
* add logging
* delete part
* add all size checks
* fix bug
* update faiss get_vector_by_id
* add get_vector case
* update get vector by id
* update server
* fix DBImpl
* attempting to fix #1268
* lint
* update unit test
* fix #1259
* issue 1271 fix wal config
* update
* fix cases
Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>
* update read / write error message
* update read / write error message
* [skip ci] get vectors by id from raw files instead faiss
* [skip ci] update FilesByType meta
* update
* fix ci error
* update
* lint
* Hide partition_name parameter
* Remove douban pip source
Signed-off-by: zhenwu <zw@zilliz.com>
* Update epsilon value in test cases
Signed-off-by: zhenwu <zw@zilliz.com>
* Add default partition
* Caiyd crud (#1313)
* fix clang format
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix unittest build error
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* add faiss_bitset_test
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* avoid user directly operate partition table
* fix has table bug
* Caiyd crud (#1323)
* fix clang format
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix unittest build error
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* use compile option -O3
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* update faiss_bitset_test.cpp
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* change open flags
* change OngoingFileChecker to static instance
* mark ongoing files when applying deletes
* update clean up with ttl
* fix centos ci
* update
* lint
* update partition
Signed-off-by: zhenwu <zw@zilliz.com>
* update delete and flush to include partitions
* update
* Update cases
Signed-off-by: zhenwu <zw@zilliz.com>
* Fix test cases crud (#1350)
* fix order
* add wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix invalid operation issue
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix invalid operation issue
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix bug
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix bug
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* crud fix
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* crud fix
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* add table info test cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Signed-off-by: JinHai-CN <hai.jin@zilliz.com>
* merge cases
Signed-off-by: zhenwu <zw@zilliz.com>
* Shengjun (#1349)
* Add GPU sharing solution on native Kubernetes (#1102)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Fix http server bug (#1096)
* refactoring(create_table done)
* refactoring
* refactor server delivery (insert done)
* refactoring server module (count_table done)
* server refactor done
* cmake pass
* refactor server module done.
* set grpc response status correctly
* format done.
* fix redefine ErrorMap()
* optimize insert reducing ids data copy
* optimize grpc request with reducing data copy
* clang format
* [skip ci] Refactor server module done. update changlog. prepare for PR
* remove explicit and change int32_t to int64_t
* add web server
* [skip ci] add license in web module
* modify header include & comment oatpp environment config
* add port configure & create table in handler
* modify web url
* simple url complation done & add swagger
* make sure web url
* web functionality done. debuging
* add web unittest
* web test pass
* add web server port
* add web server port in template
* update unittest cmake file
* change web server default port to 19121
* rename method in web module & unittest pass
* add search case in unittest for web module
* rename some variables
* fix bug
* unittest pass
* web prepare
* fix cmd bug(check server status)
* update changlog
* add web port validate & default set
* clang-format pass
* add web port test in unittest
* add CORS & redirect root to swagger ui
* add web status
* web table method func cascade test pass
* add config url in web module
* modify thirdparty cmake to avoid building oatpp test
* clang format
* update changlog
* add constants in web module
* reserve Config.cpp
* fix constants reference bug
* replace web server with async module
* modify component to support async
* format
* developing controller & add test clent into unittest
* add web port into demo/server_config
* modify thirdparty cmake to allow build test
* remove unnecessary comment
* add endpoint info in controller
* finish web test(bug here)
* clang format
* add web test cpp to lint exclusions
* check null field in GetConfig
* add macro RETURN STATUS DTo
* fix cmake conflict
* fix crash when exit server
* remove surplus comments & add http param check
* add uri /docs to direct swagger
* format
* change cmd to system
* add default value & unittest in web module
* add macros to judge if GPU supported
* add macros in unit & add default in index dto & print error message when bind http port fail
* format (fix #788)
* fix cors bug (not completed)
* comment cors
* change web framework to simple api
* comments optimize
* change to simple API
* remove comments in controller.hpp
* remove EP_COMMON_CMAKE_ARGS in oatpp and oatpp-swagger
* add ep cmake args to sqlite
* clang-format
* change a format
* test pass
* change name to
* fix compiler issue(oatpp-swagger depend on oatpp)
* add & in start_server.h
* specify lib location with oatpp and oatpp-swagger
* add comments
* add swagger definition
* [skip ci] change http method options status code
* remove oatpp swagger(fix #970)
* remove comments
* check Start web behavior
* add default to cpu_cache_capacity
* remove swagger component.hpp & /docs url
* remove /docs info
* remove /docs in unittest
* remove space in test rpc
* remove repeate info in CHANGLOG
* change cache_insert_data default value as a constant
* [skip ci] Fix some broken links (#960)
* [skip ci] Fix broken link
* [skip ci] Fix broken link
* [skip ci] Fix broken link
* [skip ci] Fix broken links
* fix issue 373 (#964)
* fix issue 373
* Adjustment format
* Adjustment format
* Adjustment format
* change readme
* #966 update NOTICE.md (#967)
* remove comments
* check Start web behavior
* add default to cpu_cache_capacity
* remove swagger component.hpp & /docs url
* remove /docs info
* remove /docs in unittest
* remove space in test rpc
* remove repeate info in CHANGLOG
* change cache_insert_data default value as a constant
* adjust web port cofig place
* rename web_port variable
* change gpu resources invoke way to cmd()
* set advanced config name add DEFAULT
* change config setting to cmd
* modify ..
* optimize code
* assign TableDto' count default value 0 (fix #995)
* check if table exists when show partitions (fix #1028)
* check table exists when drop partition (fix #1029)
* check if partition name is legal (fix #1022)
* modify status code when partition tag is illegal
* update changlog
* add info to /system url
* add binary index and add bin uri & handler method(not completed)
* optimize http insert and search time(fix #1066) | add binary vectors support(fix #1067)
* fix test partition bug
* fix test bug when check insert records
* add binary vectors test
* add default for offset and page_size
* fix uinttest bug
* [skip ci] remove comments
* optimize web code for PR comments
* add new folder named utils
* check offset and pagesize (fix #1082)
* improve error message if offset or page_size is not legal (fix #1075)
* add log into web module
* update changlog
* check gpu sources setting when assign repeated value (fix #990)
* update changlog
* clang-format pass
* add default handler in http handler
* [skip ci] improve error msg when check gpu resources
* change check offset way
* remove func IsIntStr
* add case
* change int32 to int64 when check number str
* add log in we module(doing)
* update test case
* add log in web controller
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
* Filtering for specific paths in Jenkins CI (#1107)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Fix Filtering for specific paths in Jenkins CI bug (#1109)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Fix Filtering for specific paths in Jenkins CI bug (#1110)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Don't skip ci when triggered by a time (#1113)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Don't skip ci when triggered by a time
* Don't skip ci when triggered by a time
* Set default sending to Milvus Dev mail group (#1121)
* run hadolint with reviewdog
* add LINCENSE in Dockerfile
* run hadolint with reviewdog
* Reporter of reviewdog command is "github-pr-check"
* format Dockerfile
* ignore DL3007 in hadolint
* clean up old docker images
* Add GPU sharing solution on native Kubernetes
* nightly test mailer
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Test filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* Filtering for specific paths in Jenkins CI
* No skip ci when triggered by a time
* Don't skip ci when triggered by a time
* Set default sending to Milvus Dev
* Support hnsw (#1131)
* add hnsw
* add config
* format...
* format..
* Remove test.template (#1129)
* Update framework
* remove files
* Remove files
* Remove ann-acc cases && Update java-sdk cases
* change cn to en
* [skip ci] remove doc test
* [skip ci] change cn to en
* Case stability
* Add mail notification when test failed
* Add main notification
* Add main notification
* gen milvus instance from utils
* Distable case with multiprocess
* Add mail notification when nightly test failed
* add milvus handler param
* add http handler
* Remove test.template
Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
* Add doc for the RESTful API / Update contributor number in Milvus readme (#1100)
* [skip ci] Update contributor number.
* [skip ci] Add RESTful API doc.
* [skip ci] Some updates.
* [skip ci] Change port to 19121.
* [skip ci] Update README.md.
Update the descriptions for OPTIONS.
* Update README.md
Fix a typo.
* #1105 update error message when creating IVFSQ8H index without GPU resources (#1117)
* [skip ci] Update README (#1104)
* remove Nvidia owned files from faiss (#1136)
* #1135 remove Nvidia owned files from faiss
* Revert "#1135 remove Nvidia owned files from faiss"
This reverts commit 3bc007c28c
.
* #1135 remove Nvidia API implementation
* #1135 remove Nvidia owned files from faiss
* Update CODE_OF_CONDUCT.md (#1163)
* Improve codecov (#1095)
* Optimize config test. Dir src/config 99% lines covered
* add unittest coverage
* optimize cache&config unittest
* code format
* format
* format code
* fix merge conflict
* cover src/utils unittest
* '#831 fix exe_path judge error'
* #831 fix exe_path judge error
* add some unittest coverage
* add some unittest coverage
* improve coverage of src/wrapper
* improve src/wrapper coverage
* *test optimize db/meta unittest
* fix bug
* *test optimize mysqlMetaImpl unittest
* *style: format code
* import server& scheduler unittest coverage
* handover next work
* *test: add some test_meta test case
* *format code
* *fix: fix typo
* feat(codecov): improve code coverage for src/db(#872)
* feat(codecov): improve code coverage for src/db/engine(#872)
* feat(codecov): improve code coverage(#872)
* fix config unittest bug
* feat(codecov): improve code coverage core/db/engine(#872)
* feat(codecov): improve code coverage core/knowhere
* feat(codecov): improve code coverage core/knowhere
* feat(codecov): improve code coverage
* feat(codecov): fix cpu test some error
* feat(codecov): improve code coverage
* feat(codecov): rename some fiu
* fix(db/meta): fix switch/case default action
* feat(codecov): improve code coverage(#872)
* fix error caused by merge code
* format code
* feat(codecov): improve code coverage & format code(#872)
* feat(codecov): fix test error(#872)
* feat(codecov): fix unittest test_mem(#872)
* feat(codecov): fix unittest(#872)
* feat(codecov): fix unittest for resource manager(#872)
* feat(codecov): code format (#872)
* feat(codecov): trigger ci(#872)
* fix(RequestScheduler): remove a wrong sleep statement
* test(test_rpc): fix rpc test
* Fix format issue
* Remove unused comments
* Fix unit test error
Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: Jin Hai <hai.jin@zilliz.com>
* Support run dev test with http handler in python SDK (#1116)
* refactoring(create_table done)
* refactoring
* refactor server delivery (insert done)
* refactoring server module (count_table done)
* server refactor done
* cmake pass
* refactor server module done.
* set grpc response status correctly
* format done.
* fix redefine ErrorMap()
* optimize insert reducing ids data copy
* optimize grpc request with reducing data copy
* clang format
* [skip ci] Refactor server module done. update changlog. prepare for PR
* remove explicit and change int32_t to int64_t
* add web server
* [skip ci] add license in web module
* modify header include & comment oatpp environment config
* add port configure & create table in handler
* modify web url
* simple url complation done & add swagger
* make sure web url
* web functionality done. debuging
* add web unittest
* web test pass
* add web server port
* add web server port in template
* update unittest cmake file
* change web server default port to 19121
* rename method in web module & unittest pass
* add search case in unittest for web module
* rename some variables
* fix bug
* unittest pass
* web prepare
* fix cmd bug(check server status)
* update changlog
* add web port validate & default set
* clang-format pass
* add web port test in unittest
* add CORS & redirect root to swagger ui
* add web status
* web table method func cascade test pass
* add config url in web module
* modify thirdparty cmake to avoid building oatpp test
* clang format
* update changlog
* add constants in web module
* reserve Config.cpp
* fix constants reference bug
* replace web server with async module
* modify component to support async
* format
* developing controller & add test clent into unittest
* add web port into demo/server_config
* modify thirdparty cmake to allow build test
* remove unnecessary comment
* add endpoint info in controller
* finish web test(bug here)
* clang format
* add web test cpp to lint exclusions
* check null field in GetConfig
* add macro RETURN STATUS DTo
* fix cmake conflict
* fix crash when exit server
* remove surplus comments & add http param check
* add uri /docs to direct swagger
* format
* change cmd to system
* add default value & unittest in web module
* add macros to judge if GPU supported
* add macros in unit & add default in index dto & print error message when bind http port fail
* format (fix #788)
* fix cors bug (not completed)
* comment cors
* change web framework to simple api
* comments optimize
* change to simple API
* remove comments in controller.hpp
* remove EP_COMMON_CMAKE_ARGS in oatpp and oatpp-swagger
* add ep cmake args to sqlite
* clang-format
* change a format
* test pass
* change name to
* fix compiler issue(oatpp-swagger depend on oatpp)
* add & in start_server.h
* specify lib location with oatpp and oatpp-swagger
* add comments
* add swagger definition
* [skip ci] change http method options status code
* remove oatpp swagger(fix #970)
* remove comments
* check Start web behavior
* add default to cpu_cache_capacity
* remove swagger component.hpp & /docs url
* remove /docs info
* remove /docs in unittest
* remove space in test rpc
* remove repeate info in CHANGLOG
* change cache_insert_data default value as a constant
* [skip ci] Fix some broken links (#960)
* [skip ci] Fix broken link
* [skip ci] Fix broken link
* [skip ci] Fix broken link
* [skip ci] Fix broken links
* fix issue 373 (#964)
* fix issue 373
* Adjustment format
* Adjustment format
* Adjustment format
* change readme
* #966 update NOTICE.md (#967)
* remove comments
* check Start web behavior
* add default to cpu_cache_capacity
* remove swagger component.hpp & /docs url
* remove /docs info
* remove /docs in unittest
* remove space in test rpc
* remove repeate info in CHANGLOG
* change cache_insert_data default value as a constant
* adjust web port cofig place
* rename web_port variable
* change gpu resources invoke way to cmd()
* set advanced config name add DEFAULT
* change config setting to cmd
* modify ..
* optimize code
* assign TableDto' count default value 0 (fix #995)
* check if table exists when show partitions (fix #1028)
* check table exists when drop partition (fix #1029)
* check if partition name is legal (fix #1022)
* modify status code when partition tag is illegal
* update changlog
* add info to /system url
* add binary index and add bin uri & handler method(not completed)
* optimize http insert and search time(fix #1066) | add binary vectors support(fix #1067)
* fix test partition bug
* fix test bug when check insert records
* add binary vectors test
* add default for offset and page_size
* fix uinttest bug
* [skip ci] remove comments
* optimize web code for PR comments
* add new folder named utils
* check offset and pagesize (fix #1082)
* improve error message if offset or page_size is not legal (fix #1075)
* add log into web module
* update changlog
* check gpu sources setting when assign repeated value (fix #990)
* update changlog
* clang-format pass
* add default handler in http handler
* [skip ci] improve error msg when check gpu resources
* change check offset way
* remove func IsIntStr
* add case
* change int32 to int64 when check number str
* add log in we module(doing)
* update test case
* add log in web controller
* remove surplus dot
* add preload into /system/
* change get_milvus() to get_milvus(args['handler'])
* support load table into memory with http server (fix #1115)
* [skip ci] comment surplus dto in VectorDto
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
* Fix #1140 (#1162)
* fix
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* update...
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* fix2
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* fix3
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* update changelog
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* Update INSTALL.md (#1175)
* Update INSTALL.md
1. Change image tag and Milvus source code to latest.
2. Fix a typo
Signed-off-by: Lu Wang <yamasite@qq.com>
* Update INSTALL.md
Signed-off-by: lu.wang <yamasite@qq.com>
* add Tanimoto ground truth (#1138)
* add milvus ground truth
* add milvus groundtruth
* [skip ci] add milvus ground truth
* [skip ci]add tanimoto ground truth
* fix mix case bug (#1208)
* fix mix case bug
Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>
* Remove case.md
Signed-off-by: del.zhenwu <zhenxiang.li@zilliz.com>
* Update README.md (#1206)
Add LFAI mailing lists.
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Add design.md to store links to design docs (#1219)
* Update README.md
Add link to Milvus design docs
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Create design.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update design.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Add troubleshooting info about libmysqlpp.so.3 error (#1225)
* Update INSTALL.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update INSTALL.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update README.md (#1233)
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* #1240 Update license declaration of each file (#1241)
* #1240 Update license declaration of each files
Signed-off-by: jinhai <hai.jin@zilliz.com>
* #1240 Update CHANGELOG
Signed-off-by: jinhai <hai.jin@zilliz.com>
* Update README.md (#1258)
Add Jenkins master badge.
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update INSTALL.md (#1265)
Fix indentation.
* support CPU profiling (#1251)
* #1250 support CPU profiling
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* #1250 fix code coverage
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* Fix HNSW crash (#1262)
* fix
Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>
* update.
Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>
* Add troubleshooting information for INSTALL.md and enhance readability (#1274)
* Update INSTALL.md
1. Add new troubleshooting message;
2. Enhance readability.
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update INSTALL.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update INSTALL.md
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Update INSTALL.md
Add CentOS link.
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* Create COMMUNITY.md (#1292)
Signed-off-by: Lutkin Wang <yamasite@qq.com>
* fix gtest
* add copyright
* fix gtest
* MERGE_NOT_YET
* fix lint
Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
Co-authored-by: BossZou <40255591+BossZou@users.noreply.github.com>
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: Tinkerrr <linxiaojun.cn@outlook.com>
Co-authored-by: del-zhenwu <56623710+del-zhenwu@users.noreply.github.com>
Co-authored-by: Lutkin Wang <yamasite@qq.com>
Co-authored-by: shengjh <46514371+shengjh@users.noreply.github.com>
Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: Jin Hai <hai.jin@zilliz.com>
Co-authored-by: shiyu22 <cshiyu22@gmail.com>
* #1302 Get all record IDs in a segment by given a segment id
* Remove query time ranges
Signed-off-by: zhenwu <zw@zilliz.com>
* #1295 let wal enable by default
* fix cases
Signed-off-by: zhenwu <zw@zilliz.com>
* fix partition cases
Signed-off-by: zhenwu <zw@zilliz.com>
* [skip ci] update test_db
* update
* fix case bug
Signed-off-by: zhenwu <zw@zilliz.com>
* lint
* fix test case failures
* remove some code
* Caiyd crud 1 (#1377)
* fix clang format
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix unittest build error
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix build issue when enable profiling
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix hastable bug
* update bloom filter
* update
* benchmark
* update benchmark
* update
* update
* remove wal record size
Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
* remove wal record size config
Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
* update apply deletes: switch to binary search
* update sdk_simple
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* update apply deletes: switch to binary search
* add test_search_by_id
Signed-off-by: zhenwu <zw@zilliz.com>
* add more log
* flush error with multi same ids
Signed-off-by: zhenwu <zw@zilliz.com>
* modify wal config
Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
* update
* add binary search_by_id
* fix case bug
Signed-off-by: zhenwu <zw@zilliz.com>
* update cases
Signed-off-by: zhenwu <zw@zilliz.com>
* fix unit test #1395
* improve merge performance
* add uids_ for VectorIndex to improve search performance
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix error
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* update
* fix search
* fix record num
Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
* refine code
* refine code
* Add get_vector_ids test cases (#1407)
* fix order
* add wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix wal case
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix invalid operation issue
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix invalid operation issue
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix bug
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* fix bug
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* crud fix
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* crud fix
Signed-off-by: sahuang <xiaohaix@student.unimelb.edu.au>
* add table info test cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Signed-off-by: JinHai-CN <hai.jin@zilliz.com>
* add to compact case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* add to compact case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* add to compact case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* add case and debug compact
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* test pdb
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* test pdb
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* test pdb
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix cases
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update table_info case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update table_info case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update table_info case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update get vector ids case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update get vector ids case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update get vector ids case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update get vector ids case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* update case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* pdb test
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* pdb test
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* add tests for get_vector_ids
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix case
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* add binary and ip
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix binary index
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* fix pdb
Signed-off-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
* #1408 fix search result in-correct after DeleteById
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* add one case
* delete failed segment
* update serialize
* update serialize
* fix case
Signed-off-by: zhenwu <zw@zilliz.com>
* update
* update case assertion
Signed-off-by: zhenwu <zw@zilliz.com>
* [skip ci] update config
* change bloom filter msync flag to async
* #1319 add more timing debug info
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* update
* update
* add normalize
Signed-off-by: zhenwu <zw@zilliz.com>
* add normalize
Signed-off-by: zhenwu <zw@zilliz.com>
* add normalize
Signed-off-by: zhenwu <zw@zilliz.com>
* Fix compiling error
Signed-off-by: jinhai <hai.jin@zilliz.com>
* support ip (#1383)
* support ip
Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>
* IP result distance sort by descend
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* update
Signed-off-by: Nicky <nicky.xj.lin@gmail.com>
* format
Signed-off-by: xiaojun.lin <xiaojun.lin@zilliz.com>
* get table lsn
* Remove unused third party
Signed-off-by: jinhai <hai.jin@zilliz.com>
* Refine code
Signed-off-by: jinhai <hai.jin@zilliz.com>
* #1319 fix clang format
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* fix wal applied lsn
Signed-off-by: shengjun.li <shengjun.li@zilliz.com>
* validate partition tag
* #1319 improve search performance
Signed-off-by: yudong.cai <yudong.cai@zilliz.com>
* build error
Co-authored-by: Zhiru Zhu <youny626@hotmail.com>
Co-authored-by: groot <yihua.mo@zilliz.com>
Co-authored-by: Xiaohai Xu <xiaohaix@student.unimelb.edu.au>
Co-authored-by: shengjh <46514371+shengjh@users.noreply.github.com>
Co-authored-by: del-zhenwu <56623710+del-zhenwu@users.noreply.github.com>
Co-authored-by: shengjun.li <49774184+shengjun1985@users.noreply.github.com>
Co-authored-by: Cai Yudong <yudong.cai@zilliz.com>
Co-authored-by: quicksilver <zhifeng.zhang@zilliz.com>
Co-authored-by: BossZou <40255591+BossZou@users.noreply.github.com>
Co-authored-by: jielinxu <52057195+jielinxu@users.noreply.github.com>
Co-authored-by: JackLCL <53512883+JackLCL@users.noreply.github.com>
Co-authored-by: Tinkerrr <linxiaojun.cn@outlook.com>
Co-authored-by: Lutkin Wang <yamasite@qq.com>
Co-authored-by: ABNER-1 <ABNER-1@users.noreply.github.com>
Co-authored-by: shiyu22 <cshiyu22@gmail.com>
pull/1195/head^2
parent
636f5c9cb6
commit
dab74700b2
|
@ -22,9 +22,11 @@ Please mark all change in change log and use the issue from GitHub
|
|||
- \#1075 - improve error message when page size or offset is illegal
|
||||
- \#1082 - check page_size or offset value to avoid float
|
||||
- \#1115 - http server support load table into memory
|
||||
- \#1152 - Error log output continuously after server start
|
||||
- \#1211 - Server down caused by searching with index_type: HNSW
|
||||
- \#1240 - Update license declaration
|
||||
- \#1298 - Unittest failed when on CPU2GPU case
|
||||
- \#1359 - Negative distance value returned when searching with HNSW index type
|
||||
|
||||
## Feature
|
||||
- \#216 - Add CLI to get server info
|
||||
|
@ -39,6 +41,8 @@ Please mark all change in change log and use the issue from GitHub
|
|||
- \#823 - Support binary vector tanimoto/jaccard/hamming metric
|
||||
- \#853 - Support HNSW
|
||||
- \#910 - Change Milvus c++ standard to c++17
|
||||
- \#1204 - Add api to get table data information
|
||||
- \#1302 - Get all record IDs in a segment by given a segment id
|
||||
|
||||
## Improvement
|
||||
- \#738 - Use Openblas / lapack from apt install
|
||||
|
@ -53,11 +57,14 @@ Please mark all change in change log and use the issue from GitHub
|
|||
- \#1002 - Rename minio to s3 in Storage Config section
|
||||
- \#1078 - Move 'insert_buffer_size' to Cache Config section
|
||||
- \#1105 - Error message is not clear when creating IVFSQ8H index without gpu resources
|
||||
- \#1297 - Hide partition_name parameter, avid user directly access partition table
|
||||
- \#1310 - Add default partition tag for a table
|
||||
- \#740, #849, #878, #972, #1033, #1161, #1173, #1199, #1190, #1223, #1222, #1257, #1264, #1269, #1164, #1303, #1304, #1324, #1388 - Various fixes and improvements for Milvus documentation.
|
||||
- \#1234 - Do S3 server validation check when Milvus startup
|
||||
- \#1263 - Allow system conf modifiable and some take effect directly
|
||||
- \#1320 - Remove debug logging from faiss
|
||||
|
||||
|
||||
## Task
|
||||
- \#1327 - Exclude third-party code from codebeat
|
||||
- \#1331 - Exclude third-party code from codacy
|
||||
|
|
|
@ -6,7 +6,6 @@
|
|||
|
||||
| Name | License |
|
||||
| ------------- | ------------------------------------------------------------ |
|
||||
| Apache Arrow | [Apache License 2.0](https://github.com/apache/arrow/blob/master/LICENSE.txt) |
|
||||
| Boost | [Boost Software License](https://github.com/boostorg/boost/blob/master/LICENSE_1_0.txt) |
|
||||
| FAISS | [MIT](https://github.com/facebookresearch/faiss/blob/master/LICENSE) |
|
||||
| Gtest | [BSD 3-Clause](https://github.com/google/googletest/blob/master/LICENSE) |
|
||||
|
|
|
@ -1,6 +1,7 @@
|
|||
timeout(time: 90, unit: 'MINUTES') {
|
||||
dir ("tests/milvus_python_test") {
|
||||
sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com'
|
||||
// sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com'
|
||||
sh 'python3 -m pip install -r requirements.txt'
|
||||
sh "pytest . --alluredir=\"test_out/dev/single/sqlite\" --ip ${env.HELM_RELEASE_NAME}.milvus.svc.cluster.local"
|
||||
}
|
||||
// mysql database backend test
|
||||
|
|
|
@ -1,6 +1,7 @@
|
|||
timeout(time: 60, unit: 'MINUTES') {
|
||||
dir ("tests/milvus_python_test") {
|
||||
sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com'
|
||||
// sh 'python3 -m pip install -r requirements.txt -i http://pypi.douban.com/simple --trusted-host pypi.douban.com'
|
||||
sh 'python3 -m pip install -r requirements.txt'
|
||||
sh "pytest . --alluredir=\"test_out/dev/single/sqlite\" --level=1 --ip ${env.HELM_RELEASE_NAME}.milvus.svc.cluster.local"
|
||||
}
|
||||
|
||||
|
|
|
@ -90,7 +90,7 @@ if (MILVUS_VERSION_MAJOR STREQUAL ""
|
|||
OR MILVUS_VERSION_MINOR STREQUAL ""
|
||||
OR MILVUS_VERSION_PATCH STREQUAL "")
|
||||
message(WARNING "Failed to determine Milvus version from git branch name")
|
||||
set(MILVUS_VERSION "0.6.0")
|
||||
set(MILVUS_VERSION "0.7.0")
|
||||
endif ()
|
||||
|
||||
message(STATUS "Build version = ${MILVUS_VERSION}")
|
||||
|
|
|
@ -106,9 +106,7 @@ metric_config:
|
|||
# | The sum of 'insert_buffer_size' and 'cpu_cache_capacity' | | |
|
||||
# | must be less than system memory size. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# cache_insert_data | Whether to load inserted data into cache immediately for | Boolean | false |
|
||||
# | hot query. If want to simultaneously insert and query | | |
|
||||
# | vectors, it's recommended to enable this config. | | |
|
||||
# cache_insert_data | Whether to load data to cache for hot query | Boolean | false |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
cache_config:
|
||||
cpu_cache_capacity: 4
|
||||
|
|
|
@ -46,9 +46,12 @@ server_config:
|
|||
# | loaded when Milvus server starts up. | | |
|
||||
# | '*' means preload all existing tables. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# auto_flush_interval | Interval of auto flush. Unit is millisecond. | Integer | 1000 |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
db_config:
|
||||
backend_url: sqlite://:@:/
|
||||
preload_table:
|
||||
auto_flush_interval: 1000
|
||||
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# Storage Config | Description | Type | Default |
|
||||
|
@ -106,9 +109,7 @@ metric_config:
|
|||
# | The sum of 'insert_buffer_size' and 'cpu_cache_capacity' | | |
|
||||
# | must be less than system memory size. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# cache_insert_data | Whether to load inserted data into cache immediately for | Boolean | false |
|
||||
# | hot query. If want to simultaneously insert and query | | |
|
||||
# | vectors, it's recommended to enable this config. | | |
|
||||
# cache_insert_data | Whether to load data to cache for hot query | Boolean | false |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
cache_config:
|
||||
cpu_cache_capacity: 4
|
||||
|
@ -167,3 +168,24 @@ gpu_resource_config:
|
|||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
tracing_config:
|
||||
json_config_path:
|
||||
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# Wal Config | Description | Type | Default |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# enable | Switch of function wal. | Boolean | false |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# recovery_error_ignore| Whether ignore the error which happens during wal recovery | Boolean | true |
|
||||
# | stage. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# buffer_size | The size of the wal buffer. Unit is MB. | Integer | 256 |
|
||||
# | It should be in range [64, 4096]. If the value set out of | | |
|
||||
# | the range, the system will use the boundary value. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# wal_path | The root path of wal relative files, include wal meta | String | NULL |
|
||||
# | files. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
wal_config:
|
||||
enable: true
|
||||
recovery_error_ignore: true
|
||||
buffer_size: 256 # MB
|
||||
wal_path: /tmp/milvus/wal
|
||||
|
|
|
@ -106,9 +106,7 @@ metric_config:
|
|||
# | The sum of 'insert_buffer_size' and 'cpu_cache_capacity' | | |
|
||||
# | must be less than system memory size. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# cache_insert_data | Whether to load inserted data into cache immediately for | Boolean | false |
|
||||
# | hot query. If want to simultaneously insert and query | | |
|
||||
# | vectors, it's recommended to enable this config. | | |
|
||||
# cache_insert_data | Whether to load data to cache for hot query | Boolean | false |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
cache_config:
|
||||
cpu_cache_capacity: 4
|
||||
|
@ -167,3 +165,24 @@ gpu_resource_config:
|
|||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
tracing_config:
|
||||
json_config_path:
|
||||
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# Wal Config | Description | Type | Default |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# enable | Switch of function wal. | Boolean | false |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# recovery_error_ignore| Whether ignore the error which happens during wal recovery | Boolean | true |
|
||||
# | stage. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# buffer_size | The size of the wal buffer. Unit is MB. | Integer | 256 |
|
||||
# | It should be in range [64, 4096]. If the value set out of | | |
|
||||
# | the range, the system will use the boundary value. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
# wal_path | The root path of wal relative files, include wal meta | String | NULL |
|
||||
# | files. | | |
|
||||
#----------------------+------------------------------------------------------------+------------+-----------------+
|
||||
wal_config:
|
||||
enable: true
|
||||
recovery_error_ignore: true
|
||||
buffer_size: 256 # MB
|
||||
wal_path: /tmp/milvus/wal
|
|
@ -36,6 +36,7 @@ aux_source_directory(${MILVUS_ENGINE_SRC}/db db_main_files)
|
|||
aux_source_directory(${MILVUS_ENGINE_SRC}/db/engine db_engine_files)
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/db/insert db_insert_files)
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/db/meta db_meta_files)
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/db/wal db_wal_files)
|
||||
|
||||
set(grpc_service_files
|
||||
${MILVUS_ENGINE_SRC}/grpc/gen-milvus/milvus.grpc.pb.cc
|
||||
|
@ -65,9 +66,11 @@ set(scheduler_files
|
|||
|
||||
aux_source_directory(${MILVUS_THIRDPARTY_SRC}/easyloggingpp thirdparty_easyloggingpp_files)
|
||||
aux_source_directory(${MILVUS_THIRDPARTY_SRC}/nlohmann thirdparty_nlohmann_files)
|
||||
aux_source_directory(${MILVUS_THIRDPARTY_SRC}/dablooms thirdparty_dablooms_files)
|
||||
set(thirdparty_files
|
||||
${thirdparty_easyloggingpp_files}
|
||||
${thirdparty_nlohmann_files}
|
||||
${thirdparty_dablooms_files}
|
||||
)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/server server_service_files)
|
||||
|
@ -113,10 +116,18 @@ set(storage_files
|
|||
)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/utils utils_files)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/wrapper wrapper_files)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/tracing tracing_files)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/codecs codecs_files)
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/codecs/default codecs_default_files)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/segment segment_files)
|
||||
|
||||
aux_source_directory(${MILVUS_ENGINE_SRC}/store store_files)
|
||||
|
||||
set(engine_files
|
||||
${CMAKE_CURRENT_SOURCE_DIR}/main.cpp
|
||||
${cache_files}
|
||||
|
@ -124,11 +135,16 @@ set(engine_files
|
|||
${db_engine_files}
|
||||
${db_insert_files}
|
||||
${db_meta_files}
|
||||
${db_wal_files}
|
||||
${metrics_files}
|
||||
${storage_files}
|
||||
${thirdparty_files}
|
||||
${utils_files}
|
||||
${wrapper_files}
|
||||
${codecs_files}
|
||||
${codecs_default_files}
|
||||
${segment_files}
|
||||
${store_files}
|
||||
)
|
||||
|
||||
if (MILVUS_WITH_PROMETHEUS)
|
||||
|
|
|
@ -0,0 +1,33 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class AttrsFormat {
|
||||
// public:
|
||||
// virtual Attrs
|
||||
// read() = 0;
|
||||
//
|
||||
// virtual void
|
||||
// write(Attrs attrs) = 0;
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,33 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class AttrsIndexFormat {
|
||||
// public:
|
||||
// virtual AttrsIndex
|
||||
// read() = 0;
|
||||
//
|
||||
// virtual void
|
||||
// write(AttrsIndex attrs_index) = 0;
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,60 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "AttrsFormat.h"
|
||||
#include "AttrsIndexFormat.h"
|
||||
#include "DeletedDocsFormat.h"
|
||||
#include "IdBloomFilterFormat.h"
|
||||
#include "IdIndexFormat.h"
|
||||
#include "VectorsFormat.h"
|
||||
#include "VectorsIndexFormat.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class Codec {
|
||||
public:
|
||||
virtual VectorsFormatPtr
|
||||
GetVectorsFormat() = 0;
|
||||
|
||||
virtual DeletedDocsFormatPtr
|
||||
GetDeletedDocsFormat() = 0;
|
||||
|
||||
virtual IdBloomFilterFormatPtr
|
||||
GetIdBloomFilterFormat() = 0;
|
||||
|
||||
// TODO(zhiru)
|
||||
/*
|
||||
virtual AttrsFormat
|
||||
GetAttrsFormat() = 0;
|
||||
|
||||
virtual VectorsIndexFormat
|
||||
GetVectorsIndexFormat() = 0;
|
||||
|
||||
virtual AttrsIndexFormat
|
||||
GetAttrsIndexFormat() = 0;
|
||||
|
||||
virtual IdIndexFormat
|
||||
GetIdIndexFormat() = 0;
|
||||
|
||||
*/
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,40 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "segment/DeletedDocs.h"
|
||||
#include "store/Directory.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class DeletedDocsFormat {
|
||||
public:
|
||||
virtual void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::DeletedDocsPtr& deleted_docs) = 0;
|
||||
|
||||
virtual void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::DeletedDocsPtr& deleted_docs) = 0;
|
||||
};
|
||||
|
||||
using DeletedDocsFormatPtr = std::shared_ptr<DeletedDocsFormat>;
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,43 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "segment/IdBloomFilter.h"
|
||||
#include "store/Directory.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class IdBloomFilterFormat {
|
||||
public:
|
||||
virtual void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::IdBloomFilterPtr& id_bloom_filter_ptr) = 0;
|
||||
|
||||
virtual void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::IdBloomFilterPtr& id_bloom_filter_ptr) = 0;
|
||||
|
||||
virtual void
|
||||
create(const store::DirectoryPtr& directory_ptr, segment::IdBloomFilterPtr& id_bloom_filter_ptr) = 0;
|
||||
};
|
||||
|
||||
using IdBloomFilterFormatPtr = std::shared_ptr<IdBloomFilterFormat>;
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,33 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class IdIndexFormat {
|
||||
// public:
|
||||
// virtual IdIndex
|
||||
// read() = 0;
|
||||
//
|
||||
// virtual void
|
||||
// write(IdIndex id_index) = 0;
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,48 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
#include <vector>
|
||||
|
||||
#include "segment/Vectors.h"
|
||||
#include "store/Directory.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class VectorsFormat {
|
||||
public:
|
||||
virtual void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::VectorsPtr& vectors_read) = 0;
|
||||
|
||||
virtual void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::VectorsPtr& vectors) = 0;
|
||||
|
||||
virtual void
|
||||
read_uids(const store::DirectoryPtr& directory_ptr, std::vector<segment::doc_id_t>& uids) = 0;
|
||||
|
||||
virtual void
|
||||
read_vectors(const store::DirectoryPtr& directory_ptr, off_t offset, size_t num_bytes,
|
||||
std::vector<uint8_t>& raw_vectors) = 0;
|
||||
};
|
||||
|
||||
using VectorsFormatPtr = std::shared_ptr<VectorsFormat>;
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,33 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class VectorsIndexFormat {
|
||||
// public:
|
||||
// virtual VectorsIndex
|
||||
// read() = 0;
|
||||
//
|
||||
// virtual void
|
||||
// write(VectorsIndex vectors_index) = 0;
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,51 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#include "codecs/default/DefaultCodec.h"
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "DefaultDeletedDocsFormat.h"
|
||||
#include "DefaultIdBloomFilterFormat.h"
|
||||
#include "DefaultVectorsFormat.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
DefaultCodec::DefaultCodec() {
|
||||
vectors_format_ptr_ = std::make_shared<DefaultVectorsFormat>();
|
||||
deleted_docs_format_ptr_ = std::make_shared<DefaultDeletedDocsFormat>();
|
||||
id_bloom_filter_format_ptr_ = std::make_shared<DefaultIdBloomFilterFormat>();
|
||||
}
|
||||
|
||||
VectorsFormatPtr
|
||||
DefaultCodec::GetVectorsFormat() {
|
||||
return vectors_format_ptr_;
|
||||
}
|
||||
|
||||
DeletedDocsFormatPtr
|
||||
DefaultCodec::GetDeletedDocsFormat() {
|
||||
return deleted_docs_format_ptr_;
|
||||
}
|
||||
|
||||
IdBloomFilterFormatPtr
|
||||
DefaultCodec::GetIdBloomFilterFormat() {
|
||||
return id_bloom_filter_format_ptr_;
|
||||
}
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,45 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "codecs/Codec.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class DefaultCodec : public Codec {
|
||||
public:
|
||||
DefaultCodec();
|
||||
|
||||
VectorsFormatPtr
|
||||
GetVectorsFormat() override;
|
||||
|
||||
DeletedDocsFormatPtr
|
||||
GetDeletedDocsFormat() override;
|
||||
|
||||
IdBloomFilterFormatPtr
|
||||
GetIdBloomFilterFormat() override;
|
||||
|
||||
private:
|
||||
VectorsFormatPtr vectors_format_ptr_;
|
||||
DeletedDocsFormatPtr deleted_docs_format_ptr_;
|
||||
IdBloomFilterFormatPtr id_bloom_filter_format_ptr_;
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,73 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#include "codecs/default/DefaultDeletedDocsFormat.h"
|
||||
|
||||
#include <boost/filesystem.hpp>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "segment/Types.h"
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
void
|
||||
DefaultDeletedDocsFormat::read(const store::DirectoryPtr& directory_ptr, segment::DeletedDocsPtr& deleted_docs) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
const std::string del_file_path = dir_path + "/" + deleted_docs_filename_;
|
||||
FILE* del_file = fopen(del_file_path.c_str(), "rb");
|
||||
if (del_file == nullptr) {
|
||||
std::string err_msg = "Failed to open file: " + del_file_path;
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
|
||||
auto file_size = boost::filesystem::file_size(boost::filesystem::path(del_file_path));
|
||||
auto deleted_docs_size = file_size / sizeof(segment::offset_t);
|
||||
std::vector<segment::offset_t> deleted_docs_list;
|
||||
deleted_docs_list.resize(deleted_docs_size);
|
||||
fread((void*)(deleted_docs_list.data()), sizeof(segment::offset_t), deleted_docs_size, del_file);
|
||||
deleted_docs = std::make_shared<segment::DeletedDocs>(deleted_docs_list);
|
||||
fclose(del_file);
|
||||
}
|
||||
|
||||
void
|
||||
DefaultDeletedDocsFormat::write(const store::DirectoryPtr& directory_ptr, const segment::DeletedDocsPtr& deleted_docs) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
const std::string del_file_path = dir_path + "/" + deleted_docs_filename_;
|
||||
FILE* del_file = fopen(del_file_path.c_str(), "ab"); // TODO(zhiru): append mode
|
||||
if (del_file == nullptr) {
|
||||
std::string err_msg = "Failed to open file: " + del_file_path;
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
|
||||
auto deleted_docs_list = deleted_docs->GetDeletedDocs();
|
||||
fwrite((void*)(deleted_docs_list.data()), sizeof(segment::offset_t), deleted_docs->GetSize(), del_file);
|
||||
fclose(del_file);
|
||||
}
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,54 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
|
||||
#include "codecs/DeletedDocsFormat.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class DefaultDeletedDocsFormat : public DeletedDocsFormat {
|
||||
public:
|
||||
DefaultDeletedDocsFormat() = default;
|
||||
|
||||
void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::DeletedDocsPtr& deleted_docs) override;
|
||||
|
||||
void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::DeletedDocsPtr& deleted_docs) override;
|
||||
|
||||
// No copy and move
|
||||
DefaultDeletedDocsFormat(const DefaultDeletedDocsFormat&) = delete;
|
||||
DefaultDeletedDocsFormat(DefaultDeletedDocsFormat&&) = delete;
|
||||
|
||||
DefaultDeletedDocsFormat&
|
||||
operator=(const DefaultDeletedDocsFormat&) = delete;
|
||||
DefaultDeletedDocsFormat&
|
||||
operator=(DefaultDeletedDocsFormat&&) = delete;
|
||||
|
||||
private:
|
||||
std::mutex mutex_;
|
||||
|
||||
const std::string deleted_docs_filename_ = "deleted_docs";
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,82 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#include "codecs/default/DefaultIdBloomFilterFormat.h"
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
constexpr unsigned int bloom_filter_capacity = 500000;
|
||||
constexpr double bloom_filter_error_rate = 0.01;
|
||||
|
||||
void
|
||||
DefaultIdBloomFilterFormat::read(const store::DirectoryPtr& directory_ptr,
|
||||
segment::IdBloomFilterPtr& id_bloom_filter_ptr) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
const std::string bloom_filter_file_path = dir_path + "/" + bloom_filter_filename_;
|
||||
scaling_bloom_t* bloom_filter =
|
||||
new_scaling_bloom_from_file(bloom_filter_capacity, bloom_filter_error_rate, bloom_filter_file_path.c_str());
|
||||
if (bloom_filter == nullptr) {
|
||||
std::string err_msg =
|
||||
"Failed to read bloom filter from file: " + bloom_filter_file_path + ". " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_UNEXPECTED_ERROR, err_msg);
|
||||
}
|
||||
id_bloom_filter_ptr = std::make_shared<segment::IdBloomFilter>(bloom_filter);
|
||||
}
|
||||
|
||||
void
|
||||
DefaultIdBloomFilterFormat::write(const store::DirectoryPtr& directory_ptr,
|
||||
const segment::IdBloomFilterPtr& id_bloom_filter_ptr) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
const std::string bloom_filter_file_path = dir_path + "/" + bloom_filter_filename_;
|
||||
if (scaling_bloom_flush(id_bloom_filter_ptr->GetBloomFilter()) == -1) {
|
||||
std::string err_msg =
|
||||
"Failed to write bloom filter to file: " + bloom_filter_file_path + ". " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_UNEXPECTED_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
DefaultIdBloomFilterFormat::create(const store::DirectoryPtr& directory_ptr,
|
||||
segment::IdBloomFilterPtr& id_bloom_filter_ptr) {
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
const std::string bloom_filter_file_path = dir_path + "/" + bloom_filter_filename_;
|
||||
scaling_bloom_t* bloom_filter =
|
||||
new_scaling_bloom(bloom_filter_capacity, bloom_filter_error_rate, bloom_filter_file_path.c_str());
|
||||
if (bloom_filter == nullptr) {
|
||||
std::string err_msg =
|
||||
"Failed to read bloom filter from file: " + bloom_filter_file_path + ". " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_UNEXPECTED_ERROR, err_msg);
|
||||
}
|
||||
id_bloom_filter_ptr = std::make_shared<segment::IdBloomFilter>(bloom_filter);
|
||||
}
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,59 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
|
||||
#include "codecs/IdBloomFilterFormat.h"
|
||||
#include "segment/IdBloomFilter.h"
|
||||
#include "store/Directory.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class DefaultIdBloomFilterFormat : public IdBloomFilterFormat {
|
||||
public:
|
||||
DefaultIdBloomFilterFormat() = default;
|
||||
|
||||
void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::IdBloomFilterPtr& id_bloom_filter_ptr) override;
|
||||
|
||||
void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::IdBloomFilterPtr& id_bloom_filter_ptr) override;
|
||||
|
||||
void
|
||||
create(const store::DirectoryPtr& directory_ptr, segment::IdBloomFilterPtr& id_bloom_filter_ptr) override;
|
||||
|
||||
// No copy and move
|
||||
DefaultIdBloomFilterFormat(const DefaultIdBloomFilterFormat&) = delete;
|
||||
DefaultIdBloomFilterFormat(DefaultIdBloomFilterFormat&&) = delete;
|
||||
|
||||
DefaultIdBloomFilterFormat&
|
||||
operator=(const DefaultIdBloomFilterFormat&) = delete;
|
||||
DefaultIdBloomFilterFormat&
|
||||
operator=(DefaultIdBloomFilterFormat&&) = delete;
|
||||
|
||||
private:
|
||||
std::mutex mutex_;
|
||||
|
||||
const std::string bloom_filter_filename_ = "bloom_filter";
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,263 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#include "codecs/default/DefaultVectorsFormat.h"
|
||||
|
||||
#include <fcntl.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <boost/filesystem.hpp>
|
||||
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
void
|
||||
DefaultVectorsFormat::read(const store::DirectoryPtr& directory_ptr, segment::VectorsPtr& vectors_read) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
if (!boost::filesystem::is_directory(dir_path)) {
|
||||
std::string err_msg = "Directory: " + dir_path + "does not exist";
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_INVALID_ARGUMENT, err_msg);
|
||||
}
|
||||
|
||||
boost::filesystem::path target_path(dir_path);
|
||||
typedef boost::filesystem::directory_iterator d_it;
|
||||
d_it it_end;
|
||||
d_it it(target_path);
|
||||
// for (auto& it : boost::filesystem::directory_iterator(dir_path)) {
|
||||
for (; it != it_end; ++it) {
|
||||
const auto& path = it->path();
|
||||
if (path.extension().string() == raw_vector_extension_) {
|
||||
int rv_fd = open(path.c_str(), O_RDONLY, 00664);
|
||||
if (rv_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
size_t num_bytes = boost::filesystem::file_size(path);
|
||||
std::vector<uint8_t> vector_list;
|
||||
vector_list.resize(num_bytes);
|
||||
if (::read(rv_fd, vector_list.data(), num_bytes) == -1) {
|
||||
std::string err_msg = "Failed to read from file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
|
||||
vectors_read->AddData(vector_list);
|
||||
vectors_read->SetName(path.stem().string());
|
||||
|
||||
if (::close(rv_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
if (path.extension().string() == user_id_extension_) {
|
||||
int uid_fd = open(path.c_str(), O_RDONLY, 00664);
|
||||
if (uid_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
auto file_size = boost::filesystem::file_size(path);
|
||||
auto count = file_size / sizeof(segment::doc_id_t);
|
||||
std::vector<segment::doc_id_t> uids;
|
||||
uids.resize(count);
|
||||
if (::read(uid_fd, uids.data(), file_size) == -1) {
|
||||
std::string err_msg = "Failed to read from file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
|
||||
vectors_read->AddUids(uids);
|
||||
|
||||
if (::close(uid_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
DefaultVectorsFormat::write(const store::DirectoryPtr& directory_ptr, const segment::VectorsPtr& vectors) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
|
||||
const std::string rv_file_path = dir_path + "/" + vectors->GetName() + raw_vector_extension_;
|
||||
const std::string uid_file_path = dir_path + "/" + vectors->GetName() + user_id_extension_;
|
||||
|
||||
/*
|
||||
FILE* rv_file = fopen(rv_file_path.c_str(), "wb");
|
||||
if (rv_file == nullptr) {
|
||||
std::string err_msg = "Failed to open file: " + rv_file_path;
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
|
||||
fwrite((void*)(it.second->GetData()), sizeof(char), it.second->GetNumBytes(), rv_file);
|
||||
fclose(rv_file);
|
||||
|
||||
|
||||
FILE* uid_file = fopen(uid_file_path.c_str(), "wb");
|
||||
if (uid_file == nullptr) {
|
||||
std::string err_msg = "Failed to open file: " + uid_file_path;
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
|
||||
fwrite((void*)(it.second->GetUids()), sizeof it.second->GetUids()[0], it.second->GetCount(), uid_file);
|
||||
fclose(rv_file);
|
||||
*/
|
||||
|
||||
int rv_fd = open(rv_file_path.c_str(), O_WRONLY | O_TRUNC | O_CREAT, 00664);
|
||||
if (rv_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + rv_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
int uid_fd = open(uid_file_path.c_str(), O_WRONLY | O_TRUNC | O_CREAT, 00664);
|
||||
if (uid_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + uid_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
|
||||
if (::write(rv_fd, vectors->GetData().data(), vectors->GetData().size()) == -1) {
|
||||
std::string err_msg = "Failed to write to file" + rv_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
if (::close(rv_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + rv_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
|
||||
if (::write(uid_fd, vectors->GetUids().data(), sizeof(segment::doc_id_t) * vectors->GetCount()) == -1) {
|
||||
std::string err_msg = "Failed to write to file" + uid_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
if (::close(uid_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + uid_file_path + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
DefaultVectorsFormat::read_uids(const store::DirectoryPtr& directory_ptr, std::vector<segment::doc_id_t>& uids) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
if (!boost::filesystem::is_directory(dir_path)) {
|
||||
std::string err_msg = "Directory: " + dir_path + "does not exist";
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_INVALID_ARGUMENT, err_msg);
|
||||
}
|
||||
|
||||
boost::filesystem::path target_path(dir_path);
|
||||
typedef boost::filesystem::directory_iterator d_it;
|
||||
d_it it_end;
|
||||
d_it it(target_path);
|
||||
// for (auto& it : boost::filesystem::directory_iterator(dir_path)) {
|
||||
for (; it != it_end; ++it) {
|
||||
const auto& path = it->path();
|
||||
if (path.extension().string() == user_id_extension_) {
|
||||
int uid_fd = open(path.c_str(), O_RDONLY, 00664);
|
||||
if (uid_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
auto file_size = boost::filesystem::file_size(path);
|
||||
auto count = file_size / sizeof(segment::doc_id_t);
|
||||
uids.resize(count);
|
||||
if (::read(uid_fd, uids.data(), file_size) == -1) {
|
||||
std::string err_msg = "Failed to read from file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
if (::close(uid_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
DefaultVectorsFormat::read_vectors(const store::DirectoryPtr& directory_ptr, off_t offset, size_t num_bytes,
|
||||
std::vector<uint8_t>& raw_vectors) {
|
||||
const std::lock_guard<std::mutex> lock(mutex_);
|
||||
|
||||
std::string dir_path = directory_ptr->GetDirPath();
|
||||
if (!boost::filesystem::is_directory(dir_path)) {
|
||||
std::string err_msg = "Directory: " + dir_path + "does not exist";
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_INVALID_ARGUMENT, err_msg);
|
||||
}
|
||||
|
||||
boost::filesystem::path target_path(dir_path);
|
||||
typedef boost::filesystem::directory_iterator d_it;
|
||||
d_it it_end;
|
||||
d_it it(target_path);
|
||||
// for (auto& it : boost::filesystem::directory_iterator(dir_path)) {
|
||||
for (; it != it_end; ++it) {
|
||||
const auto& path = it->path();
|
||||
if (path.extension().string() == raw_vector_extension_) {
|
||||
int rv_fd = open(path.c_str(), O_RDONLY, 00664);
|
||||
if (rv_fd == -1) {
|
||||
std::string err_msg = "Failed to open file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_CANNOT_CREATE_FILE, err_msg);
|
||||
}
|
||||
int off = lseek(rv_fd, offset, SEEK_SET);
|
||||
if (off == -1) {
|
||||
std::string err_msg = "Failed to seek file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
|
||||
raw_vectors.resize(num_bytes);
|
||||
|
||||
if (::read(rv_fd, raw_vectors.data(), num_bytes) == -1) {
|
||||
std::string err_msg = "Failed to read from file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
|
||||
if (::close(rv_fd) == -1) {
|
||||
std::string err_msg = "Failed to close file: " + path.string() + ", error: " + std::strerror(errno);
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
throw Exception(SERVER_WRITE_ERROR, err_msg);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -0,0 +1,64 @@
|
|||
// Licensed to the Apache Software Foundation (ASF) under one
|
||||
// or more contributor license agreements. See the NOTICE file
|
||||
// distributed with this work for additional information
|
||||
// regarding copyright ownership. The ASF licenses this file
|
||||
// to you under the Apache License, Version 2.0 (the
|
||||
// "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing,
|
||||
// software distributed under the License is distributed on an
|
||||
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
// KIND, either express or implied. See the License for the
|
||||
// specific language governing permissions and limitations
|
||||
// under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "codecs/VectorsFormat.h"
|
||||
#include "segment/Vectors.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace codec {
|
||||
|
||||
class DefaultVectorsFormat : public VectorsFormat {
|
||||
public:
|
||||
DefaultVectorsFormat() = default;
|
||||
|
||||
void
|
||||
read(const store::DirectoryPtr& directory_ptr, segment::VectorsPtr& vectors_read) override;
|
||||
|
||||
void
|
||||
write(const store::DirectoryPtr& directory_ptr, const segment::VectorsPtr& vectors) override;
|
||||
|
||||
void
|
||||
read_uids(const store::DirectoryPtr& directory_ptr, std::vector<segment::doc_id_t>& uids) override;
|
||||
|
||||
void
|
||||
read_vectors(const store::DirectoryPtr& directory_ptr, off_t offset, size_t num_bytes,
|
||||
std::vector<uint8_t>& raw_vectors) override;
|
||||
|
||||
// No copy and move
|
||||
DefaultVectorsFormat(const DefaultVectorsFormat&) = delete;
|
||||
DefaultVectorsFormat(DefaultVectorsFormat&&) = delete;
|
||||
|
||||
DefaultVectorsFormat&
|
||||
operator=(const DefaultVectorsFormat&) = delete;
|
||||
DefaultVectorsFormat&
|
||||
operator=(DefaultVectorsFormat&&) = delete;
|
||||
|
||||
private:
|
||||
std::mutex mutex_;
|
||||
|
||||
const std::string raw_vector_extension_ = ".rv";
|
||||
const std::string user_id_extension_ = ".uid";
|
||||
};
|
||||
|
||||
} // namespace codec
|
||||
} // namespace milvus
|
|
@ -44,7 +44,7 @@ class DB {
|
|||
CreateTable(meta::TableSchema& table_schema_) = 0;
|
||||
|
||||
virtual Status
|
||||
DropTable(const std::string& table_id, const meta::DatesT& dates) = 0;
|
||||
DropTable(const std::string& table_id) = 0;
|
||||
|
||||
virtual Status
|
||||
DescribeTable(meta::TableSchema& table_schema_) = 0;
|
||||
|
@ -52,9 +52,15 @@ class DB {
|
|||
virtual Status
|
||||
HasTable(const std::string& table_id, bool& has_or_not_) = 0;
|
||||
|
||||
virtual Status
|
||||
HasNativeTable(const std::string& table_id, bool& has_or_not_) = 0;
|
||||
|
||||
virtual Status
|
||||
AllTables(std::vector<meta::TableSchema>& table_schema_array) = 0;
|
||||
|
||||
virtual Status
|
||||
GetTableInfo(const std::string& table_id, TableInfo& table_info) = 0;
|
||||
|
||||
virtual Status
|
||||
GetTableRowCount(const std::string& table_id, uint64_t& row_count) = 0;
|
||||
|
||||
|
@ -80,20 +86,44 @@ class DB {
|
|||
virtual Status
|
||||
InsertVectors(const std::string& table_id, const std::string& partition_tag, VectorsData& vectors) = 0;
|
||||
|
||||
virtual Status
|
||||
DeleteVector(const std::string& table_id, IDNumber vector_id) = 0;
|
||||
|
||||
virtual Status
|
||||
DeleteVectors(const std::string& table_id, IDNumbers vector_ids) = 0;
|
||||
|
||||
virtual Status
|
||||
Flush(const std::string& table_id) = 0;
|
||||
|
||||
virtual Status
|
||||
Flush() = 0;
|
||||
|
||||
virtual Status
|
||||
Compact(const std::string& table_id) = 0;
|
||||
|
||||
virtual Status
|
||||
GetVectorByID(const std::string& table_id, const IDNumber& vector_id, VectorsData& vector) = 0;
|
||||
|
||||
virtual Status
|
||||
GetVectorIDs(const std::string& table_id, const std::string& segment_id, IDNumbers& vector_ids) = 0;
|
||||
|
||||
// virtual Status
|
||||
// Merge(const std::set<std::string>& table_ids) = 0;
|
||||
|
||||
virtual Status
|
||||
QueryByID(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, IDNumber vector_id,
|
||||
ResultIds& result_ids, ResultDistances& result_distances) = 0;
|
||||
|
||||
virtual Status
|
||||
Query(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
ResultIds& result_ids, ResultDistances& result_distances) = 0;
|
||||
|
||||
virtual Status
|
||||
Query(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
const meta::DatesT& dates, ResultIds& result_ids, ResultDistances& result_distances) = 0;
|
||||
|
||||
virtual Status
|
||||
QueryByFileID(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& file_ids, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
const meta::DatesT& dates, ResultIds& result_ids, ResultDistances& result_distances) = 0;
|
||||
ResultIds& result_ids, ResultDistances& result_distances) = 0;
|
||||
|
||||
virtual Status
|
||||
Size(uint64_t& result) = 0;
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -28,6 +28,7 @@
|
|||
#include "db/Types.h"
|
||||
#include "db/insert/MemManager.h"
|
||||
#include "utils/ThreadPool.h"
|
||||
#include "wal/WalManager.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
@ -52,7 +53,7 @@ class DBImpl : public DB {
|
|||
CreateTable(meta::TableSchema& table_schema) override;
|
||||
|
||||
Status
|
||||
DropTable(const std::string& table_id, const meta::DatesT& dates) override;
|
||||
DropTable(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
DescribeTable(meta::TableSchema& table_schema) override;
|
||||
|
@ -60,9 +61,15 @@ class DBImpl : public DB {
|
|||
Status
|
||||
HasTable(const std::string& table_id, bool& has_or_not) override;
|
||||
|
||||
Status
|
||||
HasNativeTable(const std::string& table_id, bool& has_or_not_) override;
|
||||
|
||||
Status
|
||||
AllTables(std::vector<meta::TableSchema>& table_schema_array) override;
|
||||
|
||||
Status
|
||||
GetTableInfo(const std::string& table_id, TableInfo& table_info) override;
|
||||
|
||||
Status
|
||||
PreloadTable(const std::string& table_id) override;
|
||||
|
||||
|
@ -88,6 +95,30 @@ class DBImpl : public DB {
|
|||
Status
|
||||
InsertVectors(const std::string& table_id, const std::string& partition_tag, VectorsData& vectors) override;
|
||||
|
||||
Status
|
||||
DeleteVector(const std::string& table_id, IDNumber vector_id) override;
|
||||
|
||||
Status
|
||||
DeleteVectors(const std::string& table_id, IDNumbers vector_ids) override;
|
||||
|
||||
Status
|
||||
Flush(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
Flush() override;
|
||||
|
||||
Status
|
||||
Compact(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
GetVectorByID(const std::string& table_id, const IDNumber& vector_id, VectorsData& vector) override;
|
||||
|
||||
Status
|
||||
GetVectorIDs(const std::string& table_id, const std::string& segment_id, IDNumbers& vector_ids) override;
|
||||
|
||||
// Status
|
||||
// Merge(const std::set<std::string>& table_ids) override;
|
||||
|
||||
Status
|
||||
CreateIndex(const std::string& table_id, const TableIndex& index) override;
|
||||
|
||||
|
@ -97,20 +128,20 @@ class DBImpl : public DB {
|
|||
Status
|
||||
DropIndex(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
QueryByID(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, IDNumber vector_id,
|
||||
ResultIds& result_ids, ResultDistances& result_distances) override;
|
||||
|
||||
Status
|
||||
Query(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
ResultIds& result_ids, ResultDistances& result_distances) override;
|
||||
|
||||
Status
|
||||
Query(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& partition_tags, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
const meta::DatesT& dates, ResultIds& result_ids, ResultDistances& result_distances) override;
|
||||
|
||||
Status
|
||||
QueryByFileID(const std::shared_ptr<server::Context>& context, const std::string& table_id,
|
||||
const std::vector<std::string>& file_ids, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
const meta::DatesT& dates, ResultIds& result_ids, ResultDistances& result_distances) override;
|
||||
ResultIds& result_ids, ResultDistances& result_distances) override;
|
||||
|
||||
Status
|
||||
Size(uint64_t& result) override;
|
||||
|
@ -121,6 +152,10 @@ class DBImpl : public DB {
|
|||
const meta::TableFilesSchema& files, uint64_t k, uint64_t nprobe, const VectorsData& vectors,
|
||||
ResultIds& result_ids, ResultDistances& result_distances);
|
||||
|
||||
Status
|
||||
GetVectorByIdHelper(const std::string& table_id, IDNumber vector_id, VectorsData& vector,
|
||||
const meta::TableFilesSchema& files);
|
||||
|
||||
void
|
||||
BackgroundTimerTask();
|
||||
void
|
||||
|
@ -132,36 +167,44 @@ class DBImpl : public DB {
|
|||
StartMetricTask();
|
||||
|
||||
void
|
||||
StartCompactionTask();
|
||||
StartMergeTask();
|
||||
|
||||
Status
|
||||
MergeFiles(const std::string& table_id, const meta::DateT& date, const meta::TableFilesSchema& files);
|
||||
MergeFiles(const std::string& table_id, const meta::TableFilesSchema& files);
|
||||
Status
|
||||
BackgroundMergeFiles(const std::string& table_id);
|
||||
void
|
||||
BackgroundCompaction(std::set<std::string> table_ids);
|
||||
BackgroundMerge(std::set<std::string> table_ids);
|
||||
|
||||
void
|
||||
StartBuildIndexTask(bool force = false);
|
||||
void
|
||||
BackgroundBuildIndex();
|
||||
|
||||
Status
|
||||
CompactFile(const std::string& table_id, const milvus::engine::meta::TableFileSchema& file);
|
||||
|
||||
/*
|
||||
Status
|
||||
SyncMemData(std::set<std::string>& sync_table_ids);
|
||||
*/
|
||||
|
||||
Status
|
||||
GetFilesToBuildIndex(const std::string& table_id, const std::vector<int>& file_types,
|
||||
meta::TableFilesSchema& files);
|
||||
|
||||
Status
|
||||
GetFilesToSearch(const std::string& table_id, const std::vector<size_t>& file_ids, const meta::DatesT& dates,
|
||||
meta::TableFilesSchema& files);
|
||||
GetFilesToSearch(const std::string& table_id, const std::vector<size_t>& file_ids, meta::TableFilesSchema& files);
|
||||
|
||||
Status
|
||||
GetPartitionByTag(const std::string& table_id, const std::string& partition_tag, std::string& partition_name);
|
||||
|
||||
Status
|
||||
GetPartitionsByTags(const std::string& table_id, const std::vector<std::string>& partition_tags,
|
||||
std::set<std::string>& partition_name_array);
|
||||
|
||||
Status
|
||||
DropTableRecursively(const std::string& table_id, const meta::DatesT& dates);
|
||||
DropTableRecursively(const std::string& table_id);
|
||||
|
||||
Status
|
||||
UpdateTableIndexRecursively(const std::string& table_id, const TableIndex& index);
|
||||
|
@ -175,6 +218,12 @@ class DBImpl : public DB {
|
|||
Status
|
||||
GetTableRowCountRecursively(const std::string& table_id, uint64_t& row_count);
|
||||
|
||||
Status
|
||||
ExecWalRecord(const wal::MXLogRecord& record);
|
||||
|
||||
void
|
||||
BackgroundWalTask();
|
||||
|
||||
private:
|
||||
const DBOptions options_;
|
||||
|
||||
|
@ -184,12 +233,49 @@ class DBImpl : public DB {
|
|||
|
||||
meta::MetaPtr meta_ptr_;
|
||||
MemManagerPtr mem_mgr_;
|
||||
std::mutex mem_serialize_mutex_;
|
||||
|
||||
ThreadPool compact_thread_pool_;
|
||||
std::mutex compact_result_mutex_;
|
||||
std::list<std::future<void>> compact_thread_results_;
|
||||
std::set<std::string> compact_table_ids_;
|
||||
std::shared_ptr<wal::WalManager> wal_mgr_;
|
||||
std::thread bg_wal_thread_;
|
||||
|
||||
struct SimpleWaitNotify {
|
||||
bool notified_ = false;
|
||||
std::mutex mutex_;
|
||||
std::condition_variable cv_;
|
||||
|
||||
void
|
||||
Wait() {
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
if (!notified_) {
|
||||
cv_.wait(lck);
|
||||
}
|
||||
notified_ = false;
|
||||
}
|
||||
|
||||
void
|
||||
Wait_Until(const std::chrono::system_clock::time_point& tm_pint) {
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
if (!notified_) {
|
||||
cv_.wait_until(lck, tm_pint);
|
||||
}
|
||||
notified_ = false;
|
||||
}
|
||||
|
||||
void
|
||||
Notify() {
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
notified_ = true;
|
||||
lck.unlock();
|
||||
cv_.notify_one();
|
||||
}
|
||||
};
|
||||
|
||||
SimpleWaitNotify wal_task_swn_;
|
||||
SimpleWaitNotify flush_task_swn_;
|
||||
|
||||
ThreadPool merge_thread_pool_;
|
||||
std::mutex merge_result_mutex_;
|
||||
std::list<std::future<void>> merge_thread_results_;
|
||||
std::set<std::string> merge_table_ids_;
|
||||
|
||||
ThreadPool index_thread_pool_;
|
||||
std::mutex index_result_mutex_;
|
||||
|
@ -198,7 +284,8 @@ class DBImpl : public DB {
|
|||
std::mutex build_index_mutex_;
|
||||
|
||||
IndexFailedChecker index_failed_checker_;
|
||||
OngoingFileChecker ongoing_files_checker_;
|
||||
|
||||
std::mutex flush_merge_compact_mutex_;
|
||||
}; // DBImpl
|
||||
|
||||
} // namespace engine
|
||||
|
|
|
@ -17,6 +17,12 @@
|
|||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
OngoingFileChecker&
|
||||
OngoingFileChecker::GetInstance() {
|
||||
static OngoingFileChecker instance;
|
||||
return instance;
|
||||
}
|
||||
|
||||
Status
|
||||
OngoingFileChecker::MarkOngoingFile(const meta::TableFileSchema& table_file) {
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
|
|
|
@ -23,8 +23,11 @@
|
|||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
class OngoingFileChecker : public meta::Meta::CleanUpFilter {
|
||||
class OngoingFileChecker {
|
||||
public:
|
||||
static OngoingFileChecker&
|
||||
GetInstance();
|
||||
|
||||
Status
|
||||
MarkOngoingFile(const meta::TableFileSchema& table_file);
|
||||
|
||||
|
@ -38,7 +41,7 @@ class OngoingFileChecker : public meta::Meta::CleanUpFilter {
|
|||
UnmarkOngoingFiles(const meta::TableFilesSchema& table_files);
|
||||
|
||||
bool
|
||||
IsIgnored(const meta::TableFileSchema& schema) override;
|
||||
IsIgnored(const meta::TableFileSchema& schema);
|
||||
|
||||
private:
|
||||
Status
|
||||
|
|
|
@ -70,6 +70,14 @@ struct DBOptions {
|
|||
|
||||
size_t insert_buffer_size_ = 4 * ONE_GB;
|
||||
bool insert_cache_immediately_ = false;
|
||||
|
||||
int auto_flush_interval_ = 1000;
|
||||
|
||||
// wal relative configurations
|
||||
bool wal_enable_ = true;
|
||||
bool recovery_error_ignore_ = true;
|
||||
uint32_t buffer_size_ = 256;
|
||||
std::string mxlog_path_ = "/tmp/milvus/wal/";
|
||||
}; // Options
|
||||
|
||||
} // namespace engine
|
||||
|
|
|
@ -11,20 +11,22 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
|
||||
#include <faiss/Index.h>
|
||||
#include <stdint.h>
|
||||
|
||||
#include <cstdint>
|
||||
#include <map>
|
||||
#include <set>
|
||||
#include <string>
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "segment/Types.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
typedef int64_t IDNumber;
|
||||
typedef segment::doc_id_t IDNumber;
|
||||
typedef IDNumber* IDNumberPtr;
|
||||
typedef std::vector<IDNumber> IDNumbers;
|
||||
|
||||
|
@ -49,5 +51,23 @@ using Table2FileErr = std::map<std::string, File2ErrArray>;
|
|||
using File2RefCount = std::map<std::string, int64_t>;
|
||||
using Table2FileRef = std::map<std::string, File2RefCount>;
|
||||
|
||||
struct SegmentStat {
|
||||
std::string name_;
|
||||
int64_t row_count_ = 0;
|
||||
std::string index_name_;
|
||||
int64_t data_size_ = 0;
|
||||
};
|
||||
|
||||
struct PartitionStat {
|
||||
std::string tag_;
|
||||
std::vector<SegmentStat> segments_stat_;
|
||||
};
|
||||
|
||||
struct TableInfo {
|
||||
std::vector<PartitionStat> partitions_stat_;
|
||||
};
|
||||
|
||||
static const char* DEFAULT_PARTITON_TAG = "_default";
|
||||
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
||||
|
|
|
@ -10,10 +10,6 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/Utils.h"
|
||||
#include "server/Config.h"
|
||||
#include "storage/s3/S3ClientWrapper.h"
|
||||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
#include <fiu-local.h>
|
||||
#include <boost/filesystem.hpp>
|
||||
|
@ -22,6 +18,11 @@
|
|||
#include <regex>
|
||||
#include <vector>
|
||||
|
||||
#include "server/Config.h"
|
||||
#include "storage/s3/S3ClientWrapper.h"
|
||||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace utils {
|
||||
|
@ -36,7 +37,7 @@ std::mutex index_file_counter_mutex;
|
|||
static std::string
|
||||
ConstructParentFolder(const std::string& db_path, const meta::TableFileSchema& table_file) {
|
||||
std::string table_path = db_path + TABLES_FOLDER + table_file.table_id_;
|
||||
std::string partition_path = table_path + "/" + std::to_string(table_file.date_);
|
||||
std::string partition_path = table_path + "/" + table_file.segment_id_;
|
||||
return partition_path;
|
||||
}
|
||||
|
||||
|
@ -163,7 +164,7 @@ GetTableFilePath(const DBMetaOptions& options, meta::TableFileSchema& table_file
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
if (boost::filesystem::exists(file_path)) {
|
||||
if (boost::filesystem::exists(parent_path)) {
|
||||
table_file.location_ = file_path;
|
||||
return Status::OK();
|
||||
}
|
||||
|
@ -171,7 +172,7 @@ GetTableFilePath(const DBMetaOptions& options, meta::TableFileSchema& table_file
|
|||
for (auto& path : options.slave_paths_) {
|
||||
parent_path = ConstructParentFolder(path, table_file);
|
||||
file_path = parent_path + "/" + table_file.file_id_;
|
||||
if (boost::filesystem::exists(file_path)) {
|
||||
if (boost::filesystem::exists(parent_path)) {
|
||||
table_file.location_ = file_path;
|
||||
return Status::OK();
|
||||
}
|
||||
|
@ -192,6 +193,22 @@ DeleteTableFilePath(const DBMetaOptions& options, meta::TableFileSchema& table_f
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
DeleteSegment(const DBMetaOptions& options, meta::TableFileSchema& table_file) {
|
||||
utils::GetTableFilePath(options, table_file);
|
||||
std::string segment_dir;
|
||||
GetParentPath(table_file.location_, segment_dir);
|
||||
boost::filesystem::remove_all(segment_dir);
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
GetParentPath(const std::string& path, std::string& parent_path) {
|
||||
boost::filesystem::path p(path);
|
||||
parent_path = p.parent_path().string();
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
bool
|
||||
IsSameIndex(const TableIndex& index1, const TableIndex& index2) {
|
||||
return index1.engine_type_ == index2.engine_type_ && index1.nlist_ == index2.nlist_ &&
|
||||
|
|
|
@ -11,13 +11,13 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include <ctime>
|
||||
#include <string>
|
||||
|
||||
#include "Options.h"
|
||||
#include "db/Types.h"
|
||||
#include "db/meta/MetaTypes.h"
|
||||
|
||||
#include <ctime>
|
||||
#include <string>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace utils {
|
||||
|
@ -36,6 +36,11 @@ Status
|
|||
GetTableFilePath(const DBMetaOptions& options, meta::TableFileSchema& table_file);
|
||||
Status
|
||||
DeleteTableFilePath(const DBMetaOptions& options, meta::TableFileSchema& table_file);
|
||||
Status
|
||||
DeleteSegment(const DBMetaOptions& options, meta::TableFileSchema& table_file);
|
||||
|
||||
Status
|
||||
GetParentPath(const std::string& path, std::string& parent_path);
|
||||
|
||||
bool
|
||||
IsSameIndex(const TableIndex& index1, const TableIndex& index2);
|
||||
|
|
|
@ -84,8 +84,14 @@ class ExecutionEngine {
|
|||
// virtual std::shared_ptr<ExecutionEngine>
|
||||
// Clone() = 0;
|
||||
|
||||
// virtual Status
|
||||
// Merge(const std::string& location) = 0;
|
||||
|
||||
virtual Status
|
||||
Merge(const std::string& location) = 0;
|
||||
GetVectorByID(const int64_t& id, float* vector, bool hybrid) = 0;
|
||||
|
||||
virtual Status
|
||||
GetVectorByID(const int64_t& id, uint8_t* vector, bool hybrid) = 0;
|
||||
|
||||
virtual Status
|
||||
Search(int64_t n, const float* data, int64_t k, int64_t nprobe, float* distances, int64_t* labels, bool hybrid) = 0;
|
||||
|
@ -94,6 +100,10 @@ class ExecutionEngine {
|
|||
Search(int64_t n, const uint8_t* data, int64_t k, int64_t nprobe, float* distances, int64_t* labels,
|
||||
bool hybrid) = 0;
|
||||
|
||||
virtual Status
|
||||
Search(int64_t n, const std::vector<int64_t>& ids, int64_t k, int64_t nprobe, float* distances, int64_t* labels,
|
||||
bool hybrid) = 0;
|
||||
|
||||
virtual std::shared_ptr<ExecutionEngine>
|
||||
BuildIndex(const std::string& location, EngineType engine_type) = 0;
|
||||
|
||||
|
|
|
@ -11,13 +11,16 @@
|
|||
|
||||
#include "db/engine/ExecutionEngineImpl.h"
|
||||
|
||||
#include <faiss/utils/ConcurrentBitset.h>
|
||||
#include <fiu-local.h>
|
||||
|
||||
#include <stdexcept>
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
#include "cache/CpuCacheMgr.h"
|
||||
#include "cache/GpuCacheMgr.h"
|
||||
#include "db/Utils.h"
|
||||
#include "knowhere/common/Config.h"
|
||||
#include "metrics/Metrics.h"
|
||||
#include "scheduler/Utils.h"
|
||||
|
@ -25,8 +28,8 @@
|
|||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
#include "utils/TimeRecorder.h"
|
||||
#include "utils/ValidationUtil.h"
|
||||
|
||||
#include "wrapper/BinVecImpl.h"
|
||||
#include "wrapper/ConfAdapter.h"
|
||||
#include "wrapper/ConfAdapterMgr.h"
|
||||
|
@ -356,6 +359,7 @@ ExecutionEngineImpl::Serialize() {
|
|||
return status;
|
||||
}
|
||||
|
||||
/*
|
||||
Status
|
||||
ExecutionEngineImpl::Load(bool to_cache) {
|
||||
index_ = std::static_pointer_cast<VecIndex>(cache::CpuCacheMgr::GetInstance()->GetIndex(location_));
|
||||
|
@ -383,6 +387,134 @@ ExecutionEngineImpl::Load(bool to_cache) {
|
|||
}
|
||||
return Status::OK();
|
||||
}
|
||||
*/
|
||||
|
||||
Status
|
||||
ExecutionEngineImpl::Load(bool to_cache) {
|
||||
// TODO(zhiru): refactor
|
||||
|
||||
index_ = std::static_pointer_cast<VecIndex>(cache::CpuCacheMgr::GetInstance()->GetIndex(location_));
|
||||
bool already_in_cache = (index_ != nullptr);
|
||||
if (!already_in_cache) {
|
||||
std::string segment_dir;
|
||||
utils::GetParentPath(location_, segment_dir);
|
||||
auto segment_reader_ptr = std::make_shared<segment::SegmentReader>(segment_dir);
|
||||
|
||||
if (index_type_ == EngineType::FAISS_IDMAP || index_type_ == EngineType::FAISS_BIN_IDMAP) {
|
||||
index_ = index_type_ == EngineType::FAISS_IDMAP ? GetVecIndexFactory(IndexType::FAISS_IDMAP)
|
||||
: GetVecIndexFactory(IndexType::FAISS_BIN_IDMAP);
|
||||
|
||||
TempMetaConf temp_conf;
|
||||
temp_conf.gpu_id = gpu_num_;
|
||||
temp_conf.dim = dim_;
|
||||
auto status = MappingMetricType(metric_type_, temp_conf.metric_type);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
auto adapter = AdapterMgr::GetInstance().GetAdapter(index_->GetType());
|
||||
auto conf = adapter->Match(temp_conf);
|
||||
|
||||
status = segment_reader_ptr->Load();
|
||||
if (!status.ok()) {
|
||||
std::string msg = "Failed to load segment from " + location_;
|
||||
ENGINE_LOG_ERROR << msg;
|
||||
return Status(DB_ERROR, msg);
|
||||
}
|
||||
|
||||
segment::SegmentPtr segment_ptr;
|
||||
segment_reader_ptr->GetSegment(segment_ptr);
|
||||
auto& vectors = segment_ptr->vectors_ptr_;
|
||||
auto& deleted_docs = segment_ptr->deleted_docs_ptr_->GetDeletedDocs();
|
||||
|
||||
auto vectors_uids = vectors->GetUids();
|
||||
index_->SetUids(vectors_uids);
|
||||
|
||||
auto vectors_data = vectors->GetData();
|
||||
|
||||
faiss::ConcurrentBitsetPtr concurrent_bitset_ptr =
|
||||
std::make_shared<faiss::ConcurrentBitset>(vectors->GetCount());
|
||||
for (auto& offset : deleted_docs) {
|
||||
if (!concurrent_bitset_ptr->test(offset)) {
|
||||
concurrent_bitset_ptr->set(offset);
|
||||
}
|
||||
}
|
||||
|
||||
ErrorCode ec = KNOWHERE_UNEXPECTED_ERROR;
|
||||
if (index_type_ == EngineType::FAISS_IDMAP) {
|
||||
std::vector<float> float_vectors;
|
||||
float_vectors.resize(vectors_data.size() / sizeof(float));
|
||||
memcpy(float_vectors.data(), vectors_data.data(), vectors_data.size());
|
||||
ec = std::static_pointer_cast<BFIndex>(index_)->Build(conf);
|
||||
if (ec != KNOWHERE_SUCCESS) {
|
||||
return status;
|
||||
}
|
||||
status = std::static_pointer_cast<BFIndex>(index_)->AddWithoutIds(vectors->GetCount(),
|
||||
float_vectors.data(), Config());
|
||||
status = std::static_pointer_cast<BFIndex>(index_)->SetBlacklist(concurrent_bitset_ptr);
|
||||
} else if (index_type_ == EngineType::FAISS_BIN_IDMAP) {
|
||||
ec = std::static_pointer_cast<BinBFIndex>(index_)->Build(conf);
|
||||
if (ec != KNOWHERE_SUCCESS) {
|
||||
return status;
|
||||
}
|
||||
status = std::static_pointer_cast<BinBFIndex>(index_)->AddWithoutIds(vectors->GetCount(),
|
||||
vectors_data.data(), Config());
|
||||
status = std::static_pointer_cast<BinBFIndex>(index_)->SetBlacklist(concurrent_bitset_ptr);
|
||||
}
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Finished loading raw data from segment " << segment_dir;
|
||||
|
||||
} else {
|
||||
try {
|
||||
double physical_size = PhysicalSize();
|
||||
server::CollectExecutionEngineMetrics metrics(physical_size);
|
||||
index_ = read_index(location_);
|
||||
|
||||
if (index_ == nullptr) {
|
||||
std::string msg = "Failed to load index from " + location_;
|
||||
ENGINE_LOG_ERROR << msg;
|
||||
return Status(DB_ERROR, msg);
|
||||
} else {
|
||||
segment::DeletedDocsPtr deleted_docs_ptr;
|
||||
auto status = segment_reader_ptr->LoadDeletedDocs(deleted_docs_ptr);
|
||||
if (!status.ok()) {
|
||||
std::string msg = "Failed to load deleted docs from " + location_;
|
||||
ENGINE_LOG_ERROR << msg;
|
||||
return Status(DB_ERROR, msg);
|
||||
}
|
||||
auto& deleted_docs = deleted_docs_ptr->GetDeletedDocs();
|
||||
|
||||
faiss::ConcurrentBitsetPtr concurrent_bitset_ptr =
|
||||
std::make_shared<faiss::ConcurrentBitset>(index_->Count());
|
||||
for (auto& offset : deleted_docs) {
|
||||
if (!concurrent_bitset_ptr->test(offset)) {
|
||||
concurrent_bitset_ptr->set(offset);
|
||||
}
|
||||
}
|
||||
|
||||
index_->SetBlacklist(concurrent_bitset_ptr);
|
||||
|
||||
std::vector<segment::doc_id_t> uids;
|
||||
segment_reader_ptr->LoadUids(uids);
|
||||
index_->SetUids(uids);
|
||||
|
||||
ENGINE_LOG_DEBUG << "Finished loading index file from segment " << segment_dir;
|
||||
}
|
||||
} catch (std::exception& e) {
|
||||
ENGINE_LOG_ERROR << e.what();
|
||||
return Status(DB_ERROR, e.what());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (!already_in_cache && to_cache) {
|
||||
Cache();
|
||||
}
|
||||
return Status::OK();
|
||||
} // namespace engine
|
||||
|
||||
Status
|
||||
ExecutionEngineImpl::CopyToGpu(uint64_t device_id, bool hybrid) {
|
||||
|
@ -520,6 +652,7 @@ ExecutionEngineImpl::CopyToCpu() {
|
|||
// return ret;
|
||||
//}
|
||||
|
||||
/*
|
||||
Status
|
||||
ExecutionEngineImpl::Merge(const std::string& location) {
|
||||
if (location == location_) {
|
||||
|
@ -564,6 +697,7 @@ ExecutionEngineImpl::Merge(const std::string& location) {
|
|||
return Status(DB_ERROR, "file index type is not idmap");
|
||||
}
|
||||
}
|
||||
*/
|
||||
|
||||
ExecutionEnginePtr
|
||||
ExecutionEngineImpl::BuildIndex(const std::string& location, EngineType engine_type) {
|
||||
|
@ -664,6 +798,7 @@ ExecutionEngineImpl::Search(int64_t n, const float* data, int64_t k, int64_t npr
|
|||
}
|
||||
}
|
||||
#endif
|
||||
TimeRecorder rc("ExecutionEngineImpl::Search");
|
||||
|
||||
if (index_ == nullptr) {
|
||||
ENGINE_LOG_ERROR << "ExecutionEngineImpl: index is null, failed to search";
|
||||
|
@ -684,7 +819,20 @@ ExecutionEngineImpl::Search(int64_t n, const float* data, int64_t k, int64_t npr
|
|||
HybridLoad();
|
||||
}
|
||||
|
||||
rc.RecordSection("search prepare");
|
||||
auto status = index_->Search(n, data, distances, labels, conf);
|
||||
rc.RecordSection("search done");
|
||||
|
||||
// map offsets to ids
|
||||
const std::vector<segment::doc_id_t>& uids = index_->GetUids();
|
||||
for (int64_t i = 0; i < n * k; i++) {
|
||||
int64_t offset = labels[i];
|
||||
if (offset != -1) {
|
||||
labels[i] = uids[offset];
|
||||
}
|
||||
}
|
||||
|
||||
rc.RecordSection("map uids");
|
||||
|
||||
if (hybrid) {
|
||||
HybridUnset();
|
||||
|
@ -699,6 +847,8 @@ ExecutionEngineImpl::Search(int64_t n, const float* data, int64_t k, int64_t npr
|
|||
Status
|
||||
ExecutionEngineImpl::Search(int64_t n, const uint8_t* data, int64_t k, int64_t nprobe, float* distances,
|
||||
int64_t* labels, bool hybrid) {
|
||||
TimeRecorder rc("ExecutionEngineImpl::Search");
|
||||
|
||||
if (index_ == nullptr) {
|
||||
ENGINE_LOG_ERROR << "ExecutionEngineImpl: index is null, failed to search";
|
||||
return Status(DB_ERROR, "index is null");
|
||||
|
@ -718,7 +868,174 @@ ExecutionEngineImpl::Search(int64_t n, const uint8_t* data, int64_t k, int64_t n
|
|||
HybridLoad();
|
||||
}
|
||||
|
||||
rc.RecordSection("search prepare");
|
||||
auto status = index_->Search(n, data, distances, labels, conf);
|
||||
rc.RecordSection("search done");
|
||||
|
||||
// map offsets to ids
|
||||
const std::vector<segment::doc_id_t>& uids = index_->GetUids();
|
||||
for (int64_t i = 0; i < n * k; i++) {
|
||||
int64_t offset = labels[i];
|
||||
if (offset != -1) {
|
||||
labels[i] = uids[offset];
|
||||
}
|
||||
}
|
||||
|
||||
rc.RecordSection("map uids");
|
||||
|
||||
if (hybrid) {
|
||||
HybridUnset();
|
||||
}
|
||||
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Search error:" << status.message();
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
Status
|
||||
ExecutionEngineImpl::Search(int64_t n, const std::vector<int64_t>& ids, int64_t k, int64_t nprobe, float* distances,
|
||||
int64_t* labels, bool hybrid) {
|
||||
TimeRecorder rc("ExecutionEngineImpl::Search");
|
||||
|
||||
if (index_ == nullptr) {
|
||||
ENGINE_LOG_ERROR << "ExecutionEngineImpl: index is null, failed to search";
|
||||
return Status(DB_ERROR, "index is null");
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Search by ids Params: [k] " << k << " [nprobe] " << nprobe;
|
||||
|
||||
// TODO(linxj): remove here. Get conf from function
|
||||
TempMetaConf temp_conf;
|
||||
temp_conf.k = k;
|
||||
temp_conf.nprobe = nprobe;
|
||||
|
||||
auto adapter = AdapterMgr::GetInstance().GetAdapter(index_->GetType());
|
||||
auto conf = adapter->MatchSearch(temp_conf, index_->GetType());
|
||||
|
||||
if (hybrid) {
|
||||
HybridLoad();
|
||||
}
|
||||
|
||||
rc.RecordSection("search prepare");
|
||||
|
||||
// std::string segment_dir;
|
||||
// utils::GetParentPath(location_, segment_dir);
|
||||
// segment::SegmentReader segment_reader(segment_dir);
|
||||
// segment::IdBloomFilterPtr id_bloom_filter_ptr;
|
||||
// segment_reader.LoadBloomFilter(id_bloom_filter_ptr);
|
||||
|
||||
// Check if the id is present. If so, find its offset
|
||||
const std::vector<segment::doc_id_t>& uids = index_->GetUids();
|
||||
|
||||
std::vector<int64_t> offsets;
|
||||
/*
|
||||
std::vector<segment::doc_id_t> uids;
|
||||
auto status = segment_reader.LoadUids(uids);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
*/
|
||||
|
||||
// There is only one id in ids
|
||||
for (auto& id : ids) {
|
||||
// if (id_bloom_filter_ptr->Check(id)) {
|
||||
// if (uids.empty()) {
|
||||
// segment_reader.LoadUids(uids);
|
||||
// }
|
||||
// auto found = std::find(uids.begin(), uids.end(), id);
|
||||
// if (found != uids.end()) {
|
||||
// auto offset = std::distance(uids.begin(), found);
|
||||
// offsets.emplace_back(offset);
|
||||
// }
|
||||
// }
|
||||
auto found = std::find(uids.begin(), uids.end(), id);
|
||||
if (found != uids.end()) {
|
||||
auto offset = std::distance(uids.begin(), found);
|
||||
offsets.emplace_back(offset);
|
||||
}
|
||||
}
|
||||
|
||||
rc.RecordSection("get offset");
|
||||
|
||||
auto status = Status::OK();
|
||||
if (!offsets.empty()) {
|
||||
status = index_->SearchById(offsets.size(), offsets.data(), distances, labels, conf);
|
||||
rc.RecordSection("search by id done");
|
||||
|
||||
// map offsets to ids
|
||||
for (int64_t i = 0; i < offsets.size() * k; i++) {
|
||||
int64_t offset = labels[i];
|
||||
if (offset != -1) {
|
||||
labels[i] = uids[offset];
|
||||
}
|
||||
}
|
||||
rc.RecordSection("map uids");
|
||||
}
|
||||
|
||||
if (hybrid) {
|
||||
HybridUnset();
|
||||
}
|
||||
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Search error:" << status.message();
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
Status
|
||||
ExecutionEngineImpl::GetVectorByID(const int64_t& id, float* vector, bool hybrid) {
|
||||
if (index_ == nullptr) {
|
||||
ENGINE_LOG_ERROR << "ExecutionEngineImpl: index is null, failed to search";
|
||||
return Status(DB_ERROR, "index is null");
|
||||
}
|
||||
|
||||
// TODO(linxj): remove here. Get conf from function
|
||||
TempMetaConf temp_conf;
|
||||
|
||||
auto adapter = AdapterMgr::GetInstance().GetAdapter(index_->GetType());
|
||||
auto conf = adapter->MatchSearch(temp_conf, index_->GetType());
|
||||
|
||||
if (hybrid) {
|
||||
HybridLoad();
|
||||
}
|
||||
|
||||
// Only one id for now
|
||||
std::vector<int64_t> ids{id};
|
||||
auto status = index_->GetVectorById(1, ids.data(), vector, conf);
|
||||
|
||||
if (hybrid) {
|
||||
HybridUnset();
|
||||
}
|
||||
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Search error:" << status.message();
|
||||
}
|
||||
return status;
|
||||
}
|
||||
|
||||
Status
|
||||
ExecutionEngineImpl::GetVectorByID(const int64_t& id, uint8_t* vector, bool hybrid) {
|
||||
if (index_ == nullptr) {
|
||||
ENGINE_LOG_ERROR << "ExecutionEngineImpl: index is null, failed to search";
|
||||
return Status(DB_ERROR, "index is null");
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Get binary vector by id: " << id;
|
||||
|
||||
// TODO(linxj): remove here. Get conf from function
|
||||
TempMetaConf temp_conf;
|
||||
|
||||
auto adapter = AdapterMgr::GetInstance().GetAdapter(index_->GetType());
|
||||
auto conf = adapter->MatchSearch(temp_conf, index_->GetType());
|
||||
|
||||
if (hybrid) {
|
||||
HybridLoad();
|
||||
}
|
||||
|
||||
// Only one id for now
|
||||
std::vector<int64_t> ids{id};
|
||||
auto status = index_->GetVectorById(1, ids.data(), vector, conf);
|
||||
|
||||
if (hybrid) {
|
||||
HybridUnset();
|
||||
|
|
|
@ -11,11 +11,14 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "ExecutionEngine.h"
|
||||
#include "wrapper/VecIndex.h"
|
||||
#include <src/segment/SegmentReader.h>
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "ExecutionEngine.h"
|
||||
#include "wrapper/VecIndex.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
@ -64,8 +67,14 @@ class ExecutionEngineImpl : public ExecutionEngine {
|
|||
// ExecutionEnginePtr
|
||||
// Clone() override;
|
||||
|
||||
// Status
|
||||
// Merge(const std::string& location) override;
|
||||
|
||||
Status
|
||||
Merge(const std::string& location) override;
|
||||
GetVectorByID(const int64_t& id, float* vector, bool hybrid) override;
|
||||
|
||||
Status
|
||||
GetVectorByID(const int64_t& id, uint8_t* vector, bool hybrid) override;
|
||||
|
||||
Status
|
||||
Search(int64_t n, const float* data, int64_t k, int64_t nprobe, float* distances, int64_t* labels,
|
||||
|
@ -75,6 +84,10 @@ class ExecutionEngineImpl : public ExecutionEngine {
|
|||
Search(int64_t n, const uint8_t* data, int64_t k, int64_t nprobe, float* distances, int64_t* labels,
|
||||
bool hybrid = false) override;
|
||||
|
||||
Status
|
||||
Search(int64_t n, const std::vector<int64_t>& ids, int64_t k, int64_t nprobe, float* distances, int64_t* labels,
|
||||
bool hybrid) override;
|
||||
|
||||
ExecutionEnginePtr
|
||||
BuildIndex(const std::string& location, EngineType engine_type) override;
|
||||
|
||||
|
|
|
@ -11,23 +11,40 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "db/Types.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <memory>
|
||||
#include <set>
|
||||
#include <string>
|
||||
|
||||
#include "db/Types.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
class MemManager {
|
||||
public:
|
||||
virtual Status
|
||||
InsertVectors(const std::string& table_id, VectorsData& vectors) = 0;
|
||||
InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const float* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) = 0;
|
||||
|
||||
virtual Status
|
||||
Serialize(std::set<std::string>& table_ids) = 0;
|
||||
InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const uint8_t* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) = 0;
|
||||
|
||||
virtual Status
|
||||
DeleteVector(const std::string& table_id, IDNumber vector_id, uint64_t lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
DeleteVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, uint64_t lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
Flush(const std::string& table_id) = 0;
|
||||
|
||||
virtual Status
|
||||
Flush(std::set<std::string>& table_ids) = 0;
|
||||
|
||||
// virtual Status
|
||||
// Serialize(std::set<std::string>& table_ids) = 0;
|
||||
|
||||
virtual Status
|
||||
EraseMemVector(const std::string& table_id) = 0;
|
||||
|
|
|
@ -10,12 +10,13 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/insert/MemManagerImpl.h"
|
||||
|
||||
#include <thread>
|
||||
|
||||
#include "VectorSource.h"
|
||||
#include "db/Constants.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
#include <thread>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
|
@ -31,37 +32,177 @@ MemManagerImpl::GetMemByTable(const std::string& table_id) {
|
|||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::InsertVectors(const std::string& table_id, VectorsData& vectors) {
|
||||
while (GetCurrentMem() > options_.insert_buffer_size_) {
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(1));
|
||||
MemManagerImpl::InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const float* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) {
|
||||
flushed_tables.clear();
|
||||
if (GetCurrentMem() > options_.insert_buffer_size_) {
|
||||
ENGINE_LOG_DEBUG << "Insert buffer size exceeds limit. Performing force flush";
|
||||
auto status = Flush(flushed_tables);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
}
|
||||
|
||||
VectorsData vectors_data;
|
||||
vectors_data.vector_count_ = length;
|
||||
vectors_data.float_data_.resize(length * dim);
|
||||
memcpy(vectors_data.float_data_.data(), vectors, length * dim * sizeof(float));
|
||||
vectors_data.id_array_.resize(length);
|
||||
memcpy(vectors_data.id_array_.data(), vector_ids, length * sizeof(IDNumber));
|
||||
VectorSourcePtr source = std::make_shared<VectorSource>(vectors_data);
|
||||
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
|
||||
return InsertVectorsNoLock(table_id, vectors);
|
||||
return InsertVectorsNoLock(table_id, source, lsn);
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::InsertVectorsNoLock(const std::string& table_id, VectorsData& vectors) {
|
||||
MemTablePtr mem = GetMemByTable(table_id);
|
||||
VectorSourcePtr source = std::make_shared<VectorSource>(vectors);
|
||||
|
||||
auto status = mem->Add(source);
|
||||
if (status.ok()) {
|
||||
if (vectors.id_array_.empty()) {
|
||||
vectors.id_array_ = source->GetVectorIds();
|
||||
MemManagerImpl::InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const uint8_t* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) {
|
||||
flushed_tables.clear();
|
||||
if (GetCurrentMem() > options_.insert_buffer_size_) {
|
||||
ENGINE_LOG_DEBUG << "Insert buffer size exceeds limit. Performing force flush";
|
||||
auto status = Flush(flushed_tables);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
}
|
||||
|
||||
VectorsData vectors_data;
|
||||
vectors_data.vector_count_ = length;
|
||||
vectors_data.binary_data_.resize(length * dim);
|
||||
memcpy(vectors_data.binary_data_.data(), vectors, length * dim * sizeof(uint8_t));
|
||||
vectors_data.id_array_.resize(length);
|
||||
memcpy(vectors_data.id_array_.data(), vector_ids, length * sizeof(IDNumber));
|
||||
VectorSourcePtr source = std::make_shared<VectorSource>(vectors_data);
|
||||
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
|
||||
return InsertVectorsNoLock(table_id, source, lsn);
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::InsertVectorsNoLock(const std::string& table_id, const VectorSourcePtr& source, uint64_t lsn) {
|
||||
MemTablePtr mem = GetMemByTable(table_id);
|
||||
mem->SetLSN(lsn);
|
||||
|
||||
auto status = mem->Add(source);
|
||||
return status;
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::DeleteVector(const std::string& table_id, IDNumber vector_id, uint64_t lsn) {
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
MemTablePtr mem = GetMemByTable(table_id);
|
||||
mem->SetLSN(lsn);
|
||||
auto status = mem->Delete(vector_id);
|
||||
return status;
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::DeleteVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, uint64_t lsn) {
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
MemTablePtr mem = GetMemByTable(table_id);
|
||||
mem->SetLSN(lsn);
|
||||
|
||||
IDNumbers ids;
|
||||
ids.resize(length);
|
||||
memcpy(ids.data(), vector_ids, length * sizeof(IDNumber));
|
||||
|
||||
auto status = mem->Delete(ids);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
// // TODO(zhiru): loop for now
|
||||
// for (auto& id : ids) {
|
||||
// auto status = mem->Delete(id);
|
||||
// if (!status.ok()) {
|
||||
// return status;
|
||||
// }
|
||||
// }
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::Flush(const std::string& table_id) {
|
||||
ToImmutable(table_id);
|
||||
// TODO: There is actually only one memTable in the immutable list
|
||||
MemList temp_immutable_list;
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
immu_mem_list_.swap(temp_immutable_list);
|
||||
}
|
||||
|
||||
std::unique_lock<std::mutex> lock(serialization_mtx_);
|
||||
auto max_lsn = GetMaxLSN(temp_immutable_list);
|
||||
for (auto& mem : temp_immutable_list) {
|
||||
ENGINE_LOG_DEBUG << "Flushing table: " << mem->GetTableId();
|
||||
auto status = mem->Serialize(max_lsn);
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Flush table " << mem->GetTableId() << " failed";
|
||||
return status;
|
||||
}
|
||||
ENGINE_LOG_DEBUG << "Flushed table: " << mem->GetTableId();
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::Flush(std::set<std::string>& table_ids) {
|
||||
ToImmutable();
|
||||
|
||||
MemList temp_immutable_list;
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
immu_mem_list_.swap(temp_immutable_list);
|
||||
}
|
||||
|
||||
std::unique_lock<std::mutex> lock(serialization_mtx_);
|
||||
table_ids.clear();
|
||||
auto max_lsn = GetMaxLSN(temp_immutable_list);
|
||||
for (auto& mem : temp_immutable_list) {
|
||||
ENGINE_LOG_DEBUG << "Flushing table: " << mem->GetTableId();
|
||||
auto status = mem->Serialize(max_lsn);
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Flush table " << mem->GetTableId() << " failed";
|
||||
return status;
|
||||
}
|
||||
table_ids.insert(mem->GetTableId());
|
||||
ENGINE_LOG_DEBUG << "Flushed table: " << mem->GetTableId();
|
||||
}
|
||||
|
||||
meta_->SetGlobalLastLSN(max_lsn);
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::ToImmutable(const std::string& table_id) {
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
auto memIt = mem_id_map_.find(table_id);
|
||||
if (memIt != mem_id_map_.end()) {
|
||||
if (!memIt->second->Empty()) {
|
||||
immu_mem_list_.push_back(memIt->second);
|
||||
mem_id_map_.erase(memIt);
|
||||
}
|
||||
// std::string err_msg = "Could not find table = " + table_id + " to flush";
|
||||
// ENGINE_LOG_ERROR << err_msg;
|
||||
// return Status(DB_NOT_FOUND, err_msg);
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::ToImmutable() {
|
||||
std::unique_lock<std::mutex> lock(mutex_);
|
||||
MemIdMap temp_map;
|
||||
for (auto& kv : mem_id_map_) {
|
||||
if (kv.second->Empty()) {
|
||||
// empty table, no need to serialize
|
||||
// empty table without any deletes, no need to serialize
|
||||
temp_map.insert(kv);
|
||||
} else {
|
||||
immu_mem_list_.push_back(kv.second);
|
||||
|
@ -72,19 +213,6 @@ MemManagerImpl::ToImmutable() {
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::Serialize(std::set<std::string>& table_ids) {
|
||||
ToImmutable();
|
||||
std::unique_lock<std::mutex> lock(serialization_mtx_);
|
||||
table_ids.clear();
|
||||
for (auto& mem : immu_mem_list_) {
|
||||
mem->Serialize();
|
||||
table_ids.insert(mem->GetTableId());
|
||||
}
|
||||
immu_mem_list_.clear();
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemManagerImpl::EraseMemVector(const std::string& table_id) {
|
||||
{ // erase MemVector from rapid-insert cache
|
||||
|
@ -132,5 +260,17 @@ MemManagerImpl::GetCurrentMem() {
|
|||
return GetCurrentMutableMem() + GetCurrentImmutableMem();
|
||||
}
|
||||
|
||||
uint64_t
|
||||
MemManagerImpl::GetMaxLSN(const MemList& tables) {
|
||||
uint64_t max_lsn = 0;
|
||||
for (auto& table : tables) {
|
||||
auto cur_lsn = table->GetLSN();
|
||||
if (table->GetLSN() > max_lsn) {
|
||||
max_lsn = cur_lsn;
|
||||
}
|
||||
}
|
||||
return max_lsn;
|
||||
}
|
||||
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
||||
|
|
|
@ -11,12 +11,6 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "MemManager.h"
|
||||
#include "MemTable.h"
|
||||
#include "db/meta/Meta.h"
|
||||
#include "server/Config.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <ctime>
|
||||
#include <map>
|
||||
#include <memory>
|
||||
|
@ -25,12 +19,20 @@
|
|||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "MemManager.h"
|
||||
#include "MemTable.h"
|
||||
#include "db/meta/Meta.h"
|
||||
#include "server/Config.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
class MemManagerImpl : public MemManager {
|
||||
public:
|
||||
using Ptr = std::shared_ptr<MemManagerImpl>;
|
||||
using MemIdMap = std::map<std::string, MemTablePtr>;
|
||||
using MemList = std::vector<MemTablePtr>;
|
||||
|
||||
MemManagerImpl(const meta::MetaPtr& meta, const DBOptions& options) : meta_(meta), options_(options) {
|
||||
server::Config& config = server::Config::GetInstance();
|
||||
|
@ -56,10 +58,27 @@ class MemManagerImpl : public MemManager {
|
|||
}
|
||||
|
||||
Status
|
||||
InsertVectors(const std::string& table_id, VectorsData& vectors) override;
|
||||
InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const float* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) override;
|
||||
|
||||
Status
|
||||
Serialize(std::set<std::string>& table_ids) override;
|
||||
InsertVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, int64_t dim,
|
||||
const uint8_t* vectors, uint64_t lsn, std::set<std::string>& flushed_tables) override;
|
||||
|
||||
Status
|
||||
DeleteVector(const std::string& table_id, IDNumber vector_id, uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
DeleteVectors(const std::string& table_id, int64_t length, const IDNumber* vector_ids, uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
Flush(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
Flush(std::set<std::string>& table_ids) override;
|
||||
|
||||
// Status
|
||||
// Serialize(std::set<std::string>& table_ids) override;
|
||||
|
||||
Status
|
||||
EraseMemVector(const std::string& table_id) override;
|
||||
|
@ -78,12 +97,17 @@ class MemManagerImpl : public MemManager {
|
|||
GetMemByTable(const std::string& table_id);
|
||||
|
||||
Status
|
||||
InsertVectorsNoLock(const std::string& table_id, VectorsData& vectors);
|
||||
InsertVectorsNoLock(const std::string& table_id, const VectorSourcePtr& source, uint64_t lsn);
|
||||
|
||||
Status
|
||||
ToImmutable();
|
||||
|
||||
using MemIdMap = std::map<std::string, MemTablePtr>;
|
||||
using MemList = std::vector<MemTablePtr>;
|
||||
Status
|
||||
ToImmutable(const std::string& table_id);
|
||||
|
||||
uint64_t
|
||||
GetMaxLSN(const MemList& tables);
|
||||
|
||||
std::string identity_;
|
||||
MemIdMap mem_id_map_;
|
||||
MemList immu_mem_list_;
|
||||
|
|
|
@ -10,10 +10,20 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/insert/MemTable.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
#include <cache/CpuCacheMgr.h>
|
||||
#include <segment/SegmentReader.h>
|
||||
#include <wrapper/VecIndex.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <chrono>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <unordered_map>
|
||||
|
||||
#include "db/OngoingFileChecker.h"
|
||||
#include "db/Utils.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
@ -23,7 +33,7 @@ MemTable::MemTable(const std::string& table_id, const meta::MetaPtr& meta, const
|
|||
}
|
||||
|
||||
Status
|
||||
MemTable::Add(VectorSourcePtr& source) {
|
||||
MemTable::Add(const VectorSourcePtr& source) {
|
||||
while (!source->AllAdded()) {
|
||||
MemTableFilePtr current_mem_table_file;
|
||||
if (!mem_table_file_list_.empty()) {
|
||||
|
@ -50,6 +60,32 @@ MemTable::Add(VectorSourcePtr& source) {
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemTable::Delete(segment::doc_id_t doc_id) {
|
||||
// Locate which table file the doc id lands in
|
||||
for (auto& table_file : mem_table_file_list_) {
|
||||
table_file->Delete(doc_id);
|
||||
}
|
||||
// Add the id to delete list so it can be applied to other segments on disk during the next flush
|
||||
doc_ids_to_delete_.insert(doc_id);
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemTable::Delete(const std::vector<segment::doc_id_t>& doc_ids) {
|
||||
// Locate which table file the doc id lands in
|
||||
for (auto& table_file : mem_table_file_list_) {
|
||||
table_file->Delete(doc_ids);
|
||||
}
|
||||
// Add the id to delete list so it can be applied to other segments on disk during the next flush
|
||||
for (auto& id : doc_ids) {
|
||||
doc_ids_to_delete_.insert(id);
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
void
|
||||
MemTable::GetCurrentMemTableFile(MemTableFilePtr& mem_table_file) {
|
||||
mem_table_file = mem_table_file_list_.back();
|
||||
|
@ -61,23 +97,48 @@ MemTable::GetTableFileCount() {
|
|||
}
|
||||
|
||||
Status
|
||||
MemTable::Serialize() {
|
||||
for (auto mem_table_file = mem_table_file_list_.begin(); mem_table_file != mem_table_file_list_.end();) {
|
||||
auto status = (*mem_table_file)->Serialize();
|
||||
MemTable::Serialize(uint64_t wal_lsn) {
|
||||
auto start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
if (!doc_ids_to_delete_.empty()) {
|
||||
auto status = ApplyDeletes();
|
||||
if (!status.ok()) {
|
||||
std::string err_msg = "Insert data serialize failed: " + status.ToString();
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
return Status(DB_ERROR, err_msg);
|
||||
return Status(DB_ERROR, status.message());
|
||||
}
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
mem_table_file = mem_table_file_list_.erase(mem_table_file);
|
||||
}
|
||||
|
||||
for (auto mem_table_file = mem_table_file_list_.begin(); mem_table_file != mem_table_file_list_.end();) {
|
||||
auto status = (*mem_table_file)->Serialize(wal_lsn);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Flushed segment " << (*mem_table_file)->GetSegmentId();
|
||||
|
||||
{
|
||||
std::lock_guard<std::mutex> lock(mutex_);
|
||||
mem_table_file = mem_table_file_list_.erase(mem_table_file);
|
||||
}
|
||||
}
|
||||
|
||||
// Update flush lsn
|
||||
auto status = meta_->UpdateTableFlushLSN(table_id_, wal_lsn);
|
||||
if (!status.ok()) {
|
||||
std::string err_msg = "Failed to write flush lsn to meta: " + status.ToString();
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
return Status(DB_ERROR, err_msg);
|
||||
}
|
||||
|
||||
auto end = std::chrono::high_resolution_clock::now();
|
||||
std::chrono::duration<double> diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Finished flushing for table " << table_id_ << " in " << diff.count() << " s";
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
bool
|
||||
MemTable::Empty() {
|
||||
return mem_table_file_list_.empty();
|
||||
return mem_table_file_list_.empty() && doc_ids_to_delete_.empty();
|
||||
}
|
||||
|
||||
const std::string&
|
||||
|
@ -95,5 +156,236 @@ MemTable::GetCurrentMem() {
|
|||
return total_mem;
|
||||
}
|
||||
|
||||
Status
|
||||
MemTable::ApplyDeletes() {
|
||||
// Applying deletes to other segments on disk and their corresponding cache:
|
||||
// For each segment in table:
|
||||
// Load its bloom filter
|
||||
// For each id in delete list:
|
||||
// If present, add the uid to segment's uid list
|
||||
// For each segment
|
||||
// Get its cache if exists
|
||||
// Load its uids file.
|
||||
// Scan the uids, if any uid in segment's uid list exists:
|
||||
// add its offset to deletedDoc
|
||||
// remove the id from bloom filter
|
||||
// set black list in cache
|
||||
// Serialize segment's deletedDoc TODO(zhiru): append directly to previous file for now, may have duplicates
|
||||
// Serialize bloom filter
|
||||
|
||||
ENGINE_LOG_DEBUG << "Applying " << doc_ids_to_delete_.size() << " deletes in table: " << table_id_;
|
||||
|
||||
auto start_total = std::chrono::high_resolution_clock::now();
|
||||
|
||||
auto start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
std::vector<int> file_types{meta::TableFileSchema::FILE_TYPE::RAW, meta::TableFileSchema::FILE_TYPE::TO_INDEX,
|
||||
meta::TableFileSchema::FILE_TYPE::BACKUP};
|
||||
meta::TableFilesSchema table_files;
|
||||
auto status = meta_->FilesByType(table_id_, file_types, table_files);
|
||||
if (!status.ok()) {
|
||||
std::string err_msg = "Failed to apply deletes: " + status.ToString();
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
return Status(DB_ERROR, err_msg);
|
||||
}
|
||||
|
||||
OngoingFileChecker::GetInstance().MarkOngoingFiles(table_files);
|
||||
|
||||
std::unordered_map<size_t, std::vector<segment::doc_id_t>> ids_to_check_map;
|
||||
|
||||
for (size_t i = 0; i < table_files.size(); ++i) {
|
||||
auto& table_file = table_files[i];
|
||||
std::string segment_dir;
|
||||
utils::GetParentPath(table_file.location_, segment_dir);
|
||||
|
||||
segment::SegmentReader segment_reader(segment_dir);
|
||||
segment::IdBloomFilterPtr id_bloom_filter_ptr;
|
||||
segment_reader.LoadBloomFilter(id_bloom_filter_ptr);
|
||||
|
||||
for (auto& id : doc_ids_to_delete_) {
|
||||
if (id_bloom_filter_ptr->Check(id)) {
|
||||
ids_to_check_map[i].emplace_back(id);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
meta::TableFilesSchema files_to_check;
|
||||
for (auto& kv : ids_to_check_map) {
|
||||
files_to_check.emplace_back(table_files[kv.first]);
|
||||
}
|
||||
|
||||
OngoingFileChecker::GetInstance().UnmarkOngoingFiles(table_files);
|
||||
|
||||
OngoingFileChecker::GetInstance().MarkOngoingFiles(files_to_check);
|
||||
|
||||
auto end = std::chrono::high_resolution_clock::now();
|
||||
std::chrono::duration<double> diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Found " << ids_to_check_map.size() << " segment to apply deletes in " << diff.count() << " s";
|
||||
|
||||
meta::TableFilesSchema table_files_to_update;
|
||||
|
||||
for (auto& kv : ids_to_check_map) {
|
||||
auto& table_file = table_files[kv.first];
|
||||
ENGINE_LOG_DEBUG << "Applying deletes in segment: " << table_file.segment_id_;
|
||||
|
||||
start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
std::string segment_dir;
|
||||
utils::GetParentPath(table_file.location_, segment_dir);
|
||||
segment::SegmentReader segment_reader(segment_dir);
|
||||
|
||||
auto index =
|
||||
std::static_pointer_cast<VecIndex>(cache::CpuCacheMgr::GetInstance()->GetIndex(table_file.location_));
|
||||
faiss::ConcurrentBitsetPtr blacklist = nullptr;
|
||||
if (index != nullptr) {
|
||||
status = index->GetBlacklist(blacklist);
|
||||
}
|
||||
|
||||
std::vector<segment::doc_id_t> uids;
|
||||
status = segment_reader.LoadUids(uids);
|
||||
if (!status.ok()) {
|
||||
break;
|
||||
}
|
||||
segment::IdBloomFilterPtr id_bloom_filter_ptr;
|
||||
status = segment_reader.LoadBloomFilter(id_bloom_filter_ptr);
|
||||
if (!status.ok()) {
|
||||
break;
|
||||
}
|
||||
|
||||
auto& ids_to_check = kv.second;
|
||||
|
||||
segment::DeletedDocsPtr deleted_docs = std::make_shared<segment::DeletedDocs>();
|
||||
|
||||
end = std::chrono::high_resolution_clock::now();
|
||||
diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Loading uids and deleted docs took " << diff.count() << " s";
|
||||
|
||||
start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
std::sort(ids_to_check.begin(), ids_to_check.end());
|
||||
|
||||
end = std::chrono::high_resolution_clock::now();
|
||||
diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Sorting " << ids_to_check.size() << " ids took " << diff.count() << " s";
|
||||
|
||||
size_t delete_count = 0;
|
||||
auto find_diff = std::chrono::duration<double>::zero();
|
||||
auto set_diff = std::chrono::duration<double>::zero();
|
||||
|
||||
for (size_t i = 0; i < uids.size(); ++i) {
|
||||
auto find_start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
auto found = std::binary_search(ids_to_check.begin(), ids_to_check.end(), uids[i]);
|
||||
|
||||
auto find_end = std::chrono::high_resolution_clock::now();
|
||||
find_diff += (find_end - find_start);
|
||||
|
||||
if (found) {
|
||||
auto set_start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
delete_count++;
|
||||
|
||||
deleted_docs->AddDeletedDoc(i);
|
||||
|
||||
if (id_bloom_filter_ptr->Check(uids[i])) {
|
||||
id_bloom_filter_ptr->Remove(uids[i]);
|
||||
}
|
||||
|
||||
if (blacklist != nullptr) {
|
||||
if (!blacklist->test(i)) {
|
||||
blacklist->set(i);
|
||||
}
|
||||
}
|
||||
|
||||
auto set_end = std::chrono::high_resolution_clock::now();
|
||||
set_diff += (set_end - set_start);
|
||||
}
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Finding " << ids_to_check.size() << " uids in " << uids.size() << " uids took "
|
||||
<< find_diff.count() << " s in total";
|
||||
ENGINE_LOG_DEBUG << "Setting deleted docs and bloom filter took " << set_diff.count() << " s in total";
|
||||
|
||||
if (index != nullptr) {
|
||||
index->SetBlacklist(blacklist);
|
||||
}
|
||||
|
||||
start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
segment::Segment tmp_segment;
|
||||
segment::SegmentWriter segment_writer(segment_dir);
|
||||
status = segment_writer.WriteDeletedDocs(deleted_docs);
|
||||
if (!status.ok()) {
|
||||
break;
|
||||
}
|
||||
|
||||
end = std::chrono::high_resolution_clock::now();
|
||||
diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Appended " << deleted_docs->GetSize()
|
||||
<< " offsets to deleted docs in segment: " << table_file.segment_id_ << " in " << diff.count()
|
||||
<< " s";
|
||||
|
||||
start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
status = segment_writer.WriteBloomFilter(id_bloom_filter_ptr);
|
||||
if (!status.ok()) {
|
||||
break;
|
||||
}
|
||||
end = std::chrono::high_resolution_clock::now();
|
||||
diff = end - start;
|
||||
ENGINE_LOG_DEBUG << "Updated bloom filter in segment: " << table_file.segment_id_ << " in " << diff.count()
|
||||
<< " s";
|
||||
|
||||
// Update table file row count
|
||||
start = std::chrono::high_resolution_clock::now();
|
||||
|
||||
auto& segment_id = table_file.segment_id_;
|
||||
meta::TableFilesSchema segment_files;
|
||||
status = meta_->GetTableFilesBySegmentId(segment_id, segment_files);
|
||||
if (!status.ok()) {
|
||||
break;
|
||||
}
|
||||
for (auto& file : segment_files) {
|
||||
if (file.file_type_ == meta::TableFileSchema::RAW || file.file_type_ == meta::TableFileSchema::TO_INDEX ||
|
||||
file.file_type_ == meta::TableFileSchema::INDEX || file.file_type_ == meta::TableFileSchema::BACKUP) {
|
||||
file.row_count_ -= delete_count;
|
||||
table_files_to_update.emplace_back(file);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
end = std::chrono::high_resolution_clock::now();
|
||||
diff = end - start;
|
||||
|
||||
status = meta_->UpdateTableFiles(table_files_to_update);
|
||||
ENGINE_LOG_DEBUG << "Updated meta in table: " << table_id_ << " in " << diff.count() << " s";
|
||||
|
||||
if (!status.ok()) {
|
||||
std::string err_msg = "Failed to apply deletes: " + status.ToString();
|
||||
ENGINE_LOG_ERROR << err_msg;
|
||||
return Status(DB_ERROR, err_msg);
|
||||
}
|
||||
|
||||
doc_ids_to_delete_.clear();
|
||||
|
||||
auto end_total = std::chrono::high_resolution_clock::now();
|
||||
std::chrono::duration<double> diff_total = end_total - start_total;
|
||||
ENGINE_LOG_DEBUG << "Finished applying deletes in table " << table_id_ << " in " << diff_total.count() << " s";
|
||||
|
||||
OngoingFileChecker::GetInstance().UnmarkOngoingFiles(files_to_check);
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
uint64_t
|
||||
MemTable::GetLSN() {
|
||||
return lsn_;
|
||||
}
|
||||
|
||||
void
|
||||
MemTable::SetLSN(uint64_t lsn) {
|
||||
lsn_ = lsn;
|
||||
}
|
||||
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
||||
|
|
|
@ -11,15 +11,17 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include <atomic>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <set>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "MemTableFile.h"
|
||||
#include "VectorSource.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
|
@ -30,7 +32,13 @@ class MemTable {
|
|||
MemTable(const std::string& table_id, const meta::MetaPtr& meta, const DBOptions& options);
|
||||
|
||||
Status
|
||||
Add(VectorSourcePtr& source);
|
||||
Add(const VectorSourcePtr& source);
|
||||
|
||||
Status
|
||||
Delete(segment::doc_id_t doc_id);
|
||||
|
||||
Status
|
||||
Delete(const std::vector<segment::doc_id_t>& doc_ids);
|
||||
|
||||
void
|
||||
GetCurrentMemTableFile(MemTableFilePtr& mem_table_file);
|
||||
|
@ -39,7 +47,7 @@ class MemTable {
|
|||
GetTableFileCount();
|
||||
|
||||
Status
|
||||
Serialize();
|
||||
Serialize(uint64_t wal_lsn);
|
||||
|
||||
bool
|
||||
Empty();
|
||||
|
@ -50,6 +58,16 @@ class MemTable {
|
|||
size_t
|
||||
GetCurrentMem();
|
||||
|
||||
uint64_t
|
||||
GetLSN();
|
||||
|
||||
void
|
||||
SetLSN(uint64_t lsn);
|
||||
|
||||
private:
|
||||
Status
|
||||
ApplyDeletes();
|
||||
|
||||
private:
|
||||
const std::string table_id_;
|
||||
|
||||
|
@ -60,6 +78,10 @@ class MemTable {
|
|||
DBOptions options_;
|
||||
|
||||
std::mutex mutex_;
|
||||
|
||||
std::set<segment::doc_id_t> doc_ids_to_delete_;
|
||||
|
||||
std::atomic<uint64_t> lsn_;
|
||||
}; // MemTable
|
||||
|
||||
using MemTablePtr = std::shared_ptr<MemTable>;
|
||||
|
|
|
@ -10,13 +10,20 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/insert/MemTableFile.h"
|
||||
|
||||
#include <algorithm>
|
||||
#include <cmath>
|
||||
#include <iterator>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "db/Constants.h"
|
||||
#include "db/Utils.h"
|
||||
#include "db/engine/EngineFactory.h"
|
||||
#include "metrics/Metrics.h"
|
||||
#include "segment/SegmentReader.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
#include <cmath>
|
||||
#include <string>
|
||||
#include "utils/ValidationUtil.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
@ -26,9 +33,12 @@ MemTableFile::MemTableFile(const std::string& table_id, const meta::MetaPtr& met
|
|||
current_mem_ = 0;
|
||||
auto status = CreateTableFile();
|
||||
if (status.ok()) {
|
||||
execution_engine_ = EngineFactory::Build(
|
||||
/*execution_engine_ = EngineFactory::Build(
|
||||
table_file_schema_.dimension_, table_file_schema_.location_, (EngineType)table_file_schema_.engine_type_,
|
||||
(MetricType)table_file_schema_.metric_type_, table_file_schema_.nlist_);
|
||||
(MetricType)table_file_schema_.metric_type_, table_file_schema_.nlist_);*/
|
||||
std::string directory;
|
||||
utils::GetParentPath(table_file_schema_.location_, directory);
|
||||
segment_writer_ptr_ = std::make_shared<segment::SegmentWriter>(directory);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -47,7 +57,7 @@ MemTableFile::CreateTableFile() {
|
|||
}
|
||||
|
||||
Status
|
||||
MemTableFile::Add(VectorSourcePtr& source) {
|
||||
MemTableFile::Add(const VectorSourcePtr& source) {
|
||||
if (table_file_schema_.dimension_ <= 0) {
|
||||
std::string err_msg =
|
||||
"MemTableFile::Add: table_file_schema dimension = " + std::to_string(table_file_schema_.dimension_) +
|
||||
|
@ -61,7 +71,9 @@ MemTableFile::Add(VectorSourcePtr& source) {
|
|||
if (mem_left >= single_vector_mem_size) {
|
||||
size_t num_vectors_to_add = std::ceil(mem_left / single_vector_mem_size);
|
||||
size_t num_vectors_added;
|
||||
auto status = source->Add(execution_engine_, table_file_schema_, num_vectors_to_add, num_vectors_added);
|
||||
|
||||
auto status = source->Add(/*execution_engine_,*/ segment_writer_ptr_, table_file_schema_, num_vectors_to_add,
|
||||
num_vectors_added);
|
||||
if (status.ok()) {
|
||||
current_mem_ += (num_vectors_added * single_vector_mem_size);
|
||||
}
|
||||
|
@ -70,6 +82,39 @@ MemTableFile::Add(VectorSourcePtr& source) {
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemTableFile::Delete(segment::doc_id_t doc_id) {
|
||||
segment::SegmentPtr segment_ptr;
|
||||
segment_writer_ptr_->GetSegment(segment_ptr);
|
||||
// Check wither the doc_id is present, if yes, delete it's corresponding buffer
|
||||
auto uids = segment_ptr->vectors_ptr_->GetUids();
|
||||
auto found = std::find(uids.begin(), uids.end(), doc_id);
|
||||
if (found != uids.end()) {
|
||||
auto offset = std::distance(uids.begin(), found);
|
||||
segment_ptr->vectors_ptr_->Erase(offset);
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MemTableFile::Delete(const std::vector<segment::doc_id_t>& doc_ids) {
|
||||
segment::SegmentPtr segment_ptr;
|
||||
segment_writer_ptr_->GetSegment(segment_ptr);
|
||||
// Check wither the doc_id is present, if yes, delete it's corresponding buffer
|
||||
auto uids = segment_ptr->vectors_ptr_->GetUids();
|
||||
for (auto& doc_id : doc_ids) {
|
||||
auto found = std::find(uids.begin(), uids.end(), doc_id);
|
||||
if (found != uids.end()) {
|
||||
auto offset = std::distance(uids.begin(), found);
|
||||
segment_ptr->vectors_ptr_->Erase(offset);
|
||||
uids = segment_ptr->vectors_ptr_->GetUids();
|
||||
}
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
size_t
|
||||
MemTableFile::GetCurrentMem() {
|
||||
return current_mem_;
|
||||
|
@ -87,15 +132,35 @@ MemTableFile::IsFull() {
|
|||
}
|
||||
|
||||
Status
|
||||
MemTableFile::Serialize() {
|
||||
MemTableFile::Serialize(uint64_t wal_lsn) {
|
||||
size_t size = GetCurrentMem();
|
||||
server::CollectSerializeMetrics metrics(size);
|
||||
|
||||
execution_engine_->Serialize();
|
||||
table_file_schema_.file_size_ = execution_engine_->PhysicalSize();
|
||||
table_file_schema_.row_count_ = execution_engine_->Count();
|
||||
auto status = segment_writer_ptr_->Serialize();
|
||||
if (!status.ok()) {
|
||||
ENGINE_LOG_ERROR << "Failed to serialize segment: " << table_file_schema_.segment_id_;
|
||||
|
||||
// if index type isn't IDMAP, set file type to TO_INDEX if file size execeed index_file_size
|
||||
/* Can't mark it as to_delete because data is stored in this mem table file. Any further flush
|
||||
* will try to serialize the same mem table file and it won't be able to find the directory
|
||||
* to write to or update the associated table file in meta.
|
||||
*
|
||||
table_file_schema_.file_type_ = meta::TableFileSchema::TO_DELETE;
|
||||
meta_->UpdateTableFile(table_file_schema_);
|
||||
ENGINE_LOG_DEBUG << "Failed to serialize segment, mark file: " << table_file_schema_.file_id_
|
||||
<< " to to_delete";
|
||||
*/
|
||||
return status;
|
||||
}
|
||||
|
||||
// execution_engine_->Serialize();
|
||||
|
||||
// TODO(zhiru):
|
||||
// table_file_schema_.file_size_ = execution_engine_->PhysicalSize();
|
||||
// table_file_schema_.row_count_ = execution_engine_->Count();
|
||||
table_file_schema_.file_size_ = segment_writer_ptr_->Size();
|
||||
table_file_schema_.row_count_ = segment_writer_ptr_->VectorCount();
|
||||
|
||||
// if index type isn't IDMAP, set file type to TO_INDEX if file size exceed index_file_size
|
||||
// else set file type to RAW, no need to build index
|
||||
if (table_file_schema_.engine_type_ != (int)EngineType::FAISS_IDMAP &&
|
||||
table_file_schema_.engine_type_ != (int)EngineType::FAISS_BIN_IDMAP) {
|
||||
|
@ -105,17 +170,32 @@ MemTableFile::Serialize() {
|
|||
table_file_schema_.file_type_ = meta::TableFileSchema::RAW;
|
||||
}
|
||||
|
||||
auto status = meta_->UpdateTableFile(table_file_schema_);
|
||||
// Set table file's flush_lsn so WAL can roll back and delete garbage files which can be obtained from
|
||||
// GetTableFilesByFlushLSN() in meta.
|
||||
table_file_schema_.flush_lsn_ = wal_lsn;
|
||||
|
||||
status = meta_->UpdateTableFile(table_file_schema_);
|
||||
|
||||
ENGINE_LOG_DEBUG << "New " << ((table_file_schema_.file_type_ == meta::TableFileSchema::RAW) ? "raw" : "to_index")
|
||||
<< " file " << table_file_schema_.file_id_ << " of size " << size << " bytes";
|
||||
<< " file " << table_file_schema_.file_id_ << " of size " << size << " bytes, lsn = " << wal_lsn;
|
||||
|
||||
// TODO(zhiru): cache
|
||||
/*
|
||||
if (options_.insert_cache_immediately_) {
|
||||
execution_engine_->Cache();
|
||||
}
|
||||
*/
|
||||
if (options_.insert_cache_immediately_) {
|
||||
execution_engine_->Cache();
|
||||
segment_writer_ptr_->Cache();
|
||||
}
|
||||
|
||||
return status;
|
||||
}
|
||||
|
||||
const std::string&
|
||||
MemTableFile::GetSegmentId() const {
|
||||
return table_file_schema_.segment_id_;
|
||||
}
|
||||
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
||||
|
|
|
@ -11,14 +11,17 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include <segment/SegmentWriter.h>
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "VectorSource.h"
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "db/meta/Meta.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
|
@ -27,7 +30,13 @@ class MemTableFile {
|
|||
MemTableFile(const std::string& table_id, const meta::MetaPtr& meta, const DBOptions& options);
|
||||
|
||||
Status
|
||||
Add(VectorSourcePtr& source);
|
||||
Add(const VectorSourcePtr& source);
|
||||
|
||||
Status
|
||||
Delete(segment::doc_id_t doc_id);
|
||||
|
||||
Status
|
||||
Delete(const std::vector<segment::doc_id_t>& doc_ids);
|
||||
|
||||
size_t
|
||||
GetCurrentMem();
|
||||
|
@ -39,7 +48,10 @@ class MemTableFile {
|
|||
IsFull();
|
||||
|
||||
Status
|
||||
Serialize();
|
||||
Serialize(uint64_t wal_lsn);
|
||||
|
||||
const std::string&
|
||||
GetSegmentId() const;
|
||||
|
||||
private:
|
||||
Status
|
||||
|
@ -52,7 +64,8 @@ class MemTableFile {
|
|||
DBOptions options_;
|
||||
size_t current_mem_;
|
||||
|
||||
ExecutionEnginePtr execution_engine_;
|
||||
// ExecutionEnginePtr execution_engine_;
|
||||
segment::SegmentWriterPtr segment_writer_ptr_;
|
||||
}; // MemTableFile
|
||||
|
||||
using MemTableFilePtr = std::shared_ptr<MemTableFile>;
|
||||
|
|
|
@ -10,6 +10,10 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/insert/VectorSource.h"
|
||||
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
#include "db/engine/EngineFactory.h"
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "metrics/Metrics.h"
|
||||
|
@ -18,14 +22,15 @@
|
|||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
VectorSource::VectorSource(VectorsData& vectors)
|
||||
: vectors_(vectors), id_generator_(std::make_shared<SimpleIDGenerator>()) {
|
||||
VectorSource::VectorSource(VectorsData vectors)
|
||||
: vectors_(std::move(vectors)), id_generator_(std::make_shared<SimpleIDGenerator>()) {
|
||||
current_num_vectors_added = 0;
|
||||
}
|
||||
|
||||
Status
|
||||
VectorSource::Add(const ExecutionEnginePtr& execution_engine, const meta::TableFileSchema& table_file_schema,
|
||||
const size_t& num_vectors_to_add, size_t& num_vectors_added) {
|
||||
VectorSource::Add(/*const ExecutionEnginePtr& execution_engine,*/ const segment::SegmentWriterPtr& segment_writer_ptr,
|
||||
const meta::TableFileSchema& table_file_schema, const size_t& num_vectors_to_add,
|
||||
size_t& num_vectors_added) {
|
||||
uint64_t n = vectors_.vector_count_;
|
||||
server::CollectAddMetrics metrics(n, table_file_schema.dimension_);
|
||||
|
||||
|
@ -36,25 +41,46 @@ VectorSource::Add(const ExecutionEnginePtr& execution_engine, const meta::TableF
|
|||
id_generator_->GetNextIDNumbers(num_vectors_added, vector_ids_to_add);
|
||||
} else {
|
||||
vector_ids_to_add.resize(num_vectors_added);
|
||||
for (int pos = current_num_vectors_added; pos < current_num_vectors_added + num_vectors_added; pos++) {
|
||||
for (size_t pos = current_num_vectors_added; pos < current_num_vectors_added + num_vectors_added; pos++) {
|
||||
vector_ids_to_add[pos - current_num_vectors_added] = vectors_.id_array_[pos];
|
||||
}
|
||||
}
|
||||
|
||||
Status status;
|
||||
if (!vectors_.float_data_.empty()) {
|
||||
/*
|
||||
status = execution_engine->AddWithIds(
|
||||
num_vectors_added, vectors_.float_data_.data() + current_num_vectors_added * table_file_schema.dimension_,
|
||||
vector_ids_to_add.data());
|
||||
*/
|
||||
std::vector<uint8_t> vectors;
|
||||
auto size = num_vectors_added * table_file_schema.dimension_ * sizeof(float);
|
||||
vectors.resize(size);
|
||||
memcpy(vectors.data(), vectors_.float_data_.data() + current_num_vectors_added * table_file_schema.dimension_,
|
||||
size);
|
||||
status = segment_writer_ptr->AddVectors(table_file_schema.file_id_, vectors, vector_ids_to_add);
|
||||
|
||||
} else if (!vectors_.binary_data_.empty()) {
|
||||
/*
|
||||
status = execution_engine->AddWithIds(
|
||||
num_vectors_added,
|
||||
vectors_.binary_data_.data() + current_num_vectors_added * SingleVectorSize(table_file_schema.dimension_),
|
||||
vector_ids_to_add.data());
|
||||
*/
|
||||
std::vector<uint8_t> vectors;
|
||||
auto size = num_vectors_added * SingleVectorSize(table_file_schema.dimension_) * sizeof(uint8_t);
|
||||
vectors.resize(size);
|
||||
memcpy(
|
||||
vectors.data(),
|
||||
vectors_.binary_data_.data() + current_num_vectors_added * SingleVectorSize(table_file_schema.dimension_),
|
||||
size);
|
||||
status = segment_writer_ptr->AddVectors(table_file_schema.file_id_, vectors, vector_ids_to_add);
|
||||
}
|
||||
|
||||
// Clear vector data
|
||||
if (status.ok()) {
|
||||
current_num_vectors_added += num_vectors_added;
|
||||
// TODO(zhiru): remove
|
||||
vector_ids_.insert(vector_ids_.end(), std::make_move_iterator(vector_ids_to_add.begin()),
|
||||
std::make_move_iterator(vector_ids_to_add.end()));
|
||||
} else {
|
||||
|
|
|
@ -11,23 +11,26 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
#include "db/IDGenerator.h"
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "db/meta/Meta.h"
|
||||
#include "segment/SegmentWriter.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <memory>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
|
||||
// TODO(zhiru): this class needs to be refactored once attributes are added
|
||||
|
||||
class VectorSource {
|
||||
public:
|
||||
explicit VectorSource(VectorsData& vectors);
|
||||
explicit VectorSource(VectorsData vectors);
|
||||
|
||||
Status
|
||||
Add(const ExecutionEnginePtr& execution_engine, const meta::TableFileSchema& table_file_schema,
|
||||
const size_t& num_vectors_to_add, size_t& num_vectors_added);
|
||||
Add(/*const ExecutionEnginePtr& execution_engine,*/ const segment::SegmentWriterPtr& segment_writer_ptr,
|
||||
const meta::TableFileSchema& table_file_schema, const size_t& num_vectors_to_add, size_t& num_vectors_added);
|
||||
|
||||
size_t
|
||||
GetNumVectorsAdded();
|
||||
|
@ -42,7 +45,7 @@ class VectorSource {
|
|||
GetVectorIds();
|
||||
|
||||
private:
|
||||
VectorsData& vectors_;
|
||||
VectorsData vectors_;
|
||||
IDNumbers vector_ids_;
|
||||
|
||||
size_t current_num_vectors_added;
|
||||
|
|
|
@ -11,30 +11,33 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "MetaTypes.h"
|
||||
#include "db/Options.h"
|
||||
#include "db/Types.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
#include <cstddef>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "MetaTypes.h"
|
||||
#include "db/Options.h"
|
||||
#include "db/Types.h"
|
||||
#include "utils/Status.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace meta {
|
||||
|
||||
static const char* META_ENVIRONMENT = "Environment";
|
||||
static const char* META_TABLES = "Tables";
|
||||
static const char* META_TABLEFILES = "TableFiles";
|
||||
|
||||
class Meta {
|
||||
/*
|
||||
public:
|
||||
class CleanUpFilter {
|
||||
public:
|
||||
virtual bool
|
||||
IsIgnored(const TableFileSchema& schema) = 0;
|
||||
};
|
||||
*/
|
||||
|
||||
public:
|
||||
virtual ~Meta() = default;
|
||||
|
@ -54,6 +57,15 @@ class Meta {
|
|||
virtual Status
|
||||
UpdateTableFlag(const std::string& table_id, int64_t flag) = 0;
|
||||
|
||||
virtual Status
|
||||
UpdateTableFlushLSN(const std::string& table_id, uint64_t flush_lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
GetTableFlushLSN(const std::string& table_id, uint64_t& flush_lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
GetTableFilesByFlushLSN(uint64_t flush_lsn, TableFilesSchema& table_files) = 0;
|
||||
|
||||
virtual Status
|
||||
DropTable(const std::string& table_id) = 0;
|
||||
|
||||
|
@ -64,10 +76,10 @@ class Meta {
|
|||
CreateTableFile(TableFileSchema& file_schema) = 0;
|
||||
|
||||
virtual Status
|
||||
DropDataByDate(const std::string& table_id, const DatesT& dates) = 0;
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) = 0;
|
||||
|
||||
virtual Status
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) = 0;
|
||||
GetTableFilesBySegmentId(const std::string& segment_id, TableFilesSchema& table_files) = 0;
|
||||
|
||||
virtual Status
|
||||
UpdateTableFile(TableFileSchema& file_schema) = 0;
|
||||
|
@ -88,7 +100,8 @@ class Meta {
|
|||
DropTableIndex(const std::string& table_id) = 0;
|
||||
|
||||
virtual Status
|
||||
CreatePartition(const std::string& table_name, const std::string& partition_name, const std::string& tag) = 0;
|
||||
CreatePartition(const std::string& table_name, const std::string& partition_name, const std::string& tag,
|
||||
uint64_t lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
DropPartition(const std::string& partition_name) = 0;
|
||||
|
@ -100,11 +113,10 @@ class Meta {
|
|||
GetPartitionName(const std::string& table_name, const std::string& tag, std::string& partition_name) = 0;
|
||||
|
||||
virtual Status
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, const DatesT& dates,
|
||||
DatePartionedTableFilesSchema& files) = 0;
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& files) = 0;
|
||||
|
||||
virtual Status
|
||||
FilesToMerge(const std::string& table_id, DatePartionedTableFilesSchema& files) = 0;
|
||||
FilesToMerge(const std::string& table_id, TableFilesSchema& files) = 0;
|
||||
|
||||
virtual Status
|
||||
FilesToIndex(TableFilesSchema&) = 0;
|
||||
|
@ -122,13 +134,19 @@ class Meta {
|
|||
CleanUpShadowFiles() = 0;
|
||||
|
||||
virtual Status
|
||||
CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter = nullptr) = 0;
|
||||
CleanUpFilesWithTTL(uint64_t seconds /*, CleanUpFilter* filter = nullptr*/) = 0;
|
||||
|
||||
virtual Status
|
||||
DropAll() = 0;
|
||||
|
||||
virtual Status
|
||||
Count(const std::string& table_id, uint64_t& result) = 0;
|
||||
|
||||
virtual Status
|
||||
SetGlobalLastLSN(uint64_t lsn) = 0;
|
||||
|
||||
virtual Status
|
||||
GetGlobalLastLSN(uint64_t& lsn) = 0;
|
||||
}; // MetaData
|
||||
|
||||
using MetaPtr = std::shared_ptr<Meta>;
|
||||
|
|
|
@ -11,15 +11,15 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "db/Constants.h"
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "src/version.h"
|
||||
|
||||
#include <map>
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "db/Constants.h"
|
||||
#include "db/engine/ExecutionEngine.h"
|
||||
#include "src/version.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace meta {
|
||||
|
@ -35,7 +35,10 @@ constexpr int64_t FLAG_MASK_HAS_USERID = 0x1 << 1;
|
|||
|
||||
using DateT = int;
|
||||
const DateT EmptyDate = -1;
|
||||
using DatesT = std::vector<DateT>;
|
||||
|
||||
struct EnvironmentSchema {
|
||||
uint64_t global_lsn_ = 0;
|
||||
}; // EnvironmentSchema
|
||||
|
||||
struct TableSchema {
|
||||
typedef enum {
|
||||
|
@ -56,6 +59,7 @@ struct TableSchema {
|
|||
std::string owner_table_;
|
||||
std::string partition_tag_;
|
||||
std::string version_ = CURRENT_VERSION;
|
||||
uint64_t flush_lsn_ = 0;
|
||||
}; // TableSchema
|
||||
|
||||
struct TableFileSchema {
|
||||
|
@ -72,12 +76,14 @@ struct TableFileSchema {
|
|||
|
||||
size_t id_ = 0;
|
||||
std::string table_id_;
|
||||
std::string segment_id_;
|
||||
std::string file_id_;
|
||||
int32_t file_type_ = NEW;
|
||||
size_t file_size_ = 0;
|
||||
size_t row_count_ = 0;
|
||||
DateT date_ = EmptyDate;
|
||||
uint16_t dimension_ = 0;
|
||||
// TODO(zhiru)
|
||||
std::string location_;
|
||||
int64_t updated_time_ = 0;
|
||||
int64_t created_on_ = 0;
|
||||
|
@ -85,11 +91,11 @@ struct TableFileSchema {
|
|||
int32_t engine_type_ = DEFAULT_ENGINE_TYPE;
|
||||
int32_t nlist_ = DEFAULT_NLIST; // not persist to meta
|
||||
int32_t metric_type_ = DEFAULT_METRIC_TYPE; // not persist to meta
|
||||
}; // TableFileSchema
|
||||
uint64_t flush_lsn_ = 0;
|
||||
}; // TableFileSchema
|
||||
|
||||
using TableFileSchemaPtr = std::shared_ptr<meta::TableFileSchema>;
|
||||
using TableFilesSchema = std::vector<TableFileSchema>;
|
||||
using DatePartionedTableFilesSchema = std::map<DateT, TableFilesSchema>;
|
||||
|
||||
} // namespace meta
|
||||
} // namespace engine
|
||||
|
|
|
@ -10,19 +10,12 @@
|
|||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/meta/MySQLMetaImpl.h"
|
||||
#include "MetaConsts.h"
|
||||
#include "db/IDGenerator.h"
|
||||
#include "db/Utils.h"
|
||||
#include "metrics/Metrics.h"
|
||||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
#include "utils/StringHelpFunctions.h"
|
||||
|
||||
#include <fiu-local.h>
|
||||
#include <mysql++/mysql++.h>
|
||||
#include <string.h>
|
||||
#include <unistd.h>
|
||||
|
||||
#include <boost/filesystem.hpp>
|
||||
#include <chrono>
|
||||
#include <fstream>
|
||||
|
@ -35,6 +28,16 @@
|
|||
#include <string>
|
||||
#include <thread>
|
||||
|
||||
#include "MetaConsts.h"
|
||||
#include "db/IDGenerator.h"
|
||||
#include "db/OngoingFileChecker.h"
|
||||
#include "db/Utils.h"
|
||||
#include "metrics/Metrics.h"
|
||||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
#include "utils/StringHelpFunctions.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace meta {
|
||||
|
@ -146,12 +149,14 @@ static const MetaSchema TABLES_SCHEMA(META_TABLES, {
|
|||
MetaField("partition_tag", "VARCHAR(255)", "NOT NULL"),
|
||||
MetaField("version", "VARCHAR(64)",
|
||||
std::string("DEFAULT '") + CURRENT_VERSION + "'"),
|
||||
MetaField("flush_lsn", "BIGINT", "DEFAULT 0 NOT NULL"),
|
||||
});
|
||||
|
||||
// TableFiles schema
|
||||
static const MetaSchema TABLEFILES_SCHEMA(META_TABLEFILES, {
|
||||
MetaField("id", "BIGINT", "PRIMARY KEY AUTO_INCREMENT"),
|
||||
MetaField("table_id", "VARCHAR(255)", "NOT NULL"),
|
||||
MetaField("segment_id", "VARCHAR(255)", "NOT NULL"),
|
||||
MetaField("engine_type", "INT", "DEFAULT 1 NOT NULL"),
|
||||
MetaField("file_id", "VARCHAR(255)", "NOT NULL"),
|
||||
MetaField("file_type", "INT", "DEFAULT 0 NOT NULL"),
|
||||
|
@ -160,6 +165,7 @@ static const MetaSchema TABLEFILES_SCHEMA(META_TABLEFILES, {
|
|||
MetaField("updated_time", "BIGINT", "NOT NULL"),
|
||||
MetaField("created_on", "BIGINT", "NOT NULL"),
|
||||
MetaField("date", "INT", "DEFAULT -1 NOT NULL"),
|
||||
MetaField("flush_lsn", "BIGINT", "DEFAULT 0 NOT NULL"),
|
||||
});
|
||||
|
||||
} // namespace
|
||||
|
@ -395,12 +401,13 @@ MySQLMetaImpl::CreateTable(TableSchema& table_schema) {
|
|||
std::string& owner_table = table_schema.owner_table_;
|
||||
std::string& partition_tag = table_schema.partition_tag_;
|
||||
std::string& version = table_schema.version_;
|
||||
std::string flush_lsn = std::to_string(table_schema.flush_lsn_);
|
||||
|
||||
createTableQuery << "INSERT INTO " << META_TABLES << " VALUES(" << id << ", " << mysqlpp::quote << table_id
|
||||
<< ", " << state << ", " << dimension << ", " << created_on << ", " << flag << ", "
|
||||
<< index_file_size << ", " << engine_type << ", " << nlist << ", " << metric_type << ", "
|
||||
<< mysqlpp::quote << owner_table << ", " << mysqlpp::quote << partition_tag << ", "
|
||||
<< mysqlpp::quote << version << ");";
|
||||
<< mysqlpp::quote << version << ", " << flush_lsn << ");";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::CreateTable: " << createTableQuery.str();
|
||||
|
||||
|
@ -438,7 +445,7 @@ MySQLMetaImpl::DescribeTable(TableSchema& table_schema) {
|
|||
mysqlpp::Query describeTableQuery = connectionPtr->query();
|
||||
describeTableQuery
|
||||
<< "SELECT id, state, dimension, created_on, flag, index_file_size, engine_type, nlist, metric_type"
|
||||
<< " ,owner_table, partition_tag, version"
|
||||
<< " ,owner_table, partition_tag, version, flush_lsn"
|
||||
<< " FROM " << META_TABLES << " WHERE table_id = " << mysqlpp::quote << table_schema.table_id_
|
||||
<< " AND state <> " << std::to_string(TableSchema::TO_DELETE) << ";";
|
||||
|
||||
|
@ -461,6 +468,7 @@ MySQLMetaImpl::DescribeTable(TableSchema& table_schema) {
|
|||
resRow["owner_table"].to_string(table_schema.owner_table_);
|
||||
resRow["partition_tag"].to_string(table_schema.partition_tag_);
|
||||
resRow["version"].to_string(table_schema.version_);
|
||||
table_schema.flush_lsn_ = resRow["flush_lsn"];
|
||||
} else {
|
||||
return Status(DB_NOT_FOUND, "Table " + table_schema.table_id_ + " not found");
|
||||
}
|
||||
|
@ -525,9 +533,9 @@ MySQLMetaImpl::AllTables(std::vector<TableSchema>& table_schema_array) {
|
|||
|
||||
mysqlpp::Query allTablesQuery = connectionPtr->query();
|
||||
allTablesQuery << "SELECT id, table_id, dimension, engine_type, nlist, index_file_size, metric_type"
|
||||
<< " ,owner_table, partition_tag, version"
|
||||
<< " ,owner_table, partition_tag, version, flush_lsn"
|
||||
<< " FROM " << META_TABLES << " WHERE state <> " << std::to_string(TableSchema::TO_DELETE)
|
||||
<< ";";
|
||||
<< " AND owner_table = \"\";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::AllTables: " << allTablesQuery.str();
|
||||
|
||||
|
@ -546,6 +554,7 @@ MySQLMetaImpl::AllTables(std::vector<TableSchema>& table_schema_array) {
|
|||
resRow["owner_table"].to_string(table_schema.owner_table_);
|
||||
resRow["partition_tag"].to_string(table_schema.partition_tag_);
|
||||
resRow["version"].to_string(table_schema.version_);
|
||||
table_schema.flush_lsn_ = resRow["flush_lsn"];
|
||||
|
||||
table_schema_array.emplace_back(table_schema);
|
||||
}
|
||||
|
@ -653,6 +662,9 @@ MySQLMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
|
|||
server::MetricCollector metric;
|
||||
|
||||
NextFileId(file_schema.file_id_);
|
||||
if (file_schema.segment_id_.empty()) {
|
||||
file_schema.segment_id_ = file_schema.file_id_;
|
||||
}
|
||||
file_schema.dimension_ = table_schema.dimension_;
|
||||
file_schema.file_size_ = 0;
|
||||
file_schema.row_count_ = 0;
|
||||
|
@ -665,6 +677,7 @@ MySQLMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
|
|||
|
||||
std::string id = "NULL"; // auto-increment
|
||||
std::string table_id = file_schema.table_id_;
|
||||
std::string segment_id = file_schema.segment_id_;
|
||||
std::string engine_type = std::to_string(file_schema.engine_type_);
|
||||
std::string file_id = file_schema.file_id_;
|
||||
std::string file_type = std::to_string(file_schema.file_type_);
|
||||
|
@ -673,6 +686,7 @@ MySQLMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
|
|||
std::string updated_time = std::to_string(file_schema.updated_time_);
|
||||
std::string created_on = std::to_string(file_schema.created_on_);
|
||||
std::string date = std::to_string(file_schema.date_);
|
||||
std::string flush_lsn = std::to_string(file_schema.flush_lsn_);
|
||||
|
||||
{
|
||||
mysqlpp::ScopedConnection connectionPtr(*mysql_connection_pool_, safe_grab_);
|
||||
|
@ -687,9 +701,10 @@ MySQLMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
|
|||
mysqlpp::Query createTableFileQuery = connectionPtr->query();
|
||||
|
||||
createTableFileQuery << "INSERT INTO " << META_TABLEFILES << " VALUES(" << id << ", " << mysqlpp::quote
|
||||
<< table_id << ", " << engine_type << ", " << mysqlpp::quote << file_id << ", "
|
||||
<< file_type << ", " << file_size << ", " << row_count << ", " << updated_time << ", "
|
||||
<< created_on << ", " << date << ");";
|
||||
<< table_id << ", " << mysqlpp::quote << segment_id << ", " << engine_type << ", "
|
||||
<< mysqlpp::quote << file_id << ", " << file_type << ", " << file_size << ", "
|
||||
<< row_count << ", " << updated_time << ", " << created_on << ", " << date << ", "
|
||||
<< flush_lsn << ");";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::CreateTableFile: " << createTableFileQuery.str();
|
||||
|
||||
|
@ -709,61 +724,6 @@ MySQLMetaImpl::CreateTableFile(TableFileSchema& file_schema) {
|
|||
}
|
||||
}
|
||||
|
||||
// TODO(myh): Delete single vecotor by id
|
||||
Status
|
||||
MySQLMetaImpl::DropDataByDate(const std::string& table_id, const DatesT& dates) {
|
||||
if (dates.empty()) {
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
TableSchema table_schema;
|
||||
table_schema.table_id_ = table_id;
|
||||
auto status = DescribeTable(table_schema);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
try {
|
||||
std::stringstream dateListSS;
|
||||
for (auto& date : dates) {
|
||||
dateListSS << std::to_string(date) << ", ";
|
||||
}
|
||||
std::string dateListStr = dateListSS.str();
|
||||
dateListStr = dateListStr.substr(0, dateListStr.size() - 2); // remove the last ", "
|
||||
|
||||
{
|
||||
mysqlpp::ScopedConnection connectionPtr(*mysql_connection_pool_, safe_grab_);
|
||||
|
||||
bool is_null_connection = (connectionPtr == nullptr);
|
||||
fiu_do_on("MySQLMetaImpl.DropDataByDate.null_connection", is_null_connection = true);
|
||||
fiu_do_on("MySQLMetaImpl.DropDataByDate.throw_exception", throw std::exception(););
|
||||
if (is_null_connection) {
|
||||
return Status(DB_ERROR, "Failed to connect to meta server(mysql)");
|
||||
}
|
||||
|
||||
mysqlpp::Query dropPartitionsByDatesQuery = connectionPtr->query();
|
||||
|
||||
dropPartitionsByDatesQuery << "UPDATE " << META_TABLEFILES
|
||||
<< " SET file_type = " << std::to_string(TableFileSchema::TO_DELETE)
|
||||
<< " ,updated_time = " << utils::GetMicroSecTimeStamp()
|
||||
<< " WHERE table_id = " << mysqlpp::quote << table_id << " AND date in ("
|
||||
<< dateListStr << ");";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::DropDataByDate: " << dropPartitionsByDatesQuery.str();
|
||||
|
||||
if (!dropPartitionsByDatesQuery.exec()) {
|
||||
return HandleException("QUERY ERROR WHEN DROPPING PARTITIONS BY DATES",
|
||||
dropPartitionsByDatesQuery.error());
|
||||
}
|
||||
} // Scoped Connection
|
||||
|
||||
ENGINE_LOG_DEBUG << "Successfully drop data by date, table id = " << table_schema.table_id_;
|
||||
} catch (std::exception& e) {
|
||||
return HandleException("GENERAL ERROR WHEN DROPPING PARTITIONS BY DATES", e.what());
|
||||
}
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids,
|
||||
TableFilesSchema& table_files) {
|
||||
|
@ -791,10 +751,11 @@ MySQLMetaImpl::GetTableFiles(const std::string& table_id, const std::vector<size
|
|||
}
|
||||
|
||||
mysqlpp::Query getTableFileQuery = connectionPtr->query();
|
||||
getTableFileQuery << "SELECT id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
|
||||
<< " AND (" << idStr << ")"
|
||||
<< " AND file_type <> " << std::to_string(TableFileSchema::TO_DELETE) << ";";
|
||||
getTableFileQuery
|
||||
<< "SELECT id, segment_id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id << " AND ("
|
||||
<< idStr << ")"
|
||||
<< " AND file_type <> " << std::to_string(TableFileSchema::TO_DELETE) << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::GetTableFiles: " << getTableFileQuery.str();
|
||||
|
||||
|
@ -810,6 +771,7 @@ MySQLMetaImpl::GetTableFiles(const std::string& table_id, const std::vector<size
|
|||
TableFileSchema file_schema;
|
||||
file_schema.id_ = resRow["id"];
|
||||
file_schema.table_id_ = table_id;
|
||||
resRow["segment_id"].to_string(file_schema.segment_id_);
|
||||
file_schema.index_file_size_ = table_schema.index_file_size_;
|
||||
file_schema.engine_type_ = resRow["engine_type"];
|
||||
file_schema.nlist_ = table_schema.nlist_;
|
||||
|
@ -833,6 +795,66 @@ MySQLMetaImpl::GetTableFiles(const std::string& table_id, const std::vector<size
|
|||
}
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::GetTableFilesBySegmentId(const std::string& segment_id,
|
||||
milvus::engine::meta::TableFilesSchema& table_files) {
|
||||
try {
|
||||
mysqlpp::StoreQueryResult res;
|
||||
{
|
||||
mysqlpp::ScopedConnection connectionPtr(*mysql_connection_pool_, safe_grab_);
|
||||
|
||||
if (connectionPtr == nullptr) {
|
||||
return Status(DB_ERROR, "Failed to connect to meta server(mysql)");
|
||||
}
|
||||
|
||||
mysqlpp::Query getTableFileQuery = connectionPtr->query();
|
||||
getTableFileQuery << "SELECT id, table_id, segment_id, engine_type, file_id, file_type, file_size, "
|
||||
<< "row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE segment_id = " << mysqlpp::quote << segment_id
|
||||
<< " AND file_type <> " << std::to_string(TableFileSchema::TO_DELETE) << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::GetTableFilesBySegmentId: " << getTableFileQuery.str();
|
||||
|
||||
res = getTableFileQuery.store();
|
||||
} // Scoped Connection
|
||||
|
||||
if (!res.empty()) {
|
||||
TableSchema table_schema;
|
||||
res[0]["table_id"].to_string(table_schema.table_id_);
|
||||
auto status = DescribeTable(table_schema);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
for (auto& resRow : res) {
|
||||
TableFileSchema file_schema;
|
||||
file_schema.id_ = resRow["id"];
|
||||
file_schema.table_id_ = table_schema.table_id_;
|
||||
resRow["segment_id"].to_string(file_schema.segment_id_);
|
||||
file_schema.index_file_size_ = table_schema.index_file_size_;
|
||||
file_schema.engine_type_ = resRow["engine_type"];
|
||||
file_schema.nlist_ = table_schema.nlist_;
|
||||
file_schema.metric_type_ = table_schema.metric_type_;
|
||||
resRow["file_id"].to_string(file_schema.file_id_);
|
||||
file_schema.file_type_ = resRow["file_type"];
|
||||
file_schema.file_size_ = resRow["file_size"];
|
||||
file_schema.row_count_ = resRow["row_count"];
|
||||
file_schema.date_ = resRow["date"];
|
||||
file_schema.created_on_ = resRow["created_on"];
|
||||
file_schema.dimension_ = table_schema.dimension_;
|
||||
|
||||
utils::GetTableFilePath(options_, file_schema);
|
||||
table_files.emplace_back(file_schema);
|
||||
}
|
||||
}
|
||||
|
||||
ENGINE_LOG_DEBUG << "Get table files by segment id";
|
||||
return Status::OK();
|
||||
} catch (std::exception& e) {
|
||||
return HandleException("GENERAL ERROR WHEN RETRIEVING TABLE FILES BY SEGMENT ID", e.what());
|
||||
}
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::UpdateTableIndex(const std::string& table_id, const TableIndex& index) {
|
||||
try {
|
||||
|
@ -924,6 +946,113 @@ MySQLMetaImpl::UpdateTableFlag(const std::string& table_id, int64_t flag) {
|
|||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::UpdateTableFlushLSN(const std::string& table_id, uint64_t flush_lsn) {
|
||||
try {
|
||||
server::MetricCollector metric;
|
||||
|
||||
{
|
||||
mysqlpp::ScopedConnection connectionPtr(*mysql_connection_pool_, safe_grab_);
|
||||
|
||||
if (connectionPtr == nullptr) {
|
||||
return Status(DB_ERROR, "Failed to connect to meta server(mysql)");
|
||||
}
|
||||
|
||||
mysqlpp::Query updateTableFlagQuery = connectionPtr->query();
|
||||
updateTableFlagQuery << "UPDATE " << META_TABLES << " SET flush_lsn = " << flush_lsn
|
||||
<< " WHERE table_id = " << mysqlpp::quote << table_id << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::UpdateTableFlushLSN: " << updateTableFlagQuery.str();
|
||||
|
||||
if (!updateTableFlagQuery.exec()) {
|
||||
return HandleException("QUERY ERROR WHEN UPDATING TABLE FLUSH_LSN", updateTableFlagQuery.error());
|
||||
}
|
||||
} // Scoped Connection
|
||||
|
||||
ENGINE_LOG_DEBUG << "Successfully update table flush_lsn, table id = " << table_id;
|
||||
} catch (std::exception& e) {
|
||||
return HandleException("GENERAL ERROR WHEN UPDATING TABLE FLUSH_LSN", e.what());
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::GetTableFlushLSN(const std::string& table_id, uint64_t& flush_lsn) {
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::GetTableFilesByFlushLSN(uint64_t flush_lsn, TableFilesSchema& table_files) {
|
||||
table_files.clear();
|
||||
|
||||
try {
|
||||
server::MetricCollector metric;
|
||||
mysqlpp::StoreQueryResult res;
|
||||
{
|
||||
mysqlpp::ScopedConnection connectionPtr(*mysql_connection_pool_, safe_grab_);
|
||||
|
||||
if (connectionPtr == nullptr) {
|
||||
return Status(DB_ERROR, "Failed to connect to meta server(mysql)");
|
||||
}
|
||||
|
||||
mysqlpp::Query filesToIndexQuery = connectionPtr->query();
|
||||
filesToIndexQuery << "SELECT id, table_id, segment_id, engine_type, file_id, file_type, file_size, "
|
||||
"row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE flush_lsn = " << flush_lsn << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::FilesToIndex: " << filesToIndexQuery.str();
|
||||
|
||||
res = filesToIndexQuery.store();
|
||||
} // Scoped Connection
|
||||
|
||||
Status ret;
|
||||
std::map<std::string, TableSchema> groups;
|
||||
TableFileSchema table_file;
|
||||
for (auto& resRow : res) {
|
||||
table_file.id_ = resRow["id"]; // implicit conversion
|
||||
resRow["table_id"].to_string(table_file.table_id_);
|
||||
resRow["segment_id"].to_string(table_file.segment_id_);
|
||||
table_file.engine_type_ = resRow["engine_type"];
|
||||
resRow["file_id"].to_string(table_file.file_id_);
|
||||
table_file.file_type_ = resRow["file_type"];
|
||||
table_file.file_size_ = resRow["file_size"];
|
||||
table_file.row_count_ = resRow["row_count"];
|
||||
table_file.date_ = resRow["date"];
|
||||
table_file.created_on_ = resRow["created_on"];
|
||||
|
||||
auto groupItr = groups.find(table_file.table_id_);
|
||||
if (groupItr == groups.end()) {
|
||||
TableSchema table_schema;
|
||||
table_schema.table_id_ = table_file.table_id_;
|
||||
auto status = DescribeTable(table_schema);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
groups[table_file.table_id_] = table_schema;
|
||||
}
|
||||
table_file.dimension_ = groups[table_file.table_id_].dimension_;
|
||||
table_file.index_file_size_ = groups[table_file.table_id_].index_file_size_;
|
||||
table_file.nlist_ = groups[table_file.table_id_].nlist_;
|
||||
table_file.metric_type_ = groups[table_file.table_id_].metric_type_;
|
||||
|
||||
auto status = utils::GetTableFilePath(options_, table_file);
|
||||
if (!status.ok()) {
|
||||
ret = status;
|
||||
}
|
||||
|
||||
table_files.push_back(table_file);
|
||||
}
|
||||
|
||||
if (res.size() > 0) {
|
||||
ENGINE_LOG_DEBUG << "Collect " << res.size() << " files with flush_lsn = " << flush_lsn;
|
||||
}
|
||||
return ret;
|
||||
} catch (std::exception& e) {
|
||||
return HandleException("GENERAL ERROR WHEN FINDING TABLE FILES BY LSN", e.what());
|
||||
}
|
||||
}
|
||||
|
||||
// ZR: this function assumes all fields in file_schema have value
|
||||
Status
|
||||
MySQLMetaImpl::UpdateTableFile(TableFileSchema& file_schema) {
|
||||
|
@ -1213,7 +1342,8 @@ MySQLMetaImpl::DropTableIndex(const std::string& table_id) {
|
|||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag) {
|
||||
MySQLMetaImpl::CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag,
|
||||
uint64_t lsn) {
|
||||
server::MetricCollector metric;
|
||||
|
||||
TableSchema table_schema;
|
||||
|
@ -1252,6 +1382,7 @@ MySQLMetaImpl::CreatePartition(const std::string& table_id, const std::string& p
|
|||
table_schema.created_on_ = utils::GetMicroSecTimeStamp();
|
||||
table_schema.owner_table_ = table_id;
|
||||
table_schema.partition_tag_ = valid_tag;
|
||||
table_schema.flush_lsn_ = lsn;
|
||||
|
||||
status = CreateTable(table_schema);
|
||||
fiu_do_on("MySQLMetaImpl.CreatePartition.aleady_exist", status = Status(DB_ALREADY_EXIST, ""));
|
||||
|
@ -1283,8 +1414,10 @@ MySQLMetaImpl::ShowPartitions(const std::string& table_id, std::vector<meta::Tab
|
|||
}
|
||||
|
||||
mysqlpp::Query allPartitionsQuery = connectionPtr->query();
|
||||
allPartitionsQuery << "SELECT table_id FROM " << META_TABLES << " WHERE owner_table = " << mysqlpp::quote
|
||||
<< table_id << " AND state <> " << std::to_string(TableSchema::TO_DELETE) << ";";
|
||||
allPartitionsQuery << "SELECT table_id, id, state, dimension, created_on, flag, index_file_size,"
|
||||
<< " engine_type, nlist, metric_type, partition_tag, version FROM " << META_TABLES
|
||||
<< " WHERE owner_table = " << mysqlpp::quote << table_id << " AND state <> "
|
||||
<< std::to_string(TableSchema::TO_DELETE) << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::AllTables: " << allPartitionsQuery.str();
|
||||
|
||||
|
@ -1294,7 +1427,19 @@ MySQLMetaImpl::ShowPartitions(const std::string& table_id, std::vector<meta::Tab
|
|||
for (auto& resRow : res) {
|
||||
meta::TableSchema partition_schema;
|
||||
resRow["table_id"].to_string(partition_schema.table_id_);
|
||||
DescribeTable(partition_schema);
|
||||
partition_schema.id_ = resRow["id"]; // implicit conversion
|
||||
partition_schema.state_ = resRow["state"];
|
||||
partition_schema.dimension_ = resRow["dimension"];
|
||||
partition_schema.created_on_ = resRow["created_on"];
|
||||
partition_schema.flag_ = resRow["flag"];
|
||||
partition_schema.index_file_size_ = resRow["index_file_size"];
|
||||
partition_schema.engine_type_ = resRow["engine_type"];
|
||||
partition_schema.nlist_ = resRow["nlist"];
|
||||
partition_schema.metric_type_ = resRow["metric_type"];
|
||||
partition_schema.owner_table_ = table_id;
|
||||
resRow["partition_tag"].to_string(partition_schema.partition_tag_);
|
||||
resRow["version"].to_string(partition_schema.version_);
|
||||
|
||||
partition_schema_array.emplace_back(partition_schema);
|
||||
}
|
||||
} catch (std::exception& e) {
|
||||
|
@ -1349,8 +1494,7 @@ MySQLMetaImpl::GetPartitionName(const std::string& table_id, const std::string&
|
|||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, const DatesT& dates,
|
||||
DatePartionedTableFilesSchema& files) {
|
||||
MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& files) {
|
||||
files.clear();
|
||||
|
||||
try {
|
||||
|
@ -1367,19 +1511,9 @@ MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size
|
|||
}
|
||||
|
||||
mysqlpp::Query filesToSearchQuery = connectionPtr->query();
|
||||
filesToSearchQuery << "SELECT id, table_id, engine_type, file_id, file_type, file_size, row_count, date"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id;
|
||||
|
||||
if (!dates.empty()) {
|
||||
std::stringstream partitionListSS;
|
||||
for (auto& date : dates) {
|
||||
partitionListSS << std::to_string(date) << ", ";
|
||||
}
|
||||
std::string partitionListStr = partitionListSS.str();
|
||||
|
||||
partitionListStr = partitionListStr.substr(0, partitionListStr.size() - 2); // remove the last ", "
|
||||
filesToSearchQuery << " AND date IN (" << partitionListStr << ")";
|
||||
}
|
||||
filesToSearchQuery
|
||||
<< "SELECT id, table_id, segment_id, engine_type, file_id, file_type, file_size, row_count, date"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id;
|
||||
|
||||
if (!ids.empty()) {
|
||||
std::stringstream idSS;
|
||||
|
@ -1410,10 +1544,11 @@ MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size
|
|||
}
|
||||
|
||||
Status ret;
|
||||
TableFileSchema table_file;
|
||||
for (auto& resRow : res) {
|
||||
TableFileSchema table_file;
|
||||
table_file.id_ = resRow["id"]; // implicit conversion
|
||||
resRow["table_id"].to_string(table_file.table_id_);
|
||||
resRow["segment_id"].to_string(table_file.segment_id_);
|
||||
table_file.index_file_size_ = table_schema.index_file_size_;
|
||||
table_file.engine_type_ = resRow["engine_type"];
|
||||
table_file.nlist_ = table_schema.nlist_;
|
||||
|
@ -1430,12 +1565,7 @@ MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size
|
|||
ret = status;
|
||||
}
|
||||
|
||||
auto dateItr = files.find(table_file.date_);
|
||||
if (dateItr == files.end()) {
|
||||
files[table_file.date_] = TableFilesSchema();
|
||||
}
|
||||
|
||||
files[table_file.date_].push_back(table_file);
|
||||
files.emplace_back(table_file);
|
||||
}
|
||||
|
||||
if (res.size() > 0) {
|
||||
|
@ -1448,7 +1578,7 @@ MySQLMetaImpl::FilesToSearch(const std::string& table_id, const std::vector<size
|
|||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFilesSchema& files) {
|
||||
MySQLMetaImpl::FilesToMerge(const std::string& table_id, TableFilesSchema& files) {
|
||||
files.clear();
|
||||
|
||||
try {
|
||||
|
@ -1474,10 +1604,11 @@ MySQLMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFiles
|
|||
}
|
||||
|
||||
mysqlpp::Query filesToMergeQuery = connectionPtr->query();
|
||||
filesToMergeQuery
|
||||
<< "SELECT id, table_id, file_id, file_type, file_size, row_count, date, engine_type, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
|
||||
<< " AND file_type = " << std::to_string(TableFileSchema::RAW) << " ORDER BY row_count DESC;";
|
||||
filesToMergeQuery << "SELECT id, table_id, segment_id, file_id, file_type, file_size, row_count, date, "
|
||||
"engine_type, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
|
||||
<< " AND file_type = " << std::to_string(TableFileSchema::RAW)
|
||||
<< " ORDER BY row_count DESC;";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::FilesToMerge: " << filesToMergeQuery.str();
|
||||
|
||||
|
@ -1495,6 +1626,7 @@ MySQLMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFiles
|
|||
|
||||
table_file.id_ = resRow["id"]; // implicit conversion
|
||||
resRow["table_id"].to_string(table_file.table_id_);
|
||||
resRow["segment_id"].to_string(table_file.segment_id_);
|
||||
resRow["file_id"].to_string(table_file.file_id_);
|
||||
table_file.file_type_ = resRow["file_type"];
|
||||
table_file.row_count_ = resRow["row_count"];
|
||||
|
@ -1511,13 +1643,8 @@ MySQLMetaImpl::FilesToMerge(const std::string& table_id, DatePartionedTableFiles
|
|||
ret = status;
|
||||
}
|
||||
|
||||
auto dateItr = files.find(table_file.date_);
|
||||
if (dateItr == files.end()) {
|
||||
files[table_file.date_] = TableFilesSchema();
|
||||
++to_merge_files;
|
||||
}
|
||||
|
||||
files[table_file.date_].push_back(table_file);
|
||||
files.emplace_back(table_file);
|
||||
++to_merge_files;
|
||||
}
|
||||
|
||||
if (to_merge_files > 0) {
|
||||
|
@ -1547,10 +1674,10 @@ MySQLMetaImpl::FilesToIndex(TableFilesSchema& files) {
|
|||
}
|
||||
|
||||
mysqlpp::Query filesToIndexQuery = connectionPtr->query();
|
||||
filesToIndexQuery
|
||||
<< "SELECT id, table_id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE file_type = " << std::to_string(TableFileSchema::TO_INDEX)
|
||||
<< ";";
|
||||
filesToIndexQuery << "SELECT id, table_id, segment_id, engine_type, file_id, file_type, file_size, "
|
||||
"row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES
|
||||
<< " WHERE file_type = " << std::to_string(TableFileSchema::TO_INDEX) << ";";
|
||||
|
||||
ENGINE_LOG_DEBUG << "MySQLMetaImpl::FilesToIndex: " << filesToIndexQuery.str();
|
||||
|
||||
|
@ -1563,6 +1690,7 @@ MySQLMetaImpl::FilesToIndex(TableFilesSchema& files) {
|
|||
for (auto& resRow : res) {
|
||||
table_file.id_ = resRow["id"]; // implicit conversion
|
||||
resRow["table_id"].to_string(table_file.table_id_);
|
||||
resRow["segment_id"].to_string(table_file.segment_id_);
|
||||
table_file.engine_type_ = resRow["engine_type"];
|
||||
resRow["file_id"].to_string(table_file.file_id_);
|
||||
table_file.file_type_ = resRow["file_type"];
|
||||
|
@ -1610,6 +1738,8 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
return Status(DB_ERROR, "file types array is empty");
|
||||
}
|
||||
|
||||
Status ret = Status::OK();
|
||||
|
||||
try {
|
||||
table_files.clear();
|
||||
|
||||
|
@ -1635,7 +1765,7 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
mysqlpp::Query hasNonIndexFilesQuery = connectionPtr->query();
|
||||
// since table_id is a unique column we just need to check whether it exists or not
|
||||
hasNonIndexFilesQuery
|
||||
<< "SELECT id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
|
||||
<< "SELECT id, segment_id, engine_type, file_id, file_type, file_size, row_count, date, created_on"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE table_id = " << mysqlpp::quote << table_id
|
||||
<< " AND file_type in (" << types << ");";
|
||||
|
||||
|
@ -1644,6 +1774,13 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
res = hasNonIndexFilesQuery.store();
|
||||
} // Scoped Connection
|
||||
|
||||
TableSchema table_schema;
|
||||
table_schema.table_id_ = table_id;
|
||||
auto status = DescribeTable(table_schema);
|
||||
if (!status.ok()) {
|
||||
return status;
|
||||
}
|
||||
|
||||
if (res.num_rows() > 0) {
|
||||
int raw_count = 0, new_count = 0, new_merge_count = 0, new_index_count = 0;
|
||||
int to_index_count = 0, index_count = 0, backup_count = 0;
|
||||
|
@ -1651,6 +1788,7 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
TableFileSchema file_schema;
|
||||
file_schema.id_ = resRow["id"];
|
||||
file_schema.table_id_ = table_id;
|
||||
resRow["segment_id"].to_string(file_schema.segment_id_);
|
||||
file_schema.engine_type_ = resRow["engine_type"];
|
||||
resRow["file_id"].to_string(file_schema.file_id_);
|
||||
file_schema.file_type_ = resRow["file_type"];
|
||||
|
@ -1659,6 +1797,16 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
file_schema.date_ = resRow["date"];
|
||||
file_schema.created_on_ = resRow["created_on"];
|
||||
|
||||
file_schema.index_file_size_ = table_schema.index_file_size_;
|
||||
file_schema.nlist_ = table_schema.nlist_;
|
||||
file_schema.metric_type_ = table_schema.metric_type_;
|
||||
file_schema.dimension_ = table_schema.dimension_;
|
||||
|
||||
auto status = utils::GetTableFilePath(options_, file_schema);
|
||||
if (!status.ok()) {
|
||||
ret = status;
|
||||
}
|
||||
|
||||
table_files.emplace_back(file_schema);
|
||||
|
||||
int32_t file_type = resRow["file_type"];
|
||||
|
@ -1723,7 +1871,7 @@ MySQLMetaImpl::FilesByType(const std::string& table_id, const std::vector<int>&
|
|||
return HandleException("GENERAL ERROR WHEN GET FILE BY TYPE", e.what());
|
||||
}
|
||||
|
||||
return Status::OK();
|
||||
return ret;
|
||||
}
|
||||
|
||||
// TODO(myh): Support swap to cloud storage
|
||||
|
@ -1866,7 +2014,7 @@ MySQLMetaImpl::CleanUpShadowFiles() {
|
|||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter) {
|
||||
MySQLMetaImpl::CleanUpFilesWithTTL(uint64_t seconds /*, CleanUpFilter* filter*/) {
|
||||
auto now = utils::GetMicroSecTimeStamp();
|
||||
std::set<std::string> table_ids;
|
||||
|
||||
|
@ -1886,7 +2034,7 @@ MySQLMetaImpl::CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter) {
|
|||
}
|
||||
|
||||
mysqlpp::Query query = connectionPtr->query();
|
||||
query << "SELECT id, table_id, file_id, file_type, date"
|
||||
query << "SELECT id, table_id, segment_id, engine_type, file_id, file_type, date"
|
||||
<< " FROM " << META_TABLEFILES << " WHERE file_type IN ("
|
||||
<< std::to_string(TableFileSchema::TO_DELETE) << "," << std::to_string(TableFileSchema::BACKUP) << ")"
|
||||
<< " AND updated_time < " << std::to_string(now - seconds * US_PS) << ";";
|
||||
|
@ -1902,12 +2050,14 @@ MySQLMetaImpl::CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter) {
|
|||
for (auto& resRow : res) {
|
||||
table_file.id_ = resRow["id"]; // implicit conversion
|
||||
resRow["table_id"].to_string(table_file.table_id_);
|
||||
resRow["segment_id"].to_string(table_file.segment_id_);
|
||||
table_file.engine_type_ = resRow["engine_type"];
|
||||
resRow["file_id"].to_string(table_file.file_id_);
|
||||
table_file.date_ = resRow["date"];
|
||||
table_file.file_type_ = resRow["file_type"];
|
||||
|
||||
// check if the file can be deleted
|
||||
if (filter && filter->IsIgnored(table_file)) {
|
||||
if (OngoingFileChecker::GetInstance().IsIgnored(table_file)) {
|
||||
ENGINE_LOG_DEBUG << "File:" << table_file.file_id_
|
||||
<< " currently is in use, not able to delete now";
|
||||
continue; // ignore this file, don't delete it
|
||||
|
@ -1919,9 +2069,19 @@ MySQLMetaImpl::CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter) {
|
|||
server::CommonUtil::EraseFromCache(table_file.location_);
|
||||
|
||||
if (table_file.file_type_ == (int)TableFileSchema::TO_DELETE) {
|
||||
// delete file from disk storage
|
||||
utils::DeleteTableFilePath(options_, table_file);
|
||||
ENGINE_LOG_DEBUG << "Remove file id:" << table_file.id_ << " location:" << table_file.location_;
|
||||
// If we are deleting a raw table file, it means it's okay to delete the entire segment directory.
|
||||
// Else, we can only delete the single file
|
||||
// TODO(zhiru): We determine whether a table file is raw by its engine type. This is a bit hacky
|
||||
if (table_file.engine_type_ == (int32_t)EngineType::FAISS_IDMAP ||
|
||||
table_file.engine_type_ == (int32_t)EngineType::FAISS_BIN_IDMAP) {
|
||||
utils::DeleteSegment(options_, table_file);
|
||||
std::string segment_dir;
|
||||
utils::GetParentPath(table_file.location_, segment_dir);
|
||||
ENGINE_LOG_DEBUG << "Remove segment directory: " << segment_dir;
|
||||
} else {
|
||||
utils::DeleteTableFilePath(options_, table_file);
|
||||
ENGINE_LOG_DEBUG << "Remove table file: " << table_file.location_;
|
||||
}
|
||||
|
||||
idsToDelete.emplace_back(std::to_string(table_file.id_));
|
||||
table_ids.insert(table_file.table_id_);
|
||||
|
@ -2196,6 +2356,16 @@ MySQLMetaImpl::DiscardFiles(int64_t to_discard_size) {
|
|||
}
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::SetGlobalLastLSN(uint64_t lsn) {
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
Status
|
||||
MySQLMetaImpl::GetGlobalLastLSN(uint64_t& lsn) {
|
||||
return Status::OK();
|
||||
}
|
||||
|
||||
} // namespace meta
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
||||
|
|
|
@ -11,16 +11,17 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "Meta.h"
|
||||
#include "MySQLConnectionPool.h"
|
||||
#include "db/Options.h"
|
||||
|
||||
#include <mysql++/mysql++.h>
|
||||
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "Meta.h"
|
||||
#include "MySQLConnectionPool.h"
|
||||
#include "db/Options.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace meta {
|
||||
|
@ -52,10 +53,10 @@ class MySQLMetaImpl : public Meta {
|
|||
CreateTableFile(TableFileSchema& file_schema) override;
|
||||
|
||||
Status
|
||||
DropDataByDate(const std::string& table_id, const DatesT& dates) override;
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) override;
|
||||
GetTableFilesBySegmentId(const std::string& segment_id, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
UpdateTableIndex(const std::string& table_id, const TableIndex& index) override;
|
||||
|
@ -63,6 +64,15 @@ class MySQLMetaImpl : public Meta {
|
|||
Status
|
||||
UpdateTableFlag(const std::string& table_id, int64_t flag) override;
|
||||
|
||||
Status
|
||||
UpdateTableFlushLSN(const std::string& table_id, uint64_t flush_lsn) override;
|
||||
|
||||
Status
|
||||
GetTableFlushLSN(const std::string& table_id, uint64_t& flush_lsn) override;
|
||||
|
||||
Status
|
||||
GetTableFilesByFlushLSN(uint64_t flush_lsn, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
UpdateTableFile(TableFileSchema& file_schema) override;
|
||||
|
||||
|
@ -79,7 +89,8 @@ class MySQLMetaImpl : public Meta {
|
|||
DropTableIndex(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag) override;
|
||||
CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag,
|
||||
uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
DropPartition(const std::string& partition_name) override;
|
||||
|
@ -91,11 +102,10 @@ class MySQLMetaImpl : public Meta {
|
|||
GetPartitionName(const std::string& table_id, const std::string& tag, std::string& partition_name) override;
|
||||
|
||||
Status
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, const DatesT& dates,
|
||||
DatePartionedTableFilesSchema& files) override;
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& files) override;
|
||||
|
||||
Status
|
||||
FilesToMerge(const std::string& table_id, DatePartionedTableFilesSchema& files) override;
|
||||
FilesToMerge(const std::string& table_id, TableFilesSchema& files) override;
|
||||
|
||||
Status
|
||||
FilesToIndex(TableFilesSchema&) override;
|
||||
|
@ -114,7 +124,7 @@ class MySQLMetaImpl : public Meta {
|
|||
CleanUpShadowFiles() override;
|
||||
|
||||
Status
|
||||
CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter = nullptr) override;
|
||||
CleanUpFilesWithTTL(uint64_t seconds /*, CleanUpFilter* filter = nullptr*/) override;
|
||||
|
||||
Status
|
||||
DropAll() override;
|
||||
|
@ -122,6 +132,12 @@ class MySQLMetaImpl : public Meta {
|
|||
Status
|
||||
Count(const std::string& table_id, uint64_t& result) override;
|
||||
|
||||
Status
|
||||
SetGlobalLastLSN(uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
GetGlobalLastLSN(uint64_t& lsn) override;
|
||||
|
||||
private:
|
||||
Status
|
||||
NextFileId(std::string& file_id);
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -11,13 +11,13 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include "Meta.h"
|
||||
#include "db/Options.h"
|
||||
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
#include <vector>
|
||||
|
||||
#include "Meta.h"
|
||||
#include "db/Options.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace meta {
|
||||
|
@ -52,10 +52,10 @@ class SqliteMetaImpl : public Meta {
|
|||
CreateTableFile(TableFileSchema& file_schema) override;
|
||||
|
||||
Status
|
||||
DropDataByDate(const std::string& table_id, const DatesT& dates) override;
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
GetTableFiles(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& table_files) override;
|
||||
GetTableFilesBySegmentId(const std::string& segment_id, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
UpdateTableIndex(const std::string& table_id, const TableIndex& index) override;
|
||||
|
@ -63,6 +63,15 @@ class SqliteMetaImpl : public Meta {
|
|||
Status
|
||||
UpdateTableFlag(const std::string& table_id, int64_t flag) override;
|
||||
|
||||
Status
|
||||
UpdateTableFlushLSN(const std::string& table_id, uint64_t flush_lsn) override;
|
||||
|
||||
Status
|
||||
GetTableFlushLSN(const std::string& table_id, uint64_t& flush_lsn) override;
|
||||
|
||||
Status
|
||||
GetTableFilesByFlushLSN(uint64_t flush_lsn, TableFilesSchema& table_files) override;
|
||||
|
||||
Status
|
||||
UpdateTableFile(TableFileSchema& file_schema) override;
|
||||
|
||||
|
@ -79,7 +88,8 @@ class SqliteMetaImpl : public Meta {
|
|||
DropTableIndex(const std::string& table_id) override;
|
||||
|
||||
Status
|
||||
CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag) override;
|
||||
CreatePartition(const std::string& table_id, const std::string& partition_name, const std::string& tag,
|
||||
uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
DropPartition(const std::string& partition_name) override;
|
||||
|
@ -91,11 +101,10 @@ class SqliteMetaImpl : public Meta {
|
|||
GetPartitionName(const std::string& table_id, const std::string& tag, std::string& partition_name) override;
|
||||
|
||||
Status
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, const DatesT& dates,
|
||||
DatePartionedTableFilesSchema& files) override;
|
||||
FilesToSearch(const std::string& table_id, const std::vector<size_t>& ids, TableFilesSchema& files) override;
|
||||
|
||||
Status
|
||||
FilesToMerge(const std::string& table_id, DatePartionedTableFilesSchema& files) override;
|
||||
FilesToMerge(const std::string& table_id, TableFilesSchema& files) override;
|
||||
|
||||
Status
|
||||
FilesToIndex(TableFilesSchema&) override;
|
||||
|
@ -114,7 +123,7 @@ class SqliteMetaImpl : public Meta {
|
|||
CleanUpShadowFiles() override;
|
||||
|
||||
Status
|
||||
CleanUpFilesWithTTL(uint64_t seconds, CleanUpFilter* filter = nullptr) override;
|
||||
CleanUpFilesWithTTL(uint64_t seconds /*, CleanUpFilter* filter = nullptr*/) override;
|
||||
|
||||
Status
|
||||
DropAll() override;
|
||||
|
@ -122,6 +131,12 @@ class SqliteMetaImpl : public Meta {
|
|||
Status
|
||||
Count(const std::string& table_id, uint64_t& result) override;
|
||||
|
||||
Status
|
||||
SetGlobalLastLSN(uint64_t lsn) override;
|
||||
|
||||
Status
|
||||
GetGlobalLastLSN(uint64_t& lsn) override;
|
||||
|
||||
private:
|
||||
Status
|
||||
NextFileId(std::string& file_id);
|
||||
|
|
|
@ -0,0 +1,420 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/wal/WalBuffer.h"
|
||||
|
||||
#include <cstring>
|
||||
|
||||
#include "db/wal/WalDefinations.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
inline std::string
|
||||
ToFileName(int32_t file_no) {
|
||||
return std::to_string(file_no) + ".wal";
|
||||
}
|
||||
|
||||
inline void
|
||||
BuildLsn(uint32_t file_no, uint32_t offset, uint64_t& lsn) {
|
||||
lsn = (uint64_t)file_no << 32 | offset;
|
||||
}
|
||||
|
||||
inline void
|
||||
ParserLsn(uint64_t lsn, uint32_t& file_no, uint32_t& offset) {
|
||||
file_no = uint32_t(lsn >> 32);
|
||||
offset = uint32_t(lsn & LSN_OFFSET_MASK);
|
||||
}
|
||||
|
||||
MXLogBuffer::MXLogBuffer(const std::string& mxlog_path, const uint32_t buffer_size)
|
||||
: mxlog_buffer_size_(buffer_size), mxlog_writer_(mxlog_path) {
|
||||
if (mxlog_buffer_size_ < (uint32_t)WAL_BUFFER_MIN_SIZE) {
|
||||
WAL_LOG_INFO << "config wal buffer size is too small " << mxlog_buffer_size_;
|
||||
mxlog_buffer_size_ = (uint32_t)WAL_BUFFER_MIN_SIZE;
|
||||
} else if (mxlog_buffer_size_ > (uint32_t)WAL_BUFFER_MAX_SIZE) {
|
||||
WAL_LOG_INFO << "config wal buffer size is too larger " << mxlog_buffer_size_;
|
||||
mxlog_buffer_size_ = (uint32_t)WAL_BUFFER_MAX_SIZE;
|
||||
}
|
||||
}
|
||||
|
||||
MXLogBuffer::~MXLogBuffer() {
|
||||
}
|
||||
|
||||
/**
|
||||
* alloc space for buffers
|
||||
* @param buffer_size
|
||||
* @return
|
||||
*/
|
||||
bool
|
||||
MXLogBuffer::Init(uint64_t start_lsn, uint64_t end_lsn) {
|
||||
WAL_LOG_DEBUG << "start_lsn " << start_lsn << " end_lsn " << end_lsn;
|
||||
|
||||
ParserLsn(start_lsn, mxlog_buffer_reader_.file_no, mxlog_buffer_reader_.buf_offset);
|
||||
ParserLsn(end_lsn, mxlog_buffer_writer_.file_no, mxlog_buffer_writer_.buf_offset);
|
||||
|
||||
if (start_lsn == end_lsn) {
|
||||
// no data need recovery, start a new file_no
|
||||
if (mxlog_buffer_writer_.buf_offset != 0) {
|
||||
mxlog_buffer_writer_.file_no++;
|
||||
mxlog_buffer_writer_.buf_offset = 0;
|
||||
mxlog_buffer_reader_.file_no++;
|
||||
mxlog_buffer_reader_.buf_offset = 0;
|
||||
}
|
||||
} else {
|
||||
// to check whether buffer_size is enough
|
||||
MXLogFileHandler file_handler(mxlog_writer_.GetFilePath());
|
||||
|
||||
uint32_t buffer_size_need = 0;
|
||||
for (auto i = mxlog_buffer_reader_.file_no; i < mxlog_buffer_writer_.file_no; i++) {
|
||||
file_handler.SetFileName(ToFileName(i));
|
||||
auto file_size = file_handler.GetFileSize();
|
||||
if (file_size == 0) {
|
||||
WAL_LOG_ERROR << "bad wal file " << i;
|
||||
return false;
|
||||
}
|
||||
if (file_size > buffer_size_need) {
|
||||
buffer_size_need = file_size;
|
||||
}
|
||||
}
|
||||
if (mxlog_buffer_writer_.buf_offset > buffer_size_need) {
|
||||
buffer_size_need = mxlog_buffer_writer_.buf_offset;
|
||||
}
|
||||
|
||||
if (buffer_size_need > mxlog_buffer_size_) {
|
||||
mxlog_buffer_size_ = buffer_size_need;
|
||||
WAL_LOG_INFO << "recovery will need more buffer, buffer size changed " << mxlog_buffer_size_;
|
||||
}
|
||||
}
|
||||
|
||||
buf_[0] = BufferPtr(new char[mxlog_buffer_size_]);
|
||||
buf_[1] = BufferPtr(new char[mxlog_buffer_size_]);
|
||||
|
||||
if (mxlog_buffer_reader_.file_no == mxlog_buffer_writer_.file_no) {
|
||||
// read-write buffer
|
||||
mxlog_buffer_reader_.buf_idx = 0;
|
||||
mxlog_buffer_writer_.buf_idx = 0;
|
||||
|
||||
mxlog_writer_.SetFileName(ToFileName(mxlog_buffer_writer_.file_no));
|
||||
if (mxlog_buffer_writer_.buf_offset == 0) {
|
||||
mxlog_writer_.SetFileOpenMode("w");
|
||||
|
||||
} else {
|
||||
mxlog_writer_.SetFileOpenMode("r+");
|
||||
if (!mxlog_writer_.FileExists()) {
|
||||
WAL_LOG_ERROR << "wal file not exist " << mxlog_buffer_writer_.file_no;
|
||||
return false;
|
||||
}
|
||||
|
||||
auto read_offset = mxlog_buffer_reader_.buf_offset;
|
||||
auto read_size = mxlog_buffer_writer_.buf_offset - mxlog_buffer_reader_.buf_offset;
|
||||
if (!mxlog_writer_.Load(buf_[0].get() + read_offset, read_offset, read_size)) {
|
||||
WAL_LOG_ERROR << "load wal file error " << read_offset << " " << read_size;
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
} else {
|
||||
// read buffer
|
||||
mxlog_buffer_reader_.buf_idx = 0;
|
||||
|
||||
MXLogFileHandler file_handler(mxlog_writer_.GetFilePath());
|
||||
file_handler.SetFileName(ToFileName(mxlog_buffer_reader_.file_no));
|
||||
file_handler.SetFileOpenMode("r");
|
||||
|
||||
auto read_offset = mxlog_buffer_reader_.buf_offset;
|
||||
auto read_size = file_handler.Load(buf_[0].get() + read_offset, read_offset);
|
||||
mxlog_buffer_reader_.max_offset = read_size + read_offset;
|
||||
file_handler.CloseFile();
|
||||
|
||||
// write buffer
|
||||
mxlog_buffer_writer_.buf_idx = 1;
|
||||
|
||||
mxlog_writer_.SetFileName(ToFileName(mxlog_buffer_writer_.file_no));
|
||||
mxlog_writer_.SetFileOpenMode("r+");
|
||||
if (!mxlog_writer_.FileExists()) {
|
||||
WAL_LOG_ERROR << "wal file not exist " << mxlog_buffer_writer_.file_no;
|
||||
return false;
|
||||
}
|
||||
if (!mxlog_writer_.Load(buf_[1].get(), 0, mxlog_buffer_writer_.buf_offset)) {
|
||||
WAL_LOG_ERROR << "load wal file error " << mxlog_buffer_writer_.file_no;
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
SetFileNoFrom(mxlog_buffer_reader_.file_no);
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogBuffer::Reset(uint64_t lsn) {
|
||||
WAL_LOG_DEBUG << "reset lsn " << lsn;
|
||||
|
||||
buf_[0] = BufferPtr(new char[mxlog_buffer_size_]);
|
||||
buf_[1] = BufferPtr(new char[mxlog_buffer_size_]);
|
||||
|
||||
ParserLsn(lsn, mxlog_buffer_writer_.file_no, mxlog_buffer_writer_.buf_offset);
|
||||
if (mxlog_buffer_writer_.buf_offset != 0) {
|
||||
mxlog_buffer_writer_.file_no++;
|
||||
mxlog_buffer_writer_.buf_offset = 0;
|
||||
}
|
||||
mxlog_buffer_writer_.buf_idx = 0;
|
||||
|
||||
memcpy(&mxlog_buffer_reader_, &mxlog_buffer_writer_, sizeof(MXLogBufferHandler));
|
||||
|
||||
mxlog_writer_.CloseFile();
|
||||
mxlog_writer_.SetFileName(ToFileName(mxlog_buffer_writer_.file_no));
|
||||
mxlog_writer_.SetFileOpenMode("w");
|
||||
|
||||
SetFileNoFrom(mxlog_buffer_reader_.file_no);
|
||||
}
|
||||
|
||||
uint32_t
|
||||
MXLogBuffer::GetBufferSize() {
|
||||
return mxlog_buffer_size_;
|
||||
}
|
||||
|
||||
// buffer writer cares about surplus space of buffer
|
||||
uint32_t
|
||||
MXLogBuffer::SurplusSpace() {
|
||||
return mxlog_buffer_size_ - mxlog_buffer_writer_.buf_offset;
|
||||
}
|
||||
|
||||
uint32_t
|
||||
MXLogBuffer::RecordSize(const MXLogRecord& record) {
|
||||
return SizeOfMXLogRecordHeader + (uint32_t)record.table_id.size() + (uint32_t)record.partition_tag.size() +
|
||||
record.length * (uint32_t)sizeof(IDNumber) + record.data_size;
|
||||
}
|
||||
|
||||
ErrorCode
|
||||
MXLogBuffer::Append(MXLogRecord& record) {
|
||||
uint32_t record_size = RecordSize(record);
|
||||
if (SurplusSpace() < record_size) {
|
||||
// writer buffer has no space, switch wal file and write to a new buffer
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
if (mxlog_buffer_writer_.buf_idx == mxlog_buffer_reader_.buf_idx) {
|
||||
// swith writer buffer
|
||||
mxlog_buffer_reader_.max_offset = mxlog_buffer_writer_.buf_offset;
|
||||
mxlog_buffer_writer_.buf_idx ^= 1;
|
||||
}
|
||||
mxlog_buffer_writer_.file_no++;
|
||||
mxlog_buffer_writer_.buf_offset = 0;
|
||||
lck.unlock();
|
||||
|
||||
// Reborn means close old wal file and open new wal file
|
||||
if (!mxlog_writer_.ReBorn(ToFileName(mxlog_buffer_writer_.file_no), "w")) {
|
||||
WAL_LOG_ERROR << "ReBorn wal file error " << mxlog_buffer_writer_.file_no;
|
||||
return WAL_FILE_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
// point to the offset of current record in wal file
|
||||
char* current_write_buf = buf_[mxlog_buffer_writer_.buf_idx].get();
|
||||
uint32_t current_write_offset = mxlog_buffer_writer_.buf_offset;
|
||||
|
||||
MXLogRecordHeader head;
|
||||
BuildLsn(mxlog_buffer_writer_.file_no, mxlog_buffer_writer_.buf_offset + (uint32_t)record_size, head.mxl_lsn);
|
||||
head.mxl_type = (uint8_t)record.type;
|
||||
head.table_id_size = (uint16_t)record.table_id.size();
|
||||
head.partition_tag_size = (uint16_t)record.partition_tag.size();
|
||||
head.vector_num = record.length;
|
||||
head.data_size = record.data_size;
|
||||
|
||||
memcpy(current_write_buf + current_write_offset, &head, SizeOfMXLogRecordHeader);
|
||||
current_write_offset += SizeOfMXLogRecordHeader;
|
||||
|
||||
if (!record.table_id.empty()) {
|
||||
memcpy(current_write_buf + current_write_offset, record.table_id.data(), record.table_id.size());
|
||||
current_write_offset += record.table_id.size();
|
||||
}
|
||||
|
||||
if (!record.partition_tag.empty()) {
|
||||
memcpy(current_write_buf + current_write_offset, record.partition_tag.data(), record.partition_tag.size());
|
||||
current_write_offset += record.partition_tag.size();
|
||||
}
|
||||
if (record.ids != nullptr && record.length > 0) {
|
||||
memcpy(current_write_buf + current_write_offset, record.ids, record.length * sizeof(IDNumber));
|
||||
current_write_offset += record.length * sizeof(IDNumber);
|
||||
}
|
||||
|
||||
if (record.data != nullptr && record.data_size > 0) {
|
||||
memcpy(current_write_buf + current_write_offset, record.data, record.data_size);
|
||||
current_write_offset += record.data_size;
|
||||
}
|
||||
|
||||
bool write_rst = mxlog_writer_.Write(current_write_buf + mxlog_buffer_writer_.buf_offset, record_size);
|
||||
if (!write_rst) {
|
||||
WAL_LOG_ERROR << "write wal file error";
|
||||
return WAL_FILE_ERROR;
|
||||
}
|
||||
|
||||
mxlog_buffer_writer_.buf_offset = current_write_offset;
|
||||
|
||||
record.lsn = head.mxl_lsn;
|
||||
return WAL_SUCCESS;
|
||||
}
|
||||
|
||||
ErrorCode
|
||||
MXLogBuffer::Next(const uint64_t last_applied_lsn, MXLogRecord& record) {
|
||||
// init output
|
||||
record.type = MXLogType::None;
|
||||
|
||||
// reader catch up to writer, no next record, read fail
|
||||
if (GetReadLsn() >= last_applied_lsn) {
|
||||
return WAL_SUCCESS;
|
||||
}
|
||||
|
||||
// otherwise, it means there must exists next record, in buffer or wal log
|
||||
bool need_load_new = false;
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
if (mxlog_buffer_reader_.file_no != mxlog_buffer_writer_.file_no) {
|
||||
if (mxlog_buffer_reader_.buf_offset == mxlog_buffer_reader_.max_offset) { // last record
|
||||
mxlog_buffer_reader_.file_no++;
|
||||
mxlog_buffer_reader_.buf_offset = 0;
|
||||
need_load_new = (mxlog_buffer_reader_.file_no != mxlog_buffer_writer_.file_no);
|
||||
if (!need_load_new) {
|
||||
// read reach write buffer
|
||||
mxlog_buffer_reader_.buf_idx = mxlog_buffer_writer_.buf_idx;
|
||||
}
|
||||
}
|
||||
}
|
||||
lck.unlock();
|
||||
|
||||
if (need_load_new) {
|
||||
MXLogFileHandler mxlog_reader(mxlog_writer_.GetFilePath());
|
||||
mxlog_reader.SetFileName(ToFileName(mxlog_buffer_reader_.file_no));
|
||||
mxlog_reader.SetFileOpenMode("r");
|
||||
uint32_t file_size = mxlog_reader.Load(buf_[mxlog_buffer_reader_.buf_idx].get(), 0);
|
||||
if (file_size == 0) {
|
||||
WAL_LOG_ERROR << "load wal file error " << mxlog_buffer_reader_.file_no;
|
||||
return WAL_FILE_ERROR;
|
||||
}
|
||||
mxlog_buffer_reader_.max_offset = file_size;
|
||||
}
|
||||
|
||||
char* current_read_buf = buf_[mxlog_buffer_reader_.buf_idx].get();
|
||||
uint64_t current_read_offset = mxlog_buffer_reader_.buf_offset;
|
||||
|
||||
MXLogRecordHeader* head = (MXLogRecordHeader*)(current_read_buf + current_read_offset);
|
||||
record.type = (MXLogType)head->mxl_type;
|
||||
record.lsn = head->mxl_lsn;
|
||||
record.length = head->vector_num;
|
||||
record.data_size = head->data_size;
|
||||
|
||||
current_read_offset += SizeOfMXLogRecordHeader;
|
||||
|
||||
if (head->table_id_size != 0) {
|
||||
record.table_id.assign(current_read_buf + current_read_offset, head->table_id_size);
|
||||
current_read_offset += head->table_id_size;
|
||||
} else {
|
||||
record.table_id = "";
|
||||
}
|
||||
|
||||
if (head->partition_tag_size != 0) {
|
||||
record.partition_tag.assign(current_read_buf + current_read_offset, head->partition_tag_size);
|
||||
current_read_offset += head->partition_tag_size;
|
||||
} else {
|
||||
record.partition_tag = "";
|
||||
}
|
||||
|
||||
if (head->vector_num != 0) {
|
||||
record.ids = (IDNumber*)(current_read_buf + current_read_offset);
|
||||
current_read_offset += head->vector_num * sizeof(IDNumber);
|
||||
} else {
|
||||
record.ids = nullptr;
|
||||
}
|
||||
|
||||
if (record.data_size != 0) {
|
||||
record.data = current_read_buf + current_read_offset;
|
||||
} else {
|
||||
record.data = nullptr;
|
||||
}
|
||||
|
||||
mxlog_buffer_reader_.buf_offset = uint32_t(head->mxl_lsn & LSN_OFFSET_MASK);
|
||||
return WAL_SUCCESS;
|
||||
}
|
||||
|
||||
uint64_t
|
||||
MXLogBuffer::GetReadLsn() {
|
||||
uint64_t read_lsn;
|
||||
BuildLsn(mxlog_buffer_reader_.file_no, mxlog_buffer_reader_.buf_offset, read_lsn);
|
||||
return read_lsn;
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogBuffer::ResetWriteLsn(uint64_t lsn) {
|
||||
WAL_LOG_INFO << "reset write lsn " << lsn;
|
||||
|
||||
int32_t old_file_no = mxlog_buffer_writer_.file_no;
|
||||
ParserLsn(lsn, mxlog_buffer_writer_.file_no, mxlog_buffer_writer_.buf_offset);
|
||||
if (old_file_no == mxlog_buffer_writer_.file_no) {
|
||||
WAL_LOG_DEBUG << "file No. is not changed";
|
||||
return true;
|
||||
}
|
||||
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
if (mxlog_buffer_writer_.file_no == mxlog_buffer_reader_.file_no) {
|
||||
mxlog_buffer_writer_.buf_idx = mxlog_buffer_reader_.buf_idx;
|
||||
WAL_LOG_DEBUG << "file No. is the same as reader";
|
||||
return true;
|
||||
}
|
||||
lck.unlock();
|
||||
|
||||
if (!mxlog_writer_.ReBorn(ToFileName(mxlog_buffer_writer_.file_no), "r+")) {
|
||||
WAL_LOG_ERROR << "reborn file error " << mxlog_buffer_writer_.file_no;
|
||||
return false;
|
||||
}
|
||||
if (!mxlog_writer_.Load(buf_[mxlog_buffer_writer_.buf_idx].get(), 0, mxlog_buffer_writer_.buf_offset)) {
|
||||
WAL_LOG_ERROR << "load file error";
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogBuffer::SetFileNoFrom(uint32_t file_no) {
|
||||
file_no_from_ = file_no;
|
||||
|
||||
if (file_no > 0) {
|
||||
// remove the files whose No. are less than file_no
|
||||
MXLogFileHandler file_handler(mxlog_writer_.GetFilePath());
|
||||
do {
|
||||
file_handler.SetFileName(ToFileName(--file_no));
|
||||
if (!file_handler.FileExists()) {
|
||||
break;
|
||||
}
|
||||
WAL_LOG_INFO << "Delete wal file " << file_no;
|
||||
file_handler.DeleteFile();
|
||||
} while (file_no > 0);
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
MXLogBuffer::RemoveOldFiles(uint64_t flushed_lsn) {
|
||||
uint32_t file_no;
|
||||
uint32_t offset;
|
||||
ParserLsn(flushed_lsn, file_no, offset);
|
||||
if (file_no_from_ < file_no) {
|
||||
MXLogFileHandler file_handler(mxlog_writer_.GetFilePath());
|
||||
do {
|
||||
file_handler.SetFileName(ToFileName(file_no_from_));
|
||||
WAL_LOG_INFO << "Delete wal file " << file_no_from_;
|
||||
file_handler.DeleteFile();
|
||||
} while (++file_no_from_ < file_no);
|
||||
}
|
||||
}
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,108 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <atomic>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <string>
|
||||
|
||||
#include "WalDefinations.h"
|
||||
#include "WalFileHandler.h"
|
||||
#include "WalMetaHandler.h"
|
||||
#include "utils/Error.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
#pragma pack(push)
|
||||
#pragma pack(1)
|
||||
|
||||
struct MXLogRecordHeader {
|
||||
uint64_t mxl_lsn; // log sequence number (high 32 bits: file No. inc by 1, low 32 bits: offset in file, max 4GB)
|
||||
uint8_t mxl_type; // record type, insert/delete/update/flush...
|
||||
uint16_t table_id_size;
|
||||
uint16_t partition_tag_size;
|
||||
uint32_t vector_num;
|
||||
uint32_t data_size;
|
||||
};
|
||||
|
||||
const uint32_t SizeOfMXLogRecordHeader = sizeof(MXLogRecordHeader);
|
||||
|
||||
#pragma pack(pop)
|
||||
|
||||
struct MXLogBufferHandler {
|
||||
uint32_t max_offset;
|
||||
uint32_t file_no;
|
||||
uint32_t buf_offset;
|
||||
uint8_t buf_idx;
|
||||
};
|
||||
|
||||
using BufferPtr = std::shared_ptr<char>;
|
||||
|
||||
class MXLogBuffer {
|
||||
public:
|
||||
MXLogBuffer(const std::string& mxlog_path, const uint32_t buffer_size);
|
||||
~MXLogBuffer();
|
||||
|
||||
bool
|
||||
Init(uint64_t read_lsn, uint64_t write_lsn);
|
||||
|
||||
// ignore all old wal file
|
||||
void
|
||||
Reset(uint64_t lsn);
|
||||
|
||||
// Note: record.lsn will be set inner
|
||||
ErrorCode
|
||||
Append(MXLogRecord& record);
|
||||
|
||||
ErrorCode
|
||||
Next(const uint64_t last_applied_lsn, MXLogRecord& record);
|
||||
|
||||
uint64_t
|
||||
GetReadLsn();
|
||||
|
||||
bool
|
||||
ResetWriteLsn(uint64_t lsn);
|
||||
|
||||
void
|
||||
SetFileNoFrom(uint32_t file_no);
|
||||
|
||||
void
|
||||
RemoveOldFiles(uint64_t flushed_lsn);
|
||||
|
||||
uint32_t
|
||||
GetBufferSize();
|
||||
|
||||
uint32_t
|
||||
SurplusSpace();
|
||||
|
||||
private:
|
||||
uint32_t
|
||||
RecordSize(const MXLogRecord& record);
|
||||
|
||||
private:
|
||||
uint32_t mxlog_buffer_size_; // from config
|
||||
BufferPtr buf_[2];
|
||||
std::mutex mutex_;
|
||||
uint32_t file_no_from_;
|
||||
MXLogBufferHandler mxlog_buffer_reader_;
|
||||
MXLogBufferHandler mxlog_buffer_writer_;
|
||||
MXLogFileHandler mxlog_writer_;
|
||||
};
|
||||
|
||||
using MXLogBufferPtr = std::shared_ptr<MXLogBuffer>;
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,53 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <unordered_map>
|
||||
|
||||
#include "db/Types.h"
|
||||
#include "db/meta/MetaTypes.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
using TableSchemaPtr = std::shared_ptr<milvus::engine::meta::TableSchema>;
|
||||
using TableMetaPtr = std::shared_ptr<std::unordered_map<std::string, TableSchemaPtr> >;
|
||||
|
||||
#define WAL_BUFFER_MAX_SIZE ((uint32_t)2 * 1024 * 1024 * 1024)
|
||||
#define WAL_BUFFER_MIN_SIZE ((uint32_t)32 * 1024 * 1024)
|
||||
#define LSN_OFFSET_MASK 0x00000000ffffffff
|
||||
|
||||
enum class MXLogType { InsertBinary, InsertVector, Delete, Update, Flush, None };
|
||||
|
||||
struct MXLogRecord {
|
||||
uint64_t lsn;
|
||||
MXLogType type;
|
||||
std::string table_id;
|
||||
std::string partition_tag;
|
||||
uint32_t length;
|
||||
const IDNumber* ids;
|
||||
uint32_t data_size;
|
||||
const void* data;
|
||||
};
|
||||
|
||||
struct MXLogConfiguration {
|
||||
bool recovery_error_ignore;
|
||||
uint32_t buffer_size;
|
||||
std::string mxlog_path;
|
||||
};
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,139 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/wal/WalFileHandler.h"
|
||||
|
||||
#include <sys/stat.h>
|
||||
#include <unistd.h>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
MXLogFileHandler::MXLogFileHandler(const std::string& mxlog_path) : file_path_(mxlog_path), p_file_(nullptr) {
|
||||
}
|
||||
|
||||
MXLogFileHandler::~MXLogFileHandler() {
|
||||
CloseFile();
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::OpenFile() {
|
||||
if (p_file_ == nullptr) {
|
||||
p_file_ = fopen((file_path_ + file_name_).c_str(), file_mode_.c_str());
|
||||
}
|
||||
return (p_file_ != nullptr);
|
||||
}
|
||||
|
||||
uint32_t
|
||||
MXLogFileHandler::Load(char* buf, uint32_t data_offset) {
|
||||
uint32_t read_size = 0;
|
||||
if (OpenFile()) {
|
||||
uint32_t file_size = GetFileSize();
|
||||
if (file_size > data_offset) {
|
||||
read_size = file_size - data_offset;
|
||||
fseek(p_file_, data_offset, SEEK_SET);
|
||||
fread(buf, 1, read_size, p_file_);
|
||||
}
|
||||
}
|
||||
return read_size;
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::Load(char* buf, uint32_t data_offset, uint32_t data_size) {
|
||||
if (OpenFile() && data_size != 0) {
|
||||
auto file_size = GetFileSize();
|
||||
if ((file_size < data_offset) || (file_size - data_offset < data_size)) {
|
||||
return false;
|
||||
}
|
||||
|
||||
fseek(p_file_, data_offset, SEEK_SET);
|
||||
fread(buf, 1, data_size, p_file_);
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::Write(char* buf, uint32_t data_size, bool is_sync) {
|
||||
uint32_t written_size = 0;
|
||||
if (OpenFile() && data_size != 0) {
|
||||
written_size = fwrite(buf, 1, data_size, p_file_);
|
||||
fflush(p_file_);
|
||||
}
|
||||
return (written_size == data_size);
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::ReBorn(const std::string& file_name, const std::string& open_mode) {
|
||||
CloseFile();
|
||||
SetFileName(file_name);
|
||||
SetFileOpenMode(open_mode);
|
||||
return OpenFile();
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::CloseFile() {
|
||||
if (p_file_ != nullptr) {
|
||||
fclose(p_file_);
|
||||
p_file_ = nullptr;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
std::string
|
||||
MXLogFileHandler::GetFilePath() {
|
||||
return file_path_;
|
||||
}
|
||||
|
||||
std::string
|
||||
MXLogFileHandler::GetFileName() {
|
||||
return file_name_;
|
||||
}
|
||||
|
||||
uint32_t
|
||||
MXLogFileHandler::GetFileSize() {
|
||||
struct stat statbuf;
|
||||
if (0 == stat((file_path_ + file_name_).c_str(), &statbuf)) {
|
||||
return (uint32_t)statbuf.st_size;
|
||||
}
|
||||
|
||||
return 0;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogFileHandler::DeleteFile() {
|
||||
remove((file_path_ + file_name_).c_str());
|
||||
file_name_ = "";
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogFileHandler::FileExists() {
|
||||
return access((file_path_ + file_name_).c_str(), 0) != -1;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogFileHandler::SetFileOpenMode(const std::string& open_mode) {
|
||||
file_mode_ = open_mode;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogFileHandler::SetFileName(const std::string& file_name) {
|
||||
file_name_ = file_name;
|
||||
}
|
||||
|
||||
void
|
||||
MXLogFileHandler::SetFilePath(const std::string& file_path) {
|
||||
file_path_ = file_path;
|
||||
}
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,65 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <string>
|
||||
|
||||
#include "WalDefinations.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
class MXLogFileHandler {
|
||||
public:
|
||||
explicit MXLogFileHandler(const std::string& mxlog_path);
|
||||
~MXLogFileHandler();
|
||||
|
||||
std::string
|
||||
GetFilePath();
|
||||
std::string
|
||||
GetFileName();
|
||||
bool
|
||||
OpenFile();
|
||||
bool
|
||||
CloseFile();
|
||||
uint32_t
|
||||
Load(char* buf, uint32_t data_offset);
|
||||
bool
|
||||
Load(char* buf, uint32_t data_offset, uint32_t data_size);
|
||||
bool
|
||||
Write(char* buf, uint32_t data_size, bool is_sync = false);
|
||||
bool
|
||||
ReBorn(const std::string& file_name, const std::string& open_mode);
|
||||
uint32_t
|
||||
GetFileSize();
|
||||
void
|
||||
SetFileOpenMode(const std::string& open_mode);
|
||||
void
|
||||
SetFilePath(const std::string& file_path);
|
||||
void
|
||||
SetFileName(const std::string& file_name);
|
||||
void
|
||||
DeleteFile();
|
||||
bool
|
||||
FileExists();
|
||||
|
||||
private:
|
||||
std::string file_path_;
|
||||
std::string file_name_;
|
||||
std::string file_mode_;
|
||||
FILE* p_file_;
|
||||
};
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,397 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/wal/WalManager.h"
|
||||
|
||||
#include <unistd.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <memory>
|
||||
|
||||
#include "utils/CommonUtil.h"
|
||||
#include "utils/Exception.h"
|
||||
#include "utils/Log.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
WalManager::WalManager(const MXLogConfiguration& config) {
|
||||
mxlog_config_.recovery_error_ignore = config.recovery_error_ignore;
|
||||
mxlog_config_.buffer_size = config.buffer_size * 1024 * 1024;
|
||||
mxlog_config_.mxlog_path = config.mxlog_path;
|
||||
|
||||
// check the path end with '/'
|
||||
if (mxlog_config_.mxlog_path.back() != '/') {
|
||||
mxlog_config_.mxlog_path += '/';
|
||||
}
|
||||
// check path exist
|
||||
auto status = server::CommonUtil::CreateDirectory(mxlog_config_.mxlog_path);
|
||||
if (!status.ok()) {
|
||||
std::string msg = "failed to create wal directory " + mxlog_config_.mxlog_path;
|
||||
ENGINE_LOG_ERROR << msg;
|
||||
throw Exception(WAL_PATH_ERROR, msg);
|
||||
}
|
||||
}
|
||||
|
||||
WalManager::~WalManager() {
|
||||
}
|
||||
|
||||
ErrorCode
|
||||
WalManager::Init(const meta::MetaPtr& meta) {
|
||||
uint64_t applied_lsn = 0;
|
||||
p_meta_handler_ = std::make_shared<MXLogMetaHandler>(mxlog_config_.mxlog_path);
|
||||
if (p_meta_handler_ != nullptr) {
|
||||
p_meta_handler_->GetMXLogInternalMeta(applied_lsn);
|
||||
}
|
||||
|
||||
uint64_t recovery_start = 0;
|
||||
if (meta != nullptr) {
|
||||
meta->GetGlobalLastLSN(recovery_start);
|
||||
|
||||
std::vector<meta::TableSchema> table_schema_array;
|
||||
auto status = meta->AllTables(table_schema_array);
|
||||
if (!status.ok()) {
|
||||
return WAL_META_ERROR;
|
||||
}
|
||||
|
||||
if (!table_schema_array.empty()) {
|
||||
// get min and max flushed lsn
|
||||
uint64_t min_flused_lsn = table_schema_array[0].flush_lsn_;
|
||||
uint64_t max_flused_lsn = table_schema_array[0].flush_lsn_;
|
||||
for (size_t i = 1; i < table_schema_array.size(); i++) {
|
||||
if (min_flused_lsn > table_schema_array[i].flush_lsn_) {
|
||||
min_flused_lsn = table_schema_array[i].flush_lsn_;
|
||||
} else if (max_flused_lsn < table_schema_array[i].flush_lsn_) {
|
||||
max_flused_lsn = table_schema_array[i].flush_lsn_;
|
||||
}
|
||||
}
|
||||
if (applied_lsn < max_flused_lsn) {
|
||||
// a new WAL folder?
|
||||
applied_lsn = max_flused_lsn;
|
||||
}
|
||||
if (recovery_start < min_flused_lsn) {
|
||||
// not flush all yet
|
||||
recovery_start = min_flused_lsn;
|
||||
}
|
||||
|
||||
for (auto& schema : table_schema_array) {
|
||||
TableLsn tb_lsn = {schema.flush_lsn_, applied_lsn};
|
||||
tables_[schema.table_id_] = tb_lsn;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// all tables are droped and a new wal path?
|
||||
if (applied_lsn < recovery_start) {
|
||||
applied_lsn = recovery_start;
|
||||
}
|
||||
|
||||
ErrorCode error_code = WAL_ERROR;
|
||||
p_buffer_ = std::make_shared<MXLogBuffer>(mxlog_config_.mxlog_path, mxlog_config_.buffer_size);
|
||||
if (p_buffer_ != nullptr) {
|
||||
if (p_buffer_->Init(recovery_start, applied_lsn)) {
|
||||
error_code = WAL_SUCCESS;
|
||||
} else if (mxlog_config_.recovery_error_ignore) {
|
||||
p_buffer_->Reset(applied_lsn);
|
||||
error_code = WAL_SUCCESS;
|
||||
} else {
|
||||
error_code = WAL_FILE_ERROR;
|
||||
}
|
||||
}
|
||||
|
||||
// buffer size may changed
|
||||
mxlog_config_.buffer_size = p_buffer_->GetBufferSize();
|
||||
|
||||
last_applied_lsn_ = applied_lsn;
|
||||
return error_code;
|
||||
}
|
||||
|
||||
ErrorCode
|
||||
WalManager::GetNextRecovery(MXLogRecord& record) {
|
||||
ErrorCode error_code = WAL_SUCCESS;
|
||||
while (true) {
|
||||
error_code = p_buffer_->Next(last_applied_lsn_, record);
|
||||
if (error_code != WAL_SUCCESS) {
|
||||
if (mxlog_config_.recovery_error_ignore) {
|
||||
// reset and break recovery
|
||||
p_buffer_->Reset(last_applied_lsn_);
|
||||
|
||||
record.type = MXLogType::None;
|
||||
error_code = WAL_SUCCESS;
|
||||
}
|
||||
break;
|
||||
}
|
||||
if (record.type == MXLogType::None) {
|
||||
break;
|
||||
}
|
||||
|
||||
// background thread has not started.
|
||||
// so, needn't lock here.
|
||||
auto it = tables_.find(record.table_id);
|
||||
if (it != tables_.end()) {
|
||||
if (it->second.flush_lsn < record.lsn) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
WAL_LOG_INFO << "record type " << (int32_t)record.type << " record lsn " << record.lsn << " error code "
|
||||
<< error_code;
|
||||
|
||||
return error_code;
|
||||
}
|
||||
|
||||
ErrorCode
|
||||
WalManager::GetNextRecord(MXLogRecord& record) {
|
||||
auto check_flush = [&]() -> bool {
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
if (flush_info_.IsValid()) {
|
||||
if (p_buffer_->GetReadLsn() >= flush_info_.lsn_) {
|
||||
// can exec flush requirement
|
||||
record.type = MXLogType::Flush;
|
||||
record.table_id = flush_info_.table_id_;
|
||||
record.lsn = flush_info_.lsn_;
|
||||
flush_info_.Clear();
|
||||
|
||||
WAL_LOG_INFO << "record flush table " << record.table_id << " lsn " << record.lsn;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
};
|
||||
|
||||
if (check_flush()) {
|
||||
return WAL_SUCCESS;
|
||||
}
|
||||
|
||||
ErrorCode error_code = WAL_SUCCESS;
|
||||
while (WAL_SUCCESS == p_buffer_->Next(last_applied_lsn_, record)) {
|
||||
if (record.type == MXLogType::None) {
|
||||
if (check_flush()) {
|
||||
return WAL_SUCCESS;
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
auto it = tables_.find(record.table_id);
|
||||
if (it != tables_.end()) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
WAL_LOG_INFO << "record type " << (int32_t)record.type << " table " << record.table_id << " lsn " << record.lsn;
|
||||
return error_code;
|
||||
}
|
||||
|
||||
uint64_t
|
||||
WalManager::CreateTable(const std::string& table_id) {
|
||||
WAL_LOG_INFO << "create table " << table_id << " " << last_applied_lsn_;
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
uint64_t applied_lsn = last_applied_lsn_;
|
||||
tables_[table_id] = {applied_lsn, applied_lsn};
|
||||
return applied_lsn;
|
||||
}
|
||||
|
||||
void
|
||||
WalManager::DropTable(const std::string& table_id) {
|
||||
WAL_LOG_INFO << "drop table " << table_id;
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
tables_.erase(table_id);
|
||||
}
|
||||
|
||||
void
|
||||
WalManager::TableFlushed(const std::string& table_id, uint64_t lsn) {
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
auto it = tables_.find(table_id);
|
||||
if (it != tables_.end()) {
|
||||
it->second.flush_lsn = lsn;
|
||||
}
|
||||
lck.unlock();
|
||||
|
||||
WAL_LOG_INFO << table_id << " is flushed by lsn " << lsn;
|
||||
}
|
||||
|
||||
template <typename T>
|
||||
bool
|
||||
WalManager::Insert(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<T>& vectors) {
|
||||
MXLogType log_type;
|
||||
if (std::is_same<T, float>::value) {
|
||||
log_type = MXLogType::InsertVector;
|
||||
} else if (std::is_same<T, uint8_t>::value) {
|
||||
log_type = MXLogType::InsertBinary;
|
||||
} else {
|
||||
return false;
|
||||
}
|
||||
|
||||
size_t vector_num = vector_ids.size();
|
||||
if (vector_num == 0) {
|
||||
WAL_LOG_ERROR << "The ids is empty.";
|
||||
return false;
|
||||
}
|
||||
size_t dim = vectors.size() / vector_num;
|
||||
size_t unit_size = dim * sizeof(T) + sizeof(IDNumber);
|
||||
size_t head_size = SizeOfMXLogRecordHeader + table_id.length() + partition_tag.length();
|
||||
|
||||
MXLogRecord record;
|
||||
record.type = log_type;
|
||||
record.table_id = table_id;
|
||||
record.partition_tag = partition_tag;
|
||||
|
||||
uint64_t new_lsn = 0;
|
||||
for (size_t i = 0; i < vector_num; i += record.length) {
|
||||
size_t surplus_space = p_buffer_->SurplusSpace();
|
||||
size_t max_rcd_num = 0;
|
||||
if (surplus_space >= head_size + unit_size) {
|
||||
max_rcd_num = (surplus_space - head_size) / unit_size;
|
||||
} else {
|
||||
max_rcd_num = (mxlog_config_.buffer_size - head_size) / unit_size;
|
||||
}
|
||||
if (max_rcd_num == 0) {
|
||||
WAL_LOG_ERROR << "Wal buffer size is too small " << mxlog_config_.buffer_size << " unit " << unit_size;
|
||||
return false;
|
||||
}
|
||||
|
||||
record.length = std::min(vector_num - i, max_rcd_num);
|
||||
record.ids = vector_ids.data() + i;
|
||||
record.data_size = record.length * dim * sizeof(T);
|
||||
record.data = vectors.data() + i * dim;
|
||||
|
||||
auto error_code = p_buffer_->Append(record);
|
||||
if (error_code != WAL_SUCCESS) {
|
||||
p_buffer_->ResetWriteLsn(last_applied_lsn_);
|
||||
return false;
|
||||
}
|
||||
new_lsn = record.lsn;
|
||||
}
|
||||
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
last_applied_lsn_ = new_lsn;
|
||||
auto it = tables_.find(table_id);
|
||||
if (it != tables_.end()) {
|
||||
it->second.wal_lsn = new_lsn;
|
||||
}
|
||||
lck.unlock();
|
||||
|
||||
WAL_LOG_INFO << table_id << " insert in part " << partition_tag << " with lsn " << new_lsn;
|
||||
|
||||
return p_meta_handler_->SetMXLogInternalMeta(new_lsn);
|
||||
}
|
||||
|
||||
bool
|
||||
WalManager::DeleteById(const std::string& table_id, const IDNumbers& vector_ids) {
|
||||
size_t vector_num = vector_ids.size();
|
||||
if (vector_num == 0) {
|
||||
WAL_LOG_ERROR << "The ids is empty.";
|
||||
return false;
|
||||
}
|
||||
|
||||
size_t unit_size = sizeof(IDNumber);
|
||||
size_t head_size = SizeOfMXLogRecordHeader + table_id.length();
|
||||
|
||||
MXLogRecord record;
|
||||
record.type = MXLogType::Delete;
|
||||
record.table_id = table_id;
|
||||
record.partition_tag = "";
|
||||
|
||||
uint64_t new_lsn = 0;
|
||||
for (size_t i = 0; i < vector_num; i += record.length) {
|
||||
size_t surplus_space = p_buffer_->SurplusSpace();
|
||||
size_t max_rcd_num = 0;
|
||||
if (surplus_space >= head_size + unit_size) {
|
||||
max_rcd_num = (surplus_space - head_size) / unit_size;
|
||||
} else {
|
||||
max_rcd_num = (mxlog_config_.buffer_size - head_size) / unit_size;
|
||||
}
|
||||
|
||||
record.length = std::min(vector_num - i, max_rcd_num);
|
||||
record.ids = vector_ids.data() + i;
|
||||
record.data_size = 0;
|
||||
record.data = nullptr;
|
||||
|
||||
auto error_code = p_buffer_->Append(record);
|
||||
if (error_code != WAL_SUCCESS) {
|
||||
p_buffer_->ResetWriteLsn(last_applied_lsn_);
|
||||
return false;
|
||||
}
|
||||
new_lsn = record.lsn;
|
||||
}
|
||||
|
||||
std::unique_lock<std::mutex> lck(mutex_);
|
||||
last_applied_lsn_ = new_lsn;
|
||||
auto it = tables_.find(table_id);
|
||||
if (it != tables_.end()) {
|
||||
it->second.wal_lsn = new_lsn;
|
||||
}
|
||||
lck.unlock();
|
||||
|
||||
WAL_LOG_INFO << table_id << " delete rows by id, lsn " << new_lsn;
|
||||
|
||||
return p_meta_handler_->SetMXLogInternalMeta(new_lsn);
|
||||
}
|
||||
|
||||
uint64_t
|
||||
WalManager::Flush(const std::string table_id) {
|
||||
std::lock_guard<std::mutex> lck(mutex_);
|
||||
// At most one flush requirement is waiting at any time.
|
||||
// Otherwise, flush_info_ should be modified to a list.
|
||||
__glibcxx_assert(!flush_info_.IsValid());
|
||||
|
||||
uint64_t lsn = 0;
|
||||
if (table_id.empty()) {
|
||||
// flush all tables
|
||||
for (auto& it : tables_) {
|
||||
if (it.second.wal_lsn > it.second.flush_lsn) {
|
||||
lsn = last_applied_lsn_;
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
} else {
|
||||
// flush one table
|
||||
auto it = tables_.find(table_id);
|
||||
if (it != tables_.end()) {
|
||||
if (it->second.wal_lsn > it->second.flush_lsn) {
|
||||
lsn = it->second.wal_lsn;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (lsn != 0) {
|
||||
flush_info_.table_id_ = table_id;
|
||||
flush_info_.lsn_ = lsn;
|
||||
}
|
||||
|
||||
WAL_LOG_INFO << table_id << " want to be flush, lsn " << lsn;
|
||||
|
||||
return lsn;
|
||||
}
|
||||
|
||||
void
|
||||
WalManager::RemoveOldFiles(uint64_t flushed_lsn) {
|
||||
if (p_buffer_ != nullptr) {
|
||||
p_buffer_->RemoveOldFiles(flushed_lsn);
|
||||
}
|
||||
}
|
||||
|
||||
template bool
|
||||
WalManager::Insert<float>(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<float>& vectors);
|
||||
|
||||
template bool
|
||||
WalManager::Insert<uint8_t>(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<uint8_t>& vectors);
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,159 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <atomic>
|
||||
#include <map>
|
||||
#include <string>
|
||||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
#include "WalBuffer.h"
|
||||
#include "WalDefinations.h"
|
||||
#include "WalFileHandler.h"
|
||||
#include "WalMetaHandler.h"
|
||||
#include "utils/Error.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
class WalManager {
|
||||
public:
|
||||
explicit WalManager(const MXLogConfiguration& config);
|
||||
~WalManager();
|
||||
|
||||
/*
|
||||
* init
|
||||
* @param meta
|
||||
* @retval error_code
|
||||
*/
|
||||
ErrorCode
|
||||
Init(const meta::MetaPtr& meta);
|
||||
|
||||
/*
|
||||
* Get next recovery
|
||||
* @param record[out]: record
|
||||
* @retval error_code
|
||||
*/
|
||||
ErrorCode
|
||||
GetNextRecovery(MXLogRecord& record);
|
||||
|
||||
/*
|
||||
* Get next record
|
||||
* @param record[out]: record
|
||||
* @retval error_code
|
||||
*/
|
||||
ErrorCode
|
||||
GetNextRecord(MXLogRecord& record);
|
||||
|
||||
/*
|
||||
* Create table
|
||||
* @param table_id: table id
|
||||
* @retval lsn
|
||||
*/
|
||||
uint64_t
|
||||
CreateTable(const std::string& table_id);
|
||||
|
||||
/*
|
||||
* Drop table
|
||||
* @param table_id: table id
|
||||
* @retval none
|
||||
*/
|
||||
void
|
||||
DropTable(const std::string& table_id);
|
||||
|
||||
/*
|
||||
* Table is flushed
|
||||
* @param table_id: table id
|
||||
* @param lsn: flushed lsn
|
||||
*/
|
||||
void
|
||||
TableFlushed(const std::string& table_id, uint64_t lsn);
|
||||
|
||||
/*
|
||||
* Insert
|
||||
* @param table_id: table id
|
||||
* @param table_id: partition tag
|
||||
* @param vector_ids: vector ids
|
||||
* @param vectors: vectors
|
||||
*/
|
||||
template <typename T>
|
||||
bool
|
||||
Insert(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<T>& vectors);
|
||||
|
||||
/*
|
||||
* Insert
|
||||
* @param table_id: table id
|
||||
* @param vector_ids: vector ids
|
||||
*/
|
||||
bool
|
||||
DeleteById(const std::string& table_id, const IDNumbers& vector_ids);
|
||||
|
||||
/*
|
||||
* Get flush lsn
|
||||
* @param table_id: table id (empty means all tables)
|
||||
* @retval if there is something not flushed, return lsn;
|
||||
* else, return 0
|
||||
*/
|
||||
uint64_t
|
||||
Flush(const std::string table_id = "");
|
||||
|
||||
void
|
||||
RemoveOldFiles(uint64_t flushed_lsn);
|
||||
|
||||
private:
|
||||
WalManager
|
||||
operator=(WalManager&);
|
||||
|
||||
MXLogConfiguration mxlog_config_;
|
||||
|
||||
MXLogBufferPtr p_buffer_;
|
||||
MXLogMetaHandlerPtr p_meta_handler_;
|
||||
|
||||
struct TableLsn {
|
||||
uint64_t flush_lsn;
|
||||
uint64_t wal_lsn;
|
||||
};
|
||||
std::mutex mutex_;
|
||||
std::map<std::string, TableLsn> tables_;
|
||||
std::atomic<uint64_t> last_applied_lsn_;
|
||||
|
||||
// if multi-thread call Flush(), use list
|
||||
struct FlushInfo {
|
||||
std::string table_id_;
|
||||
uint64_t lsn_ = 0;
|
||||
|
||||
bool
|
||||
IsValid() {
|
||||
return (lsn_ != 0);
|
||||
}
|
||||
void
|
||||
Clear() {
|
||||
lsn_ = 0;
|
||||
}
|
||||
};
|
||||
FlushInfo flush_info_;
|
||||
};
|
||||
|
||||
extern template bool
|
||||
WalManager::Insert<float>(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<float>& vectors);
|
||||
|
||||
extern template bool
|
||||
WalManager::Insert<uint8_t>(const std::string& table_id, const std::string& partition_tag, const IDNumbers& vector_ids,
|
||||
const std::vector<uint8_t>& vectors);
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -0,0 +1,70 @@
|
|||
// Copyright (C) 2019-2020 Zilliz. All rights reserved.
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance
|
||||
// with the License. You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software distributed under the License
|
||||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "db/wal/WalMetaHandler.h"
|
||||
|
||||
#include <cstring>
|
||||
|
||||
namespace milvus {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
MXLogMetaHandler::MXLogMetaHandler(const std::string& internal_meta_file_path) {
|
||||
std::string file_full_path = internal_meta_file_path + WAL_META_FILE_NAME;
|
||||
|
||||
wal_meta_fp_ = fopen(file_full_path.c_str(), "r+");
|
||||
if (wal_meta_fp_ == nullptr) {
|
||||
wal_meta_fp_ = fopen(file_full_path.c_str(), "w");
|
||||
|
||||
} else {
|
||||
uint64_t all_wal_lsn[3] = {0, 0, 0};
|
||||
auto rt_val = fread(&all_wal_lsn, sizeof(all_wal_lsn), 1, wal_meta_fp_);
|
||||
if (rt_val == 1) {
|
||||
if (all_wal_lsn[2] == all_wal_lsn[1]) {
|
||||
latest_wal_lsn_ = all_wal_lsn[2];
|
||||
} else {
|
||||
latest_wal_lsn_ = all_wal_lsn[0];
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
MXLogMetaHandler::~MXLogMetaHandler() {
|
||||
if (wal_meta_fp_ != nullptr) {
|
||||
fclose(wal_meta_fp_);
|
||||
wal_meta_fp_ = nullptr;
|
||||
}
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogMetaHandler::GetMXLogInternalMeta(uint64_t& wal_lsn) {
|
||||
wal_lsn = latest_wal_lsn_;
|
||||
return true;
|
||||
}
|
||||
|
||||
bool
|
||||
MXLogMetaHandler::SetMXLogInternalMeta(uint64_t wal_lsn) {
|
||||
if (wal_meta_fp_ != nullptr) {
|
||||
uint64_t all_wal_lsn[3] = {latest_wal_lsn_, wal_lsn, wal_lsn};
|
||||
fseek(wal_meta_fp_, 0, SEEK_SET);
|
||||
auto rt_val = fwrite(&all_wal_lsn, sizeof(all_wal_lsn), 1, wal_meta_fp_);
|
||||
if (rt_val == 1) {
|
||||
fflush(wal_meta_fp_);
|
||||
latest_wal_lsn_ = wal_lsn;
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -13,27 +13,38 @@
|
|||
|
||||
#include <memory>
|
||||
#include <string>
|
||||
#include <unordered_map>
|
||||
|
||||
#include "server/delivery/request/BaseRequest.h"
|
||||
#include "db/meta/Meta.h"
|
||||
#include "db/meta/MetaFactory.h"
|
||||
#include "db/meta/MetaTypes.h"
|
||||
#include "db/wal/WalDefinations.h"
|
||||
#include "db/wal/WalFileHandler.h"
|
||||
|
||||
namespace milvus {
|
||||
namespace server {
|
||||
namespace engine {
|
||||
namespace wal {
|
||||
|
||||
class DeleteByDateRequest : public BaseRequest {
|
||||
static const char* WAL_META_FILE_NAME = "mxlog.meta";
|
||||
|
||||
class MXLogMetaHandler {
|
||||
public:
|
||||
static BaseRequestPtr
|
||||
Create(const std::shared_ptr<Context>& context, const std::string& table_name, const Range& range);
|
||||
explicit MXLogMetaHandler(const std::string& internal_meta_file_path);
|
||||
~MXLogMetaHandler();
|
||||
|
||||
protected:
|
||||
DeleteByDateRequest(const std::shared_ptr<Context>& context, const std::string& table_name, const Range& range);
|
||||
bool
|
||||
GetMXLogInternalMeta(uint64_t& wal_lsn);
|
||||
|
||||
Status
|
||||
OnExecute() override;
|
||||
bool
|
||||
SetMXLogInternalMeta(uint64_t wal_lsn);
|
||||
|
||||
private:
|
||||
const std::string table_name_;
|
||||
const Range& range_;
|
||||
FILE* wal_meta_fp_;
|
||||
uint64_t latest_wal_lsn_ = 0;
|
||||
};
|
||||
|
||||
} // namespace server
|
||||
using MXLogMetaHandlerPtr = std::shared_ptr<MXLogMetaHandler>;
|
||||
|
||||
} // namespace wal
|
||||
} // namespace engine
|
||||
} // namespace milvus
|
|
@ -25,6 +25,7 @@ static const char* MilvusService_method_names[] = {
|
|||
"/milvus.grpc.MilvusService/DescribeTable",
|
||||
"/milvus.grpc.MilvusService/CountTable",
|
||||
"/milvus.grpc.MilvusService/ShowTables",
|
||||
"/milvus.grpc.MilvusService/ShowTableInfo",
|
||||
"/milvus.grpc.MilvusService/DropTable",
|
||||
"/milvus.grpc.MilvusService/CreateIndex",
|
||||
"/milvus.grpc.MilvusService/DescribeIndex",
|
||||
|
@ -33,11 +34,16 @@ static const char* MilvusService_method_names[] = {
|
|||
"/milvus.grpc.MilvusService/ShowPartitions",
|
||||
"/milvus.grpc.MilvusService/DropPartition",
|
||||
"/milvus.grpc.MilvusService/Insert",
|
||||
"/milvus.grpc.MilvusService/GetVectorByID",
|
||||
"/milvus.grpc.MilvusService/GetVectorIDs",
|
||||
"/milvus.grpc.MilvusService/Search",
|
||||
"/milvus.grpc.MilvusService/SearchByID",
|
||||
"/milvus.grpc.MilvusService/SearchInFiles",
|
||||
"/milvus.grpc.MilvusService/Cmd",
|
||||
"/milvus.grpc.MilvusService/DeleteByDate",
|
||||
"/milvus.grpc.MilvusService/DeleteByID",
|
||||
"/milvus.grpc.MilvusService/PreloadTable",
|
||||
"/milvus.grpc.MilvusService/Flush",
|
||||
"/milvus.grpc.MilvusService/Compact",
|
||||
};
|
||||
|
||||
std::unique_ptr< MilvusService::Stub> MilvusService::NewStub(const std::shared_ptr< ::grpc::ChannelInterface>& channel, const ::grpc::StubOptions& options) {
|
||||
|
@ -52,19 +58,25 @@ MilvusService::Stub::Stub(const std::shared_ptr< ::grpc::ChannelInterface>& chan
|
|||
, rpcmethod_DescribeTable_(MilvusService_method_names[2], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_CountTable_(MilvusService_method_names[3], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_ShowTables_(MilvusService_method_names[4], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropTable_(MilvusService_method_names[5], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_CreateIndex_(MilvusService_method_names[6], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DescribeIndex_(MilvusService_method_names[7], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropIndex_(MilvusService_method_names[8], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_CreatePartition_(MilvusService_method_names[9], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_ShowPartitions_(MilvusService_method_names[10], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropPartition_(MilvusService_method_names[11], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Insert_(MilvusService_method_names[12], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Search_(MilvusService_method_names[13], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_SearchInFiles_(MilvusService_method_names[14], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Cmd_(MilvusService_method_names[15], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DeleteByDate_(MilvusService_method_names[16], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_PreloadTable_(MilvusService_method_names[17], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_ShowTableInfo_(MilvusService_method_names[5], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropTable_(MilvusService_method_names[6], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_CreateIndex_(MilvusService_method_names[7], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DescribeIndex_(MilvusService_method_names[8], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropIndex_(MilvusService_method_names[9], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_CreatePartition_(MilvusService_method_names[10], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_ShowPartitions_(MilvusService_method_names[11], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DropPartition_(MilvusService_method_names[12], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Insert_(MilvusService_method_names[13], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_GetVectorByID_(MilvusService_method_names[14], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_GetVectorIDs_(MilvusService_method_names[15], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Search_(MilvusService_method_names[16], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_SearchByID_(MilvusService_method_names[17], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_SearchInFiles_(MilvusService_method_names[18], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Cmd_(MilvusService_method_names[19], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_DeleteByID_(MilvusService_method_names[20], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_PreloadTable_(MilvusService_method_names[21], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Flush_(MilvusService_method_names[22], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
, rpcmethod_Compact_(MilvusService_method_names[23], ::grpc::internal::RpcMethod::NORMAL_RPC, channel)
|
||||
{}
|
||||
|
||||
::grpc::Status MilvusService::Stub::CreateTable(::grpc::ClientContext* context, const ::milvus::grpc::TableSchema& request, ::milvus::grpc::Status* response) {
|
||||
|
@ -207,6 +219,34 @@ void MilvusService::Stub::experimental_async::ShowTables(::grpc::ClientContext*
|
|||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TableNameList>::Create(channel_.get(), cq, rpcmethod_ShowTables_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::ShowTableInfo(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::milvus::grpc::TableInfo* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_ShowTableInfo_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::ShowTableInfo(::grpc::ClientContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::TableInfo* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_ShowTableInfo_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::ShowTableInfo(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::TableInfo* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_ShowTableInfo_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::ShowTableInfo(::grpc::ClientContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::TableInfo* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_ShowTableInfo_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::ShowTableInfo(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::TableInfo* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_ShowTableInfo_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::TableInfo>* MilvusService::Stub::AsyncShowTableInfoRaw(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TableInfo>::Create(channel_.get(), cq, rpcmethod_ShowTableInfo_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::TableInfo>* MilvusService::Stub::PrepareAsyncShowTableInfoRaw(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TableInfo>::Create(channel_.get(), cq, rpcmethod_ShowTableInfo_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::DropTable(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::milvus::grpc::Status* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_DropTable_, context, request, response);
|
||||
}
|
||||
|
@ -431,6 +471,62 @@ void MilvusService::Stub::experimental_async::Insert(::grpc::ClientContext* cont
|
|||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::VectorIds>::Create(channel_.get(), cq, rpcmethod_Insert_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::GetVectorByID(::grpc::ClientContext* context, const ::milvus::grpc::VectorIdentity& request, ::milvus::grpc::VectorData* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_GetVectorByID_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorByID(::grpc::ClientContext* context, const ::milvus::grpc::VectorIdentity* request, ::milvus::grpc::VectorData* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_GetVectorByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::VectorData* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_GetVectorByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorByID(::grpc::ClientContext* context, const ::milvus::grpc::VectorIdentity* request, ::milvus::grpc::VectorData* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_GetVectorByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::VectorData* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_GetVectorByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::VectorData>* MilvusService::Stub::AsyncGetVectorByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::VectorIdentity& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::VectorData>::Create(channel_.get(), cq, rpcmethod_GetVectorByID_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::VectorData>* MilvusService::Stub::PrepareAsyncGetVectorByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::VectorIdentity& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::VectorData>::Create(channel_.get(), cq, rpcmethod_GetVectorByID_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::GetVectorIDs(::grpc::ClientContext* context, const ::milvus::grpc::GetVectorIDsParam& request, ::milvus::grpc::VectorIds* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_GetVectorIDs_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorIDs(::grpc::ClientContext* context, const ::milvus::grpc::GetVectorIDsParam* request, ::milvus::grpc::VectorIds* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_GetVectorIDs_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorIDs(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::VectorIds* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_GetVectorIDs_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorIDs(::grpc::ClientContext* context, const ::milvus::grpc::GetVectorIDsParam* request, ::milvus::grpc::VectorIds* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_GetVectorIDs_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::GetVectorIDs(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::VectorIds* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_GetVectorIDs_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::VectorIds>* MilvusService::Stub::AsyncGetVectorIDsRaw(::grpc::ClientContext* context, const ::milvus::grpc::GetVectorIDsParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::VectorIds>::Create(channel_.get(), cq, rpcmethod_GetVectorIDs_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::VectorIds>* MilvusService::Stub::PrepareAsyncGetVectorIDsRaw(::grpc::ClientContext* context, const ::milvus::grpc::GetVectorIDsParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::VectorIds>::Create(channel_.get(), cq, rpcmethod_GetVectorIDs_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::Search(::grpc::ClientContext* context, const ::milvus::grpc::SearchParam& request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_Search_, context, request, response);
|
||||
}
|
||||
|
@ -459,6 +555,34 @@ void MilvusService::Stub::experimental_async::Search(::grpc::ClientContext* cont
|
|||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TopKQueryResult>::Create(channel_.get(), cq, rpcmethod_Search_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::SearchByID(::grpc::ClientContext* context, const ::milvus::grpc::SearchByIDParam& request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_SearchByID_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::SearchByID(::grpc::ClientContext* context, const ::milvus::grpc::SearchByIDParam* request, ::milvus::grpc::TopKQueryResult* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_SearchByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::SearchByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::TopKQueryResult* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_SearchByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::SearchByID(::grpc::ClientContext* context, const ::milvus::grpc::SearchByIDParam* request, ::milvus::grpc::TopKQueryResult* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_SearchByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::SearchByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::TopKQueryResult* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_SearchByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::TopKQueryResult>* MilvusService::Stub::AsyncSearchByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::SearchByIDParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TopKQueryResult>::Create(channel_.get(), cq, rpcmethod_SearchByID_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::TopKQueryResult>* MilvusService::Stub::PrepareAsyncSearchByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::SearchByIDParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::TopKQueryResult>::Create(channel_.get(), cq, rpcmethod_SearchByID_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::SearchInFiles(::grpc::ClientContext* context, const ::milvus::grpc::SearchInFilesParam& request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_SearchInFiles_, context, request, response);
|
||||
}
|
||||
|
@ -515,32 +639,32 @@ void MilvusService::Stub::experimental_async::Cmd(::grpc::ClientContext* context
|
|||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::StringReply>::Create(channel_.get(), cq, rpcmethod_Cmd_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::DeleteByDate(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByDateParam& request, ::milvus::grpc::Status* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_DeleteByDate_, context, request, response);
|
||||
::grpc::Status MilvusService::Stub::DeleteByID(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByIDParam& request, ::milvus::grpc::Status* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_DeleteByID_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::DeleteByDate(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByDateParam* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_DeleteByDate_, context, request, response, std::move(f));
|
||||
void MilvusService::Stub::experimental_async::DeleteByID(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByIDParam* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_DeleteByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::DeleteByDate(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_DeleteByDate_, context, request, response, std::move(f));
|
||||
void MilvusService::Stub::experimental_async::DeleteByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_DeleteByID_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::DeleteByDate(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByDateParam* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_DeleteByDate_, context, request, response, reactor);
|
||||
void MilvusService::Stub::experimental_async::DeleteByID(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByIDParam* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_DeleteByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::DeleteByDate(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_DeleteByDate_, context, request, response, reactor);
|
||||
void MilvusService::Stub::experimental_async::DeleteByID(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_DeleteByID_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::AsyncDeleteByDateRaw(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByDateParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_DeleteByDate_, context, request, true);
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::AsyncDeleteByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByIDParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_DeleteByID_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::PrepareAsyncDeleteByDateRaw(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByDateParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_DeleteByDate_, context, request, false);
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::PrepareAsyncDeleteByIDRaw(::grpc::ClientContext* context, const ::milvus::grpc::DeleteByIDParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_DeleteByID_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::PreloadTable(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::milvus::grpc::Status* response) {
|
||||
|
@ -571,6 +695,62 @@ void MilvusService::Stub::experimental_async::PreloadTable(::grpc::ClientContext
|
|||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_PreloadTable_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::Flush(::grpc::ClientContext* context, const ::milvus::grpc::FlushParam& request, ::milvus::grpc::Status* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_Flush_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Flush(::grpc::ClientContext* context, const ::milvus::grpc::FlushParam* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_Flush_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Flush(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_Flush_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Flush(::grpc::ClientContext* context, const ::milvus::grpc::FlushParam* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_Flush_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Flush(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_Flush_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::AsyncFlushRaw(::grpc::ClientContext* context, const ::milvus::grpc::FlushParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_Flush_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::PrepareAsyncFlushRaw(::grpc::ClientContext* context, const ::milvus::grpc::FlushParam& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_Flush_, context, request, false);
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Stub::Compact(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::milvus::grpc::Status* response) {
|
||||
return ::grpc::internal::BlockingUnaryCall(channel_.get(), rpcmethod_Compact_, context, request, response);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Compact(::grpc::ClientContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_Compact_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Compact(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, std::function<void(::grpc::Status)> f) {
|
||||
::grpc_impl::internal::CallbackUnaryCall(stub_->channel_.get(), stub_->rpcmethod_Compact_, context, request, response, std::move(f));
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Compact(::grpc::ClientContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_Compact_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
void MilvusService::Stub::experimental_async::Compact(::grpc::ClientContext* context, const ::grpc::ByteBuffer* request, ::milvus::grpc::Status* response, ::grpc::experimental::ClientUnaryReactor* reactor) {
|
||||
::grpc_impl::internal::ClientCallbackUnaryFactory::Create(stub_->channel_.get(), stub_->rpcmethod_Compact_, context, request, response, reactor);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::AsyncCompactRaw(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_Compact_, context, request, true);
|
||||
}
|
||||
|
||||
::grpc::ClientAsyncResponseReader< ::milvus::grpc::Status>* MilvusService::Stub::PrepareAsyncCompactRaw(::grpc::ClientContext* context, const ::milvus::grpc::TableName& request, ::grpc::CompletionQueue* cq) {
|
||||
return ::grpc_impl::internal::ClientAsyncResponseReaderFactory< ::milvus::grpc::Status>::Create(channel_.get(), cq, rpcmethod_Compact_, context, request, false);
|
||||
}
|
||||
|
||||
MilvusService::Service::Service() {
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[0],
|
||||
|
@ -600,68 +780,98 @@ MilvusService::Service::Service() {
|
|||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[5],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::TableInfo>(
|
||||
std::mem_fn(&MilvusService::Service::ShowTableInfo), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[6],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::DropTable), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[6],
|
||||
MilvusService_method_names[7],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::IndexParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::CreateIndex), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[7],
|
||||
MilvusService_method_names[8],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::IndexParam>(
|
||||
std::mem_fn(&MilvusService::Service::DescribeIndex), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[8],
|
||||
MilvusService_method_names[9],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::DropIndex), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[9],
|
||||
MilvusService_method_names[10],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::PartitionParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::CreatePartition), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[10],
|
||||
MilvusService_method_names[11],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::PartitionList>(
|
||||
std::mem_fn(&MilvusService::Service::ShowPartitions), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[11],
|
||||
MilvusService_method_names[12],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::PartitionParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::DropPartition), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[12],
|
||||
MilvusService_method_names[13],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::InsertParam, ::milvus::grpc::VectorIds>(
|
||||
std::mem_fn(&MilvusService::Service::Insert), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[13],
|
||||
MilvusService_method_names[14],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::VectorIdentity, ::milvus::grpc::VectorData>(
|
||||
std::mem_fn(&MilvusService::Service::GetVectorByID), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[15],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::GetVectorIDsParam, ::milvus::grpc::VectorIds>(
|
||||
std::mem_fn(&MilvusService::Service::GetVectorIDs), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[16],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::SearchParam, ::milvus::grpc::TopKQueryResult>(
|
||||
std::mem_fn(&MilvusService::Service::Search), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[14],
|
||||
MilvusService_method_names[17],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::SearchByIDParam, ::milvus::grpc::TopKQueryResult>(
|
||||
std::mem_fn(&MilvusService::Service::SearchByID), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[18],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::SearchInFilesParam, ::milvus::grpc::TopKQueryResult>(
|
||||
std::mem_fn(&MilvusService::Service::SearchInFiles), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[15],
|
||||
MilvusService_method_names[19],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::Command, ::milvus::grpc::StringReply>(
|
||||
std::mem_fn(&MilvusService::Service::Cmd), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[16],
|
||||
MilvusService_method_names[20],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::DeleteByDateParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::DeleteByDate), this)));
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::DeleteByIDParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::DeleteByID), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[17],
|
||||
MilvusService_method_names[21],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::PreloadTable), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[22],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::FlushParam, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::Flush), this)));
|
||||
AddMethod(new ::grpc::internal::RpcServiceMethod(
|
||||
MilvusService_method_names[23],
|
||||
::grpc::internal::RpcMethod::NORMAL_RPC,
|
||||
new ::grpc::internal::RpcMethodHandler< MilvusService::Service, ::milvus::grpc::TableName, ::milvus::grpc::Status>(
|
||||
std::mem_fn(&MilvusService::Service::Compact), this)));
|
||||
}
|
||||
|
||||
MilvusService::Service::~Service() {
|
||||
|
@ -702,6 +912,13 @@ MilvusService::Service::~Service() {
|
|||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::ShowTableInfo(::grpc::ServerContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::TableInfo* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::DropTable(::grpc::ServerContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::Status* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
|
@ -758,6 +975,20 @@ MilvusService::Service::~Service() {
|
|||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::GetVectorByID(::grpc::ServerContext* context, const ::milvus::grpc::VectorIdentity* request, ::milvus::grpc::VectorData* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::GetVectorIDs(::grpc::ServerContext* context, const ::milvus::grpc::GetVectorIDsParam* request, ::milvus::grpc::VectorIds* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::Search(::grpc::ServerContext* context, const ::milvus::grpc::SearchParam* request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
|
@ -765,6 +996,13 @@ MilvusService::Service::~Service() {
|
|||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::SearchByID(::grpc::ServerContext* context, const ::milvus::grpc::SearchByIDParam* request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::SearchInFiles(::grpc::ServerContext* context, const ::milvus::grpc::SearchInFilesParam* request, ::milvus::grpc::TopKQueryResult* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
|
@ -779,7 +1017,7 @@ MilvusService::Service::~Service() {
|
|||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::DeleteByDate(::grpc::ServerContext* context, const ::milvus::grpc::DeleteByDateParam* request, ::milvus::grpc::Status* response) {
|
||||
::grpc::Status MilvusService::Service::DeleteByID(::grpc::ServerContext* context, const ::milvus::grpc::DeleteByIDParam* request, ::milvus::grpc::Status* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
|
@ -793,6 +1031,20 @@ MilvusService::Service::~Service() {
|
|||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::Flush(::grpc::ServerContext* context, const ::milvus::grpc::FlushParam* request, ::milvus::grpc::Status* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
::grpc::Status MilvusService::Service::Compact(::grpc::ServerContext* context, const ::milvus::grpc::TableName* request, ::milvus::grpc::Status* response) {
|
||||
(void) context;
|
||||
(void) request;
|
||||
(void) response;
|
||||
return ::grpc::Status(::grpc::StatusCode::UNIMPLEMENTED, "");
|
||||
}
|
||||
|
||||
|
||||
} // namespace milvus
|
||||
} // namespace grpc
|
||||
|
|
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
|
@ -61,21 +61,21 @@ static ::PROTOBUF_NAMESPACE_ID::Message const * const file_default_instances[] =
|
|||
const char descriptor_table_protodef_status_2eproto[] PROTOBUF_SECTION_VARIABLE(protodesc_cold) =
|
||||
"\n\014status.proto\022\013milvus.grpc\"D\n\006Status\022*\n"
|
||||
"\nerror_code\030\001 \001(\0162\026.milvus.grpc.ErrorCod"
|
||||
"e\022\016\n\006reason\030\002 \001(\t*\253\004\n\tErrorCode\022\013\n\007SUCCE"
|
||||
"e\022\016\n\006reason\030\002 \001(\t*\230\004\n\tErrorCode\022\013\n\007SUCCE"
|
||||
"SS\020\000\022\024\n\020UNEXPECTED_ERROR\020\001\022\022\n\016CONNECT_FA"
|
||||
"ILED\020\002\022\025\n\021PERMISSION_DENIED\020\003\022\024\n\020TABLE_N"
|
||||
"OT_EXISTS\020\004\022\024\n\020ILLEGAL_ARGUMENT\020\005\022\021\n\rILL"
|
||||
"EGAL_RANGE\020\006\022\025\n\021ILLEGAL_DIMENSION\020\007\022\026\n\022I"
|
||||
"LLEGAL_INDEX_TYPE\020\010\022\026\n\022ILLEGAL_TABLE_NAM"
|
||||
"E\020\t\022\020\n\014ILLEGAL_TOPK\020\n\022\025\n\021ILLEGAL_ROWRECO"
|
||||
"RD\020\013\022\025\n\021ILLEGAL_VECTOR_ID\020\014\022\031\n\025ILLEGAL_S"
|
||||
"EARCH_RESULT\020\r\022\022\n\016FILE_NOT_FOUND\020\016\022\017\n\013ME"
|
||||
"TA_FAILED\020\017\022\020\n\014CACHE_FAILED\020\020\022\030\n\024CANNOT_"
|
||||
"CREATE_FOLDER\020\021\022\026\n\022CANNOT_CREATE_FILE\020\022\022"
|
||||
"\030\n\024CANNOT_DELETE_FOLDER\020\023\022\026\n\022CANNOT_DELE"
|
||||
"TE_FILE\020\024\022\025\n\021BUILD_INDEX_ERROR\020\025\022\021\n\rILLE"
|
||||
"GAL_NLIST\020\026\022\027\n\023ILLEGAL_METRIC_TYPE\020\027\022\021\n\r"
|
||||
"OUT_OF_MEMORY\020\030b\006proto3"
|
||||
"OT_EXISTS\020\004\022\024\n\020ILLEGAL_ARGUMENT\020\005\022\025\n\021ILL"
|
||||
"EGAL_DIMENSION\020\007\022\026\n\022ILLEGAL_INDEX_TYPE\020\010"
|
||||
"\022\026\n\022ILLEGAL_TABLE_NAME\020\t\022\020\n\014ILLEGAL_TOPK"
|
||||
"\020\n\022\025\n\021ILLEGAL_ROWRECORD\020\013\022\025\n\021ILLEGAL_VEC"
|
||||
"TOR_ID\020\014\022\031\n\025ILLEGAL_SEARCH_RESULT\020\r\022\022\n\016F"
|
||||
"ILE_NOT_FOUND\020\016\022\017\n\013META_FAILED\020\017\022\020\n\014CACH"
|
||||
"E_FAILED\020\020\022\030\n\024CANNOT_CREATE_FOLDER\020\021\022\026\n\022"
|
||||
"CANNOT_CREATE_FILE\020\022\022\030\n\024CANNOT_DELETE_FO"
|
||||
"LDER\020\023\022\026\n\022CANNOT_DELETE_FILE\020\024\022\025\n\021BUILD_"
|
||||
"INDEX_ERROR\020\025\022\021\n\rILLEGAL_NLIST\020\026\022\027\n\023ILLE"
|
||||
"GAL_METRIC_TYPE\020\027\022\021\n\rOUT_OF_MEMORY\020\030b\006pr"
|
||||
"oto3"
|
||||
;
|
||||
static const ::PROTOBUF_NAMESPACE_ID::internal::DescriptorTable*const descriptor_table_status_2eproto_deps[1] = {
|
||||
};
|
||||
|
@ -85,7 +85,7 @@ static ::PROTOBUF_NAMESPACE_ID::internal::SCCInfoBase*const descriptor_table_sta
|
|||
static ::PROTOBUF_NAMESPACE_ID::internal::once_flag descriptor_table_status_2eproto_once;
|
||||
static bool descriptor_table_status_2eproto_initialized = false;
|
||||
const ::PROTOBUF_NAMESPACE_ID::internal::DescriptorTable descriptor_table_status_2eproto = {
|
||||
&descriptor_table_status_2eproto_initialized, descriptor_table_protodef_status_2eproto, "status.proto", 663,
|
||||
&descriptor_table_status_2eproto_initialized, descriptor_table_protodef_status_2eproto, "status.proto", 644,
|
||||
&descriptor_table_status_2eproto_once, descriptor_table_status_2eproto_sccs, descriptor_table_status_2eproto_deps, 1, 0,
|
||||
schemas, file_default_instances, TableStruct_status_2eproto::offsets,
|
||||
file_level_metadata_status_2eproto, 1, file_level_enum_descriptors_status_2eproto, file_level_service_descriptors_status_2eproto,
|
||||
|
@ -107,7 +107,6 @@ bool ErrorCode_IsValid(int value) {
|
|||
case 3:
|
||||
case 4:
|
||||
case 5:
|
||||
case 6:
|
||||
case 7:
|
||||
case 8:
|
||||
case 9:
|
||||
|
|
|
@ -75,7 +75,6 @@ enum ErrorCode : int {
|
|||
PERMISSION_DENIED = 3,
|
||||
TABLE_NOT_EXISTS = 4,
|
||||
ILLEGAL_ARGUMENT = 5,
|
||||
ILLEGAL_RANGE = 6,
|
||||
ILLEGAL_DIMENSION = 7,
|
||||
ILLEGAL_INDEX_TYPE = 8,
|
||||
ILLEGAL_TABLE_NAME = 9,
|
||||
|
|
|
@ -11,13 +11,6 @@ message TableName {
|
|||
string table_name = 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Partition name
|
||||
*/
|
||||
message PartitionName {
|
||||
string partition_name = 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Table name list
|
||||
*/
|
||||
|
@ -42,8 +35,7 @@ message TableSchema {
|
|||
*/
|
||||
message PartitionParam {
|
||||
string table_name = 1;
|
||||
string partition_name = 2;
|
||||
string tag = 3;
|
||||
string tag = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -51,15 +43,7 @@ message PartitionParam {
|
|||
*/
|
||||
message PartitionList {
|
||||
Status status = 1;
|
||||
repeated PartitionParam partition_array = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Range schema
|
||||
*/
|
||||
message Range {
|
||||
string start_value = 1;
|
||||
string end_value = 2;
|
||||
repeated string partition_tag_array = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -94,10 +78,9 @@ message VectorIds {
|
|||
message SearchParam {
|
||||
string table_name = 1;
|
||||
repeated RowRecord query_record_array = 2;
|
||||
repeated Range query_range_array = 3;
|
||||
int64 topk = 4;
|
||||
int64 nprobe = 5;
|
||||
repeated string partition_tag_array = 6;
|
||||
int64 topk = 3;
|
||||
int64 nprobe = 4;
|
||||
repeated string partition_tag_array = 5;
|
||||
}
|
||||
|
||||
/**
|
||||
|
@ -108,6 +91,17 @@ message SearchInFilesParam {
|
|||
SearchParam search_param = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Params for searching vector by ID
|
||||
*/
|
||||
message SearchByIDParam {
|
||||
string table_name = 1;
|
||||
int64 id = 2;
|
||||
int64 topk = 3;
|
||||
int64 nprobe = 4;
|
||||
repeated string partition_tag_array = 5;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Query result params
|
||||
*/
|
||||
|
@ -169,11 +163,70 @@ message IndexParam {
|
|||
}
|
||||
|
||||
/**
|
||||
* @brief table name and range for DeleteByDate
|
||||
* @brief Flush params
|
||||
*/
|
||||
message DeleteByDateParam {
|
||||
Range range = 1;
|
||||
string table_name = 2;
|
||||
message FlushParam {
|
||||
repeated string table_name_array = 1;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief Flush params
|
||||
*/
|
||||
message DeleteByIDParam {
|
||||
string table_name = 1;
|
||||
repeated int64 id_array = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief segment statistics
|
||||
*/
|
||||
message SegmentStat {
|
||||
string segment_name = 1;
|
||||
int64 row_count = 2;
|
||||
string index_name = 3;
|
||||
int64 data_size = 4;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief table statistics
|
||||
*/
|
||||
message PartitionStat {
|
||||
string tag = 1;
|
||||
int64 total_row_count = 2;
|
||||
repeated SegmentStat segments_stat = 3;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief table information
|
||||
*/
|
||||
message TableInfo {
|
||||
Status status = 1;
|
||||
int64 total_row_count = 2;
|
||||
repeated PartitionStat partitions_stat = 3;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief vector identity
|
||||
*/
|
||||
message VectorIdentity {
|
||||
string table_name = 1;
|
||||
int64 id = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief vector data
|
||||
*/
|
||||
message VectorData {
|
||||
Status status = 1;
|
||||
RowRecord vector_data = 2;
|
||||
}
|
||||
|
||||
/**
|
||||
* @brief get vector ids from a segment parameters
|
||||
*/
|
||||
message GetVectorIDsParam {
|
||||
string table_name = 1;
|
||||
string segment_name = 2;
|
||||
}
|
||||
|
||||
service MilvusService {
|
||||
|
@ -222,6 +275,15 @@ service MilvusService {
|
|||
*/
|
||||
rpc ShowTables(Command) returns (TableNameList) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to get table detail information.
|
||||
*
|
||||
* @param TableName, target table name.
|
||||
*
|
||||
* @return TableInfo
|
||||
*/
|
||||
rpc ShowTableInfo(TableName) returns (TableInfo) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to delete table.
|
||||
*
|
||||
|
@ -294,6 +356,24 @@ service MilvusService {
|
|||
*/
|
||||
rpc Insert(InsertParam) returns (VectorIds) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to get vector data by id.
|
||||
*
|
||||
* @param VectorIdentity, target vector id.
|
||||
*
|
||||
* @return VectorData
|
||||
*/
|
||||
rpc GetVectorByID(VectorIdentity) returns (VectorData) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to get vector ids from a segment
|
||||
*
|
||||
* @param GetVectorIDsParam, target table and segment
|
||||
*
|
||||
* @return VectorIds
|
||||
*/
|
||||
rpc GetVectorIDs(GetVectorIDsParam) returns (VectorIds) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to query vector in table.
|
||||
*
|
||||
|
@ -303,6 +383,15 @@ service MilvusService {
|
|||
*/
|
||||
rpc Search(SearchParam) returns (TopKQueryResult) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to query vector by id.
|
||||
*
|
||||
* @param SearchByIDParam, search parameters.
|
||||
*
|
||||
* @return TopKQueryResult
|
||||
*/
|
||||
rpc SearchByID(SearchByIDParam) returns (TopKQueryResult) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to query vector in specified files.
|
||||
*
|
||||
|
@ -321,21 +410,39 @@ service MilvusService {
|
|||
*/
|
||||
rpc Cmd(Command) returns (StringReply) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to delete vector by date range
|
||||
/**
|
||||
* @brief This method is used to delete vector by id
|
||||
*
|
||||
* @param DeleteByDateParam, delete parameters.
|
||||
* @param DeleteByIDParam, delete parameters.
|
||||
*
|
||||
* @return status
|
||||
*/
|
||||
rpc DeleteByDate(DeleteByDateParam) returns (Status) {}
|
||||
rpc DeleteByID(DeleteByIDParam) returns (Status) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to preload table
|
||||
*
|
||||
* @param TableName, target table name.
|
||||
*
|
||||
* @return Status
|
||||
*/
|
||||
rpc PreloadTable(TableName) returns (Status) {}
|
||||
/**
|
||||
* @brief This method is used to preload table
|
||||
*
|
||||
* @param TableName, target table name.
|
||||
*
|
||||
* @return Status
|
||||
*/
|
||||
rpc PreloadTable(TableName) returns (Status) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to flush buffer into storage.
|
||||
*
|
||||
* @param FlushParam, flush parameters
|
||||
*
|
||||
* @return Status
|
||||
*/
|
||||
rpc Flush(FlushParam) returns (Status) {}
|
||||
|
||||
/**
|
||||
* @brief This method is used to compact table
|
||||
*
|
||||
* @param TableName, target table name.
|
||||
*
|
||||
* @return Status
|
||||
*/
|
||||
rpc Compact(TableName) returns (Status) {}
|
||||
}
|
||||
|
|
|
@ -9,7 +9,6 @@ enum ErrorCode {
|
|||
PERMISSION_DENIED = 3;
|
||||
TABLE_NOT_EXISTS = 4;
|
||||
ILLEGAL_ARGUMENT = 5;
|
||||
ILLEGAL_RANGE = 6;
|
||||
ILLEGAL_DIMENSION = 7;
|
||||
ILLEGAL_INDEX_TYPE = 8;
|
||||
ILLEGAL_TABLE_NAME = 9;
|
||||
|
|
|
@ -9,14 +9,14 @@
|
|||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "knowhere/index/vector_index/IndexBinaryIDMAP.h"
|
||||
|
||||
#include <faiss/IndexBinaryFlat.h>
|
||||
#include <faiss/MetaIndexes.h>
|
||||
|
||||
#include <faiss/index_factory.h>
|
||||
|
||||
#include "knowhere/adapter/VectorAdapter.h"
|
||||
#include "knowhere/common/Exception.h"
|
||||
#include "knowhere/index/vector_index/IndexBinaryIDMAP.h"
|
||||
|
||||
namespace knowhere {
|
||||
|
||||
|
@ -72,7 +72,7 @@ void
|
|||
BinaryIDMAP::search_impl(int64_t n, const uint8_t* data, int64_t k, float* distances, int64_t* labels,
|
||||
const Config& cfg) {
|
||||
int32_t* pdistances = (int32_t*)distances;
|
||||
index_->search(n, (uint8_t*)data, k, pdistances, labels);
|
||||
index_->search(n, (uint8_t*)data, k, pdistances, labels, bitset_);
|
||||
}
|
||||
|
||||
void
|
||||
|
@ -137,4 +137,97 @@ BinaryIDMAP::Seal() {
|
|||
// do nothing
|
||||
}
|
||||
|
||||
void
|
||||
BinaryIDMAP::AddWithoutId(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize");
|
||||
}
|
||||
|
||||
std::lock_guard<std::mutex> lk(mutex_);
|
||||
GETBINARYTENSOR(dataset)
|
||||
|
||||
std::vector<int64_t> new_ids(rows);
|
||||
for (int i = 0; i < rows; ++i) {
|
||||
new_ids[i] = i;
|
||||
}
|
||||
|
||||
index_->add_with_ids(rows, (uint8_t*)p_data, new_ids.data());
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
BinaryIDMAP::GetVectorById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize");
|
||||
}
|
||||
|
||||
// GETBINARYTENSOR(dataset)
|
||||
// auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
auto elems = dataset->Get<int64_t>(meta::DIM);
|
||||
|
||||
size_t p_x_size = sizeof(uint8_t) * elems;
|
||||
auto p_x = (uint8_t*)malloc(p_x_size);
|
||||
|
||||
index_->get_vector_by_id(1, p_data, p_x, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::TENSOR, p_x);
|
||||
return ret_ds;
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
BinaryIDMAP::SearchById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize");
|
||||
}
|
||||
|
||||
// auto search_cfg = std::dynamic_pointer_cast<BinIDMAPCfg>(config);
|
||||
// if (search_cfg == nullptr) {
|
||||
// KNOWHERE_THROW_MSG("not support this kind of config");
|
||||
// }
|
||||
|
||||
// GETBINARYTENSOR(dataset)
|
||||
auto dim = dataset->Get<int64_t>(meta::DIM);
|
||||
auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
|
||||
auto elems = rows * config->k;
|
||||
size_t p_id_size = sizeof(int64_t) * elems;
|
||||
size_t p_dist_size = sizeof(float) * elems;
|
||||
auto p_id = (int64_t*)malloc(p_id_size);
|
||||
auto p_dist = (float*)malloc(p_dist_size);
|
||||
|
||||
auto* pdistances = (int32_t*)p_dist;
|
||||
// index_->searchById(rows, (uint8_t*)p_data, config->k, pdistances, p_id, bitset_);
|
||||
// auto blacklist = dataset->Get<faiss::ConcurrentBitsetPtr>("bitset");
|
||||
index_->search_by_id(rows, p_data, config->k, pdistances, p_id, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
if (index_->metric_type == faiss::METRIC_Hamming) {
|
||||
auto pf_dist = (float*)malloc(p_dist_size);
|
||||
int32_t* pi_dist = (int32_t*)p_dist;
|
||||
for (int i = 0; i < elems; i++) {
|
||||
*(pf_dist + i) = (float)(*(pi_dist + i));
|
||||
}
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, pf_dist);
|
||||
free(p_dist);
|
||||
} else {
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, p_dist);
|
||||
}
|
||||
|
||||
return ret_ds;
|
||||
}
|
||||
|
||||
void
|
||||
BinaryIDMAP::SetBlacklist(faiss::ConcurrentBitsetPtr list) {
|
||||
bitset_ = std::move(list);
|
||||
}
|
||||
|
||||
void
|
||||
BinaryIDMAP::GetBlacklist(faiss::ConcurrentBitsetPtr& list) {
|
||||
list = bitset_;
|
||||
}
|
||||
|
||||
} // namespace knowhere
|
||||
|
|
|
@ -11,6 +11,7 @@
|
|||
|
||||
#pragma once
|
||||
|
||||
#include <faiss/utils/ConcurrentBitset.h>
|
||||
#include <memory>
|
||||
#include <mutex>
|
||||
#include <utility>
|
||||
|
@ -41,6 +42,9 @@ class BinaryIDMAP : public VectorIndex, public FaissBaseBinaryIndex {
|
|||
void
|
||||
Add(const DatasetPtr& dataset, const Config& config) override;
|
||||
|
||||
void
|
||||
AddWithoutId(const DatasetPtr& dataset, const Config& config);
|
||||
|
||||
void
|
||||
Train(const Config& config);
|
||||
|
||||
|
@ -59,12 +63,27 @@ class BinaryIDMAP : public VectorIndex, public FaissBaseBinaryIndex {
|
|||
const int64_t*
|
||||
GetRawIds();
|
||||
|
||||
DatasetPtr
|
||||
GetVectorById(const DatasetPtr& dataset, const Config& config) override;
|
||||
|
||||
DatasetPtr
|
||||
SearchById(const DatasetPtr& dataset, const Config& config) override;
|
||||
|
||||
void
|
||||
SetBlacklist(faiss::ConcurrentBitsetPtr list);
|
||||
|
||||
void
|
||||
GetBlacklist(faiss::ConcurrentBitsetPtr& list);
|
||||
|
||||
protected:
|
||||
virtual void
|
||||
search_impl(int64_t n, const uint8_t* data, int64_t k, float* distances, int64_t* labels, const Config& cfg);
|
||||
|
||||
protected:
|
||||
std::mutex mutex_;
|
||||
|
||||
private:
|
||||
faiss::ConcurrentBitsetPtr bitset_ = nullptr;
|
||||
};
|
||||
|
||||
using BinaryIDMAPPtr = std::shared_ptr<BinaryIDMAP>;
|
||||
|
|
|
@ -9,14 +9,15 @@
|
|||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include "knowhere/index/vector_index/IndexBinaryIVF.h"
|
||||
|
||||
#include <faiss/IndexBinaryFlat.h>
|
||||
#include <faiss/IndexBinaryIVF.h>
|
||||
|
||||
#include <chrono>
|
||||
|
||||
#include "knowhere/adapter/VectorAdapter.h"
|
||||
#include "knowhere/common/Exception.h"
|
||||
#include "knowhere/index/vector_index/IndexBinaryIVF.h"
|
||||
|
||||
#include <chrono>
|
||||
|
||||
namespace knowhere {
|
||||
|
||||
|
@ -91,7 +92,10 @@ BinaryIVF::search_impl(int64_t n, const uint8_t* data, int64_t k, float* distanc
|
|||
ivf_index->nprobe = params->nprobe;
|
||||
int32_t* pdistances = (int32_t*)distances;
|
||||
stdclock::time_point before = stdclock::now();
|
||||
ivf_index->search(n, (uint8_t*)data, k, pdistances, labels);
|
||||
|
||||
// todo: remove static cast (zhiru)
|
||||
static_cast<faiss::IndexBinary*>(index_.get())->search(n, (uint8_t*)data, k, pdistances, labels, bitset_);
|
||||
|
||||
stdclock::time_point after = stdclock::now();
|
||||
double search_cost = (std::chrono::duration<double, std::micro>(after - before)).count();
|
||||
KNOWHERE_LOG_DEBUG << "IVF search cost: " << search_cost
|
||||
|
@ -153,4 +157,92 @@ BinaryIVF::Seal() {
|
|||
// do nothing
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
BinaryIVF::GetVectorById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_ || !index_->is_trained) {
|
||||
KNOWHERE_THROW_MSG("index not initialize or trained");
|
||||
}
|
||||
|
||||
// GETBINARYTENSOR(dataset)
|
||||
// auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
auto elems = dataset->Get<int64_t>(meta::DIM);
|
||||
|
||||
try {
|
||||
size_t p_x_size = sizeof(uint8_t) * elems;
|
||||
auto p_x = (uint8_t*)malloc(p_x_size);
|
||||
|
||||
index_->get_vector_by_id(1, p_data, p_x, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::TENSOR, p_x);
|
||||
return ret_ds;
|
||||
} catch (faiss::FaissException& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
} catch (std::exception& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
}
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
BinaryIVF::SearchById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_ || !index_->is_trained) {
|
||||
KNOWHERE_THROW_MSG("index not initialize or trained");
|
||||
}
|
||||
|
||||
// auto search_cfg = std::dynamic_pointer_cast<IVFBinCfg>(config);
|
||||
// if (search_cfg == nullptr) {
|
||||
// KNOWHERE_THROW_MSG("not support this kind of config");
|
||||
// }
|
||||
|
||||
// GETBINARYTENSOR(dataset)
|
||||
auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
|
||||
try {
|
||||
auto elems = rows * config->k;
|
||||
|
||||
size_t p_id_size = sizeof(int64_t) * elems;
|
||||
size_t p_dist_size = sizeof(float) * elems;
|
||||
auto p_id = (int64_t*)malloc(p_id_size);
|
||||
auto p_dist = (float*)malloc(p_dist_size);
|
||||
|
||||
int32_t* pdistances = (int32_t*)p_dist;
|
||||
// auto blacklist = dataset->Get<faiss::ConcurrentBitsetPtr>("bitset");
|
||||
// index_->searchById(rows, (uint8_t*)p_data, config->k, pdistances, p_id, blacklist);
|
||||
index_->search_by_id(rows, p_data, config->k, pdistances, p_id, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
if (index_->metric_type == faiss::METRIC_Hamming) {
|
||||
auto pf_dist = (float*)malloc(p_dist_size);
|
||||
int32_t* pi_dist = (int32_t*)p_dist;
|
||||
for (int i = 0; i < elems; i++) {
|
||||
*(pf_dist + i) = (float)(*(pi_dist + i));
|
||||
}
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, pf_dist);
|
||||
free(p_dist);
|
||||
} else {
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, p_dist);
|
||||
}
|
||||
|
||||
return ret_ds;
|
||||
} catch (faiss::FaissException& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
} catch (std::exception& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
BinaryIVF::SetBlacklist(faiss::ConcurrentBitsetPtr list) {
|
||||
bitset_ = std::move(list);
|
||||
}
|
||||
|
||||
void
|
||||
BinaryIVF::GetBlacklist(faiss::ConcurrentBitsetPtr& list) {
|
||||
list = bitset_;
|
||||
}
|
||||
|
||||
} // namespace knowhere
|
||||
|
|
|
@ -16,6 +16,7 @@
|
|||
#include <utility>
|
||||
#include <vector>
|
||||
|
||||
#include <faiss/utils/ConcurrentBitset.h>
|
||||
#include "FaissBaseBinaryIndex.h"
|
||||
#include "VectorIndex.h"
|
||||
#include "faiss/IndexIVF.h"
|
||||
|
@ -54,6 +55,18 @@ class BinaryIVF : public VectorIndex, public FaissBaseBinaryIndex {
|
|||
int64_t
|
||||
Dimension() override;
|
||||
|
||||
DatasetPtr
|
||||
GetVectorById(const DatasetPtr& dataset, const Config& config);
|
||||
|
||||
DatasetPtr
|
||||
SearchById(const DatasetPtr& dataset, const Config& config);
|
||||
|
||||
void
|
||||
SetBlacklist(faiss::ConcurrentBitsetPtr list);
|
||||
|
||||
void
|
||||
GetBlacklist(faiss::ConcurrentBitsetPtr& list);
|
||||
|
||||
protected:
|
||||
virtual std::shared_ptr<faiss::IVFSearchParameters>
|
||||
GenParams(const Config& config);
|
||||
|
@ -63,6 +76,9 @@ class BinaryIVF : public VectorIndex, public FaissBaseBinaryIndex {
|
|||
|
||||
protected:
|
||||
std::mutex mutex_;
|
||||
|
||||
private:
|
||||
faiss::ConcurrentBitsetPtr bitset_ = nullptr;
|
||||
};
|
||||
|
||||
using BinaryIVFIndexPtr = std::shared_ptr<BinaryIVF>;
|
||||
|
|
|
@ -27,6 +27,14 @@
|
|||
|
||||
namespace knowhere {
|
||||
|
||||
void
|
||||
normalize_vector(float* data, float* norm_array, size_t dim) {
|
||||
float norm = 0.0f;
|
||||
for (int i = 0; i < dim; i++) norm += data[i] * data[i];
|
||||
norm = 1.0f / (sqrtf(norm) + 1e-30f);
|
||||
for (int i = 0; i < dim; i++) norm_array[i] = data[i] * norm;
|
||||
}
|
||||
|
||||
BinarySet
|
||||
IndexHNSW::Serialize() {
|
||||
if (!index_) {
|
||||
|
@ -59,6 +67,8 @@ IndexHNSW::Load(const BinarySet& index_binary) {
|
|||
hnswlib::SpaceInterface<float>* space;
|
||||
index_ = std::make_shared<hnswlib::HierarchicalNSW<float>>(space);
|
||||
index_->loadIndex(reader);
|
||||
|
||||
normalize = index_->metric_type_ == 1 ? true : false; // 1 == InnerProduct
|
||||
} catch (std::exception& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
}
|
||||
|
@ -69,6 +79,13 @@ IndexHNSW::Search(const DatasetPtr& dataset, const Config& config) {
|
|||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize or trained");
|
||||
}
|
||||
|
||||
auto search_cfg = std::dynamic_pointer_cast<HNSWCfg>(config);
|
||||
if (search_cfg == nullptr) {
|
||||
KNOWHERE_THROW_MSG("search conf is null");
|
||||
}
|
||||
index_->setEf(search_cfg->ef);
|
||||
|
||||
GETTENSOR(dataset)
|
||||
|
||||
size_t id_size = sizeof(int64_t) * config->k;
|
||||
|
@ -77,18 +94,34 @@ IndexHNSW::Search(const DatasetPtr& dataset, const Config& config) {
|
|||
auto p_dist = (float*)malloc(dist_size * rows);
|
||||
|
||||
using P = std::pair<float, int64_t>;
|
||||
auto compare = [](P& v1, P& v2) { return v1.first < v2.first; };
|
||||
auto compare = [](const P& v1, const P& v2) { return v1.first < v2.first; };
|
||||
#pragma omp parallel for
|
||||
for (unsigned int i = 0; i < rows; ++i) {
|
||||
const float* single_query = p_data + i * dim;
|
||||
std::vector<std::pair<float, int64_t>> ret = index_->searchKnn(single_query, config->k, compare);
|
||||
std::vector<P> ret;
|
||||
const float* single_query = p_data + i * Dimension();
|
||||
|
||||
// if (normalize) {
|
||||
// std::vector<float> norm_vector(Dimension());
|
||||
// normalize_vector((float*)(single_query), norm_vector.data(), Dimension());
|
||||
// ret = index_->searchKnn((float*)(norm_vector.data()), config->k, compare);
|
||||
// } else {
|
||||
// ret = index_->searchKnn((float*)single_query, config->k, compare);
|
||||
// }
|
||||
ret = index_->searchKnn((float*)single_query, config->k, compare);
|
||||
|
||||
while (ret.size() < config->k) {
|
||||
ret.push_back(std::make_pair(-1, -1));
|
||||
}
|
||||
std::vector<float> dist;
|
||||
std::vector<int64_t> ids;
|
||||
std::transform(ret.begin(), ret.end(), std::back_inserter(dist),
|
||||
[](const std::pair<float, int64_t>& e) { return e.first; });
|
||||
|
||||
if (normalize) {
|
||||
std::transform(ret.begin(), ret.end(), std::back_inserter(dist),
|
||||
[](const std::pair<float, int64_t>& e) { return float(1 - e.first); });
|
||||
} else {
|
||||
std::transform(ret.begin(), ret.end(), std::back_inserter(dist),
|
||||
[](const std::pair<float, int64_t>& e) { return e.first; });
|
||||
}
|
||||
std::transform(ret.begin(), ret.end(), std::back_inserter(ids),
|
||||
[](const std::pair<float, int64_t>& e) { return e.second; });
|
||||
|
||||
|
@ -105,8 +138,8 @@ IndexHNSW::Search(const DatasetPtr& dataset, const Config& config) {
|
|||
IndexModelPtr
|
||||
IndexHNSW::Train(const DatasetPtr& dataset, const Config& config) {
|
||||
auto build_cfg = std::dynamic_pointer_cast<HNSWCfg>(config);
|
||||
if (build_cfg != nullptr) {
|
||||
build_cfg->CheckValid(); // throw exception
|
||||
if (build_cfg == nullptr) {
|
||||
KNOWHERE_THROW_MSG("build conf is null");
|
||||
}
|
||||
|
||||
GETTENSOR(dataset)
|
||||
|
@ -116,6 +149,7 @@ IndexHNSW::Train(const DatasetPtr& dataset, const Config& config) {
|
|||
space = new hnswlib::L2Space(dim);
|
||||
} else if (config->metric_type == METRICTYPE::IP) {
|
||||
space = new hnswlib::InnerProductSpace(dim);
|
||||
normalize = true;
|
||||
}
|
||||
index_ = std::make_shared<hnswlib::HierarchicalNSW<float>>(space, rows, build_cfg->M, build_cfg->ef);
|
||||
|
||||
|
@ -133,12 +167,28 @@ IndexHNSW::Add(const DatasetPtr& dataset, const Config& config) {
|
|||
GETTENSOR(dataset)
|
||||
auto p_ids = dataset->Get<const int64_t*>(meta::IDS);
|
||||
|
||||
for (int i = 0; i < 1; i++) {
|
||||
index_->addPoint((void*)(p_data + dim * i), p_ids[i]);
|
||||
}
|
||||
// if (normalize) {
|
||||
// std::vector<float> ep_norm_vector(Dimension());
|
||||
// normalize_vector((float*)(p_data), ep_norm_vector.data(), Dimension());
|
||||
// index_->addPoint((void*)(ep_norm_vector.data()), p_ids[0]);
|
||||
// #pragma omp parallel for
|
||||
// for (int i = 1; i < rows; ++i) {
|
||||
// std::vector<float> norm_vector(Dimension());
|
||||
// normalize_vector((float*)(p_data + Dimension() * i), norm_vector.data(), Dimension());
|
||||
// index_->addPoint((void*)(norm_vector.data()), p_ids[i]);
|
||||
// }
|
||||
// } else {
|
||||
// index_->addPoint((void*)(p_data), p_ids[0]);
|
||||
// #pragma omp parallel for
|
||||
// for (int i = 1; i < rows; ++i) {
|
||||
// index_->addPoint((void*)(p_data + Dimension() * i), p_ids[i]);
|
||||
// }
|
||||
// }
|
||||
|
||||
index_->addPoint((void*)(p_data), p_ids[0]);
|
||||
#pragma omp parallel for
|
||||
for (int i = 1; i < rows; i++) {
|
||||
index_->addPoint((void*)(p_data + dim * i), p_ids[i]);
|
||||
for (int i = 1; i < rows; ++i) {
|
||||
index_->addPoint((void*)(p_data + Dimension() * i), p_ids[i]);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
@ -56,6 +56,7 @@ class IndexHNSW : public VectorIndex {
|
|||
Dimension() override;
|
||||
|
||||
private:
|
||||
bool normalize = false;
|
||||
std::mutex mutex_;
|
||||
std::shared_ptr<hnswlib::HierarchicalNSW<float>> index_;
|
||||
};
|
||||
|
|
|
@ -9,10 +9,9 @@
|
|||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#include <faiss/AutoTune.h>
|
||||
#include <faiss/IndexFlat.h>
|
||||
#include <faiss/MetaIndexes.h>
|
||||
|
||||
#include <faiss/AutoTune.h>
|
||||
#include <faiss/clone_index.h>
|
||||
#include <faiss/index_factory.h>
|
||||
#include <faiss/index_io.h>
|
||||
|
@ -78,7 +77,7 @@ IDMAP::Search(const DatasetPtr& dataset, const Config& config) {
|
|||
|
||||
void
|
||||
IDMAP::search_impl(int64_t n, const float* data, int64_t k, float* distances, int64_t* labels, const Config& cfg) {
|
||||
index_->search(n, (float*)data, k, distances, labels);
|
||||
index_->search(n, (float*)data, k, distances, labels, bitset_);
|
||||
}
|
||||
|
||||
void
|
||||
|
@ -101,7 +100,8 @@ IDMAP::AddWithoutId(const DatasetPtr& dataset, const Config& config) {
|
|||
}
|
||||
|
||||
std::lock_guard<std::mutex> lk(mutex_);
|
||||
GETTENSOR(dataset)
|
||||
auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const float*>(meta::TENSOR);
|
||||
|
||||
std::vector<int64_t> new_ids(rows);
|
||||
for (int i = 0; i < rows; ++i) {
|
||||
|
@ -185,4 +185,60 @@ IDMAP::Seal() {
|
|||
// do nothing
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
IDMAP::GetVectorById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize");
|
||||
}
|
||||
// GETTENSOR(dataset)
|
||||
// auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
auto elems = dataset->Get<int64_t>(meta::DIM);
|
||||
|
||||
size_t p_x_size = sizeof(float) * elems;
|
||||
auto p_x = (float*)malloc(p_x_size);
|
||||
|
||||
index_->get_vector_by_id(1, p_data, p_x, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::TENSOR, p_x);
|
||||
return ret_ds;
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
IDMAP::SearchById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_) {
|
||||
KNOWHERE_THROW_MSG("index not initialize");
|
||||
}
|
||||
// GETTENSOR(dataset)
|
||||
auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
|
||||
auto elems = rows * config->k;
|
||||
size_t p_id_size = sizeof(int64_t) * elems;
|
||||
size_t p_dist_size = sizeof(float) * elems;
|
||||
auto p_id = (int64_t*)malloc(p_id_size);
|
||||
auto p_dist = (float*)malloc(p_dist_size);
|
||||
|
||||
// todo: enable search by id (zhiru)
|
||||
// auto blacklist = dataset->Get<faiss::ConcurrentBitsetPtr>("bitset");
|
||||
// index_->searchById(rows, (float*)p_data, config->k, p_dist, p_id, blacklist);
|
||||
index_->search_by_id(rows, p_data, config->k, p_dist, p_id, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, p_dist);
|
||||
return ret_ds;
|
||||
}
|
||||
|
||||
void
|
||||
IDMAP::SetBlacklist(faiss::ConcurrentBitsetPtr list) {
|
||||
bitset_ = std::move(list);
|
||||
}
|
||||
|
||||
void
|
||||
IDMAP::GetBlacklist(faiss::ConcurrentBitsetPtr& list) {
|
||||
list = bitset_;
|
||||
}
|
||||
|
||||
} // namespace knowhere
|
||||
|
|
|
@ -13,6 +13,7 @@
|
|||
|
||||
#include "IndexIVF.h"
|
||||
|
||||
#include <faiss/utils/ConcurrentBitset.h>
|
||||
#include <memory>
|
||||
#include <utility>
|
||||
|
||||
|
@ -55,6 +56,7 @@ class IDMAP : public VectorIndex, public FaissBaseIndex {
|
|||
|
||||
VectorIndexPtr
|
||||
CopyCpuToGpu(const int64_t& device_id, const Config& config);
|
||||
|
||||
void
|
||||
Seal() override;
|
||||
|
||||
|
@ -64,12 +66,27 @@ class IDMAP : public VectorIndex, public FaissBaseIndex {
|
|||
virtual const int64_t*
|
||||
GetRawIds();
|
||||
|
||||
DatasetPtr
|
||||
GetVectorById(const DatasetPtr& dataset, const Config& config);
|
||||
|
||||
DatasetPtr
|
||||
SearchById(const DatasetPtr& dataset, const Config& config);
|
||||
|
||||
void
|
||||
SetBlacklist(faiss::ConcurrentBitsetPtr list);
|
||||
|
||||
void
|
||||
GetBlacklist(faiss::ConcurrentBitsetPtr& list);
|
||||
|
||||
protected:
|
||||
virtual void
|
||||
search_impl(int64_t n, const float* data, int64_t k, float* distances, int64_t* labels, const Config& cfg);
|
||||
|
||||
protected:
|
||||
std::mutex mutex_;
|
||||
|
||||
private:
|
||||
faiss::ConcurrentBitsetPtr bitset_ = nullptr;
|
||||
};
|
||||
|
||||
using IDMAPPtr = std::shared_ptr<IDMAP>;
|
||||
|
|
|
@ -217,8 +217,10 @@ IVF::GenGraph(const float* data, const int64_t& k, Graph& graph, const Config& c
|
|||
void
|
||||
IVF::search_impl(int64_t n, const float* data, int64_t k, float* distances, int64_t* labels, const Config& cfg) {
|
||||
auto params = GenParams(cfg);
|
||||
auto ivf_index = dynamic_cast<faiss::IndexIVF*>(index_.get());
|
||||
ivf_index->nprobe = params->nprobe;
|
||||
stdclock::time_point before = stdclock::now();
|
||||
faiss::ivflib::search_with_parameters(index_.get(), n, (float*)data, k, distances, labels, params.get());
|
||||
ivf_index->search(n, (float*)data, k, distances, labels, bitset_);
|
||||
stdclock::time_point after = stdclock::now();
|
||||
double search_cost = (std::chrono::duration<double, std::micro>(after - before)).count();
|
||||
KNOWHERE_LOG_DEBUG << "IVF search cost: " << search_cost
|
||||
|
@ -271,6 +273,99 @@ IVF::Seal() {
|
|||
SealImpl();
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
IVF::GetVectorById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_ || !index_->is_trained) {
|
||||
KNOWHERE_THROW_MSG("index not initialize or trained");
|
||||
}
|
||||
|
||||
auto search_cfg = std::dynamic_pointer_cast<IVFCfg>(config);
|
||||
if (search_cfg == nullptr) {
|
||||
KNOWHERE_THROW_MSG("not support this kind of config");
|
||||
}
|
||||
|
||||
// auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
auto elems = dataset->Get<int64_t>(meta::DIM);
|
||||
|
||||
try {
|
||||
size_t p_x_size = sizeof(float) * elems;
|
||||
auto p_x = (float*)malloc(p_x_size);
|
||||
|
||||
auto index_ivf = std::static_pointer_cast<faiss::IndexIVF>(index_);
|
||||
index_ivf->get_vector_by_id(1, p_data, p_x, bitset_);
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::TENSOR, p_x);
|
||||
return ret_ds;
|
||||
} catch (faiss::FaissException& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
} catch (std::exception& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
}
|
||||
}
|
||||
|
||||
DatasetPtr
|
||||
IVF::SearchById(const DatasetPtr& dataset, const Config& config) {
|
||||
if (!index_ || !index_->is_trained) {
|
||||
KNOWHERE_THROW_MSG("index not initialize or trained");
|
||||
}
|
||||
|
||||
auto search_cfg = std::dynamic_pointer_cast<IVFCfg>(config);
|
||||
if (search_cfg == nullptr) {
|
||||
KNOWHERE_THROW_MSG("not support this kind of config");
|
||||
}
|
||||
|
||||
auto rows = dataset->Get<int64_t>(meta::ROWS);
|
||||
auto p_data = dataset->Get<const int64_t*>(meta::IDS);
|
||||
|
||||
try {
|
||||
auto elems = rows * search_cfg->k;
|
||||
|
||||
size_t p_id_size = sizeof(int64_t) * elems;
|
||||
size_t p_dist_size = sizeof(float) * elems;
|
||||
auto p_id = (int64_t*)malloc(p_id_size);
|
||||
auto p_dist = (float*)malloc(p_dist_size);
|
||||
|
||||
// todo: enable search by id (zhiru)
|
||||
// auto blacklist = dataset->Get<faiss::ConcurrentBitsetPtr>("bitset");
|
||||
auto index_ivf = std::static_pointer_cast<faiss::IndexIVF>(index_);
|
||||
index_ivf->search_by_id(rows, p_data, search_cfg->k, p_dist, p_id, bitset_);
|
||||
|
||||
// std::stringstream ss_res_id, ss_res_dist;
|
||||
// for (int i = 0; i < 10; ++i) {
|
||||
// printf("%llu", res_ids[i]);
|
||||
// printf("\n");
|
||||
// printf("%.6f", res_dis[i]);
|
||||
// printf("\n");
|
||||
// ss_res_id << res_ids[i] << " ";
|
||||
// ss_res_dist << res_dis[i] << " ";
|
||||
// }
|
||||
// std::cout << std::endl << "after search: " << std::endl;
|
||||
// std::cout << ss_res_id.str() << std::endl;
|
||||
// std::cout << ss_res_dist.str() << std::endl << std::endl;
|
||||
|
||||
auto ret_ds = std::make_shared<Dataset>();
|
||||
ret_ds->Set(meta::IDS, p_id);
|
||||
ret_ds->Set(meta::DISTANCE, p_dist);
|
||||
return ret_ds;
|
||||
} catch (faiss::FaissException& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
} catch (std::exception& e) {
|
||||
KNOWHERE_THROW_MSG(e.what());
|
||||
}
|
||||
}
|
||||
|
||||
void
|
||||
IVF::SetBlacklist(faiss::ConcurrentBitsetPtr list) {
|
||||
bitset_ = std::move(list);
|
||||
}
|
||||
|
||||
void
|
||||
IVF::GetBlacklist(faiss::ConcurrentBitsetPtr& list) {
|
||||
list = bitset_;
|
||||
}
|
||||
|
||||
IVFIndexModel::IVFIndexModel(std::shared_ptr<faiss::Index> index) : FaissBaseIndex(std::move(index)) {
|
||||
}
|
||||
|
||||
|
|
|
@ -19,6 +19,7 @@
|
|||
#include "FaissBaseIndex.h"
|
||||
#include "VectorIndex.h"
|
||||
#include "faiss/IndexIVF.h"
|
||||
#include "faiss/utils/ConcurrentBitset.h"
|
||||
|
||||
namespace knowhere {
|
||||
|
||||
|
@ -71,6 +72,18 @@ class IVF : public VectorIndex, public FaissBaseIndex {
|
|||
virtual VectorIndexPtr
|
||||
CopyCpuToGpu(const int64_t& device_id, const Config& config);
|
||||
|
||||
DatasetPtr
|
||||
GetVectorById(const DatasetPtr& dataset, const Config& config) override;
|
||||
|
||||
DatasetPtr
|
||||
SearchById(const DatasetPtr& dataset, const Config& config) override;
|
||||
|
||||
void
|
||||
SetBlacklist(faiss::ConcurrentBitsetPtr list);
|
||||
|
||||
void
|
||||
GetBlacklist(faiss::ConcurrentBitsetPtr& list);
|
||||
|
||||
protected:
|
||||
virtual std::shared_ptr<faiss::IVFSearchParameters>
|
||||
GenParams(const Config& config);
|
||||
|
@ -83,6 +96,9 @@ class IVF : public VectorIndex, public FaissBaseIndex {
|
|||
|
||||
protected:
|
||||
std::mutex mutex_;
|
||||
|
||||
private:
|
||||
faiss::ConcurrentBitsetPtr bitset_ = nullptr;
|
||||
};
|
||||
|
||||
using IVFIndexPtr = std::shared_ptr<IVF>;
|
||||
|
|
|
@ -12,12 +12,14 @@
|
|||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
#include <vector>
|
||||
|
||||
#include "knowhere/common/Config.h"
|
||||
#include "knowhere/common/Dataset.h"
|
||||
#include "knowhere/index/Index.h"
|
||||
#include "knowhere/index/preprocessor/Preprocessor.h"
|
||||
#include "knowhere/index/vector_index/helpers/IndexParameter.h"
|
||||
#include "segment/Types.h"
|
||||
|
||||
namespace knowhere {
|
||||
|
||||
|
@ -36,6 +38,16 @@ class VectorIndex : public Index {
|
|||
return nullptr;
|
||||
}
|
||||
|
||||
virtual DatasetPtr
|
||||
GetVectorById(const DatasetPtr& dataset, const Config& config) {
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
virtual DatasetPtr
|
||||
SearchById(const DatasetPtr& dataset, const Config& config) {
|
||||
return nullptr;
|
||||
}
|
||||
|
||||
virtual void
|
||||
Add(const DatasetPtr& dataset, const Config& config) = 0;
|
||||
|
||||
|
@ -51,6 +63,20 @@ class VectorIndex : public Index {
|
|||
|
||||
virtual int64_t
|
||||
Dimension() = 0;
|
||||
|
||||
virtual const std::vector<milvus::segment::doc_id_t>&
|
||||
GetUids() const {
|
||||
return uids_;
|
||||
}
|
||||
|
||||
virtual void
|
||||
SetUids(std::vector<milvus::segment::doc_id_t>& uids) {
|
||||
uids_.clear();
|
||||
uids_.swap(uids);
|
||||
}
|
||||
|
||||
private:
|
||||
std::vector<milvus::segment::doc_id_t> uids_;
|
||||
};
|
||||
|
||||
} // namespace knowhere
|
||||
|
|
|
@ -9,8 +9,6 @@
|
|||
// is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
|
||||
// or implied. See the License for the specific language governing permissions and limitations under the License.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include <immintrin.h>
|
||||
|
||||
#include "knowhere/index/vector_index/nsg/Distance.h"
|
||||
|
|
|
@ -41,13 +41,19 @@ void Index::assign (idx_t n, const float * x, idx_t * labels, idx_t k)
|
|||
search (n, x, k, distances, labels);
|
||||
}
|
||||
|
||||
void Index::add_with_ids(
|
||||
idx_t /*n*/,
|
||||
const float* /*x*/,
|
||||
const idx_t* /*xids*/) {
|
||||
void Index::add_with_ids(idx_t n, const float* x, const idx_t* xids) {
|
||||
FAISS_THROW_MSG ("add_with_ids not implemented for this type of index");
|
||||
}
|
||||
|
||||
void Index::get_vector_by_id (idx_t n, const idx_t *xid, float *x, ConcurrentBitsetPtr bitset) {
|
||||
FAISS_THROW_MSG ("get_vector_by_id not implemented for this type of index");
|
||||
}
|
||||
|
||||
void Index::search_by_id (idx_t n, const idx_t *xid, idx_t k, float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) {
|
||||
FAISS_THROW_MSG ("search_by_id not implemented for this type of index");
|
||||
}
|
||||
|
||||
size_t Index::remove_ids(const IDSelector& /*sel*/) {
|
||||
FAISS_THROW_MSG ("remove_ids not implemented for this type of index");
|
||||
return -1;
|
||||
|
|
|
@ -16,6 +16,8 @@
|
|||
#include <string>
|
||||
#include <sstream>
|
||||
|
||||
#include <faiss/utils/ConcurrentBitset.h>
|
||||
|
||||
#define FAISS_VERSION_MAJOR 1
|
||||
#define FAISS_VERSION_MINOR 6
|
||||
#define FAISS_VERSION_PATCH 0
|
||||
|
@ -132,9 +134,34 @@ struct Index {
|
|||
* @param x input vectors to search, size n * d
|
||||
* @param labels output labels of the NNs, size n*k
|
||||
* @param distances output pairwise distances, size n*k
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const = 0;
|
||||
virtual void search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const = 0;
|
||||
|
||||
/** query n raw vectors from the index by ids.
|
||||
*
|
||||
* return n raw vectors.
|
||||
*
|
||||
* @param n input num of xid
|
||||
* @param xid input labels of the NNs, size n
|
||||
* @param x output raw vectors, size n * d
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void get_vector_by_id (idx_t n, const idx_t *xid, float *x, ConcurrentBitsetPtr bitset = nullptr);
|
||||
|
||||
/** query n vectors of dimension d to the index by ids.
|
||||
*
|
||||
* return at most k vectors. If there are not enough results for a
|
||||
* query, the result array is padded with -1s.
|
||||
*
|
||||
* @param xid input ids to search, size n
|
||||
* @param labels output labels of the NNs, size n*k
|
||||
* @param distances output pairwise distances, size n*k
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void search_by_id (idx_t n, const idx_t *xid, idx_t k, float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr);
|
||||
|
||||
/** query n vectors of dimension d to the index.
|
||||
*
|
||||
|
|
|
@ -166,7 +166,8 @@ void Index2Layer::search(
|
|||
const float* /*x*/,
|
||||
idx_t /*k*/,
|
||||
float* /*distances*/,
|
||||
idx_t* /*labels*/) const {
|
||||
idx_t* /*labels*/,
|
||||
ConcurrentBitsetPtr bitset) const {
|
||||
FAISS_THROW_MSG("not implemented");
|
||||
}
|
||||
|
||||
|
|
|
@ -60,7 +60,8 @@ struct Index2Layer: Index {
|
|||
const float* x,
|
||||
idx_t k,
|
||||
float* distances,
|
||||
idx_t* labels) const override;
|
||||
idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void reconstruct_n(idx_t i0, idx_t ni, float* recons) const override;
|
||||
|
||||
|
|
|
@ -35,6 +35,15 @@ void IndexBinary::add_with_ids(idx_t, const uint8_t *, const idx_t *) {
|
|||
FAISS_THROW_MSG("add_with_ids not implemented for this type of index");
|
||||
}
|
||||
|
||||
void IndexBinary::get_vector_by_id (idx_t n, const idx_t *xid, uint8_t *x, ConcurrentBitsetPtr bitset) {
|
||||
FAISS_THROW_MSG("get_vector_by_id not implemented for this type of index");
|
||||
}
|
||||
|
||||
void IndexBinary::search_by_id (idx_t n, const idx_t *xid, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) {
|
||||
FAISS_THROW_MSG("search_by_id not implemented for this type of index");
|
||||
}
|
||||
|
||||
size_t IndexBinary::remove_ids(const IDSelector&) {
|
||||
FAISS_THROW_MSG("remove_ids not implemented for this type of index");
|
||||
return 0;
|
||||
|
|
|
@ -93,11 +93,37 @@ struct IndexBinary {
|
|||
* @param x input vectors to search, size n * d / 8
|
||||
* @param labels output labels of the NNs, size n*k
|
||||
* @param distances output pairwise distances, size n*k
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const = 0;
|
||||
virtual void search (idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const = 0;
|
||||
|
||||
/** Query n vectors of dimension d to the index.
|
||||
/** query n raw vectors from the index by ids.
|
||||
*
|
||||
* return n raw vectors.
|
||||
*
|
||||
* @param n input num of xid
|
||||
* @param xid input labels of the NNs, size n
|
||||
* @param x output raw vectors, size n * d
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void get_vector_by_id (idx_t n, const idx_t *xid, uint8_t *x, ConcurrentBitsetPtr bitset = nullptr);
|
||||
|
||||
/** query n vectors of dimension d to the index by ids.
|
||||
*
|
||||
* return at most k vectors. If there are not enough results for a
|
||||
* query, the result array is padded with -1s.
|
||||
*
|
||||
* @param xid input ids to search, size n
|
||||
* @param labels output labels of the NNs, size n*k
|
||||
* @param distances output pairwise distances, size n*k
|
||||
* @param bitset flags to check the validity of vectors
|
||||
*/
|
||||
virtual void search_by_id (idx_t n, const idx_t *xid, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr);
|
||||
|
||||
|
||||
/** Query n vectors of dimension d to the index.
|
||||
*
|
||||
* return all vectors with distance < radius. Note that many
|
||||
* indexes do not implement the range_search (only the k-NN search
|
||||
|
|
|
@ -39,57 +39,57 @@ void IndexBinaryFlat::reset() {
|
|||
}
|
||||
|
||||
void IndexBinaryFlat::search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const {
|
||||
const idx_t block_size = query_batch_size;
|
||||
if (metric_type == METRIC_Jaccard || metric_type == METRIC_Tanimoto) {
|
||||
float *D = new float[k * n];
|
||||
for (idx_t s = 0; s < n; s += block_size) {
|
||||
idx_t nn = block_size;
|
||||
if (s + block_size > n) {
|
||||
nn = n - s;
|
||||
}
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset) const {
|
||||
const idx_t block_size = query_batch_size;
|
||||
if (metric_type == METRIC_Jaccard || metric_type == METRIC_Tanimoto) {
|
||||
float *D = new float[k * n];
|
||||
for (idx_t s = 0; s < n; s += block_size) {
|
||||
idx_t nn = block_size;
|
||||
if (s + block_size > n) {
|
||||
nn = n - s;
|
||||
}
|
||||
|
||||
if (use_heap) {
|
||||
// We see the distances and labels as heaps.
|
||||
if (use_heap) {
|
||||
// We see the distances and labels as heaps.
|
||||
|
||||
float_maxheap_array_t res = {
|
||||
size_t(nn), size_t(k), labels + s * k, D + s * k
|
||||
};
|
||||
float_maxheap_array_t res = {
|
||||
size_t(nn), size_t(k), labels + s * k, D + s * k
|
||||
};
|
||||
|
||||
jaccard_knn_hc(&res, x + s * code_size, xb.data(), ntotal, code_size,
|
||||
/* ordered = */ true);
|
||||
jaccard_knn_hc(&res, x + s * code_size, xb.data(), ntotal, code_size,
|
||||
/* ordered = */ true, bitset);
|
||||
|
||||
} else {
|
||||
FAISS_THROW_MSG("tanimoto_knn_mc not implemented");
|
||||
}
|
||||
}
|
||||
if (metric_type == METRIC_Tanimoto) {
|
||||
for (int i = 0; i < k * n; i++) {
|
||||
D[i] = -log2(1-D[i]);
|
||||
}
|
||||
}
|
||||
memcpy(distances, D, sizeof(float) * n * k);
|
||||
delete [] D;
|
||||
} else {
|
||||
for (idx_t s = 0; s < n; s += block_size) {
|
||||
idx_t nn = block_size;
|
||||
if (s + block_size > n) {
|
||||
nn = n - s;
|
||||
}
|
||||
if (use_heap) {
|
||||
// We see the distances and labels as heaps.
|
||||
int_maxheap_array_t res = {
|
||||
size_t(nn), size_t(k), labels + s * k, distances + s * k
|
||||
};
|
||||
} else {
|
||||
FAISS_THROW_MSG("tanimoto_knn_mc not implemented");
|
||||
}
|
||||
}
|
||||
if (metric_type == METRIC_Tanimoto) {
|
||||
for (int i = 0; i < k * n; i++) {
|
||||
D[i] = -log2(1-D[i]);
|
||||
}
|
||||
}
|
||||
memcpy(distances, D, sizeof(float) * n * k);
|
||||
delete [] D;
|
||||
} else {
|
||||
for (idx_t s = 0; s < n; s += block_size) {
|
||||
idx_t nn = block_size;
|
||||
if (s + block_size > n) {
|
||||
nn = n - s;
|
||||
}
|
||||
if (use_heap) {
|
||||
// We see the distances and labels as heaps.
|
||||
int_maxheap_array_t res = {
|
||||
size_t(nn), size_t(k), labels + s * k, distances + s * k
|
||||
};
|
||||
|
||||
hammings_knn_hc(&res, x + s * code_size, xb.data(), ntotal, code_size,
|
||||
/* ordered = */ true);
|
||||
} else {
|
||||
hammings_knn_mc(x + s * code_size, xb.data(), nn, ntotal, k, code_size,
|
||||
distances + s * k, labels + s * k);
|
||||
}
|
||||
}
|
||||
}
|
||||
hammings_knn_hc(&res, x + s * code_size, xb.data(), ntotal, code_size,
|
||||
/* ordered = */ true, bitset);
|
||||
} else {
|
||||
hammings_knn_mc(x + s * code_size, xb.data(), nn, ntotal, k, code_size,
|
||||
distances + s * k, labels + s * k, bitset);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
size_t IndexBinaryFlat::remove_ids(const IDSelector& sel) {
|
||||
|
|
|
@ -37,8 +37,8 @@ struct IndexBinaryFlat : IndexBinary {
|
|||
|
||||
void reset() override;
|
||||
|
||||
void search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const override;
|
||||
void search (idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void reconstruct(idx_t key, uint8_t *recons) const override;
|
||||
|
||||
|
|
|
@ -50,7 +50,7 @@ void IndexBinaryFromFloat::reset() {
|
|||
}
|
||||
|
||||
void IndexBinaryFromFloat::search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const {
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset) const {
|
||||
constexpr idx_t bs = 32768;
|
||||
std::unique_ptr<float[]> xf(new float[bs * d]);
|
||||
std::unique_ptr<float[]> df(new float[bs * k]);
|
||||
|
|
|
@ -41,7 +41,7 @@ struct IndexBinaryFromFloat : IndexBinary {
|
|||
void reset() override;
|
||||
|
||||
void search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const override;
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void train(idx_t n, const uint8_t *x) override;
|
||||
};
|
||||
|
|
|
@ -196,7 +196,7 @@ void IndexBinaryHNSW::train(idx_t n, const uint8_t *x)
|
|||
}
|
||||
|
||||
void IndexBinaryHNSW::search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
#pragma omp parallel
|
||||
{
|
||||
|
|
|
@ -45,7 +45,7 @@ struct IndexBinaryHNSW : IndexBinary {
|
|||
|
||||
/// entry point for search
|
||||
void search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const override;
|
||||
int32_t *distances, idx_t *labels, ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void reconstruct(idx_t key, uint8_t* recons) const override;
|
||||
|
||||
|
|
|
@ -146,8 +146,8 @@ void IndexBinaryIVF::make_direct_map(bool new_maintain_direct_map) {
|
|||
maintain_direct_map = new_maintain_direct_map;
|
||||
}
|
||||
|
||||
void IndexBinaryIVF::search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const {
|
||||
void IndexBinaryIVF::search(idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) const {
|
||||
std::unique_ptr<idx_t[]> idx(new idx_t[n * nprobe]);
|
||||
std::unique_ptr<int32_t[]> coarse_dis(new int32_t[n * nprobe]);
|
||||
|
||||
|
@ -159,10 +159,40 @@ void IndexBinaryIVF::search(idx_t n, const uint8_t *x, idx_t k,
|
|||
invlists->prefetch_lists(idx.get(), n * nprobe);
|
||||
|
||||
search_preassigned(n, x, k, idx.get(), coarse_dis.get(),
|
||||
distances, labels, false);
|
||||
distances, labels, false, nullptr, bitset);
|
||||
indexIVF_stats.search_time += getmillisecs() - t0;
|
||||
}
|
||||
|
||||
void IndexBinaryIVF::get_vector_by_id(idx_t n, const idx_t *xid, uint8_t *x, ConcurrentBitsetPtr bitset) {
|
||||
|
||||
if (!maintain_direct_map) {
|
||||
make_direct_map(true);
|
||||
}
|
||||
|
||||
/* only get vector by 1 id */
|
||||
FAISS_ASSERT(n == 1);
|
||||
if (!bitset || !bitset->test(xid[0])) {
|
||||
reconstruct(xid[0], x + 0 * d);
|
||||
} else {
|
||||
memset(x, UINT8_MAX, d * sizeof(uint8_t));
|
||||
}
|
||||
}
|
||||
|
||||
void IndexBinaryIVF::search_by_id (idx_t n, const idx_t *xid, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) {
|
||||
if (!maintain_direct_map) {
|
||||
make_direct_map(true);
|
||||
}
|
||||
|
||||
auto x = new uint8_t[n * d];
|
||||
for (idx_t i = 0; i < n; ++i) {
|
||||
reconstruct(xid[i], x + i * d);
|
||||
}
|
||||
|
||||
search(n, x, k, distances, labels, bitset);
|
||||
delete []x;
|
||||
}
|
||||
|
||||
void IndexBinaryIVF::reconstruct(idx_t key, uint8_t *recons) const {
|
||||
FAISS_THROW_IF_NOT_MSG(direct_map.size() == ntotal,
|
||||
"direct map is not initialized");
|
||||
|
@ -376,18 +406,22 @@ struct IVFBinaryScannerL2: BinaryInvertedListScanner {
|
|||
const uint8_t *codes,
|
||||
const idx_t *ids,
|
||||
int32_t *simi, idx_t *idxi,
|
||||
size_t k) const override
|
||||
size_t k,
|
||||
ConcurrentBitsetPtr bitset) const override
|
||||
{
|
||||
using C = CMax<int32_t, idx_t>;
|
||||
|
||||
size_t nup = 0;
|
||||
for (size_t j = 0; j < n; j++) {
|
||||
uint32_t dis = hc.hamming (codes);
|
||||
if (dis < simi[0]) {
|
||||
heap_pop<C> (k, simi, idxi);
|
||||
idx_t id = store_pairs ? (list_no << 32 | j) : ids[j];
|
||||
heap_push<C> (k, simi, idxi, dis, id);
|
||||
nup++;
|
||||
if (!bitset || !bitset->test(ids[j])) {
|
||||
uint32_t dis = hc.hamming (codes);
|
||||
|
||||
if (dis < simi[0]) {
|
||||
heap_pop<C> (k, simi, idxi);
|
||||
idx_t id = store_pairs ? (list_no << 32 | j) : ids[j];
|
||||
heap_push<C> (k, simi, idxi, dis, id);
|
||||
nup++;
|
||||
}
|
||||
}
|
||||
codes += code_size;
|
||||
}
|
||||
|
@ -422,18 +456,22 @@ struct IVFBinaryScannerJaccard: BinaryInvertedListScanner {
|
|||
const uint8_t *codes,
|
||||
const idx_t *ids,
|
||||
int32_t *simi, idx_t *idxi,
|
||||
size_t k) const override
|
||||
size_t k,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override
|
||||
{
|
||||
using C = CMax<float, idx_t>;
|
||||
float* psimi = (float*)simi;
|
||||
size_t nup = 0;
|
||||
for (size_t j = 0; j < n; j++) {
|
||||
float dis = hc.jaccard (codes);
|
||||
if (dis < psimi[0]) {
|
||||
heap_pop<C> (k, psimi, idxi);
|
||||
idx_t id = store_pairs ? (list_no << 32 | j) : ids[j];
|
||||
heap_push<C> (k, psimi, idxi, dis, id);
|
||||
nup++;
|
||||
if(!bitset || !bitset->test(ids[j])){
|
||||
float dis = hc.jaccard (codes);
|
||||
|
||||
if (dis < psimi[0]) {
|
||||
heap_pop<C> (k, psimi, idxi);
|
||||
idx_t id = store_pairs ? (list_no << 32 | j) : ids[j];
|
||||
heap_push<C> (k, psimi, idxi, dis, id);
|
||||
nup++;
|
||||
}
|
||||
}
|
||||
codes += code_size;
|
||||
}
|
||||
|
@ -496,7 +534,8 @@ void search_knn_hamming_heap(const IndexBinaryIVF& ivf,
|
|||
const int32_t * coarse_dis,
|
||||
int32_t *distances, idx_t *labels,
|
||||
bool store_pairs,
|
||||
const IVFSearchParameters *params)
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset = nullptr)
|
||||
{
|
||||
long nprobe = params ? params->nprobe : ivf.nprobe;
|
||||
long max_codes = params ? params->max_codes : ivf.max_codes;
|
||||
|
@ -556,7 +595,7 @@ void search_knn_hamming_heap(const IndexBinaryIVF& ivf,
|
|||
}
|
||||
|
||||
nheap += scanner->scan_codes (list_size, scodes.get(),
|
||||
ids, simi, idxi, k);
|
||||
ids, simi, idxi, k, bitset);
|
||||
|
||||
nscan += list_size;
|
||||
if (max_codes && nscan >= max_codes)
|
||||
|
@ -588,7 +627,8 @@ void search_knn_jaccard_heap(const IndexBinaryIVF& ivf,
|
|||
const float * coarse_dis,
|
||||
float *distances, idx_t *labels,
|
||||
bool store_pairs,
|
||||
const IVFSearchParameters *params)
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset = nullptr)
|
||||
{
|
||||
long nprobe = params ? params->nprobe : ivf.nprobe;
|
||||
long max_codes = params ? params->max_codes : ivf.max_codes;
|
||||
|
@ -643,7 +683,7 @@ void search_knn_jaccard_heap(const IndexBinaryIVF& ivf,
|
|||
}
|
||||
|
||||
nheap += scanner->scan_codes (list_size, scodes.get(),
|
||||
ids, (int32_t*)simi, idxi, k);
|
||||
ids, (int32_t*)simi, idxi, k, bitset);
|
||||
|
||||
nscan += list_size;
|
||||
if (max_codes && nscan >= max_codes)
|
||||
|
@ -671,7 +711,8 @@ void search_knn_hamming_count(const IndexBinaryIVF& ivf,
|
|||
int k,
|
||||
int32_t *distances,
|
||||
idx_t *labels,
|
||||
const IVFSearchParameters *params) {
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset = nullptr) {
|
||||
const int nBuckets = ivf.d + 1;
|
||||
std::vector<int> all_counters(nx * nBuckets, 0);
|
||||
std::unique_ptr<idx_t[]> all_ids_per_dis(new idx_t[nx * nBuckets * k]);
|
||||
|
@ -719,10 +760,12 @@ void search_knn_hamming_count(const IndexBinaryIVF& ivf,
|
|||
: ivf.invlists->get_ids(key);
|
||||
|
||||
for (size_t j = 0; j < list_size; j++) {
|
||||
const uint8_t * yj = list_vecs + ivf.code_size * j;
|
||||
if(!bitset || !bitset->test(ids[j])){
|
||||
const uint8_t * yj = list_vecs + ivf.code_size * j;
|
||||
|
||||
idx_t id = store_pairs ? (key << 32 | j) : ids[j];
|
||||
csi.update_counter(yj, id);
|
||||
idx_t id = store_pairs ? (key << 32 | j) : ids[j];
|
||||
csi.update_counter(yj, id);
|
||||
}
|
||||
}
|
||||
if (ids)
|
||||
ivf.invlists->release_ids (key, ids);
|
||||
|
@ -764,12 +807,13 @@ void search_knn_hamming_count_1 (
|
|||
int k,
|
||||
int32_t *distances,
|
||||
idx_t *labels,
|
||||
const IVFSearchParameters *params) {
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset = nullptr) {
|
||||
switch (ivf.code_size) {
|
||||
#define HANDLE_CS(cs) \
|
||||
case cs: \
|
||||
search_knn_hamming_count<HammingComputer ## cs, store_pairs>( \
|
||||
ivf, nx, x, keys, k, distances, labels, params); \
|
||||
ivf, nx, x, keys, k, distances, labels, params, bitset); \
|
||||
break;
|
||||
HANDLE_CS(4);
|
||||
HANDLE_CS(8);
|
||||
|
@ -781,13 +825,13 @@ void search_knn_hamming_count_1 (
|
|||
default:
|
||||
if (ivf.code_size % 8 == 0) {
|
||||
search_knn_hamming_count<HammingComputerM8, store_pairs>
|
||||
(ivf, nx, x, keys, k, distances, labels, params);
|
||||
(ivf, nx, x, keys, k, distances, labels, params, bitset);
|
||||
} else if (ivf.code_size % 4 == 0) {
|
||||
search_knn_hamming_count<HammingComputerM4, store_pairs>
|
||||
(ivf, nx, x, keys, k, distances, labels, params);
|
||||
(ivf, nx, x, keys, k, distances, labels, params, bitset);
|
||||
} else {
|
||||
search_knn_hamming_count<HammingComputerDefault, store_pairs>
|
||||
(ivf, nx, x, keys, k, distances, labels, params);
|
||||
(ivf, nx, x, keys, k, distances, labels, params, bitset);
|
||||
}
|
||||
break;
|
||||
}
|
||||
|
@ -821,7 +865,8 @@ void IndexBinaryIVF::search_preassigned(idx_t n, const uint8_t *x, idx_t k,
|
|||
const int32_t * coarse_dis,
|
||||
int32_t *distances, idx_t *labels,
|
||||
bool store_pairs,
|
||||
const IVFSearchParameters *params
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset
|
||||
) const {
|
||||
|
||||
if (metric_type == METRIC_Jaccard || metric_type == METRIC_Tanimoto) {
|
||||
|
@ -831,7 +876,7 @@ void IndexBinaryIVF::search_preassigned(idx_t n, const uint8_t *x, idx_t k,
|
|||
memcpy(c_dis, coarse_dis, sizeof(float) * n * nprobe);
|
||||
search_knn_jaccard_heap (*this, n, x, k, idx, c_dis ,
|
||||
D, labels, store_pairs,
|
||||
params);
|
||||
params, bitset);
|
||||
if (metric_type == METRIC_Tanimoto) {
|
||||
for (int i = 0; i < k * n; i++) {
|
||||
D[i] = -log2(1-D[i]);
|
||||
|
@ -847,14 +892,14 @@ void IndexBinaryIVF::search_preassigned(idx_t n, const uint8_t *x, idx_t k,
|
|||
if (use_heap) {
|
||||
search_knn_hamming_heap (*this, n, x, k, idx, coarse_dis,
|
||||
distances, labels, store_pairs,
|
||||
params);
|
||||
params, bitset);
|
||||
} else {
|
||||
if (store_pairs) {
|
||||
search_knn_hamming_count_1<true>
|
||||
(*this, n, x, idx, k, distances, labels, params);
|
||||
(*this, n, x, idx, k, distances, labels, params, bitset);
|
||||
} else {
|
||||
search_knn_hamming_count_1<false>
|
||||
(*this, n, x, idx, k, distances, labels, params);
|
||||
(*this, n, x, idx, k, distances, labels, params, bitset);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -105,7 +105,8 @@ struct IndexBinaryIVF : IndexBinary {
|
|||
const int32_t *centroid_dis,
|
||||
int32_t *distances, idx_t *labels,
|
||||
bool store_pairs,
|
||||
const IVFSearchParameters *params=nullptr
|
||||
const IVFSearchParameters *params=nullptr,
|
||||
ConcurrentBitsetPtr bitset = nullptr
|
||||
) const;
|
||||
|
||||
virtual BinaryInvertedListScanner *get_InvertedListScanner (
|
||||
|
@ -115,8 +116,14 @@ struct IndexBinaryIVF : IndexBinary {
|
|||
bool store_pairs=false) const;
|
||||
|
||||
/** assign the vectors, then call search_preassign */
|
||||
virtual void search(idx_t n, const uint8_t *x, idx_t k,
|
||||
int32_t *distances, idx_t *labels) const override;
|
||||
void search(idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
/** get raw vectors by ids */
|
||||
void get_vector_by_id(idx_t n, const idx_t *xid, uint8_t *x, ConcurrentBitsetPtr bitset = nullptr) override;
|
||||
|
||||
void search_by_id (idx_t n, const idx_t *xid, idx_t k, int32_t *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) override;
|
||||
|
||||
void reconstruct(idx_t key, uint8_t *recons) const override;
|
||||
|
||||
|
@ -204,7 +211,8 @@ struct BinaryInvertedListScanner {
|
|||
const uint8_t *codes,
|
||||
const idx_t *ids,
|
||||
int32_t *distances, idx_t *labels,
|
||||
size_t k) const = 0;
|
||||
size_t k,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const = 0;
|
||||
|
||||
virtual ~BinaryInvertedListScanner () {}
|
||||
|
||||
|
|
|
@ -38,30 +38,29 @@ void IndexFlat::reset() {
|
|||
ntotal = 0;
|
||||
}
|
||||
|
||||
|
||||
void IndexFlat::search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const
|
||||
void IndexFlat::search(idx_t n, const float* x, idx_t k, float* distances, idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
// we see the distances and labels as heaps
|
||||
|
||||
if (metric_type == METRIC_INNER_PRODUCT) {
|
||||
float_minheap_array_t res = {
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_inner_product (x, xb.data(), d, n, ntotal, &res);
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_inner_product (x, xb.data(), d, n, ntotal, &res, bitset);
|
||||
} else if (metric_type == METRIC_L2) {
|
||||
float_maxheap_array_t res = {
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_L2sqr (x, xb.data(), d, n, ntotal, &res);
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_L2sqr (x, xb.data(), d, n, ntotal, &res, bitset);
|
||||
} else if (metric_type == METRIC_Jaccard) {
|
||||
float_maxheap_array_t res = {
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_jaccard (x, xb.data(), d, n, ntotal, &res);
|
||||
knn_jaccard (x, xb.data(), d, n, ntotal, &res, bitset);
|
||||
} else {
|
||||
float_maxheap_array_t res = {
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
size_t(n), size_t(k), labels, distances};
|
||||
knn_extra_metrics (x, xb.data(), d, n, ntotal,
|
||||
metric_type, metric_arg,
|
||||
&res);
|
||||
&res, bitset);
|
||||
}
|
||||
}
|
||||
|
||||
|
@ -245,7 +244,8 @@ void IndexFlatL2BaseShift::search (
|
|||
const float *x,
|
||||
idx_t k,
|
||||
float *distances,
|
||||
idx_t *labels) const
|
||||
idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
FAISS_THROW_IF_NOT (shift.size() == ntotal);
|
||||
|
||||
|
@ -328,7 +328,8 @@ static void reorder_2_heaps (
|
|||
|
||||
void IndexRefineFlat::search (
|
||||
idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const
|
||||
float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
FAISS_THROW_IF_NOT (is_trained);
|
||||
idx_t k_base = idx_t (k * k_factor);
|
||||
|
@ -421,7 +422,8 @@ void IndexFlat1D::search (
|
|||
const float *x,
|
||||
idx_t k,
|
||||
float *distances,
|
||||
idx_t *labels) const
|
||||
idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
FAISS_THROW_IF_NOT_MSG (perm.size() == ntotal,
|
||||
"Call update_permutation before search");
|
||||
|
|
|
@ -33,7 +33,8 @@ struct IndexFlat: Index {
|
|||
const float* x,
|
||||
idx_t k,
|
||||
float* distances,
|
||||
idx_t* labels) const override;
|
||||
idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void range_search(
|
||||
idx_t n,
|
||||
|
@ -103,7 +104,8 @@ struct IndexFlatL2BaseShift: IndexFlatL2 {
|
|||
const float* x,
|
||||
idx_t k,
|
||||
float* distances,
|
||||
idx_t* labels) const override;
|
||||
idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
};
|
||||
|
||||
|
||||
|
@ -138,7 +140,8 @@ struct IndexRefineFlat: Index {
|
|||
const float* x,
|
||||
idx_t k,
|
||||
float* distances,
|
||||
idx_t* labels) const override;
|
||||
idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
~IndexRefineFlat() override;
|
||||
};
|
||||
|
@ -166,7 +169,8 @@ struct IndexFlat1D:IndexFlatL2 {
|
|||
const float* x,
|
||||
idx_t k,
|
||||
float* distances,
|
||||
idx_t* labels) const override;
|
||||
idx_t* labels,
|
||||
ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
};
|
||||
|
||||
|
||||
|
|
|
@ -242,7 +242,7 @@ void IndexHNSW::train(idx_t n, const float* x)
|
|||
}
|
||||
|
||||
void IndexHNSW::search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const
|
||||
float *distances, idx_t *labels, ConcurrentBitsetPtr bitset) const
|
||||
|
||||
{
|
||||
FAISS_THROW_IF_NOT_MSG(storage,
|
||||
|
@ -961,7 +961,7 @@ int search_from_candidates_2(const HNSW & hnsw,
|
|||
} // namespace
|
||||
|
||||
void IndexHNSW2Level::search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const
|
||||
float *distances, idx_t *labels, ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
if (dynamic_cast<const Index2Layer*>(storage)) {
|
||||
IndexHNSW::search (n, x, k, distances, labels);
|
||||
|
|
|
@ -91,7 +91,7 @@ struct IndexHNSW : Index {
|
|||
|
||||
/// entry point for search
|
||||
void search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const override;
|
||||
float *distances, idx_t *labels, ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
void reconstruct(idx_t key, float* recons) const override;
|
||||
|
||||
|
@ -162,7 +162,7 @@ struct IndexHNSW2Level : IndexHNSW {
|
|||
|
||||
/// entry point for search
|
||||
void search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const override;
|
||||
float *distances, idx_t *labels, ConcurrentBitsetPtr bitset = nullptr) const override;
|
||||
|
||||
};
|
||||
|
||||
|
|
|
@ -297,10 +297,8 @@ void IndexIVF::make_direct_map (bool new_maintain_direct_map)
|
|||
maintain_direct_map = new_maintain_direct_map;
|
||||
}
|
||||
|
||||
|
||||
void IndexIVF::search (idx_t n, const float *x, idx_t k,
|
||||
float *distances, idx_t *labels) const
|
||||
{
|
||||
void IndexIVF::search (idx_t n, const float *x, idx_t k, float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) const {
|
||||
std::unique_ptr<idx_t[]> idx(new idx_t[n * nprobe]);
|
||||
std::unique_ptr<float[]> coarse_dis(new float[n * nprobe]);
|
||||
|
||||
|
@ -312,18 +310,47 @@ void IndexIVF::search (idx_t n, const float *x, idx_t k,
|
|||
invlists->prefetch_lists (idx.get(), n * nprobe);
|
||||
|
||||
search_preassigned (n, x, k, idx.get(), coarse_dis.get(),
|
||||
distances, labels, false);
|
||||
distances, labels, false, nullptr,bitset);
|
||||
indexIVF_stats.search_time += getmillisecs() - t0;
|
||||
}
|
||||
|
||||
void IndexIVF::get_vector_by_id (idx_t n, const idx_t *xid, float *x, ConcurrentBitsetPtr bitset) {
|
||||
|
||||
if (!maintain_direct_map) {
|
||||
make_direct_map(true);
|
||||
}
|
||||
|
||||
/* only get vector by 1 id */
|
||||
FAISS_ASSERT(n == 1);
|
||||
if (!bitset || !bitset->test(xid[0])) {
|
||||
reconstruct(xid[0], x + 0 * d);
|
||||
} else {
|
||||
memset(x, UINT8_MAX, d * sizeof(float));
|
||||
}
|
||||
}
|
||||
|
||||
void IndexIVF::search_by_id (idx_t n, const idx_t *xid, idx_t k, float *distances, idx_t *labels,
|
||||
ConcurrentBitsetPtr bitset) {
|
||||
if (!maintain_direct_map) {
|
||||
make_direct_map(true);
|
||||
}
|
||||
|
||||
auto x = new float[n * d];
|
||||
for (idx_t i = 0; i < n; ++i) {
|
||||
reconstruct(xid[i], x + i * d);
|
||||
}
|
||||
|
||||
search(n, x, k, distances, labels, bitset);
|
||||
delete []x;
|
||||
}
|
||||
|
||||
void IndexIVF::search_preassigned (idx_t n, const float *x, idx_t k,
|
||||
const idx_t *keys,
|
||||
const float *coarse_dis ,
|
||||
float *distances, idx_t *labels,
|
||||
bool store_pairs,
|
||||
const IVFSearchParameters *params) const
|
||||
const IVFSearchParameters *params,
|
||||
ConcurrentBitsetPtr bitset) const
|
||||
{
|
||||
long nprobe = params ? params->nprobe : this->nprobe;
|
||||
long max_codes = params ? params->max_codes : this->max_codes;
|
||||
|
@ -373,7 +400,7 @@ void IndexIVF::search_preassigned (idx_t n, const float *x, idx_t k,
|
|||
// single list scan using the current scanner (with query
|
||||
// set porperly) and storing results in simi and idxi
|
||||
auto scan_one_list = [&] (idx_t key, float coarse_dis_i,
|
||||
float *simi, idx_t *idxi) {
|
||||
float *simi, idx_t *idxi, ConcurrentBitsetPtr bitset) {
|
||||
|
||||
if (key < 0) {
|
||||
// not enough centroids for multiprobe
|
||||
|
@ -405,7 +432,7 @@ void IndexIVF::search_preassigned (idx_t n, const float *x, idx_t k,
|
|||
}
|
||||
|
||||
nheap += scanner->scan_codes (list_size, scodes.get(),
|
||||
ids, simi, idxi, k);
|
||||
ids, simi, idxi, k, bitset);
|
||||
|
||||
return list_size;
|
||||
};
|
||||
|
@ -438,7 +465,7 @@ void IndexIVF::search_preassigned (idx_t n, const float *x, idx_t k,
|
|||
nscan += scan_one_list (
|
||||
keys [i * nprobe + ik],
|
||||
coarse_dis[i * nprobe + ik],
|
||||
simi, idxi
|
||||
simi, idxi, bitset
|
||||
);
|
||||
|
||||
if (max_codes && nscan >= max_codes) {
|
||||
|
@ -467,7 +494,7 @@ void IndexIVF::search_preassigned (idx_t n, const float *x, idx_t k,
|
|||
ndis += scan_one_list
|
||||
(keys [i * nprobe + ik],
|
||||
coarse_dis[i * nprobe + ik],
|
||||
local_dis.data(), local_idx.data());
|
||||
local_dis.data(), local_idx.data(), bitset);
|
||||
|
||||
// can't do the test on max_codes
|
||||
}
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue