AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

ai artificial-intelligence autonomous-agents gpt-4 openai python

Go to file

merwanehamadi 44436fe1a3 Fix Chart generation (#346 ) Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>		2023-09-01 08:31:17 -07:00
.github	Tic tac toe challenge (#345 )	2023-08-31 20:45:31 -07:00
.vscode	Adding Auto-GPT-Turbo (#322 )	2023-08-19 11:32:38 -07:00
agbenchmark	Tic tac toe challenge (#345 )	2023-08-31 20:45:31 -07:00
agent	Updating Turbo (#343 )	2023-08-31 07:09:41 -04:00
backend	adding backend and a basic ui (#309 )	2023-08-27 03:18:30 -04:00
frontend@c6a9572bed	update frontend hash, run.sh	2023-08-28 19:55:44 -07:00
notebooks	working bar and radar charts (#221 )	2023-07-31 12:22:38 +01:00
reports	Fix Chart generation (#346 )	2023-09-01 08:31:17 -07:00
.env.example	Update .env.example (#298 )	2023-08-12 19:52:15 -07:00
.flake8	Cleanup skill tree (#287 )	2023-08-10 16:29:58 -07:00
.gitignore	Updated ignore	2023-08-29 15:47:35 +02:00
.gitmodules	Update Turbo (#324 )	2023-08-23 14:39:20 -07:00
.pre-commit-config.yaml	AUTO-25: Add the ability to run multiple categories and to skip categories (#270 )	2023-08-07 12:29:00 +01:00
.python-version	Add static linters ci (#45 )	2023-07-02 16:14:49 -04:00
LICENSE	init agbenchmark	2023-06-18 11:14:54 -04:00
README.md	adding backend and a basic ui (#309 )	2023-08-27 03:18:30 -04:00
json_to_base_64.py	Push reports to google drive (#167 )	2023-07-18 09:17:45 -07:00
mypy.ini	Add all agent protocol tests (#260 )	2023-08-06 09:52:46 -07:00
poetry.lock	Support agent protocol (#337 )	2023-08-30 19:44:39 -07:00
pyproject.toml	Support agent protocol (#337 )	2023-08-30 19:44:39 -07:00
run.sh	update frontend hash, run.sh	2023-08-28 19:55:44 -07:00
send_to_googledrive.py	Fix linter 2 (#319 )	2023-08-16 16:56:02 -07:00
server.py	Tic tac toe challenge (#345 )	2023-08-31 20:45:31 -07:00

README.md

Auto-GPT Benchmarks

Built for the purpose of benchmarking the performance of agents regardless of how they work.

Objectively know how well your agent is performing in categories like code, retrieval, memory, and safety.

Save time and money while doing it through smart dependencies. The best part? It's all automated.

Scores:

Ranking overall:

Detailed results:

Click here to see the results and the raw data!!

More agents coming soon !