AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

ai artificial-intelligence autonomous-agents gpt-4 openai python

Go to file

merwanehamadi 6407e258b5 Update publish_package.yml (#180 )		2023-07-23 09:19:39 -07:00
.github	Update publish_package.yml (#180 )	2023-07-23 09:19:39 -07:00
.vscode	init agbenchmark	2023-06-18 11:14:54 -04:00
agbenchmark	Integrate baby-agi (#168 )	2023-07-21 11:15:42 -07:00
agent	Integrate baby-agi (#168 )	2023-07-21 11:15:42 -07:00
benchmark_runs	gpt-engineer-20230716225908	2023-07-16 22:59:08 +00:00
reports	beebot-20230723082956	2023-07-23 08:29:57 +00:00
.env.example	Dynamic home path for runs (#119 )	2023-07-16 18:24:06 -07:00
.flake8	Add static linters ci (#45 )	2023-07-02 16:14:49 -04:00
.gitignore	Push reports to google drive (#167 )	2023-07-18 09:17:45 -07:00
.gitmodules	Integrate baby-agi (#168 )	2023-07-21 11:15:42 -07:00
.python-version	Add static linters ci (#45 )	2023-07-02 16:14:49 -04:00
LICENSE	init agbenchmark	2023-06-18 11:14:54 -04:00
README.md	Update Auto-GPT score (#106 )	2023-07-15 09:53:56 -07:00
json_to_base_64.py	Push reports to google drive (#167 )	2023-07-18 09:17:45 -07:00
mypy.ini	Added --test, consolidate files, reports working (#83 )	2023-07-10 19:25:19 -07:00
poetry.lock	Kill subprocesses when test ends (#172 )	2023-07-20 15:41:59 -07:00
pyproject.toml	Kill subprocesses when test ends (#172 )	2023-07-20 15:41:59 -07:00
send_to_googledrive.py	Push reports to google drive (#167 )	2023-07-18 09:17:45 -07:00

README.md

Auto-GPT Benchmark

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Radio chart for each agent coming soon !

Detailed results

⚠️ These results are constantly evolving at the moment. We will publish an official benchmark result very soon.

Interface

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Write File	❌	✅	tbd	✅
Read File	❌	❌	tbd	❌
Search File	❌	❌	tbd	❌

Code

Task	Auto-GPT	gpt-engineer	mini-agi	smol-developer
Debug Simple Typo With Guidance	❌	❌	tbd	❌
Debug Simple Typo Without Guidance	❌	❌	tbd	❌
Basic Code Generation	❌	✅	tbd	✅
Create Simple Web Server	❌	❌	tbd	❌

Memory

Task	Auto-GPT
Basic Memory	❌
Remember Multiple Ids	❌
Remember Multiple Ids With Noise	❌
Remember Multiple Phrases With Noise	❌