AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
 
 
 
 
 
 
Go to file
Silen Naihin ecc386ec7b
returning scores (#210)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 11:43:22 +01:00
.github Advanced LLM Evaluation Implementation (#205) 2023-07-29 10:26:19 +01:00
.vscode init agbenchmark 2023-06-18 11:14:54 -04:00
agbenchmark returning scores (#210) 2023-07-29 11:43:22 +01:00
agent Fix tests not being run (#207) 2023-07-27 20:50:53 -07:00
notebooks Advanced LLM Evaluation Implementation (#205) 2023-07-29 10:26:19 +01:00
reports returning scores (#210) 2023-07-29 11:43:22 +01:00
.env.example Advanced LLM Evaluation Implementation (#205) 2023-07-29 10:26:19 +01:00
.flake8 Use beebot autopackai (#203) 2023-07-27 12:21:43 -07:00
.gitignore Push reports to google drive (#167) 2023-07-18 09:17:45 -07:00
.gitmodules Use beebot autopackai (#203) 2023-07-27 12:21:43 -07:00
.python-version Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
LICENSE init agbenchmark 2023-06-18 11:14:54 -04:00
README.md Update Scores Benchmark (#192) 2023-07-25 11:09:49 -07:00
get_data_from_helicone.py Delete reports (#201) 2023-07-27 11:42:24 -07:00
json_to_base_64.py Push reports to google drive (#167) 2023-07-18 09:17:45 -07:00
mypy.ini report # bug, adding submodule challenges (#193) 2023-07-26 13:53:10 +01:00
poetry.lock Add dynamic headers using environment variables (#200) 2023-07-26 21:26:03 -07:00
pyproject.toml Advanced LLM Evaluation Implementation (#205) 2023-07-29 10:26:19 +01:00
send_to_googledrive.py Add helicone dynamic headers (#199) 2023-07-26 16:03:13 -07:00

README.md

Auto-GPT Benchmarks

A repo built for the purpose of benchmarking the performance of agents far and wide, regardless of how they are set up and how they work

Scores:

Screenshot 2023-07-25 at 10 35 01 AM

Ranking overall:

Detailed results:

Screenshot 2023-07-25 at 10 42 15 AM

Click here to see the results and the raw data!!

More agents coming soon !