Commit Graph

46 Commits (4943022ff32c361f9a8bdccef4a0acc2a70e821e)

Author SHA1 Message Date
merwanehamadi 3ce10dfa25
Update pyproject.toml (#320) 2023-08-16 17:11:22 -07:00
merwanehamadi 82ed4a136a
Remove submodule (#314)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-16 14:57:52 -07:00
merwanehamadi c8c55c1297
0.0.8 (#299) 2023-08-13 07:58:19 -07:00
merwanehamadi 0a73e391d9
Release 0.0.7 (#295) 2023-08-12 10:46:52 -07:00
Erik Peterson 3ec09a3b69 Move pytest-asyncio to main dependency group 2023-08-11 12:52:35 -07:00
Silen Naihin a513b449f7 updating version 2023-08-11 13:59:42 +01:00
Silen Naihin 1a61c66898 mock flag, workspace io fixes, mark fixes 2023-08-11 13:22:21 +01:00
Jakub Novák c2269397f1
Use agent protocol (#278)
Signed-off-by: Jakub Novak <jakub@e2b.dev>
2023-08-11 09:04:08 +02:00
merwanehamadi 47c6062092
Cleanup skill tree (#287) 2023-08-10 16:29:58 -07:00
merwanehamadi 6afd962270
Remove baserun because api key issue (#282)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-09 11:24:54 -07:00
merwanehamadi e3f1e2184f
Release 0.0.4 (#280)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-09 10:04:57 -07:00
merwanehamadi 14e6d4968e
Integrate with baserun (#274) 2023-08-08 14:04:43 -07:00
merwanehamadi 305f3a6138
Add web app creation challenge (#272) 2023-08-08 13:08:51 -07:00
Swifty e0a72b86c1
AUTO-25: Add the ability to run multiple categories and to skip categories (#270) 2023-08-07 12:29:00 +01:00
Silen Naihin 19848f362d
remove pytest-depends, rerouting functions (#250) 2023-08-06 22:35:22 +01:00
merwanehamadi 13d2dcbf5e
Add agent protocol (#258)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-05 10:43:18 -07:00
merwanehamadi 02dd294ea7
Release 0.0.3 (#249) 2023-08-03 16:46:43 -07:00
Erik Peterson 819bd5059e
Update python-dotenv (#240) 2023-08-02 10:17:57 -07:00
merwanehamadi f41533ce62
Fix reports and add commit sha (#233)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-01 17:54:23 -07:00
Silen Naihin 4011cb228f
working bar and radar charts (#221) 2023-07-31 12:22:38 +01:00
Silen Naihin 19db3151dd
Feature: Visualize Test Results (#211) 2023-07-30 23:51:17 +01:00
Silen Naihin f07e7b60d4
Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
merwanehamadi 5df710fd35
Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin 66d1fec07e attempting more logs 2023-07-26 23:36:45 +01:00
Silen Naihin d9b3d7da37
Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
merwanehamadi 7288d4ccc0
Release 0.0.2 (#186) 2023-07-23 14:03:21 -07:00
merwanehamadi 68445ae577
Change package version (#184) 2023-07-23 12:51:12 -07:00
Erik Peterson 5a3b4f3d1d
Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-20 15:41:59 -07:00
merwanehamadi d46124a9d8
Push reports to google drive (#167)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-18 09:17:45 -07:00
merwanehamadi a9702e4629
Add basic code generation challenge (#98) 2023-07-14 13:27:48 -04:00
merwanehamadi 0799be7e28
Fix tests ci (#82) 2023-07-10 21:54:25 -07:00
merwanehamadi 437e066a66
Add "Simple web server" challenge (#74)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-10 20:46:03 -04:00
Silen Naihin 69bd41f741
Quality of life improvements & fixes (#75) 2023-07-08 18:43:38 -07:00
merwanehamadi 9ede17891b
Add 'Debug simple typo with guidance' challenge (#65)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-07 13:50:53 -07:00
Silen Naihin bfd0d5c826
Fix home_path, local mini-agi run works (#64)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-06 18:00:45 -07:00
merwanehamadi 101ffdbce0
Integrate with gpt engineer (#47) 2023-07-03 14:53:28 -04:00
merwanehamadi 838f72097c
Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
Silen Naihin f933717d8b mini-agi, simple challenge creation, --mock flag 2023-06-27 18:17:54 -04:00
Silen Naihin a2f79760ce other was non solution, solution is pytest-depends 2023-06-27 13:26:28 -04:00
Silen Naihin 06a6f08054 finally figured out right way to do dependencies 2023-06-27 13:26:28 -04:00
Silen Naihin 2f28a66591 more elegant marking & dependency solution 2023-06-27 13:26:28 -04:00
Silen Naihin 60a7ac2343 adding dependencies on other challenges 2023-06-27 13:26:28 -04:00
Silen Naihin 8c44b9eddf basic challenges, more ChallengeData structure 2023-06-27 13:26:28 -04:00
Silen Naihin 15c5469bb1
Add automatic regression markers (#38) 2023-06-22 08:18:22 -04:00
Silen Naihin b7deb984f7
start click, fixtures, types, challenge creation, mock run -stable (#37) 2023-06-21 11:43:18 -04:00
Silen Naihin 51f2295971 init agbenchmark 2023-06-18 11:14:54 -04:00