Commit Graph

37 Commits (2bb57b9800e27a6c3752cafa347fa0605f714d4a)

Author SHA1 Message Date
merwanehamadi 6afd962270
Remove baserun because api key issue (#282)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-09 11:24:54 -07:00
merwanehamadi e3f1e2184f
Release 0.0.4 (#280)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-09 10:04:57 -07:00
merwanehamadi 14e6d4968e
Integrate with baserun (#274) 2023-08-08 14:04:43 -07:00
merwanehamadi 305f3a6138
Add web app creation challenge (#272) 2023-08-08 13:08:51 -07:00
Swifty e0a72b86c1
AUTO-25: Add the ability to run multiple categories and to skip categories (#270) 2023-08-07 12:29:00 +01:00
Silen Naihin 19848f362d
remove pytest-depends, rerouting functions (#250) 2023-08-06 22:35:22 +01:00
merwanehamadi 13d2dcbf5e
Add agent protocol (#258)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-05 10:43:18 -07:00
merwanehamadi 02dd294ea7
Release 0.0.3 (#249) 2023-08-03 16:46:43 -07:00
Erik Peterson 819bd5059e
Update python-dotenv (#240) 2023-08-02 10:17:57 -07:00
merwanehamadi f41533ce62
Fix reports and add commit sha (#233)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-01 17:54:23 -07:00
Silen Naihin 4011cb228f
working bar and radar charts (#221) 2023-07-31 12:22:38 +01:00
Silen Naihin 19db3151dd
Feature: Visualize Test Results (#211) 2023-07-30 23:51:17 +01:00
Silen Naihin f07e7b60d4
Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
merwanehamadi 5df710fd35
Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin 66d1fec07e attempting more logs 2023-07-26 23:36:45 +01:00
Silen Naihin d9b3d7da37
Safety challenges, adaptability challenges, suite same_task (#177) 2023-07-24 13:57:44 -07:00
merwanehamadi 7288d4ccc0
Release 0.0.2 (#186) 2023-07-23 14:03:21 -07:00
merwanehamadi 68445ae577
Change package version (#184) 2023-07-23 12:51:12 -07:00
Erik Peterson 5a3b4f3d1d
Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-20 15:41:59 -07:00
merwanehamadi d46124a9d8
Push reports to google drive (#167)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-18 09:17:45 -07:00
merwanehamadi a9702e4629
Add basic code generation challenge (#98) 2023-07-14 13:27:48 -04:00
merwanehamadi 0799be7e28
Fix tests ci (#82) 2023-07-10 21:54:25 -07:00
merwanehamadi 437e066a66
Add "Simple web server" challenge (#74)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
2023-07-10 20:46:03 -04:00
Silen Naihin 69bd41f741
Quality of life improvements & fixes (#75) 2023-07-08 18:43:38 -07:00
merwanehamadi 9ede17891b
Add 'Debug simple typo with guidance' challenge (#65)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-07 13:50:53 -07:00
Silen Naihin bfd0d5c826
Fix home_path, local mini-agi run works (#64)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-06 18:00:45 -07:00
merwanehamadi 101ffdbce0
Integrate with gpt engineer (#47) 2023-07-03 14:53:28 -04:00
merwanehamadi 838f72097c
Add static linters ci (#45) 2023-07-02 16:14:49 -04:00
Silen Naihin f933717d8b mini-agi, simple challenge creation, --mock flag 2023-06-27 18:17:54 -04:00
Silen Naihin a2f79760ce other was non solution, solution is pytest-depends 2023-06-27 13:26:28 -04:00
Silen Naihin 06a6f08054 finally figured out right way to do dependencies 2023-06-27 13:26:28 -04:00
Silen Naihin 2f28a66591 more elegant marking & dependency solution 2023-06-27 13:26:28 -04:00
Silen Naihin 60a7ac2343 adding dependencies on other challenges 2023-06-27 13:26:28 -04:00
Silen Naihin 8c44b9eddf basic challenges, more ChallengeData structure 2023-06-27 13:26:28 -04:00
Silen Naihin 15c5469bb1
Add automatic regression markers (#38) 2023-06-22 08:18:22 -04:00
Silen Naihin b7deb984f7
start click, fixtures, types, challenge creation, mock run -stable (#37) 2023-06-21 11:43:18 -04:00
Silen Naihin 51f2295971 init agbenchmark 2023-06-18 11:14:54 -04:00