merwanehamadi
|
c4554225bd
|
Update submodules (#212)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-29 10:18:35 -07:00 |
Silen Naihin
|
ecc386ec7b
|
returning scores (#210)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
|
2023-07-29 11:43:22 +01:00 |
Silen Naihin
|
f07e7b60d4
|
Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
|
2023-07-29 10:26:19 +01:00 |
merwanehamadi
|
80bd0c4260
|
Fix tests not being run (#207)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-27 20:50:53 -07:00 |
merwanehamadi
|
6098b70408
|
Use beebot autopackai (#203)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-27 12:21:43 -07:00 |
merwanehamadi
|
31897e7892
|
Delete reports (#201)
|
2023-07-27 11:42:24 -07:00 |
Silen Naihin
|
71e0c598d6
|
forcing AGENT_NAME to be defined from repo
|
2023-07-27 14:28:11 +01:00 |
Silen Naihin
|
0e6be16d07
|
helicone and llm eval fixes
|
2023-07-27 14:07:46 +01:00 |
merwanehamadi
|
eb57b15380
|
Add dynamic headers using environment variables (#200)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 21:26:03 -07:00 |
merwanehamadi
|
5df710fd35
|
Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 16:03:13 -07:00 |
Silen Naihin
|
66d1fec07e
|
attempting more logs
|
2023-07-26 23:36:45 +01:00 |
merwanehamadi
|
01b118e590
|
Add llm eval (#197)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 14:00:24 -07:00 |
Silen Naihin
|
80506e9a3b
|
report # bug, adding submodule challenges (#193)
|
2023-07-26 13:53:10 +01:00 |
merwanehamadi
|
a1e02f243c
|
Add safety suite (#196)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-25 20:13:01 -07:00 |
Silen Naihin
|
5e3bbb946f
|
fix suite dependencies (#194)
|
2023-07-26 01:50:53 +01:00 |
Silen Naihin
|
b82277515f
|
hotfix reports (#191)
|
2023-07-25 19:07:24 +01:00 |
Silen Naihin
|
d9b3d7da37
|
Safety challenges, adaptability challenges, suite same_task (#177)
|
2023-07-24 13:57:44 -07:00 |
Silen Naihin
|
2b3abeff4e
|
Integrate baby-agi (#168)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
|
2023-07-21 11:15:42 -07:00 |
Erik Peterson
|
5a3b4f3d1d
|
Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-20 15:41:59 -07:00 |
Silen Naihin
|
12c5d54583
|
Fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling (#166)
|
2023-07-17 19:41:58 -07:00 |
merwanehamadi
|
2d8fa5ca6f
|
Use report location (#165)
|
2023-07-17 20:15:10 -04:00 |
Silen Naihin
|
8aa6452cc4
|
file naming when --test (#164)
|
2023-07-17 11:24:16 -04:00 |
Silen Naihin
|
dffc1dfd51
|
internal_info.json dynamic changes (#163)
|
2023-07-17 09:39:24 -04:00 |
Silen Naihin
|
ce4cefe7e7
|
Dynamic home path for runs (#119)
|
2023-07-16 18:24:06 -07:00 |
merwanehamadi
|
2704bcee5e
|
Allow change location of reports (#115)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-16 07:26:36 -07:00 |
Silen Naihin
|
9f3a2d4f05
|
Dynamic cutoff and other quality of life (#101)
|
2023-07-15 22:10:20 -04:00 |
merwanehamadi
|
5886d75059
|
Add three sum challenge (#108)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
|
2023-07-15 19:52:42 -04:00 |
Erik Peterson
|
cbd2e49d97
|
Clean up workspace between each test (#109)
|
2023-07-15 16:23:49 -07:00 |
merwanehamadi
|
7bc7d9213d
|
Replace hidden files with custom python (#99)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-14 14:39:47 -07:00 |
merwanehamadi
|
a9702e4629
|
Add basic code generation challenge (#98)
|
2023-07-14 13:27:48 -04:00 |
merwanehamadi
|
78df4915cf
|
Remove dependencies if a specific test is asked by the user (#95)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-12 14:35:12 -07:00 |
Silen Naihin
|
8d0c5179ed
|
fixing backslashes, adding basic metrics (#89)
|
2023-07-12 01:37:59 -04:00 |
merwanehamadi
|
b3c506cd94
|
Fix Auto-GPT looping forever (#87)
|
2023-07-11 20:02:29 -04:00 |
merwanehamadi
|
4ecb70c5e3
|
Fix Auto-GPT integration by adding python module as entrypoint (#86)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
|
2023-07-11 15:11:24 -04:00 |
merwanehamadi
|
0799be7e28
|
Fix tests ci (#82)
|
2023-07-10 21:54:25 -07:00 |
Silen Naihin
|
8df82909b2
|
Added --test, consolidate files, reports working (#83)
|
2023-07-10 19:25:19 -07:00 |
merwanehamadi
|
437e066a66
|
Add "Simple web server" challenge (#74)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
|
2023-07-10 20:46:03 -04:00 |
merwanehamadi
|
30ba51593f
|
Add Helicone (#81)
|
2023-07-10 12:19:12 -04:00 |
Silen Naihin
|
b8830f8625
|
Adding search interface challenge and cleaning repo (#80)
|
2023-07-09 18:33:08 -07:00 |
Silen Naihin
|
3d43117554
|
Just json, no test files (#77)
|
2023-07-09 17:27:21 -07:00 |
merwanehamadi
|
573130549f
|
Add gpt engineer to ci (#78)
|
2023-07-09 13:31:31 -07:00 |
merwanehamadi
|
d89264998d
|
Fix debug code challenge (#76)
Co-authored-by: Silen Naihin <silen.naihin@gmail.com>
|
2023-07-08 21:46:37 -04:00 |
Silen Naihin
|
69bd41f741
|
Quality of life improvements & fixes (#75)
|
2023-07-08 18:43:38 -07:00 |
Silen Naihin
|
e56b112aab
|
i/o workspace, adding superagi (#60)
|
2023-07-08 03:27:31 -04:00 |
merwanehamadi
|
487f99f8f2
|
Use artifacts out insted of python code (#72)
|
2023-07-07 15:49:37 -07:00 |
merwanehamadi
|
f0f7d2be90
|
Fix memory challenge 2 (#71)
|
2023-07-07 15:38:50 -07:00 |
merwanehamadi
|
e34c83ca1c
|
Add .txt to memory challenges (#70)
|
2023-07-07 15:34:57 -07:00 |
Erik Peterson
|
3defe044bd
|
Print out all of stdout on each process poll. (#69)
|
2023-07-07 15:02:08 -07:00 |
Silen Naihin
|
4562bc6caf
|
Update data.json remove text
|
2023-07-07 17:54:09 -04:00 |
merwanehamadi
|
e61523e59e
|
Get rid of get file path by using the data.json convention to store the challenge information (#67)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-07 13:58:17 -07:00 |