merwanehamadi
|
20c87fbc26
|
Fix typing (#247)
|
2023-08-02 15:08:07 -07:00 |
merwanehamadi
|
59f015ab93
|
fix-linter (#246)
|
2023-08-02 14:49:03 -07:00 |
merwanehamadi
|
8fa67ea466
|
Correct agent and benchmark commit sha (#245)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-08-02 14:44:14 -07:00 |
merwanehamadi
|
e3562a4b66
|
Add attempted metrics (#244)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-08-02 13:27:57 -07:00 |
merwanehamadi
|
f41533ce62
|
Fix reports and add commit sha (#233)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-08-01 17:54:23 -07:00 |
merwanehamadi
|
eeb68858d7
|
Only run mini-agi on tests (#232)
|
2023-08-01 16:50:41 -07:00 |
Silen Naihin
|
3992f0865b
|
comitting changes
|
2023-08-01 20:49:20 +01:00 |
Silen Naihin
|
f4225f63bf
|
linter and handling errs
|
2023-08-01 17:55:00 +01:00 |
Silen Naihin
|
f8a01ef70a
|
fixing combined charts issue
|
2023-08-01 17:15:15 +01:00 |
Silen Naihin
|
f195840d35
|
fixing combined_graph
|
2023-08-01 14:35:14 +01:00 |
Silen Naihin
|
6f3fd2a578
|
fix graphs, processing, workflow
|
2023-08-01 13:44:32 +01:00 |
merwanehamadi
|
ce24857a74
|
Return none as fallback Helicone (#228)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-31 20:18:15 -07:00 |
merwanehamadi
|
46dce97c4e
|
Fix reports (#227)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-31 19:39:49 -07:00 |
merwanehamadi
|
a2dc4693a3
|
Fix costs helicone (#226)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-31 16:13:06 -07:00 |
Silen Naihin
|
f9fea473f5
|
Refactoring for TDD (#222)
|
2023-07-31 21:59:47 +01:00 |
merwanehamadi
|
719f894520
|
Fix send to gdrive and tracking the wrong challenge name (#225)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-31 12:35:37 -07:00 |
Justin Torre
|
3a32adbce5
|
Fix f-string get_data_from_helicone.py (#223)
|
2023-07-31 09:06:04 -07:00 |
Silen Naihin
|
9d75712bae
|
ci ofr auth
|
2023-07-31 14:02:46 +01:00 |
Silen Naihin
|
f8de706a15
|
removing data that didnt work
|
2023-07-31 13:41:45 +01:00 |
Silen Naihin
|
2ec306e850
|
linter fixes
|
2023-07-31 13:28:01 +01:00 |
Silen Naihin
|
db49e8de15
|
helicone push 2
|
2023-07-31 13:26:49 +01:00 |
Silen Naihin
|
14c49fa7ea
|
handling helicone errors
|
2023-07-31 12:54:27 +01:00 |
Silen Naihin
|
4011cb228f
|
working bar and radar charts (#221)
|
2023-07-31 12:22:38 +01:00 |
merwanehamadi
|
ad00a0634e
|
Get helicone costs (#220)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-30 21:33:09 -07:00 |
merwanehamadi
|
6309bc9c3d
|
Update submodule (#219)
|
2023-07-30 20:03:53 -07:00 |
merwanehamadi
|
d93950e6d9
|
Fix timeout not working (#218)
|
2023-07-30 19:05:09 -07:00 |
Silen Naihin
|
19db3151dd
|
Feature: Visualize Test Results (#211)
|
2023-07-30 23:51:17 +01:00 |
merwanehamadi
|
a6c3730ac8
|
Add timeout that allows teardown (#216)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-29 20:02:41 -07:00 |
merwanehamadi
|
c4554225bd
|
Update submodules (#212)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-29 10:18:35 -07:00 |
Silen Naihin
|
ecc386ec7b
|
returning scores (#210)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
|
2023-07-29 11:43:22 +01:00 |
Silen Naihin
|
f07e7b60d4
|
Advanced LLM Evaluation Implementation (#205)
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
|
2023-07-29 10:26:19 +01:00 |
merwanehamadi
|
80bd0c4260
|
Fix tests not being run (#207)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-27 20:50:53 -07:00 |
merwanehamadi
|
6098b70408
|
Use beebot autopackai (#203)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-27 12:21:43 -07:00 |
merwanehamadi
|
31897e7892
|
Delete reports (#201)
|
2023-07-27 11:42:24 -07:00 |
Silen Naihin
|
71e0c598d6
|
forcing AGENT_NAME to be defined from repo
|
2023-07-27 14:28:11 +01:00 |
Silen Naihin
|
0e6be16d07
|
helicone and llm eval fixes
|
2023-07-27 14:07:46 +01:00 |
merwanehamadi
|
eb57b15380
|
Add dynamic headers using environment variables (#200)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 21:26:03 -07:00 |
merwanehamadi
|
5df710fd35
|
Add helicone dynamic headers (#199)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 16:03:13 -07:00 |
Silen Naihin
|
66d1fec07e
|
attempting more logs
|
2023-07-26 23:36:45 +01:00 |
merwanehamadi
|
01b118e590
|
Add llm eval (#197)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-26 14:00:24 -07:00 |
Silen Naihin
|
80506e9a3b
|
report # bug, adding submodule challenges (#193)
|
2023-07-26 13:53:10 +01:00 |
merwanehamadi
|
a1e02f243c
|
Add safety suite (#196)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-25 20:13:01 -07:00 |
Silen Naihin
|
5e3bbb946f
|
fix suite dependencies (#194)
|
2023-07-26 01:50:53 +01:00 |
Silen Naihin
|
b82277515f
|
hotfix reports (#191)
|
2023-07-25 19:07:24 +01:00 |
Silen Naihin
|
d9b3d7da37
|
Safety challenges, adaptability challenges, suite same_task (#177)
|
2023-07-24 13:57:44 -07:00 |
Silen Naihin
|
2b3abeff4e
|
Integrate baby-agi (#168)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
|
2023-07-21 11:15:42 -07:00 |
Erik Peterson
|
5a3b4f3d1d
|
Kill subprocesses when test ends (#172)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
|
2023-07-20 15:41:59 -07:00 |
Silen Naihin
|
12c5d54583
|
Fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling (#166)
|
2023-07-17 19:41:58 -07:00 |
merwanehamadi
|
2d8fa5ca6f
|
Use report location (#165)
|
2023-07-17 20:15:10 -04:00 |
Silen Naihin
|
8aa6452cc4
|
file naming when --test (#164)
|
2023-07-17 11:24:16 -04:00 |