Commit Graph

117 Commits (85f4adff2447c86e9c650e7f6deee05caf7ae603)

Author SHA1 Message Date
merwanehamadi 20c87fbc26
Fix typing () 2023-08-02 15:08:07 -07:00
merwanehamadi 59f015ab93
fix-linter () 2023-08-02 14:49:03 -07:00
merwanehamadi 8fa67ea466
Correct agent and benchmark commit sha ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-02 14:44:14 -07:00
merwanehamadi e3562a4b66
Add attempted metrics ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-02 13:27:57 -07:00
merwanehamadi f41533ce62
Fix reports and add commit sha ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-08-01 17:54:23 -07:00
merwanehamadi eeb68858d7
Only run mini-agi on tests () 2023-08-01 16:50:41 -07:00
Silen Naihin 3992f0865b comitting changes 2023-08-01 20:49:20 +01:00
Silen Naihin f4225f63bf linter and handling errs 2023-08-01 17:55:00 +01:00
Silen Naihin f8a01ef70a fixing combined charts issue 2023-08-01 17:15:15 +01:00
Silen Naihin f195840d35 fixing combined_graph 2023-08-01 14:35:14 +01:00
Silen Naihin 6f3fd2a578 fix graphs, processing, workflow 2023-08-01 13:44:32 +01:00
merwanehamadi ce24857a74
Return none as fallback Helicone ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-31 20:18:15 -07:00
merwanehamadi 46dce97c4e
Fix reports ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-31 19:39:49 -07:00
merwanehamadi a2dc4693a3
Fix costs helicone ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-31 16:13:06 -07:00
Silen Naihin f9fea473f5
Refactoring for TDD () 2023-07-31 21:59:47 +01:00
merwanehamadi 719f894520
Fix send to gdrive and tracking the wrong challenge name ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-31 12:35:37 -07:00
Justin Torre 3a32adbce5
Fix f-string get_data_from_helicone.py () 2023-07-31 09:06:04 -07:00
Silen Naihin 9d75712bae ci ofr auth 2023-07-31 14:02:46 +01:00
Silen Naihin f8de706a15 removing data that didnt work 2023-07-31 13:41:45 +01:00
Silen Naihin 2ec306e850 linter fixes 2023-07-31 13:28:01 +01:00
Silen Naihin db49e8de15 helicone push 2 2023-07-31 13:26:49 +01:00
Silen Naihin 14c49fa7ea handling helicone errors 2023-07-31 12:54:27 +01:00
Silen Naihin 4011cb228f
working bar and radar charts () 2023-07-31 12:22:38 +01:00
merwanehamadi ad00a0634e
Get helicone costs ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-30 21:33:09 -07:00
merwanehamadi 6309bc9c3d
Update submodule () 2023-07-30 20:03:53 -07:00
merwanehamadi d93950e6d9
Fix timeout not working () 2023-07-30 19:05:09 -07:00
Silen Naihin 19db3151dd
Feature: Visualize Test Results () 2023-07-30 23:51:17 +01:00
merwanehamadi a6c3730ac8
Add timeout that allows teardown ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-29 20:02:41 -07:00
merwanehamadi c4554225bd
Update submodules ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-29 10:18:35 -07:00
Silen Naihin ecc386ec7b
returning scores ()
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 11:43:22 +01:00
Silen Naihin f07e7b60d4
Advanced LLM Evaluation Implementation ()
Co-authored-by: Auto-GPT-Bot <github-bot@agpt.co>
2023-07-29 10:26:19 +01:00
merwanehamadi 80bd0c4260
Fix tests not being run ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-27 20:50:53 -07:00
merwanehamadi 6098b70408
Use beebot autopackai ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-27 12:21:43 -07:00
merwanehamadi 31897e7892
Delete reports () 2023-07-27 11:42:24 -07:00
Silen Naihin 71e0c598d6 forcing AGENT_NAME to be defined from repo 2023-07-27 14:28:11 +01:00
Silen Naihin 0e6be16d07 helicone and llm eval fixes 2023-07-27 14:07:46 +01:00
merwanehamadi eb57b15380
Add dynamic headers using environment variables ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 21:26:03 -07:00
merwanehamadi 5df710fd35
Add helicone dynamic headers ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 16:03:13 -07:00
Silen Naihin 66d1fec07e attempting more logs 2023-07-26 23:36:45 +01:00
merwanehamadi 01b118e590
Add llm eval ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-26 14:00:24 -07:00
Silen Naihin 80506e9a3b
report # bug, adding submodule challenges () 2023-07-26 13:53:10 +01:00
merwanehamadi a1e02f243c
Add safety suite ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-25 20:13:01 -07:00
Silen Naihin 5e3bbb946f
fix suite dependencies () 2023-07-26 01:50:53 +01:00
Silen Naihin b82277515f
hotfix reports () 2023-07-25 19:07:24 +01:00
Silen Naihin d9b3d7da37
Safety challenges, adaptability challenges, suite same_task () 2023-07-24 13:57:44 -07:00
Silen Naihin 2b3abeff4e
Integrate baby-agi ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-07-21 11:15:42 -07:00
Erik Peterson 5a3b4f3d1d
Kill subprocesses when test ends ()
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
Co-authored-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-07-20 15:41:59 -07:00
Silen Naihin 12c5d54583
Fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling () 2023-07-17 19:41:58 -07:00
merwanehamadi 2d8fa5ca6f
Use report location () 2023-07-17 20:15:10 -04:00
Silen Naihin 8aa6452cc4
file naming when --test () 2023-07-17 11:24:16 -04:00