Commit Graph

21 Commits (remove-git-from-cli)

Author SHA1 Message Date
Albert Örwall 4ef912d734
fix(benchmark/challenges): Improve spec and eval of TicTacToe challenge
* In challenge specification, specify `subprocess.PIPE` for `stdin` and `stderr` for completeness
* Additional tweak: let Pytest load only the current file when running the test file as a script

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-20 11:52:59 +01:00
Reinier van der Leer b106a61352
Clean up & fix GitHub workflows (#6313)
* ci: Mitigate security issues in autogpt-ci.yml

- Remove unnecessary pull_request_target paths and related variables and config
- Set permissions for contents to read only

* ci: Simplify steps in autogpt-ci.yml workflow using GitHub CLI

- Simplify step in 'autogpt-ci.yml' by using GitHub CLI instead of API for adding label and comment functionality
- Replace curl command with 'gh issue edit' to add "behaviour change" label to the pull request
- Replace gh api command with 'gh issue comment' to leave a comment about the changed behavior of AutoGPT in the pull request

* ci: Fix issues in workflows

- Move environment variable definition to top level in benchmark-ci.yml (because the other job also needs it)
- Removed invalid 'branches: [hackathon]' restriction in hackathon.yml workflow
- Removed redundant 'ref' and 'repository' fields in the 'checkout' step of both workflows.

* ci: Delete legacy benchmarks.yml workflow

* ci: Add triggers for CI workflows

- Add triggers to run CI workflows when they are edited.
- Update the paths for the CI workflows in the trigger configuration.

* fix: Fix benchmark lint error

- Removed unnecessary blank lines in report_types.py
- Fixed string quotes in challenge.py to maintain consistency

* fix: Update task description in password generator data.json

- Update task description in `data.json` file for the password generator challenge to clarify the input requirements and error handling.
- This change is made in an attempt to make the Benchmark CI pass.

* fix: Fix PasswordGenerator challenge in CI

- Fix the behavior of the reference password_generator.py to align with the task description
- Use default password length 8 instead of a random length in the generate_password function
- Retrieve the password length from the command line arguments if "--length" is provided, else set it to 8
2023-11-21 10:58:54 +01:00
Silen Naihin 344ef3bf8b fixing password gen and revenue retrieval 2 challenges 2023-10-17 20:28:49 -07:00
Silen Naihin 74ee69daf1
Update data.json 2023-10-14 08:04:37 -07:00
merwanehamadi 93e3ec36ed
Update test.py (#5721) 2023-10-13 06:56:52 -07:00
Albert Örwall 949ab477a8
Correct create_game method definition in the challenge input (#5460)
Co-authored-by: merwanehamadi <merwanehamadi@gmail.com>
2023-10-02 16:07:18 -07:00
merwanehamadi 8252a2fa8f
Correct Battleship Challenge (#5450)
* Update abstract_class.py

* Update abstract_class.py
2023-10-01 13:53:46 -07:00
merwanehamadi 0e804e27dd
Add more data challenges (#5390)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-28 19:30:08 -07:00
merwanehamadi 37fbb52d19
Add more challenges + cleanup (#5368)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-27 17:58:58 -07:00
merwanehamadi a0e383f4d9
Fix skill tree (#5303)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-22 13:09:57 -07:00
merwanehamadi 18e576cb53
Structure challenges (#5296) 2023-09-21 20:06:37 -07:00
merwanehamadi f67a352937
Add categories skill tree (#5295)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-21 17:39:16 -07:00
merwanehamadi f4e7b1c61c
Add eval_id and sync Skill Tree with Frontend(#5287)
Add eval_id to skill tree

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-21 13:36:17 -07:00
merwanehamadi ff4c76ba00
Make agbenchmark a proxy of the evaluated agent (#5279)
Make agbenchmark a Proxy of the evaluated agent

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-20 16:06:00 -07:00
merwanehamadi f4d319cee4
Refactor benchmark (#5247)
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-17 06:55:20 -07:00
merwanehamadi 295702867a
Ability to run by categories (#5229)
* Ability to run by categories

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>

* always use Path.cwd()

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>

---------

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-15 20:04:12 -07:00
Luke d319473e3c
Fix TestUrlShortener to prevent conflicting test.py file and clarify instructions (#5177) 2023-09-13 06:11:40 -07:00
Merwane Hamadi 1b14d304d4 Benchmark changes
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-12 12:13:39 -07:00
SwiftyOS c73e90c4e6 Fixing benchmarks 2023-09-11 17:41:27 -07:00
Merwane Hamadi c9b65353e7 add ci + fix linting 2023-09-05 12:11:00 -07:00
Auto-GPT-Bot 45c15e370f Auto-GPT-20230905085638
Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-05 10:10:03 -07:00