AutoGPT/benchmark/agbenchmark/challenges/verticals
Albert Örwall 4ef912d734
fix(benchmark/challenges): Improve spec and eval of TicTacToe challenge
* In challenge specification, specify `subprocess.PIPE` for `stdin` and `stderr` for completeness
* Additional tweak: let Pytest load only the current file when running the test file as a script

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-20 11:52:59 +01:00
..
code fix(benchmark/challenges): Improve spec and eval of TicTacToe challenge 2024-02-20 11:52:59 +01:00
data case sensitivity, updating challenges 2023-10-20 08:26:29 -07:00
scrape AGBenchmark codebase clean-up (#6650) 2024-01-02 22:23:09 +01:00
synthesize/1_basic_content_gen fix(benchmark): Mock mode, python evals, `--attempts` flag, challenge definitions 2024-02-14 01:05:34 +01:00