Commit Graph

5432 Commits (9e22409d6614c373c7c28e85eb80e474886c68e2)

Author SHA1 Message Date
Reinier van der Leer 4546dfdf17
feat(ci/arena): Add logging and debug output to workflow script 2024-03-01 11:02:41 +01:00
Reinier van der Leer 4011294da0
ci(arena): Fix `arena-intake` workflow
Sorry folks, it's been a while since I wrote javascript :')
2024-03-01 10:41:34 +01:00
Reinier van der Leer 48f6f83f05
ci(arena): Fix `arena-intake` workflow 2024-03-01 10:35:28 +01:00
Reinier van der Leer 51f5808430
ci: Add 'Arena intake' workflow to automatically check 'entering the arena' PRs 2024-03-01 00:27:10 +01:00
Reinier van der Leer 695049bfa3
ci: Auto-label PRs based on the scope of their diff 2024-02-29 19:38:04 +01:00
Reinier van der Leer 40f98f0f38
chore: Change `agbenchmark` to directory dependency in `autogpt` and `forge` (#6946)
Poetry recently released v1.8.x containing a fix for the issue we were having earlier:
https://github.com/python-poetry/poetry/issues/8548

This means unavailable optional directory dependencies no longer break the docker build.
2024-02-29 19:17:16 +01:00
Reinier van der Leer c26c79c34c
fix(benchmark/reports): Resolve error in format.py on `attempt.cost` is `None` 2024-02-29 19:01:47 +01:00
Krzysztof Czerwinski 2c96f6125f
feat(agent): Catch & disallow duplicate commands in LLM response parser (#6937)
Raise in `parse_and_process_response` if the proposed operation is the same as the last executed one.
2024-02-29 18:51:13 +01:00
Reinier van der Leer 5047fd9fce
lint(agent): Fix linting error in api_manager.py 2024-02-29 18:42:16 +01:00
edwardsp 50e5ea4e54
fix(agent/llm): Fix support for AzureOpenAI (#6927)
* Fix unmasking of `azure_endpoint` in `OpenAICredentials.get_api_access_kwargs()`
* Amend `ApiManager.get_models` to use `AzureOpenAI` client when `api_type` is set to `azure`

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-29 18:35:06 +01:00
Reinier van der Leer ce45c9b267
fix(agent/security): Make CORS more restrictive and configurable
* By default, allow requests originating from http://localhost:{AP_SERVER_PORT} instead of all origins
* Allow configuring allowed CORS origins through `AP_SERVER_CORS_ALLOWED_ORIGINS`
2024-02-28 21:14:49 +01:00
Krzysztof Czerwinski 1881f4f7cd
feat(agent): Gracefully handle failure to load non-existing agent (#6938) 2024-02-28 19:18:57 +01:00
Krzysztof Czerwinski 30762c211e
fix(agent/execute_code): Disable code execution commands when Docker is unavailable (#6888) 2024-02-28 19:16:02 +01:00
Elias Hohl 5090f55eba
fix(agent/security): Mitigate shell injection vulnerabilities (#6903)
* Mitigate shell injection in `MacOSTTS._speech` implementation

* Mitigate shell command control bypassing in `execute_shell` and `execute_shell_popen` commands
   - Improve implementation and docstring of `validate_command` function

---------

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-27 13:31:23 +01:00
Reinier van der Leer 1f1e8c9f7d
Update CODEOWNERS 2024-02-22 17:26:46 +01:00
Reinier van der Leer e44ca4185a
fix(frontend): Unbreak `ChatInputField`
Fix specification of `onSubmitted` hook in the `TextField`.
2024-02-21 02:09:23 +01:00
Reinier van der Leer 8fd2e48c1b
fix(ci/frontend): Add trigger on `push` including workflow file 2024-02-21 02:04:13 +01:00
Reinier van der Leer 69ccb185e8
fix(ci/frontend): Add and fix trigger on workflow file 2024-02-21 02:02:41 +01:00
Reinier van der Leer a88e833831
ci: Revise Frontend CI
- Rename build-frontend.yml to frontend-ci.yml
- Add a `pull_request` trigger
- Disable committing and pushing to a `frontend_build_{hash}` branch
- (Re)enable auto-creating a pull request for the new frontend build
2024-02-21 02:00:33 +01:00
Reinier van der Leer 64f48df62d
chore(agent/llm): Update model alias `gpt-3.5-turbo` -> `gpt-3.5-turbo-0125` 2024-02-20 17:13:51 +01:00
Reinier van der Leer 0f5490075b
fix(ci/benchmark): Install benchmark dependencies
Otherwise `poetry -C benchmark run benchmark/reports/format.py` fails.
2024-02-20 16:56:47 +01:00
Reinier van der Leer d5f2bbf093
fix(benchmark/reports): Make format.py executable 2024-02-20 14:50:32 +01:00
Reinier van der Leer 7dd97f2f74
fix(agent/browser): Print descriptive error if ChromeDriver install fails
We have been seeing `AttributeError: 'NoneType' object has no attribute 'split'` in Sentry.
According to https://github.com/SergeyPirogov/webdriver_manager/issues/649, this occurs
when Chrome is not installed, or when no suitable ChromeDriver version is available for
the installed version of Chrome.
Instead of the `AttributeError`, we can print a better message for the agent and user to help them fix the issue.

---
Co-authored-by: kcze <kpczerwinski@gmail.com>
2024-02-20 14:04:15 +01:00
Reinier van der Leer 8e464c53a8
fix(agent/llm): Include `id` in tool_calls in prompt
OpenAI requires the `id` property on tool calls in assistant messages.
We weren't storing it in the `AssistantChatMessage` that is created from the LLM's response,
causing an error when adding those messages back to the prompt.

Fix:
* Add `id` to `AssistantToolCall` and `AssistantToolCallDict` types in model_providers/schema.py
* Amend `_tool_calls_compat_extract_calls` to generate an ID for extracted tool calls

---
Co-authored-by: kcze <kpczerwinski@gmail.com>
2024-02-20 13:28:57 +01:00
Reinier van der Leer 7689a51f53
fix(autogpt/llm): Omit `AssistantChatMessage.tool_calls` if no tool calls are present
OpenAI likes neither `tool_calls=[]` nor `tool_calls=None`. If no `tool_calls` are in the message, the key must be omitted.

This partially reverts commit 67bafa6302.

---

Co-authored-by: kcze <kpczerwinski@gmail.com>
2024-02-20 13:04:55 +01:00
Reinier van der Leer c8a40727d1
fix(ci/benchmark): Specify poetry env path for report conversion step 2024-02-20 12:10:49 +01:00
Albert Örwall 4ef912d734
fix(benchmark/challenges): Improve spec and eval of TicTacToe challenge
* In challenge specification, specify `subprocess.PIPE` for `stdin` and `stderr` for completeness
* Additional tweak: let Pytest load only the current file when running the test file as a script

Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-20 11:52:59 +01:00
Thunder Drag 49a6d68200
fix(agent/setup): Fix revising constraints and best practices (#6777)
In a `for item in list` loop, removing items from the list while iterating causes it to skip over the next item. Fix: refactor `interactively_revise_ai_settings` routine to use while loop for iterating over constraints, resources, and best practices.

---------

Co-authored-by: Kripanshu Jindal <polaris@Polaris.local>
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
2024-02-20 11:06:30 +01:00
Ethan Presberg 6cfe229332
feat(frontend): Allow sending a message with the enter key (#6378)
This has not yet been tested due to an issue with compiling on WSL. This was the fix suggested by Pwuts.
2024-02-20 10:49:37 +01:00
Reinier van der Leer 1079d71699
fix(ci/benchmark): Unbreak "Push reports to data branch" step
The `report_subfolder` variable was being populated with two identical lines, because there will be two untracked files in the folder, resulting in the same dirname.
This caused later commands using that variable to fail. Fix is to `sort -u` before storing the value to `report_subfolder`.
2024-02-20 10:35:14 +01:00
Reinier van der Leer e104427767
feat(ci/benchmark): Generate step summary from benchmark report 2024-02-19 17:13:41 +01:00
Reinier van der Leer bfd479a50b
feat(benchmark): Add reports/format.py script to convert report.json to markdown 2024-02-19 17:13:05 +01:00
Reinier van der Leer fb63bf4425
chore: Update `agbenchmark` dependency for agent and forge 2024-02-19 17:11:19 +01:00
Reinier van der Leer 3a17011129
feat(benchmark): Include Steps in Report 2024-02-19 17:08:24 +01:00
Reinier van der Leer c339c6b54f
chore: Update `agbenchmark` dependency for agent and forge 2024-02-18 17:37:03 +01:00
Reinier van der Leer 7f71d6d9fd
debug(benchmark): Improve `TestResult` validation error output format 2024-02-18 17:10:14 +01:00
Reinier van der Leer 784e2bbb1c
fix(ci/benchmark): Mitigate VCS conflicts with files in data branch
`agbenchmark` currently creates files like success_rate.json in the base REPORTS_FOLDER, which causes conflicts in the last step of the benchmark workflow.
To prevent issues, these files must be removed prior to switching to the data branch.
2024-02-17 18:09:44 +01:00
Reinier van der Leer 959377f54c
fix(ci/benchmark): Add `set +e` because we expect (some) challenges to fail 2024-02-17 15:56:55 +01:00
Reinier van der Leer 6bc83e925c
chore: Update `agbenchmark` dependency for agent and forge 2024-02-17 15:56:33 +01:00
Reinier van der Leer 4ede773f5a
debug(benchmark): Add more debug code to pinpoint cause of rare crash
Target: https://github.com/Significant-Gravitas/AutoGPT/actions/runs/7941977633/job/21684817491
2024-02-17 15:48:57 +01:00
Reinier van der Leer d5ad719757
ci: Allow telemetry for non-push events, as long as it's on `master`
Also disable telemetry for AutoGPT's unit/integration tests.
2024-02-17 15:12:43 +01:00
Reinier van der Leer 1ca9b9fa93
ci: Fix setting/passing `TELEMETRY_*` environment variables 2024-02-17 14:26:03 +01:00
Reinier van der Leer 15024fb5a1
chore: Update `agbenchmark` dependency for agent and forge 2024-02-17 14:18:02 +01:00
Reinier van der Leer fa4bdef17c
ci: Update actions to newest versions
- `actions/stale` -> `v9`
- `actions/cache` -> `v4`
- `actions/checkout` -> `v4`
- `actions/setup-node` -> `v4`
- `docker/login-action` -> `v3`
- `actions/setup-python` -> `v5`
- `codecov/codecov-action` -> `v4`
- `actions/upload-artifact` -> `v4`
- `subosito/flutter-action` -> `v2`
- `docker/build-push-action` -> `v5`
- `docker/setup-buildx-action` -> `v3`
2024-02-17 13:59:13 +01:00
Reinier van der Leer e2b519ef3b
debug(benchmark): Make sure `TestResult` validator error output is sufficient to debug 2024-02-17 13:36:17 +01:00
Reinier van der Leer 09c307d679
debug(benchmark): Add log statement to validator on `TestResult`
Validation errors don't mention the values causing the error, making it hard to debug. This happened a few times in autogpts-benchmark.yml, so let's put this log statement here until we figure out what makes it crash.
2024-02-17 13:32:22 +01:00
Reinier van der Leer 880c8e804c
fix(ci/benchmark): Allow workflow to continue regardless of challenge outcomes 2024-02-17 11:52:26 +01:00
Reinier van der Leer 5f0764b65c
chore: Update agbenchmark dependency for agent and forge 2024-02-16 19:07:37 +01:00
Reinier van der Leer 63e6014b27
fix(benchmark): Fix `TestResult.fail_reason` assignment condition
The condition must be the same as for `success`, because otherwise it causes a crash when `call.excinfo` evaluates to `False` but is not `None`.
2024-02-16 19:05:00 +01:00
Reinier van der Leer 83fcd9ad16
chore: Update `agbenchmark` dependency for agent and forge 2024-02-16 18:44:58 +01:00