AutoGPT

Commit Graph

Author	SHA1	Message	Date
Nicholas Tindle	785a40ff9d	feat(server, autogpt): Add Example files and update build option (#7271 )	2024-06-27 09:56:21 -05:00
Nicholas Tindle	6ec708c771	ci(server): Agent Server CI (#7193 )	2024-06-12 00:29:23 +07:00
Reinier van der Leer	f107ff8cf0	Set up unified pre-commit + CI w/ linting + type checking & FIX EVERYTHING (#7171 ) - FIX ALL LINT/TYPE ERRORS IN AUTOGPT, FORGE, AND BENCHMARK ### Linting - Clean up linter configs for `autogpt`, `forge`, and `benchmark` - Add type checking with Pyright - Create unified pre-commit config - Create unified linting and type checking CI workflow ### Testing - Synchronize CI test setups for `autogpt`, `forge`, and `benchmark` - Add missing pytest-cov to benchmark dependencies - Mark GCS tests as slow to speed up pre-commit test runs - Repair `forge` test suite - Add `AgentDB.close()` method for test DB teardown in db_test.py - Use actual temporary dir instead of forge/test_workspace/ - Move left-behind dependencies for moved `forge`-code to from autogpt to forge ### Notable type changes - Replace uses of `ChatModelProvider` by `MultiProvider` - Removed unnecessary exports from various __init__.py - Simplify `FileStorage.open_file` signature by removing `IOBase` from return type union - Implement `S3BinaryIOWrapper(BinaryIO)` type interposer for `S3FileStorage` - Expand overloads of `GCSFileStorage.open_file` for improved typing of read and write modes Had to silence type checking for the extra overloads, because (I think) Pyright is reporting a false-positive: https://github.com/microsoft/pyright/issues/8007 - Change `count_tokens`, `get_tokenizer`, `count_message_tokens` methods on `ModelProvider`s from class methods to instance methods - Move `CompletionModelFunction.schema` method -> helper function `format_function_def_for_openai` in `forge.llm.providers.openai` - Rename `ModelProvider` -> `BaseModelProvider` - Rename `ChatModelProvider` -> `BaseChatModelProvider` - Add type `ChatModelProvider` which is a union of all subclasses of `BaseChatModelProvider` ### Removed rather than fixed - Remove deprecated and broken autogpt/agbenchmark_config/benchmarks.py - Various base classes and properties on base classes in `forge.llm.providers.schema` and `forge.models.providers` ### Fixes for other issues that came to light - Clean up `forge.agent_protocol.api_router`, `forge.agent_protocol.database`, and `forge.agent.agent` - Add fallback behavior to `ImageGeneratorComponent` - Remove test for deprecated failure behavior - Fix `agbenchmark.challenges.builtin` challenge exclusion mechanism on Windows - Fix `_tool_calls_compat_extract_calls` in `forge.llm.providers.openai` - Add support for `any` (= no type specified) in `JSONSchema.typescript_type`	2024-05-28 05:04:21 +02:00
Krzysztof Czerwinski	cdae98d36b	fix(ci): Fix cli and CI (#7166 ) - Add a special case for cli to handle autogpt and forge agent - Remove forge agent from smoke test ci	2024-05-22 17:18:00 +01:00
Reinier van der Leer	5292736779	fix(agent): Unbreak docker builds after repo restructure (#7164 ) - Move `autogpt/Dockerfile` to `Dockerfile.autogpt` - Write new selective `.dockerignore` (in repo root) to keep build context clean - Amend `autogpt/docker-compose.yml` and all `autogpt-docker-*.yml` workflows accordingly - Include `forge/` in docker build context so it can be used as a path dependency - Include `frontend/` in docker builds	2024-05-22 18:11:16 +02:00
Krzysztof Czerwinski	4c325724ec	refactor(autogpt, forge): Remove `autogpts` directory (#7163 ) - Moved `autogpt` and `forge` to project root - Removed `autogpts` directory - Moved and renamed submodule `autogpts/autogpt/tests/vcr_cassettes` to `autogpt/tests/vcr_cassettes` - When using CLI agents will be created in `agents` directory (instead of `autogpts`) - Renamed relevant docs, code and config references from `autogpts/[forge\|autogpt]` to `[forge\|autogpt]` and from `../../` to `../` - Updated `CODEOWNERS`, GitHub Actions and Docker `*.yml` configs - Updated symbolic links in `docs`	2024-05-22 13:08:54 +01:00
Swifty	34fdbaa26b	Remove arena (#7134 ) * remove arena * refactor: Remove Arena intake workflow * Remove all mention of the arena * remove evo.ninja	2024-05-09 11:36:40 +02:00
Reinier van der Leer	90f3c5e2d9	fix(ci): Disable annoying "PR too big" auto-message	2024-04-10 17:49:31 +02:00
Reinier van der Leer	30bc761391	ci(agent): Add macOS on M1 to AutoGPT CI matrix (#7041 ) Use a `macos-14` runner to cover macOS on M1/arm64 - Add `macos-arm64` to `platform-os` matrix, and map it to `macos-14` runner	2024-03-22 14:26:16 +01:00
Reinier van der Leer	2a0e087461	ci(agent): Disable Python dependency caching on Windows On Windows, unpacking cached dependencies takes longer than just installing them with `poetry install`. :')	2024-03-22 14:15:43 +01:00
Reinier van der Leer	828b81e5ef	ci(agent): Fix Python dependency caching on macOS	2024-03-22 14:13:22 +01:00
Reinier van der Leer	20041d65bf	ci(agent): Fix Docker CI for PR runs from forks (vol. 2) - Fix docker image tag format error when `secrets.DOCKER_USER` is not set	2024-03-22 12:57:29 +01:00
Reinier van der Leer	9e39937072	ci(agent): Fix Docker CI for PR runs from forks - Disable 'Log in to Docker hub' step for `pull_request` runs	2024-03-22 12:50:58 +01:00
Reinier van der Leer	dde0c70a81	ci(agent): Matrix CI tests across Linux, macOS and Windows (#7029 ) * Matrix the AutoGPT Python CI's `test` job across Ubuntu, macOS and Windows - Set up MinIO in a step rather than specifying it under `jobs[test].services`, because services are only supported on Linux runners - Add Windows version of step to install Poetry - Add macOS compatibility patches to 'Install Poetry (Unix)' and `setup_git_auth` steps Caveats: - No Docker on macOS or Windows * Windows comes with Docker but only supports running Windows containers, while we're mainly interested in using Linux containers for code execution and/or running auxiliary services. * [The macOS runner doesn't come with Docker](https://github.com/actions/runner-images/issues/17). Setting it up is possible but takes ~3-4 minutes, and the performance of the Colima engine is poor: a `docker pull` that takes 2 seconds on Linux takes 45 seconds on macOS. - No S3 service available on Windows It seems that running a background process [isn't possible on Windows](https://github.com/actions/runner/issues/598#issuecomment-2011890429), and neither is running Linux-based Docker containers. * Add `autogpt-agent` and OS-specific flags to Codecov upload step * Improve caching of Python dependencies in CI by changing the cache key - Include hash of `poetry.lock` instead of `pyproject.toml` in key - Remove date component from key; it was included to avoid getting stuck to old cached versions of packages when we were still using `requirements.txt`. With `poetry.lock` that is no longer a concern. * Fix skip check in test_s3_file_storage.py	2024-03-21 21:15:46 +01:00
Reinier van der Leer	29d390d54d	ci: Disable annoying auto-message discouraging big PRs	2024-03-01 12:28:00 +01:00
Reinier van der Leer	b69f0b2cd0	fix(ci/arena): Fix requesting manual review Three times the charm, right?	2024-03-01 12:22:05 +01:00
Reinier van der Leer	0308fb45be	fix(ci/arena): Fix requesting manual review	2024-03-01 11:44:27 +01:00
Reinier van der Leer	0325370fed	fix(ci/arena): Fix requesting manual review	2024-03-01 11:41:49 +01:00
Reinier van der Leer	1e4bd0388f	fix(ci/arena): Skip checking file against itself for duplicates	2024-03-01 11:34:47 +01:00
Reinier van der Leer	d1b06f0be3	fix(ci/arena): Improve output format	2024-03-01 11:27:26 +01:00
Reinier van der Leer	3e40b35ef1	fix(ci/arena): Reverse check for `pr.mergeable`	2024-03-01 11:23:14 +01:00
Reinier van der Leer	70873906b7	fix(ci/arena): Make check for `pr.mergeable` more specific	2024-03-01 11:20:54 +01:00
Reinier van der Leer	f93a8a93b4	fix(ci/arena): Fix error accessing `context` & improve log output readability	2024-03-01 11:19:31 +01:00
Reinier van der Leer	4121d3712d	fix(ci/arena): Fix syntax & formatting errors	2024-03-01 11:07:54 +01:00
Reinier van der Leer	4546dfdf17	feat(ci/arena): Add logging and debug output to workflow script	2024-03-01 11:02:41 +01:00
Reinier van der Leer	4011294da0	ci(arena): Fix `arena-intake` workflow Sorry folks, it's been a while since I wrote javascript :')	2024-03-01 10:41:34 +01:00
Reinier van der Leer	48f6f83f05	ci(arena): Fix `arena-intake` workflow	2024-03-01 10:35:28 +01:00
Reinier van der Leer	51f5808430	ci: Add 'Arena intake' workflow to automatically check 'entering the arena' PRs	2024-03-01 00:27:10 +01:00
Reinier van der Leer	695049bfa3	ci: Auto-label PRs based on the scope of their diff	2024-02-29 19:38:04 +01:00
Reinier van der Leer	8fd2e48c1b	fix(ci/frontend): Add trigger on `push` including workflow file	2024-02-21 02:04:13 +01:00
Reinier van der Leer	69ccb185e8	fix(ci/frontend): Add and fix trigger on workflow file	2024-02-21 02:02:41 +01:00
Reinier van der Leer	a88e833831	ci: Revise Frontend CI - Rename build-frontend.yml to frontend-ci.yml - Add a `pull_request` trigger - Disable committing and pushing to a `frontend_build_{hash}` branch - (Re)enable auto-creating a pull request for the new frontend build	2024-02-21 02:00:33 +01:00
Reinier van der Leer	0f5490075b	fix(ci/benchmark): Install benchmark dependencies Otherwise `poetry -C benchmark run benchmark/reports/format.py` fails.	2024-02-20 16:56:47 +01:00
Reinier van der Leer	c8a40727d1	fix(ci/benchmark): Specify poetry env path for report conversion step	2024-02-20 12:10:49 +01:00
Reinier van der Leer	1079d71699	fix(ci/benchmark): Unbreak "Push reports to data branch" step The `report_subfolder` variable was being populated with two identical lines, because there will be two untracked files in the folder, resulting in the same dirname. This caused later commands using that variable to fail. Fix is to `sort -u` before storing the value to `report_subfolder`.	2024-02-20 10:35:14 +01:00
Reinier van der Leer	e104427767	feat(ci/benchmark): Generate step summary from benchmark report	2024-02-19 17:13:41 +01:00
Reinier van der Leer	784e2bbb1c	fix(ci/benchmark): Mitigate VCS conflicts with files in data branch `agbenchmark` currently creates files like success_rate.json in the base REPORTS_FOLDER, which causes conflicts in the last step of the benchmark workflow. To prevent issues, these files must be removed prior to switching to the data branch.	2024-02-17 18:09:44 +01:00
Reinier van der Leer	959377f54c	fix(ci/benchmark): Add `set +e` because we expect (some) challenges to fail	2024-02-17 15:56:55 +01:00
Reinier van der Leer	d5ad719757	ci: Allow telemetry for non-push events, as long as it's on `master` Also disable telemetry for AutoGPT's unit/integration tests.	2024-02-17 15:12:43 +01:00
Reinier van der Leer	1ca9b9fa93	ci: Fix setting/passing `TELEMETRY_*` environment variables	2024-02-17 14:26:03 +01:00
Reinier van der Leer	fa4bdef17c	ci: Update actions to newest versions - `actions/stale` -> `v9` - `actions/cache` -> `v4` - `actions/checkout` -> `v4` - `actions/setup-node` -> `v4` - `docker/login-action` -> `v3` - `actions/setup-python` -> `v5` - `codecov/codecov-action` -> `v4` - `actions/upload-artifact` -> `v4` - `subosito/flutter-action` -> `v2` - `docker/build-push-action` -> `v5` - `docker/setup-buildx-action` -> `v3`	2024-02-17 13:59:13 +01:00
Reinier van der Leer	880c8e804c	fix(ci/benchmark): Allow workflow to continue regardless of challenge outcomes	2024-02-17 11:52:26 +01:00
Reinier van der Leer	d6ab470c58	Rename autogpts-benchmark-nightly.yml to autogpts-benchmark.yml	2024-02-16 18:32:50 +01:00
Reinier van der Leer	a5de79beb6	ci(benchmark): Add nightly benchmark workflow Added autogpts-benchmark-nightly.yml, which will run every night at 02:00 UTC with a selection of challenges.	2024-02-16 17:41:58 +01:00
Reinier van der Leer	679339d00c	feat(benchmark): Make report output folder configurable - Make `AgentBenchmarkConfig.reports_folder` directly configurable (through `REPORTS_FOLDER` env variable). The default is still `./agbenchmark_config/reports`. - Change all mentions of `REPORT_LOCATION` (which fulfilled the same function at some point in the past) to `REPORTS_FOLDER`.	2024-02-15 18:07:45 +01:00
Reinier van der Leer	6017eefb32	ci: Enable telemetry in CI runs on `master`	2024-02-14 12:03:54 +01:00
Reinier van der Leer	88bbdfc7fc	ci: Pick 3 challenges to run with `--mock` in smoke test CI	2024-02-14 02:30:03 +01:00
Reinier van der Leer	14c9773890	ci(agent): Add `GIT_REVISION` label to Docker builds	2024-02-12 12:31:04 +01:00
Reinier van der Leer	25cc6ad6ae	AGBenchmark codebase clean-up (#6650 ) * refactor(benchmark): Deduplicate configuration loading logic - Move the configuration loading logic to a separate `load_agbenchmark_config` function in `agbenchmark/config.py` module. - Replace the duplicate loading logic in `conftest.py`, `generate_test.py`, `ReportManager.py`, `reports.py`, and `__main__.py` with calls to `load_agbenchmark_config` function. * fix(benchmark): Fix type errors, linting errors, and clean up CLI validation in __main__.py - Fixed type errors and linting errors in `__main__.py` - Improved the readability of CLI argument validation by introducing a separate function for it * refactor(benchmark): Lint and typefix app.py - Rearranged and cleaned up import statements - Fixed type errors caused by improper use of `psutil` objects - Simplified a number of `os.path` usages by converting to `pathlib` - Use `Task` and `TaskRequestBody` classes from `agent_protocol_client` instead of `.schema` * refactor(benchmark): Replace `.agent_protocol_client` by `agent-protcol-client`, clean up schema.py - Remove `agbenchmark.agent_protocol_client` (an offline copy of `agent-protocol-client`). - Add `agent-protocol-client` as a dependency and change imports to `agent_protocol_client`. - Fix type annotation on `agent_api_interface.py::upload_artifacts` (`ApiClient` -> `AgentApi`). - Remove all unused types from schema.py (= most of them). * refactor(benchmark): Use pathlib in agent_interface.py and agent_api_interface.py * refactor(benchmark): Improve typing, response validation, and readability in app.py - Simplified response generation by leveraging type checking and conversion by FastAPI. - Introduced use of `HTTPException` for error responses. - Improved naming, formatting, and typing in `app.py::create_evaluation`. - Updated the docstring on `app.py::create_agent_task`. - Fixed return type annotations of `create_single_test` and `create_challenge` in generate_test.py. - Added default values to optional attributes on models in report_types_v2.py. - Removed unused imports in `generate_test.py` * refactor(benchmark): Clean up logging and print statements - Introduced use of the `logging` library for unified logging and better readability. - Converted most print statements to use `logger.debug`, `logger.warning`, and `logger.error`. - Improved descriptiveness of log statements. - Removed unnecessary print statements. - Added log statements to unspecific and non-verbose `except` blocks. - Added `--debug` flag, which sets the log level to `DEBUG` and enables a more comprehensive log format. - Added `.utils.logging` module with `configure_logging` function to easily configure the logging library. - Converted raw escape sequences in `.utils.challenge` to use `colorama`. - Renamed `generate_test.py::generate_tests` to `load_challenges`. * refactor(benchmark): Remove unused server.py and agent_interface.py::run_agent - Remove unused server.py file - Remove unused run_agent function from agent_interface.py * refactor(benchmark): Clean up conftest.py - Fix and add type annotations - Rewrite docstrings - Disable or remove unused code - Fix definition of arguments and their types in `pytest_addoption` * refactor(benchmark): Clean up generate_test.py file - Refactored the `create_single_test` function for clarity and readability - Removed unused variables - Made creation of `Challenge` subclasses more straightforward - Made bare `except` more specific - Renamed `Challenge.setup_challenge` method to `run_challenge` - Updated type hints and annotations - Made minor code/readability improvements in `load_challenges` - Added a helper function `_add_challenge_to_module` for attaching a Challenge class to the current module * fix(benchmark): Fix and add type annotations in execute_sub_process.py * refactor(benchmark): Simplify const determination in agent_interface.py - Simplify the logic that determines the value of `HELICONE_GRAPHQL_LOGS` * fix(benchmark): Register category markers to prevent warnings - Use the `pytest_configure` hook to register the known challenge categories as markers. Otherwise, Pytest will raise "unknown marker" warnings at runtime. * refactor(benchmark/challenges): Fix indentation in 4_revenue_retrieval_2/data.json * refactor(benchmark): Update agent_api_interface.py - Add type annotations to `copy_agent_artifacts_into_temp_folder` function - Add note about broken endpoint in the `agent_protocol_client` library - Remove unused variable in `run_api_agent` function - Improve readability and resolve linting error * feat(benchmark): Improve and centralize pathfinding - Search path hierarchy for applicable `agbenchmark_config`, rather than assuming it's in the current folder. - Create `agbenchmark.utils.path_manager` with `AGBenchmarkPathManager` and exporting a `PATH_MANAGER` const. - Replace path constants defined in __main__.py with usages of `PATH_MANAGER`. * feat(benchmark/cli): Clean up and improve CLI - Updated commands, options, and their descriptions to be more intuitive and consistent - Moved slow imports into the entrypoints that use them to speed up application startup - Fixed type hints to match output types of Click options - Hid deprecated `agbenchmark start` command - Refactored code to improve readability and maintainability - Moved main entrypoint into `run` subcommand - Fixed `version` and `serve` subcommands - Added `click-default-group` package to allow using `run` implicitly (for backwards compatibility) - Renamed `--no_dep` to `--no-dep` for consistency - Fixed string formatting issues in log statements * refactor(benchmark/config): Move AgentBenchmarkConfig and related functions to config.py - Move the `AgentBenchmarkConfig` class from `utils/data_types.py` to `config.py`. - Extract the `calculate_info_test_path` function from `utils/data_types.py` and move it to `config.py` as a private helper function `_calculate_info_test_path`. - Move `load_agent_benchmark_config()` to `AgentBenchmarkConfig.load()`. - Changed simple getter methods on `AgentBenchmarkConfig` to calculated properties. - Update all code references according to the changes mentioned above. * refactor(benchmark): Fix ReportManager init parameter types and use pathlib - Fix the type annotation of the `benchmark_start_time` parameter in `ReportManager.__init__`, was mistyped as `str` instead of `datetime`. - Change the type of the `filename` parameter in the `ReportManager.__init__` method from `str` to `Path`. - Rename `self.filename` with `self.report_file` in `ReportManager`. - Change the way the report file is created, opened and saved to use the `Path` object. * refactor(benchmark): Improve typing surrounding ChallengeData and clean up its implementation - Use `ChallengeData` objects instead of untyped `dict` in app.py, generate_test.py, reports.py. - Remove unnecessary methods `serialize`, `get_data`, `get_json_from_path`, `deserialize` from `ChallengeData` class. - Remove unused methods `challenge_from_datum` and `challenge_from_test_data` from `ChallengeData class. - Update function signatures and annotations of `create_challenge` and `generate_single_test` functions in generate_test.py. - Add types to function signatures of `generate_single_call_report` and `finalize_reports` in reports.py. - Remove unnecessary `challenge_data` parameter (in generate_test.py) and fixture (in conftest.py). * refactor(benchmark): Clean up generate_test.py, conftest.py and __main__.py - Cleaned up generate_test.py and conftest.py - Consolidated challenge creation logic in the `Challenge` class itself, most notably the new `Challenge.from_challenge_spec` method. - Moved challenge selection logic from generate_test.py to the `pytest_collection_modifyitems` hook in conftest.py. - Converted methods in the `Challenge` class to class methods where appropriate. - Improved argument handling in the `run_benchmark` function in `__main__.py`. * refactor(benchmark/config): Merge AGBenchmarkPathManager into AgentBenchmarkConfig and reduce fragmented/global state - Merge the functionality of `AGBenchmarkPathManager` into `AgentBenchmarkConfig` to consolidate the configuration management. - Remove the `.path_manager` module containing `AGBenchmarkPathManager`. - Pass the `AgentBenchmarkConfig` and its attributes through function arguments to reduce global state and improve code clarity. * feat(benchmark/serve): Configurable port for `serve` subcommand - Added `--port` option to `serve` subcommand to allow for specifying the port to run the API on. - If no `--port` option is provided, the port will default to the value specified in the `PORT` environment variable, or 8080 if not set. * feat(benchmark/cli): Add `config` subcommand - Added a new subcommand `config` to the AGBenchmark CLI, to display information about the present AGBenchmark config. * fix(benchmark): Gracefully handle incompatible challenge spec files in app.py - Added a check to skip deprecated challenges - Added logging to allow debugging of the loading process - Added handling of validation errors when parsing challenge spec files - Added missing `spec_file` attribute to `ChallengeData` * refactor(benchmark): Move `run_benchmark` entrypoint to main.py, use it in `/reports` endpoint - Move `run_benchmark` and `validate_args` from __main__.py to main.py - Replace agbenchmark subprocess in `app.py:run_single_test` with `run_benchmark` - Move `get_unique_categories` from __main__.py to challenges/__init__.py - Move `OPTIONAL_CATEGORIES` from __main__.py to challenge.py - Reduce operations on updates.json (including `initialize_updates_file`) outside of API * refactor(benchmark): Remove unused `/updates` endpoint and all related code - Remove `updates_json_file` attribute from `AgentBenchmarkConfig` - Remove `get_updates` and `_initialize_updates_file` in app.py - Remove `append_updates_file` and `create_update_json` functions in agent_api_interface.py - Remove call to `append_updates_file` in challenge.py * refactor(benchmark/config): Clean up and update docstrings on `AgentBenchmarkConfig` - Add and update docstrings - Change base class from `BaseModel` to `BaseSettings`, allow extras for backwards compatibility - Make naming of path attributes on `AgentBenchmarkConfig` more consistent - Remove unused `agent_home_directory` attribute - Remove unused `workspace` attribute * fix(benchmark): Restore mechanism to select (optional) categories in agent benchmark config * fix(benchmark): Update agent-protocol-client to v1.1.0 - Fixes issue with fetching task artifact listings	2024-01-02 22:23:09 +01:00
Reinier van der Leer	1bed3c6056	ci: Fix docker release workflow - Update autogpt-docker-release.yml to correctly sanitize image tags - This unbreaks the release workflow	2023-12-13 21:41:03 +01:00

1 2 3

126 Commits (785a40ff9defd01be0b57fc797ec940627f00c46)