AutoGPT

Commit Graph

Author	SHA1	Message	Date
Reinier van der Leer	f9792ed7f3	fix(benchmark): Unbreak `-N`/`--attempts` option	2024-02-16 18:43:37 +01:00
Reinier van der Leer	d6ab470c58	Rename autogpts-benchmark-nightly.yml to autogpts-benchmark.yml	2024-02-16 18:32:50 +01:00
Reinier van der Leer	666a5a8777	feat(agent/serve): Report task cost through `Step.additional_output` - Added `task_cumulative_cost` and `task_total_cost` attributes to the `Step.additional_output` in the `AgentProtocolServer.execute_step` endpoint. - Updated `agbenchmark` dependency in Agent and Forge	2024-02-16 18:19:04 +01:00
Reinier van der Leer	21f1e64559	feat(benchmark): Get agent task cost from `Step.additional_output`	2024-02-16 18:10:46 +01:00
Reinier van der Leer	752bac099b	feat(benchmark/report): Add and record `TestResult.n_steps` - Added `n_steps` attribute to `TestResult` type - Added logic to record the number of steps to `BuiltinChallenge.test_method`, `WebArenaChallenge.test_method`, and `.reports.add_test_result_to_report`	2024-02-16 17:53:19 +01:00
Reinier van der Leer	a5de79beb6	ci(benchmark): Add nightly benchmark workflow Added autogpts-benchmark-nightly.yml, which will run every night at 02:00 UTC with a selection of challenges.	2024-02-16 17:41:58 +01:00
Reinier van der Leer	483c01b681	lint(benchmark): Remove unnecessary `pass` statement in __main__.py	2024-02-16 17:27:56 +01:00
Reinier van der Leer	992b8874fc	chore: Update `agbenchmark` dependency for agent and forge	2024-02-16 17:22:58 +01:00
Reinier van der Leer	2a55efb322	fix(benchmark): Include `WebArenaSiteInfo.additional_info` (e.g. credentials) in task input Without the `additional_info`, it is impossible to get past the login page on challenges where that is necessary.	2024-02-16 17:20:44 +01:00
Reinier van der Leer	23d58a3cc0	feat(benchmark/cli): Add `challenge list`, `challenge info` subcommands - Add `challenge list` command with options `--all`, `--names`, `--json` - Add `tabular` dependency - Add `.utils.utils.sorted_by_enum_index` function to easily sort lists by an enum value/property based on the order of the enum's definition - Add `challenge info [name]` command with option `--json` - Add `.utils.utils.pretty_print_model` routine to pretty-print Pydantic models - Refactor `config` subcommand to use `pretty_print_model`	2024-02-16 15:17:11 +01:00
Reinier van der Leer	70e345b2ce	refactor(benchmark): `load_webarena_challenges` - Reduce duplicate and nested statements - Add `skip_unavailable` parameter Related changes: - Add `available` and `unavailable_reason` attributes to `ChallengeInfo` and `WebArenaChallengeSpec` - Add `pytest.skip` statement to `WebArenaChallenge.test_method` to make sure unavailable challenges are not run	2024-02-16 15:11:48 +01:00
Reinier van der Leer	650a701317	chore: Update `agbenchmark` dependency for agent and forge	2024-02-15 18:19:06 +01:00
Reinier van der Leer	679339d00c	feat(benchmark): Make report output folder configurable - Make `AgentBenchmarkConfig.reports_folder` directly configurable (through `REPORTS_FOLDER` env variable). The default is still `./agbenchmark_config/reports`. - Change all mentions of `REPORT_LOCATION` (which fulfilled the same function at some point in the past) to `REPORTS_FOLDER`.	2024-02-15 18:07:45 +01:00
Reinier van der Leer	fd5730b04a	feat(agent/telemetry): Distinguish between `production` and `dev` environment based on VCS state - Added a helper function `.app.utils.vcs_state_diverges_from_master()`. This function determines whether the relevant part of the codebase diverges from our `master`. - Updated `.app.telemetry._setup_sentry()` to determine the default environment name using `vcs_state_diverges_from_master`.	2024-02-15 16:00:30 +01:00
Reinier van der Leer	b7f08cd0f7	feat(agent/telemetry): Enable performance tracing & update opt-in prompt accordingly	2024-02-15 14:46:36 +01:00
Reinier van der Leer	8762f7ab3d	fix(forge): Make `watchfiles` pattern more specific to prevent unwanted (breaking) reloads This fixes the issue of changes in artifacts triggering an application reload (which caused connection errors for in-progress requests).	2024-02-15 13:42:38 +01:00
Reinier van der Leer	a9b7b175ff	fix(agent/profile_generator): Improve robustness by leveraging `create_chat_completion`'s parse handling	2024-02-15 11:48:07 +01:00
Reinier van der Leer	52b93dd84e	fix(cli/agent start): Wait for applications to finish starting before returning - Added a helper function `wait_until_conn_ready(port)` to wait for the benchmark and agent applications to finish starting - Improved the CLI's own logging (within the `agent start` command)	2024-02-15 11:26:26 +01:00
Reinier van der Leer	6a09a44ef7	lint(agent): Fix telemetry.py linting error & formatting	2024-02-14 23:31:35 +01:00
Toran Bruce Richards	32a627eda9	Add Privacy Policy link to telementry opt-in.	2024-02-14 16:42:34 +00:00
Reinier van der Leer	67bafa6302	fix(autogpt/llm): `AssistantChatMessage.tool_calls` default `[]` instead of `None` OpenAI ChatCompletion calls fail when `tool_calls = None`. This issue came to light after `22aba6d`.	2024-02-14 14:34:04 +01:00
Reinier van der Leer	6017eefb32	ci: Enable telemetry in CI runs on `master`	2024-02-14 12:03:54 +01:00
Reinier van der Leer	ae197fc85f	feat(agent/telemetry): Distinguish between users This allows us to get a much better sense of how many users actually experience issues, and how issue occurrence is distributed among users.	2024-02-14 11:50:45 +01:00
Reinier van der Leer	22aba6dd8a	fix(agent/llm): Include bad response in parse-fix prompt in `OpenAIProvider.create_chat_completion` Apparently I forgot to also append the response that caused the parse error before throwing it back to the LLM and letting it fix its mistake(s).	2024-02-14 11:20:31 +01:00
Reinier van der Leer	88bbdfc7fc	ci: Pick 3 challenges to run with `--mock` in smoke test CI	2024-02-14 02:30:03 +01:00
Reinier van der Leer	d0c9b7c405	lint(benchmark): Remove unused imports	2024-02-14 01:34:30 +01:00
Reinier van der Leer	e7698a4610	chore(agent): Update `forge` and `agbenchmark` dependencies	2024-02-14 01:32:28 +01:00
Reinier van der Leer	ab05b7ae70	chore(forge): Update `agbenchmark` dependency	2024-02-14 01:27:07 +01:00
Reinier van der Leer	327fb1f916	fix(benchmark): Mock mode, python evals, `--attempts` flag, challenge definitions - Fixed `--mock` mode - Moved interrupt to beginning of the step iterator pipeline (from `BuiltinChallenge` to `agent_api_interface.py:run_api_agent`). This ensures that any finish-up code is properly executed after executing a single step. - Implemented mock mode in `WebArenaChallenge` - Fixed `fixture 'i_attempt' not found` error when `--attempts`/`-N` is omitted - Fixed handling of `python`/`pytest` evals in `BuiltinChallenge` - Disabled left-over Helicone code (see `056163e`) - Fixed a couple of challenge definitions - WebArena task 107: fix spelling of months (Sepetember, Octorbor lmao) - synthesize/1_basic_content_gen (SynthesizeInfo): remove empty string from `should_contain` list - Added some debug logging in agent_api_interface.py and challenges/builtin.py	2024-02-14 01:05:34 +01:00
Reinier van der Leer	bb7f5abc6c	fix(agent/text_processing): Fix `extract_information` LLM response parsing OpenAI's newest models return JSON with markdown fences around it, breaking the `json.loads` parser. This commit adds an `extract_list_from_response` function to json_utils/utilities.py and uses this function to replace `json.loads` in `_process_text`.	2024-02-13 18:28:17 +01:00
Reinier van der Leer	393d6b97e6	feat(agent): Add Sentry integration for telemetry * Add Sentry integration for telemetry - Add `sentry_sdk` dependency - Add setup logic and config flow using `TELEMETRY_OPT_IN` environment variable - Add app/telemetry.py with `setup_telemetry` helper routine - Call `setup_telemetry` in `cli()` in app/cli.py - Add `TELEMETRY_OPT_IN` to .env.template - Add helper function `env_file_exists` and routine `set_env_config_value` to app/utils.py - Add unit tests for `set_env_config_value` in test_utils.py - Add prompt to startup to ask whether the user wants to enable telemetry if the env variable isn't set * Add `capture_exception` statements for LLM parsing errors and command failures	2024-02-13 18:10:52 +01:00
Reinier van der Leer	3b8d63dfb6	chore(agent): Update autogpt-forge and agbenchmark dependencies to propagate dependency updates This also indirectly updates `python-multipart` and fixes "python-multipart vulnerable to Content-Type Header ReDoS" https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/57	2024-02-13 13:24:24 +01:00
Reinier van der Leer	6763196d78	chore(forge): Update agbenchmark dependency	2024-02-13 12:44:17 +01:00
Reinier van der Leer	e1da58da02	chore(forge): Update aiohttp, fastapi, and python-multipart dependencies to mitigate vulnerabilities Addressed vulnerabilities: - python-multipart vulnerable to Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/56 Dependants: - FastAPI Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/52 - Starlette Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/49 - aiohttp is vulnerable to directory traversal - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/45 - aiohttp's HTTP parser (the python one, not llhttp) still overly lenient about separators - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/42	2024-02-13 12:38:36 +01:00
Reinier van der Leer	91cec515d4	chore(benchmark): Update `python-multipart` dependency to mitigate vulnerability - python-multipart vulnerable to Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/55	2024-02-13 12:36:00 +01:00
Reinier van der Leer	cc585a014f	chore(agent): Update aiohttp and fastapi dependencies to mitigate vulnerabilities Addressed vulnerabilities: - python-multipart vulnerable to Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/57 Dependants: - FastAPI Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/54 - Starlette Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/50 - aiohttp is vulnerable to directory traversal - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/44 - aiohttp's HTTP parser (the python one, not llhttp) still overly lenient about separators - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/41	2024-02-13 12:30:12 +01:00
Reinier van der Leer	e641cccb42	chore(benchmark): Update `aiohttp` and `fastapi` dependencies to mitigate vulnerabilities Addressed vulnerabilities: - python-multipart vulnerable to Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/55 Dependants: - FastAPI Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/53 - Starlette Content-Type Header ReDoS - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/48 - aiohttp is vulnerable to directory traversal - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/46 - aiohttp's HTTP parser (the python one, not llhttp) still overly lenient about separators - https://github.com/Significant-Gravitas/AutoGPT/security/dependabot/43	2024-02-13 12:21:52 +01:00
Mahdi Karami	cc73d4104b	fix(forge): incorrect import 'sdk' in .actions.finish (#6822 )	2024-02-13 11:02:03 +01:00
Reinier van der Leer	250552cb3d	fix(agent/tests): Update test_config.py:test_initial_values	2024-02-12 13:26:47 +01:00
Reinier van der Leer	1d653973e9	feat(agent/llm): Use new OpenAI models as default `SMART_LLM`, `FAST_LLM`, and `EMBEDDING_MODEL` - Change default `SMART_LLM` from `gpt-4` to `gpt-4-turbo-preview` - Change default `FAST_LLM` from `gpt-3.5-turbo-16k` to `gpt-3.5-turbo-0125` - Change default `EMBEDDING_MODEL` from `text-embedding-ada-002` to `text-embedding-3-small` - Update .env.template, azure.yaml.template, and documentation accordingly	2024-02-12 13:19:37 +01:00
Reinier van der Leer	7bf9ba5502	chore(agent/llm): Update OpenAI model info - Add `text-embedding-3-small` and `text-embedding-3-large` as `EMBEDDING_v3_S` and `EMBEDDING_v3_L` respectively - Add `gpt-3.5-turbo-0125` as `GPT3_v4` - Add `gpt-4-1106-vision-preview` as `GPT4_v3_VISION` - Add GPT-4V models to info map - Change chat model info mapping to derive info for aliases (e.g. `gpt-3.5-turbo`) from specific versions instead of the other way around	2024-02-12 13:17:20 +01:00
Reinier van der Leer	14c9773890	ci(agent): Add `GIT_REVISION` label to Docker builds	2024-02-12 12:31:04 +01:00
Reinier van der Leer	39fddb1214	fix(agent): Fix application of `extra_request_headers` in `OpenAIProvider`	2024-02-12 12:21:30 +01:00
Reinier van der Leer	fe0923ba6c	feat(agent/web): Add browser extensions to deal with cookie walls and ads (#6778 ) * Add `_sideload_chrome_extensions` subroutine to `open_page_in_browser` in web_selenium.py * Sideloads uBlock Origin and I Still Don't Care About Cookies, downloading them if necessary * Add 2-second delay to `open_page_in_browser` to allow time for handling cookie walls	2024-02-02 18:30:37 +01:00
Reinier van der Leer	dfaeda7cd5	lint(agent/tests): Fix line length in test_utils.py	2024-02-02 18:29:28 +01:00
Reinier van der Leer	9b7fee673e	fix(agent/tests): Update `test_utils.py:test_extract_json_from_response*` in accordance with `956cdc7` Commit `956cdc7` "fix(agent/json_utils): Decode as JSON rather than Python objects" broke these unit tests because they generated "JSON" by stringifying a Python object.	2024-02-02 18:21:19 +01:00
Reinier van der Leer	925269d17b	lint(agent): Fix line length in docstring of `EpisodicActionHistory.handle_compression`	2024-02-02 17:43:42 +01:00
Fernando Navarro Páez	266fe3a3f7	fix(forge): Fix "no module named 'forge.sdk.abilities'" (#6571 ) Fixes #6537	2024-02-01 11:23:35 +01:00
Reinier van der Leer	66e0c87894	feat(agent): Add history compression to increase longevity and efficiency * Compress steps in the prompt to reduce token usage, and to increase longevity when using models with limited context windows * Move multiple copies of step formatting code to `Episode.format` method * Add `EpisodicActionHistory.handle_compression` method to handle compression of new steps	2024-01-31 17:51:45 +01:00
Reinier van der Leer	55433f468a	feat(agent/web): Improve `read_webpage` information extraction abilities * Implement `extract_information` function in `autogpt.processing.text` module. This function extracts pieces of information from a body of text based on a list of topics of interest. * Add `topics_of_interest` and `get_raw_content` parameters to `read_webpage` commmand * Limit maximum content length if `get_raw_content=true` is specified	2024-01-31 15:08:08 +01:00

... 5 6 7 8 9 ...

5432 Commits (9e22409d6614c373c7c28e85eb80e474886c68e2) All Branches Search

5432 Commits (9e22409d6614c373c7c28e85eb80e474886c68e2)

All Branches