### Background
Introduced initial database schema for AutoGPT server.
It currently consists of 7 tables:
* `AgentGraph`: This model describes the Agent Graph/Flow (Multi Agent System).
* `AgentNode`: This model describes a single node in the Agent Graph/Flow (Multi Agent System).
* `AgentNodeLink`: This model describes the link between two AgentNodes.
* `AgentNodeExecution`: This model describes the execution of an AgentNode.
* `AgentBlock`: This model describes a component that will be executed by the AgentNode (all the details required, like name, code, input/output).
* `AgentBlockInputOutput`: This model describes the output (produced event) or input (consumed event) of an AgentBlock.
* `FileDefinition`: This model describe a file that can be used as input/output of an AgentNodeExecution.
### Changes 🏗️
* Add Prisma
* Add sqlite3
* Initialize database.
* Update instructions to set up OpenAI / GPT-4 access
* Add instructions to set up Anthropic access
* Add instructions to set up Groq access
* Remove GPT-specific `--gpt3only`, `--gpt4only` CLI flags and related logic
* Remove duplicate config instructions from docker setup page, replace it by a link to the standard setup instructions
### Background
###### Project Outline
Currently, the project mainly consists of these components:
*agent_api*
A component that will expose API endpoints for the creation & execution of agents.
This component will make connections to the database to persist and read the agents.
It will also trigger the agent execution by pushing its execution request to the ExecutionQueue.
*agent_executor*
A component that will execute the agents.
This component will be a pool of processes/threads that will consume the ExecutionQueue and execute the agent accordingly.
The result and progress of its execution will be persisted in the database.
###### How to test
Execute `poetry run app`.
Access the swagger page `http://localhost:8000/docs`, there is one API to trigger an execution of one dummy slow task, you fire the API a couple of times and see the `agent_executor` executes the multiple slow tasks concurrently by the pool of Python processes.
The pool size is currently set to `5` (hardcoded in app.py, the code entry point).
##### Changes 🏗️
* Initialize FastAPI for the AutoGPT server project.
* Reduced number of queues to 1 and abstracted into `ExecutionQueue` class.
* Reduced the number of main components into two `api` and `executor`.
- Add `_BaseOpenAIProvider`, `BaseOpenAIChatProvider`, and `BaseOpenAIEmbeddingProvider`, which implement the shared functionality of OpenAI-like providers, e.g. `GroqProvider` and `OpenAIProvider`
- (Re)move as much code as possible from `GroqProvider` and `OpenAIProvider` by rebasing them on `BaseOpenAI(Chat|Embedding)Provider`
Also:
- Rename `get_available_models()` to `get_available_chat_models()` on `BaseChatModelProvider`
- Add `get_available_models()` to `BaseModelProvider`
- Add `get_available_embedding_models()` to `BaseEmbeddingModelProvider`
- Move common `fix_failed_parse_tries` config attribute into base `ModelProviderConfiguration`
* Add default AutoGPT profile to ai_profile.py & disable profile generator
* Disable custom AI profile generation in agent_protocol_server.py
- Replace `generate_agent_for_task` by `create_agent`
- Make `ai_profile` parameter on `create_agent` optional (use default `AIProfile` if not passed)
* Generalize example call in profile_generator.py
Currently it's specified in an OpenAI-specific format, which might adversely affect performance with other providers.
* Remove dead `AIProfile.api_budget` attribute
* Remove `agent.ai_profile` and `agent.directives` attributes, and replace usages with `agent.state.*`
This prevents potential state inconsistency between `agent` and `agent.state` when other values are assigned to `agent.ai_profile` and `agent.directives`
- **FIX ALL LINT/TYPE ERRORS IN AUTOGPT, FORGE, AND BENCHMARK**
### Linting
- Clean up linter configs for `autogpt`, `forge`, and `benchmark`
- Add type checking with Pyright
- Create unified pre-commit config
- Create unified linting and type checking CI workflow
### Testing
- Synchronize CI test setups for `autogpt`, `forge`, and `benchmark`
- Add missing pytest-cov to benchmark dependencies
- Mark GCS tests as slow to speed up pre-commit test runs
- Repair `forge` test suite
- Add `AgentDB.close()` method for test DB teardown in db_test.py
- Use actual temporary dir instead of forge/test_workspace/
- Move left-behind dependencies for moved `forge`-code to from autogpt to forge
### Notable type changes
- Replace uses of `ChatModelProvider` by `MultiProvider`
- Removed unnecessary exports from various __init__.py
- Simplify `FileStorage.open_file` signature by removing `IOBase` from return type union
- Implement `S3BinaryIOWrapper(BinaryIO)` type interposer for `S3FileStorage`
- Expand overloads of `GCSFileStorage.open_file` for improved typing of read and write modes
Had to silence type checking for the extra overloads, because (I think) Pyright is reporting a false-positive:
https://github.com/microsoft/pyright/issues/8007
- Change `count_tokens`, `get_tokenizer`, `count_message_tokens` methods on `ModelProvider`s from class methods to instance methods
- Move `CompletionModelFunction.schema` method -> helper function `format_function_def_for_openai` in `forge.llm.providers.openai`
- Rename `ModelProvider` -> `BaseModelProvider`
- Rename `ChatModelProvider` -> `BaseChatModelProvider`
- Add type `ChatModelProvider` which is a union of all subclasses of `BaseChatModelProvider`
### Removed rather than fixed
- Remove deprecated and broken autogpt/agbenchmark_config/benchmarks.py
- Various base classes and properties on base classes in `forge.llm.providers.schema` and `forge.models.providers`
### Fixes for other issues that came to light
- Clean up `forge.agent_protocol.api_router`, `forge.agent_protocol.database`, and `forge.agent.agent`
- Add fallback behavior to `ImageGeneratorComponent`
- Remove test for deprecated failure behavior
- Fix `agbenchmark.challenges.builtin` challenge exclusion mechanism on Windows
- Fix `_tool_calls_compat_extract_calls` in `forge.llm.providers.openai`
- Add support for `any` (= no type specified) in `JSONSchema.typescript_type`
* Add `FileStorage.mount()` method, which mounts (part of) the workspace to a local path
* Add `watchdog` library to watch file changes in mount
* Amend `CodeExecutorComponent`
* Amend `execute_python_file` to execute Python files in a workspace mount
* Amend `execute_python_code` to create temporary .py file in workspace instead of as a local file
* Add support for `Path` argument to `filename` parameter on `execute_python_file`
* Fix `test_execute_python_code` (by making it async)
- Move `autogpt/Dockerfile` to `Dockerfile.autogpt`
- Write new selective `.dockerignore` (in repo root) to keep build context clean
- Amend `autogpt/docker-compose.yml` and all `autogpt-docker-*.yml` workflows accordingly
- Include `forge/` in docker build context so it can be used as a path dependency
- Include `frontend/` in docker builds
- Moved `autogpt` and `forge` to project root
- Removed `autogpts` directory
- Moved and renamed submodule `autogpts/autogpt/tests/vcr_cassettes` to `autogpt/tests/vcr_cassettes`
- When using CLI agents will be created in `agents` directory (instead of `autogpts`)
- Renamed relevant docs, code and config references from `autogpts/[forge|autogpt]` to `[forge|autogpt]` and from `*../../*` to `*../*`
- Updated `CODEOWNERS`, GitHub Actions and Docker `*.yml` configs
- Updated symbolic links in `docs`
Remove unused `forge` code and improve structure of `forge`.
* Put all Agent Protocol stuff together in `forge.agent_protocol`
* ... including `forge.agent_protocol.database` (was `forge.db`)
* Remove duplicate/unused parts from `forge`
* `forge.actions`, containing old commands; replaced by `forge.components` from `autogpt`
* `forge/agent.py` (the old one, `ForgeAgent`)
* `forge/app.py`, which was used to serve and run the `ForgeAgent`
* `forge/db.py` (`ForgeDatabase`), which was used for `ForgeAgent`
* `forge/llm.py`, which has been replaced by new `forge.llm` module which was ported from `autogpt.core.resource.model_providers`
* `forge.memory`, which is not in use and not being maintained
* `forge.sdk`, much of which was moved into other modules and the rest is deprecated
* `AccessDeniedError`: unused
* `forge_log.py`: replaced with `logging`
* `validate_yaml_file`: not needed
* `ai_settings_file` and associated loading logic and env var `AI_SETTINGS_FILE`: unused
* `prompt_settings_file` and associated loading logic and env var `PROMPT_SETTINGS_FILE`: default directives are now provided by the `SystemComponent`
* `request_user_double_check`, which was only used in `AIDirectives.load`
* `TypingConsoleHandler`: not used
Moved from `autogpt` to `forge`:
- `autogpt.config` -> `forge.config`
- `autogpt.processing` -> `forge.content_processing`
- `autogpt.file_storage` -> `forge.file_storage`
- `autogpt.logs` -> `forge.logging`
- `autogpt.speech` -> `forge.speech`
- `autogpt.agents.(base|components|protocols)` -> `forge.agent.*`
- `autogpt.command_decorator` -> `forge.command.decorator`
- `autogpt.models.(command|command_parameter)` -> `forge.command.(command|parameter)`
- `autogpt.(commands|components|features)` -> `forge.components`
- `autogpt.core.utils.json_utils` -> `forge.json.parsing`
- `autogpt.prompts.utils` -> `forge.llm.prompting.utils`
- `autogpt.core.prompting.(base|schema|utils)` -> `forge.llm.prompting.*`
- `autogpt.core.resource.model_providers` -> `forge.llm.providers`
- `autogpt.llm.providers.openai` + `autogpt.core.resource.model_providers.utils`
-> `forge.llm.providers.utils`
- `autogpt.models.action_history:Action*` -> `forge.models.action`
- `autogpt.core.configuration.schema` -> `forge.models.config`
- `autogpt.core.utils.json_schema` -> `forge.models.json_schema`
- `autogpt.core.resource.schema` -> `forge.models.providers`
- `autogpt.models.utils` -> `forge.models.utils`
- `forge.sdk.(errors|utils)` + `autogpt.utils.(exceptions|file_operations_utils|validators)`
-> `forge.utils.(exceptions|file_operations|url_validator)`
- `autogpt.utils.utils` -> `forge.utils.const` + `forge.utils.yaml_validator`
Moved within `forge`:
- forge/prompts/* -> forge/llm/prompting/*
The rest are mostly import updates, and some sporadic removals and necessary updates (for example to fix circular deps):
- Changed `CommandOutput = Any` to remove coupling with `ContextItem` (no longer needed)
- Removed unused `Singleton` class
- Reluctantly moved `speech` to forge due to coupling (tts needs to be changed into component)
- Moved `function_specs_from_commands` and `core/resource/model_providers` to `llm/providers` (resources were a `core` thing and are no longer relevant)
- Keep tests in `autogpt` to reduce changes in this PR
- Removed unused memory-related code from tests
- Removed duplicated classes: `FancyConsoleFormatter`, `BelowLevelFilter`
- `prompt_settings.yaml` is in both `autogpt` and `forge` because for some reason doesn't work when placed in just one dir (need to be taken care of)
- Removed `config` param from `clean_input`, it wasn't used and caused circular dependency
- Renamed `BaseAgentActionProposal` to `ActionProposal`
- Updated `pyproject.toml` in `forge` and `autogpt`
- Moved `Action*` models from `forge/components/action_history/model.py` to `forge/models/action.py` as those are relevant to the entire agent and not just `EventHistoryComponent` + to reduce coupling
- Renamed `DEFAULT_ASK_COMMAND` to `ASK_COMMAND` and `DEFAULT_FINISH_COMMAND` to `FINISH_COMMAND`
- Renamed `AutoGptFormatter` to `ForgeFormatter` and moved to `forge`
Includes changes from PR https://github.com/Significant-Gravitas/AutoGPT/pull/7148
---------
Co-authored-by: Reinier van der Leer <pwuts@agpt.co>
Persist the agent's `AgentContext` so that it works in rehydrated agent instances. This makes context usable in the `AgentProtocolServer`, where the agent instance is loaded and destroyed for every step.
- Make `AgentContext` a Pydantic model
- Add `context` parameter to `ContextComponent.__init__` so we can pass in an existing instance
- Add `context: AgentContext` to `AgentSettings` so it is persisted
- Add `type` attribute to `ContextItem` implementations as a discriminator
- Rename `ContextItem` base class to `BaseContextItem` and make new `ContextItem` type alias (union of the implementation types)
Documentation files were in docs/content/AutoGPT/components, symlinks in autogpts/autogpt/autogpt/(agents|commands).
Chef doesn't allow symlinks that point to locations outside of package dir.
Replacing the documentation files with symlinks, and symlinks with the actual documentation files, should fix this.
- feat(agent/core): Add `AnthropicProvider`
- Add `ANTHROPIC_API_KEY` to .env.template and docs
Notable differences in logic compared to `OpenAIProvider`:
- Merges subsequent user messages in `AnthropicProvider._get_chat_completion_args`
- Merges and extracts all system messages into `system` parameter in `AnthropicProvider._get_chat_completion_args`
- Supports prefill; merges prefill content (if any) into generated response
- Prompt changes to improve compatibility with `AnthropicProvider`
Anthropic has a slightly different API compared to OpenAI, and has much stricter input validation. E.g. Anthropic only supports a single `system` prompt, where OpenAI allows multiple `system` messages. Anthropic also forbids sequences of multiple `user` or `assistant` messages and requires that messages alternate between roles.
- Move response format instruction from separate message into main system prompt
- Fix clock message format
- Add pre-fill to `OneShot` generated prompt
- refactor(agent/core): Tweak `model_providers.schema`
- Simplify `ModelProviderUsage`
- Remove attribute `total_tokens` as it is always equal to `prompt_tokens + completion_tokens`
- Modify signature of `update_usage(..)`; no longer requires a full `ModelResponse` object as input
- Improve `ModelProviderBudget`
- Change type of attribute `usage` to `defaultdict[str, ModelProviderUsage]` -> allow per-model usage tracking
- Modify signature of `update_usage_and_cost(..)`; no longer requires a full `ModelResponse` object as input
- Allow `ModelProviderBudget` zero-argument instantiation
- Fix type of `AssistantChatMessage.role` to match `ChatMessage.role` (str -> `ChatMessage.Role`)
- Add shared attributes and constructor to `ModelProvider` base class
- Add `max_output_tokens` parameter to `create_chat_completion` interface
- Add pre-filling as a global feature
- Add `prefill_response` field to `ChatPrompt` model
- Add `prefill_response` parameter to `create_chat_completion` interface
- Add `ChatModelProvider.get_available_models()` and remove `ApiManager`
- Remove unused `OpenAIChatParser` typedef in openai.py
- Remove redundant `budget` attribute definition on `OpenAISettings`
- Remove unnecessary `usage` in `OpenAIProvider` > `default_settings` > `budget`
- feat(agent): Allow use of any available LLM provider through `MultiProvider`
- Add `MultiProvider` (`model_providers.multi`)
- Replace all references to / uses of `OpenAIProvider` with `MultiProvider`
- Change type of `Config.smart_llm` and `Config.fast_llm` from `str` to `ModelName`
- feat(agent/core): Validate function call arguments in `create_chat_completion`
- Add `validate_call` method to `CompletionModelFunction` in `model_providers.schema`
- Add `validate_tool_calls` utility function in `model_providers.utils`
- Add tool call validation step to `create_chat_completion` in `OpenAIProvider` and `AnthropicProvider`
- Remove (now redundant) command argument validation logic in agent.py and models/command.py
- refactor(agent): Rename `get_openai_command_specs` to `function_specs_from_commands`
* Introduce `BaseAgentActionProposal`, `OneShotAgentActionProposal`, and `AssistantThoughts` models to replace `ThoughtProcessResponse`, `DEFAULT_RESPONSE_SCHEMA`
* Refactor and clean up code because now we don't need to do as much type checking everywhere
* Tweak `OneShot` response format instruction
Granular:
* `autogpt.agents.prompt_strategies.one_shot`
* Replace ThoughtProcessResponse`, `DEFAULT_RESPONSE_SCHEMA` and parsing logic by `AssistantThoughts` and `OneShotAgentActionProposal`
* (TANGENTIAL) Move response format instruction into main system prompt message
* (TANGENTIAL) Adjust response format instruction
* `autogpt.agents.base`
* Add `BaseAgentActionProposal` base model -> replace `ThoughtProcessOutput`
* Change signature of `execute` method to accept `BaseAgentActionProposal` instead of separate `command_name` and `command_args`
* Add `do_not_execute(proposal, feedback)` abstract method, replacing `execute("human_feedback", ..., feedback)`
* Move `history` definition from `BaseAgentSettings` to `AgentSettings` (the only place where it's used anyway)
* `autogpt.models`
* Add `.utils` > `ModelWithSummary` base model
* Make the models in `.action_history` (`Episode`, `EpisodicActionHistory`) generic with a type parameter for the type of `Episode.action`
* `autogpt.core.resource.model_providers.schema`
* Add `__str__` to `AssistantFunctionCall` which pretty-prints the function call
All other changes are a direct result of the changes above.
## BREAKING CHANGE:
* Due to the change in `autogpt.models.action_history`, the application after this change will be unable to load/resume agents from before this change and vice versa.
* The `additional_output` field in the response of `execute_step` has changed slightly:
* Type of `.thoughts.plan` has changed from `str` to `list[str]`
* `.command` -> `.use_tool`
* `.command.args` -> `.use_tool.arguments`