On AgentServer, To create a Block like StringFormatterBlock or LllmCallBlock, we need some way to dynamically link input pins and aggregate them into a single list input. This will give a better experience for the user to construct an input and link it from the output of the other nodes. The scope of this change is adding support for that in the least intrusive way.
Proposal
To differentiate the input list name and its singular entry we are using the $_<index> prefix. For example:
For the input items: list[int], you can set a pin items with values like [1,2,3,4]. But you can also add input pins like items_$_0 or items_$_4 with values 1 or 2, which will be appended to the items input in alphabetical order.
The execution engine will guarantee to wait for the execution until all the input pin value is produced, so input pin with list input will produce fix-sized list.
* Getting started with nextjs
* fix linting
* remove gitignore for package.json
* pulling in reactflow components
* updating css
* use environment variables
* clean up css / ui a lil
* Fixed nodes/run button animation
so they are always visible
---------
Co-authored-by: Bentlybro <tomnoon9@gmail.com>
### Background
The current implementation of AgentServer doesn't allow for a single pin to be connected to multiple nodes, this will be problematic when you have a single output node that needs to be propagated into many nodes. Or multiple nodes that possibly feed the data into a single pin (first come first serve).
This infra change is also part of the preparation for changing the `block` interface to return a stream of output instead of a single output. Treating blocks as streams requires this capability.
### Changes 🏗️
* Update block run interface from returning `(output_name, output_data)` to `Generator[(output_name, output_data)]`
* Removed `agent` term in the API, replace it with `graph` for consistency.
* Reintroduced `AgentNodeExecutionInputOutput`. `AgentNodeExecution` input & output will be a list of `AgentNodeExecutionInputOutput` which describes the input & output data of its execution. Making an execution has 1-many relation to its input output data.
* Propagating the relation and block interface change into the execution engine.
### Background
Agent execution should be able to be triggered in a recurring manner.
This PR introduced an ExecutionScheduling service, a process responsible for managing the execution schedule and triggering its execution based on a predefined cron expression.
### Changes 🏗️
* Added `scheduler.py` / `ExecutionScheduler` implementation.
* Added scheduler test.
* Added `AgentExecutionSchedule` table and its logical model & prisma queries.
* Moved `add_execution` from API server to `execution_manager`
### Background
This PR adds support on IPC on autogpt_server.
To make this happen, there are a couple of refactoring efforts being made (will be described in the `Changes` section).
Currently, there are three independent processes:
```
AgentServer ----> ExecutionManager
|
--> ExecutionScheduler
```
### Changes 🏗️
* Added Pyro5 for IPC support.
* Introduced `AppService`: a class to construct an independent process that can expose a method to other running processes (this is analogous to a microservice).
* Introduced `AppProcess`: used by `AppService` a class for creating a child process that can be executed in the background.
* Adapting existing codebase to user `AppService`.
Remove many env vars and use component-level configuration that could be loaded from file instead.
### Changed
- `BaseAgent` provides `serialize_configs` and `deserialize_configs` that can save and load all component configuration as json `str`. Deserialized components/values overwrite existing values, so not all values need to be present in the serialized config.
- Decoupled `forge/content_processing/text.py` from `Config`
- Kept `execute_local_commands` in `Config` because it's needed to know if OS info should be included in the prompt
- Updated docs to reflect changes
- Renamed `Config` to `AppConfig`
### Added
- Added `ConfigurableComponent` class for components and following configs:
- `ActionHistoryConfiguration`
- `CodeExecutorConfiguration`
- `FileManagerConfiguration` - now file manager allows to have multiple agents using the same workspace
- `GitOperationsConfiguration`
- `ImageGeneratorConfiguration`
- `WebSearchConfiguration`
- `WebSeleniumConfiguration`
- `BaseConfig` in `forge` and moved `Config` (now inherits from `BaseConfig`) back to `autogpt`
- Required `config_class` attribute for the `ConfigurableComponent` class that should be set to configuration class for a component
`--component-config-file` CLI option and `COMPONENT_CONFIG_FILE` env var and field in `Config`. This option allows to load configuration from a specific file, CLI option takes precedence over env var.
- Added comments to config models
### Removed
- Unused `change_agent_id` method from `FileManagerComponent`
- Unused `allow_downloads` from `Config` and CLI options (it should be in web component config if needed)
- CLI option `--browser-name` (the option is inside `WebSeleniumConfiguration`)
- Unused `workspace_directory` from CLI options
- No longer needed variables from `Config` and docs
- Unused fields from `Config`: `image_size`, `audio_to_text_provider`, `huggingface_audio_to_text_model`
- Removed `files` and `workspace` class attributes from `FileManagerComponent`
When an agent is resumed from a mid-cycle state (having made a proposal but not executed it yet), we need to use the previously determined `current_episode.action` proposal instead of calling `agent.propose_action()` again.
* Rename `assert_config_has_openai_api_key` to `assert_config_has_required_llm_api_keys`
* Make OpenAI credential check conditional (only if an OpenAI model is selected in the config)
* Implement checks for Groq and Anthropic credentials
* Use API calls for Groq and OpenAI credential checks to make sure the keys are valid
Revert some changes to fix forge agent and enable components support.
- Rename forge `Agent` to `ProtocolAgent`
- Bring back and update `forge/app.py` and `forge/agent/forge_agent.py`
- `ForgeAgent` inherits from `BaseAgent`, supports component execution and runs the same pipelines as autogpt Agent
- Update forge version from 0.1.0 to 0.2.0
- Update code comments
### Background
This PR implements the main logic of the block execution engine for AutoGPT-Server.
An integration test is added to test the behavior.
*What you can do now with this PR*:
You can manually create a graph, by using the existing blocks as nodes (or write your own). Then execute the graph with an input.
*What you can't do yet*:
Listen to the graph execution result/update (you can follow the `AgentNodeExecution` table result, though).
### Changes 🏗️
* Split `data.py` (model file) into three modules:
* `execution`: a model for node execution.
* `graph`: a model for graph structure.
* `block`: a model for agent block/component.
* Implemented executor main logic
* Simplify db structure:
* Remove `AgentBlockInputOutput` in favor of `inputSchema` & `outputSchema` using serialized json/dict structure.
* Remove `id` on `AgentBlock` in favor of using name (class name of the block) as its identifier.
* Added `constantInput` column for `AgentNode` for hard-coded input/block configuration. Hence, removing`executionStateData` on `AgentNodeExecution`.
* Rename AgentNodeLink input/output to source/sink to avoid confusion
* Change multithreading to multiprocessing, to allow the use of multiple `prisma` asynchronous client.
Frontend broke in #7171 because of changes to the request models in `forge.agent_protocol`. This PR unbreaks it.
Changes:
- Make `input` required on `TaskRequestBody` and `StepRequestBody`
- Amend `toJson()` on `TaskRequestBody` and `StepRequestBody` to omit attributes with `null` value
### Background
Introduced initial database schema for AutoGPT server.
It currently consists of 7 tables:
* `AgentGraph`: This model describes the Agent Graph/Flow (Multi Agent System).
* `AgentNode`: This model describes a single node in the Agent Graph/Flow (Multi Agent System).
* `AgentNodeLink`: This model describes the link between two AgentNodes.
* `AgentNodeExecution`: This model describes the execution of an AgentNode.
* `AgentBlock`: This model describes a component that will be executed by the AgentNode (all the details required, like name, code, input/output).
* `AgentBlockInputOutput`: This model describes the output (produced event) or input (consumed event) of an AgentBlock.
* `FileDefinition`: This model describe a file that can be used as input/output of an AgentNodeExecution.
### Changes 🏗️
* Add Prisma
* Add sqlite3
* Initialize database.
* Update instructions to set up OpenAI / GPT-4 access
* Add instructions to set up Anthropic access
* Add instructions to set up Groq access
* Remove GPT-specific `--gpt3only`, `--gpt4only` CLI flags and related logic
* Remove duplicate config instructions from docker setup page, replace it by a link to the standard setup instructions
### Background
###### Project Outline
Currently, the project mainly consists of these components:
*agent_api*
A component that will expose API endpoints for the creation & execution of agents.
This component will make connections to the database to persist and read the agents.
It will also trigger the agent execution by pushing its execution request to the ExecutionQueue.
*agent_executor*
A component that will execute the agents.
This component will be a pool of processes/threads that will consume the ExecutionQueue and execute the agent accordingly.
The result and progress of its execution will be persisted in the database.
###### How to test
Execute `poetry run app`.
Access the swagger page `http://localhost:8000/docs`, there is one API to trigger an execution of one dummy slow task, you fire the API a couple of times and see the `agent_executor` executes the multiple slow tasks concurrently by the pool of Python processes.
The pool size is currently set to `5` (hardcoded in app.py, the code entry point).
##### Changes 🏗️
* Initialize FastAPI for the AutoGPT server project.
* Reduced number of queues to 1 and abstracted into `ExecutionQueue` class.
* Reduced the number of main components into two `api` and `executor`.
- Add `_BaseOpenAIProvider`, `BaseOpenAIChatProvider`, and `BaseOpenAIEmbeddingProvider`, which implement the shared functionality of OpenAI-like providers, e.g. `GroqProvider` and `OpenAIProvider`
- (Re)move as much code as possible from `GroqProvider` and `OpenAIProvider` by rebasing them on `BaseOpenAI(Chat|Embedding)Provider`
Also:
- Rename `get_available_models()` to `get_available_chat_models()` on `BaseChatModelProvider`
- Add `get_available_models()` to `BaseModelProvider`
- Add `get_available_embedding_models()` to `BaseEmbeddingModelProvider`
- Move common `fix_failed_parse_tries` config attribute into base `ModelProviderConfiguration`
* Add default AutoGPT profile to ai_profile.py & disable profile generator
* Disable custom AI profile generation in agent_protocol_server.py
- Replace `generate_agent_for_task` by `create_agent`
- Make `ai_profile` parameter on `create_agent` optional (use default `AIProfile` if not passed)
* Generalize example call in profile_generator.py
Currently it's specified in an OpenAI-specific format, which might adversely affect performance with other providers.
* Remove dead `AIProfile.api_budget` attribute
* Remove `agent.ai_profile` and `agent.directives` attributes, and replace usages with `agent.state.*`
This prevents potential state inconsistency between `agent` and `agent.state` when other values are assigned to `agent.ai_profile` and `agent.directives`
- **FIX ALL LINT/TYPE ERRORS IN AUTOGPT, FORGE, AND BENCHMARK**
### Linting
- Clean up linter configs for `autogpt`, `forge`, and `benchmark`
- Add type checking with Pyright
- Create unified pre-commit config
- Create unified linting and type checking CI workflow
### Testing
- Synchronize CI test setups for `autogpt`, `forge`, and `benchmark`
- Add missing pytest-cov to benchmark dependencies
- Mark GCS tests as slow to speed up pre-commit test runs
- Repair `forge` test suite
- Add `AgentDB.close()` method for test DB teardown in db_test.py
- Use actual temporary dir instead of forge/test_workspace/
- Move left-behind dependencies for moved `forge`-code to from autogpt to forge
### Notable type changes
- Replace uses of `ChatModelProvider` by `MultiProvider`
- Removed unnecessary exports from various __init__.py
- Simplify `FileStorage.open_file` signature by removing `IOBase` from return type union
- Implement `S3BinaryIOWrapper(BinaryIO)` type interposer for `S3FileStorage`
- Expand overloads of `GCSFileStorage.open_file` for improved typing of read and write modes
Had to silence type checking for the extra overloads, because (I think) Pyright is reporting a false-positive:
https://github.com/microsoft/pyright/issues/8007
- Change `count_tokens`, `get_tokenizer`, `count_message_tokens` methods on `ModelProvider`s from class methods to instance methods
- Move `CompletionModelFunction.schema` method -> helper function `format_function_def_for_openai` in `forge.llm.providers.openai`
- Rename `ModelProvider` -> `BaseModelProvider`
- Rename `ChatModelProvider` -> `BaseChatModelProvider`
- Add type `ChatModelProvider` which is a union of all subclasses of `BaseChatModelProvider`
### Removed rather than fixed
- Remove deprecated and broken autogpt/agbenchmark_config/benchmarks.py
- Various base classes and properties on base classes in `forge.llm.providers.schema` and `forge.models.providers`
### Fixes for other issues that came to light
- Clean up `forge.agent_protocol.api_router`, `forge.agent_protocol.database`, and `forge.agent.agent`
- Add fallback behavior to `ImageGeneratorComponent`
- Remove test for deprecated failure behavior
- Fix `agbenchmark.challenges.builtin` challenge exclusion mechanism on Windows
- Fix `_tool_calls_compat_extract_calls` in `forge.llm.providers.openai`
- Add support for `any` (= no type specified) in `JSONSchema.typescript_type`