Commit Graph

4237 Commits (f4e7b1c61c876b839b9071c218d4f0c46a095f24)

Author SHA1 Message Date
merwanehamadi f4e7b1c61c
Add eval_id and sync Skill Tree with Frontend(#5287)
Add eval_id to skill tree

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-21 13:36:17 -07:00
merwanehamadi 295787c948
Fix forge not being an autogpt anymore (#5288) 2023-09-21 10:19:45 -07:00
SwiftyOS 447f9963fb Added memory papers 2023-09-21 18:21:36 +02:00
SwiftyOS e6206c0ed6 Added llm tool papers 2023-09-21 18:21:27 +02:00
SwiftyOS 2dd0a61a6e Added planning papers 2023-09-21 18:21:16 +02:00
SwiftyOS 3f1df1684a formatting 2023-09-21 17:43:35 +02:00
SwiftyOS 7373933a18 Formatting changes 2023-09-21 17:26:55 +02:00
SwiftyOS 8c49b84faa added information about directory transversal 2023-09-21 17:18:42 +02:00
SwiftyOS ec7d3e73d7 reduced matching limit 2023-09-21 17:18:42 +02:00
SwiftyOS 58a183544a simplified system json format 2023-09-21 17:18:42 +02:00
SwiftyOS d0db337af8 changed chat completion to async 2023-09-21 17:18:42 +02:00
SwiftyOS 01f68601d3 Changed abilities to async 2023-09-21 17:18:42 +02:00
Reinier van der Leer de527d3fdf
AutoGPT: use config and LLM provider from `core` (#5286) 2023-09-21 17:17:11 +02:00
Reinier van der Leer c773815c70
Fix AutoGPT CI linters 2023-09-21 17:08:57 +02:00
Reinier van der Leer 8d29f97f46
AutoGPT: Fix Docker CI 2023-09-21 17:06:45 +02:00
Reinier van der Leer c14762a495
Merge branch 'master' into autogpt/integrate-re-arch 2023-09-21 16:59:20 +02:00
Reinier van der Leer c1494ba1ef
AutoGPT: started replacing monolithic `Config` by `.core.configuration` 2023-09-21 16:46:13 +02:00
Reinier van der Leer 7720f6af24
AutoGPT: replace `autogpt.llm.*` with LLM infrastructure of `autogpt.core`;
* Removed `autogpt.llm.base` and `autogpt.llm.utils`
* `core` does things async, so `Agent.think()` and `Agent.execute()` are now also async
* Renamed `dump()` and `parse()` on `JSONSchema` to `to_dict()` and `from_dict()`
* Removed `MessageHistory`

* Also, some typo's and linting fixes here and there
2023-09-21 16:38:41 +02:00
SwiftyOS 3f8088b12d add abilities registry to default agent 2023-09-21 16:35:30 +02:00
SwiftyOS 1936eaa425 export llm functions 2023-09-21 16:35:04 +02:00
SwiftyOS f66c8b6f2f added prompt templates 2023-09-21 16:34:54 +02:00
SwiftyOS a9c4e6daa8 Added list abiltiies for prompt 2023-09-21 16:34:24 +02:00
SwiftyOS 94c511d0e0 added finish command 2023-09-21 16:34:00 +02:00
SwiftyOS 853add7e86 update registry to require a task_id 2023-09-21 16:01:53 +02:00
SwiftyOS 4de327e0e3 Add more file abilities 2023-09-21 16:01:41 +02:00
Reinier van der Leer 88f0ccfd7e
AutoGPT/core: improve `model_providers` typing and tooling
* Make .schema model names less pedantic

* Rename LanguageModel* objects to ChatModel* or CompletionModel* where appropriate

* Add `JSONSchema` utility class in `core.utils`

* Use `JSONSchema` instead of untyped dicts for `Ability` and `CompletionModelFunction` parameter specification

* Add token counting methods to `ModelProvider` interface and implementations
2023-09-21 15:30:01 +02:00
SwiftyOS 040c6bcd8c Added log messages for task and step creation 2023-09-21 15:21:58 +02:00
SwiftyOS 13c8d81f15 Disabled debug as defualt 2023-09-21 15:21:41 +02:00
SwiftyOS 5c0ddd3a81 Added jinja2 as a requirement 2023-09-21 15:21:18 +02:00
Swifty 12f3a321b7
change to stream respsonse (#5285)
* change to stream respsonse

* Changed default log level to INFO
2023-09-21 14:57:41 +02:00
Reinier van der Leer 618e7606ef
Add .flake8 2023-09-21 14:47:54 +02:00
SwiftyOS a9ad805ba9 updated run benchmark script 2023-09-21 14:44:48 +02:00
Swifty 3f83e20387
Update frontend build (#5282)
Co-authored-by: GitHub Action <action@github.com>
2023-09-21 08:07:15 +02:00
SwiftyOS 186508e75c Removed flutter and chrome from setup as not required 2023-09-21 08:06:26 +02:00
hunteraraujo 62efc6b07e Add Firebase Analytics dependency 2023-09-20 20:40:12 -07:00
hunteraraujo 22ea449850 Integrate LeaderboardService into SkillTreeViewModel
This commit integrates the `LeaderboardService` into `SkillTreeViewModel` to enable benchmark report submissions to the leaderboard. A `BenchmarkRun` object is created from the evaluation response and submitted using the `submitReport` method from `LeaderboardService`.
2023-09-20 19:36:25 -07:00
merwanehamadi ff4c76ba00
Make agbenchmark a proxy of the evaluated agent (#5279)
Make agbenchmark a Proxy of the evaluated agent

Signed-off-by: Merwane Hamadi <merwanehamadi@gmail.com>
2023-09-20 16:06:00 -07:00
hunteraraujo 1a471b73cd Fix _leaderboardBaseUrl 2023-09-20 14:44:29 -07:00
hunteraraujo 7901c750b6 Extend RestApiUtility to Support Leaderboard Base URL
This commit extends the `RestApiUtility` class to include support for a new leaderboard base URL. A new `ApiType` enum value `ApiType.leaderboard` has been added, and the `_getEffectiveBaseUrl` method has been updated to handle this new type. The leaderboard base URL is "https://leaderboard.vercel.app/".
2023-09-20 14:43:45 -07:00
hunteraraujo a0512254ca Add LeaderboardService with submitReport Method
This commit adds a new `LeaderboardService` class featuring a `submitReport` method. This method allows for the submission of `BenchmarkRun` objects to the leaderboard via a POST request to the `/api/reports` endpoint. The new service uses the `ApiType.leaderboard` enum value.
2023-09-20 14:38:48 -07:00
hunteraraujo fe96664afb Update ApiType Enum to Include Leaderboard 2023-09-20 14:31:27 -07:00
SwiftyOS d4222519eb Added instructions about cloning and changing dir 2023-09-20 22:50:39 +02:00
SwiftyOS 88f0b04015 fixed grammer 2023-09-20 22:50:39 +02:00
hunteraraujo cfc6180233 Add BenchmarkRun Class to Model Complete Benchmark Runs
This commit introduces the `BenchmarkRun` class, designed to model a complete benchmark run. The class encapsulates all data and sub-models related to a benchmark, providing a centralized object to handle various aspects of a benchmark run.

The `BenchmarkRun` class includes the following sub-models:
- `RepositoryInfo`: Information about the repository and team.
- `RunDetails`: Specific details like the run identifier, command, and timings.
- `TaskInfo`: Information about the task being benchmarked.
- `Metrics`: Performance metrics for the benchmark run.
- `Config`: Configuration settings for the benchmark run.

A `reachedCutoff` field is also included to indicate whether a certain cutoff was reached during the benchmark run.

Methods for serializing and deserializing the object to and from JSON are also provided.
2023-09-20 13:24:19 -07:00
hunteraraujo 311f69b7cf Add RepositoryInfo Class for Benchmark Repository and Team Details
This commit introduces the RepositoryInfo class, designed to encapsulate details about the repository and team associated with a benchmark run.

The class includes the following fields:
- repoUrl: The URL of the repository where the benchmark code resides.
- teamName: The name of the team responsible for the benchmark.
- benchmarkGitCommitSha: The Git commit SHA for the benchmark code.
- agentGitCommitSha: The Git commit SHA for the agent code.

The class supports JSON serialization and deserialization, making it easy to use with Flutter's JSON handling mechanisms.
2023-09-20 13:17:46 -07:00
hunteraraujo fc193568b9 Add RunDetails class for encapsulating benchmark run information
Added a new Dart class called `RunDetails` to represent specific details related to a benchmark run.

The class includes fields for:
- The unique run identifier (`runId`)
- The command used to initiate the benchmark (`command`)
- The time the benchmark was completed (`completionTime`)
- The time the benchmark started (`benchmarkStartTime`)
- The name of the test being run (`testName`)

Serialization and deserialization methods are also provided for JSON compatibility.
2023-09-20 13:12:03 -07:00
hunteraraujo afe77bbc4f Add TaskInfo class with serialization and documentation
Added a new TaskInfo class to encapsulate information related to a specific benchmark task.

- The TaskInfo class holds attributes like the data file path, regression status, task categories, task details, expected answer, and description.
- Included methods for JSON serialization and deserialization.
- Added comprehensive documentation to describe the purpose, properties, and methods of the TaskInfo class.
2023-09-20 13:07:54 -07:00
hunteraraujo 50ef7b31eb Add Metrics class with serialization and documentation
Added a new Metrics class to represent key performance metrics of a benchmark test run.

- The Metrics class encapsulates various data points like difficulty, success rate, attempted status, success percentage, cost, and runtime.
- Included serialization and deserialization methods for converting between Metrics objects and JSON.
- Added comprehensive documentation to describe the purpose, properties, and methods of the Metrics class.
2023-09-20 13:04:47 -07:00
hunteraraujo 39f8ae515b Add Config Class for Benchmark Configuration Management
This commit introduces a new `Config` class, designed to manage and store configuration settings related to the benchmark run. The class contains two key fields:

1. `agentBenchmarkConfigPath`: The path to the agent's benchmark configuration file.
2. `host`: The address of the host where the benchmark is running.

The class includes methods for serialization and deserialization, allowing easy conversion between `Config` objects and JSON maps.

Documentation comments have also been added for better code readability and understanding.
2023-09-20 13:00:22 -07:00
SwiftyOS c72a35e92e Added blueprint of an agent tutorial 2023-09-20 17:29:14 +02:00