added weaviate to the supported vector memory providers

pull/424/head
cs0lar 2023-04-11 11:14:13 +01:00
commit 5fe784aabe
43 changed files with 1318 additions and 374 deletions

View File

@ -2,8 +2,8 @@ PINECONE_API_KEY=your-pinecone-api-key
PINECONE_ENV=your-pinecone-region
OPENAI_API_KEY=your-openai-api-key
ELEVENLABS_API_KEY=your-elevenlabs-api-key
SMART_LLM_MODEL="gpt-4"
FAST_LLM_MODEL="gpt-3.5-turbo"
SMART_LLM_MODEL=gpt-4
FAST_LLM_MODEL=gpt-3.5-turbo
GOOGLE_API_KEY=
CUSTOM_SEARCH_ENGINE_ID=
USE_AZURE=False
@ -14,7 +14,12 @@ WEAVIATE_HOST="http://127.0.0.1"
WEAVIATE_PORT="8080"
WEAVIATE_USERNAME=
WEAVIATE_PASSWORD=
WEAVIATE_INDEX="Autogpt"
MEMORY_PROVIDER="weaviate"
IMAGE_PROVIDER=dalle
HUGGINGFACE_API_TOKEN=
OPENAI_AZURE_API_BASE=your-base-url-for-azure
OPENAI_AZURE_API_VERSION=api-version-for-azure
OPENAI_AZURE_DEPLOYMENT_ID=deployment-id-for-azure
IMAGE_PROVIDER=dalle
USE_MAC_OS_TTS=False
MEMORY_INDEX="auto-gpt"
MEMORY_BACKEND="local"

View File

@ -1,18 +1,33 @@
### Background
<!-- 📢 Announcement
We've recently noticed an increase in pull requests focusing on combining multiple changes. While the intentions behind these PRs are appreciated, it's essential to maintain a clean and manageable git history. To ensure the quality of our repository, we kindly ask you to adhere to the following guidelines when submitting PRs:
<!-- Provide a brief overview of why this change is being made. Include any relevant context, prior discussions, or links to relevant issues. -->
Focus on a single, specific change.
Do not include any unrelated or "extra" modifications.
Provide clear documentation and explanations of the changes made.
Ensure diffs are limited to the intended lines — no applying preferred formatting styles or line endings (unless that's what the PR is about).
For guidance on committing only the specific lines you have changed, refer to this helpful video: https://youtu.be/8-hSNHHbiZg
By following these guidelines, your PRs are more likely to be merged quickly after testing, as long as they align with the project's overall direction. -->
### Background
<!-- Provide a concise overview of the rationale behind this change. Include relevant context, prior discussions, or links to related issues. Ensure that the change aligns with the project's overall direction. -->
### Changes
<!-- Describe the specific, focused change made in this pull request. Detail the modifications clearly and avoid any unrelated or "extra" changes. -->
<!-- Describe the changes made in this pull request. Be specific and detailed. -->
### Documentation
<!-- Explain how your changes are documented, such as in-code comments or external documentation. Ensure that the documentation is clear, concise, and easy to understand. -->
### Test Plan
<!-- Describe how you tested this functionality. Include steps to reproduce, relevant test cases, and any other pertinent information. -->
<!-- Explain how you tested this functionality. Include the steps to reproduce and any relevant test cases. -->
### PR Quality Checklist
- [ ] My pull request is atomic and focuses on a single change.
- [ ] I have thouroughly tested my changes with multiple different prompts.
- [ ] I have considered potential risks and mitigations for my changes.
- [ ] I have documented my changes clearly and comprehensively.
- [ ] I have not snuck in any "extra" small tweaks changes <!-- Submit these as seperate Pull Reqests, they are the easiest to merge! -->
### Change Safety
<!-- If you haven't added tests, please explain why. If you have, check the appropriate box. If you've ensured your PR is atomic and well-documented, check the corresponding boxes. -->
- [ ] I have added tests to cover my changes
- [ ] I have considered potential risks and mitigations for my changes
<!-- If you haven't added tests, please explain why. If you have, check the appropriate box. -->
<!-- By submitting this, I agree that my pull request should be closed if I do not fill this out or follow the guide lines. -->

4
.gitignore vendored
View File

@ -7,5 +7,9 @@ package-lock.json
auto_gpt_workspace/*
*.mpeg
.env
venv/*
outputs/*
ai_settings.yaml
.vscode
auto-gpt.json
log.txt

View File

@ -2,6 +2,7 @@ FROM python:3.11
WORKDIR /app
COPY scripts/ /app
COPY requirements.txt /app
RUN pip install -r requirements.txt

View File

@ -57,9 +57,9 @@ Your support is greatly appreciated
- 🗃️ File storage and summarization with GPT-3.5
## 📋 Requirements
- [Python 3.7 or later](https://www.tutorialspoint.com/how-to-install-python-in-windows)
- [Python 3.8 or later](https://www.tutorialspoint.com/how-to-install-python-in-windows)
- OpenAI API key
- PINECONE API key
- [PINECONE API key](https://www.pinecone.io/)
Optional:
- ElevenLabs Key (If you want the AI to speak)
@ -81,7 +81,7 @@ git clone https://github.com/Torantulino/Auto-GPT.git
2. Navigate to the project directory:
*(Type this into your CMD window, you're aiming to navigate the CMD window to the repository you just downloaded)*
```
$ cd 'Auto-GPT'
cd 'Auto-GPT'
```
3. Install the required dependencies:
@ -93,7 +93,7 @@ pip install -r requirements.txt
4. Rename `.env.template` to `.env` and fill in your `OPENAI_API_KEY`. If you plan to use Speech Mode, fill in your `ELEVEN_LABS_API_KEY` as well.
- Obtain your OpenAI API key from: https://platform.openai.com/account/api-keys.
- Obtain your ElevenLabs API key from: https://elevenlabs.io. You can view your xi-api-key using the "Profile" tab on the website.
- If you want to use GPT on an Azure instance, set `USE_AZURE` to `True` and provide the `OPENAI_API_BASE`, `OPENAI_API_VERSION` and `OPENAI_DEPLOYMENT_ID` values as explained here: https://pypi.org/project/openai/ in the `Microsoft Azure Endpoints` section
- If you want to use GPT on an Azure instance, set `USE_AZURE` to `True` and provide the `OPENAI_AZURE_API_BASE`, `OPENAI_AZURE_API_VERSION` and `OPENAI_AZURE_DEPLOYMENT_ID` values as explained here: https://pypi.org/project/openai/ in the `Microsoft Azure Endpoints` section
## 🔧 Usage
@ -114,7 +114,7 @@ python scripts/main.py --speak
## 🔍 Google API Keys Configuration
This section is optional, use the official google api if you are having issues with error 429 when running google search.
This section is optional, use the official google api if you are having issues with error 429 when running a google search.
To use the `google_official_search` command, you need to set up your Google API keys in your environment variables.
1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
@ -127,6 +127,8 @@ To use the `google_official_search` command, you need to set up your Google API
8. Set up your search engine by following the prompts. You can choose to search the entire web or specific sites.
9. Once you've created your search engine, click on "Control Panel" and then "Basics". Copy the "Search engine ID" and set it as an environment variable named `CUSTOM_SEARCH_ENGINE_ID` on your machine. See setting up environment variables below.
*Remember that your free daily custom search quota allows only up to 100 searches. To increase this limit, you need to assign a billing account to the project to profit from up to 10K daily searches.*
### Setting up environment variables
For Windows Users:
```
@ -141,26 +143,60 @@ export CUSTOM_SEARCH_ENGINE_ID="YOUR_CUSTOM_SEARCH_ENGINE_ID"
```
## Vector based memory provider
Auto-GPT supports two providers for vector-based memory, [Pinecone](https://www.pinecone.io/) and [Weaviate](https://weaviate.io/). To select the provider to use, specify the following in your `.env`:
## Redis Setup
Install docker desktop.
Run:
```
docker run -d --name redis-stack-server -p 6379:6379 redis/redis-stack-server:latest
```
See https://hub.docker.com/r/redis/redis-stack-server for setting a password and additional configuration.
Set the following environment variables:
```
MEMORY_BACKEND=redis
REDIS_HOST=localhost
REDIS_PORT=6379
REDIS_PASSWORD=
```
Note that this is not intended to be run facing the internet and is not secure, do not expose redis to the internet without a password or at all really.
You can optionally set
```
MEMORY_PROVIDER="pinecone" # change to "weaviate" to use weaviate as the memory provider
WIPE_REDIS_ON_START=False
```
### 🌲 Pinecone API Key Setup
Pinecone enable a vector based memory so a vast memory can be stored and only relevant memories
are loaded for the agent at any given time.
To persist memory stored in Redis.
You can specify the memory index for redis using the following:
````
MEMORY_INDEX=whatever
````
## 🌲 Pinecone API Key Setup
Pinecone enables the storage of vast amounts of vector-based memory, allowing for only relevant memories to be loaded for the agent at any given time.
1. Go to app.pinecone.io and make an account if you don't already have one.
2. Choose the `Starter` plan to avoid being charged.
3. Find your API key and region under the default project in the left sidebar.
#### Setting up environment variables
### Setting up environment variables
Simply set them in the `.env` file.
Alternatively, you can set them from the command line (advanced):
For Windows Users:
```
setx PINECONE_API_KEY "YOUR_PINECONE_API_KEY"
export PINECONE_ENV="Your pinecone region" # something like: us-east4-gcp
setx PINECONE_ENV "Your pinecone region" # something like: us-east4-gcp
```
For macOS and Linux users:
@ -170,9 +206,7 @@ export PINECONE_ENV="Your pinecone region" # something like: us-east4-gcp
```
Or you can set them in the `.env` file.
### Weaviate Setup
## Weaviate Setup
[Weaviate](https://weaviate.io/) is an open-source vector database. It allows to store data objects and vector embeddings from ML-models and scales seamlessly to billion of data objects. [An instance of Weaviate can be created locally (using Docker), on Kubernetes or using Weaviate Cloud Services](https://weaviate.io/developers/weaviate/quickstart).
@ -181,11 +215,12 @@ Or you can set them in the `.env` file.
In your `.env` file set the following:
```
MEMORY_BACKEND=weaviate
WEAVIATE_HOST="http://127.0.0.1" # the URL of the running Weaviate instance
WEAVIATE_PORT="8080"
WEAVIATE_USERNAME="your username"
WEAVIATE_PASSWORD="your password"
WEAVIATE_INDEX="Autogpt" # name of the index to create for the application
MEMORY_INDEX="Autogpt" # name of the index to create for the application
```
## View Memory Usage
@ -210,6 +245,7 @@ If you don't have access to the GPT4 api, this mode will allow you to use Auto-G
```
python scripts/main.py --gpt3only
```
It is recommended to use a virtual machine for tasks that require high security measures to prevent any potential harm to the main computer's system and data.
## 🖼 Image Generation
By default, Auto-GPT uses DALL-e for image generation. To use Stable Diffusion, a [HuggingFace API Token](https://huggingface.co/settings/tokens) is required.

View File

@ -12,4 +12,7 @@ docker
duckduckgo-search
google-api-python-client #(https://developers.google.com/custom-search/v1/overview)
pinecone-client==2.2.1
redis
orjson
Pillow
weaviate-client==3.15.4

View File

@ -7,6 +7,7 @@ agents = {} # key, (task, full_message_history, model)
# TODO: Centralise use of create_chat_completion() to globally enforce token limit
def create_agent(task, prompt, model):
"""Create a new agent and return its key"""
global next_key
global agents
@ -32,6 +33,7 @@ def create_agent(task, prompt, model):
def message_agent(key, message):
"""Send a message to an agent and return its response"""
global agents
task, messages, model = agents[int(key)]
@ -52,6 +54,7 @@ def message_agent(key, message):
def list_agents():
"""Return a list of all agents"""
global agents
# Return a list of agent keys and their tasks
@ -59,6 +62,7 @@ def list_agents():
def delete_agent(key):
"""Delete an agent and return True if successful, False otherwise"""
global agents
try:

View File

@ -3,7 +3,27 @@ import data
import os
class AIConfig:
def __init__(self, ai_name="", ai_role="", ai_goals=[]):
"""
A class object that contains the configuration information for the AI
Attributes:
ai_name (str): The name of the AI.
ai_role (str): The description of the AI's role.
ai_goals (list): The list of objectives the AI is supposed to complete.
"""
def __init__(self, ai_name: str="", ai_role: str="", ai_goals: list=[]) -> None:
"""
Initialize a class instance
Parameters:
ai_name (str): The name of the AI.
ai_role (str): The description of the AI's role.
ai_goals (list): The list of objectives the AI is supposed to complete.
Returns:
None
"""
self.ai_name = ai_name
self.ai_role = ai_role
self.ai_goals = ai_goals
@ -12,8 +32,19 @@ class AIConfig:
SAVE_FILE = os.path.join(os.path.dirname(__file__), '..', 'ai_settings.yaml')
@classmethod
def load(cls, config_file=SAVE_FILE):
# Load variables from yaml file if it exists
def load(cls: object, config_file: str=SAVE_FILE) -> object:
"""
Returns class object with parameters (ai_name, ai_role, ai_goals) loaded from yaml file if yaml file exists,
else returns class with no parameters.
Parameters:
cls (class object): An AIConfig Class object.
config_file (int): The path to the config yaml file. DEFAULT: "../ai_settings.yaml"
Returns:
cls (object): A instance of given cls object
"""
try:
with open(config_file) as file:
config_params = yaml.load(file, Loader=yaml.FullLoader)
@ -26,12 +57,32 @@ class AIConfig:
return cls(ai_name, ai_role, ai_goals)
def save(self, config_file=SAVE_FILE):
def save(self, config_file: str=SAVE_FILE) -> None:
"""
Saves the class parameters to the specified file yaml file path as a yaml file.
Parameters:
config_file(str): The path to the config yaml file. DEFAULT: "../ai_settings.yaml"
Returns:
None
"""
config = {"ai_name": self.ai_name, "ai_role": self.ai_role, "ai_goals": self.ai_goals}
with open(config_file, "w") as file:
yaml.dump(config, file)
def construct_full_prompt(self):
def construct_full_prompt(self) -> str:
"""
Returns a prompt to the user with the class information in an organized fashion.
Parameters:
None
Returns:
full_prompt (str): A string containing the intitial prompt for the user including the ai_name, ai_role and ai_goals.
"""
prompt_start = """Your decisions must always be made independently without seeking user assistance. Play to your strengths as an LLM and pursue simple strategies with no legal complications."""
# Construct full prompt
@ -41,3 +92,4 @@ class AIConfig:
full_prompt += f"\n\n{data.load_prompt()}"
return full_prompt

View File

@ -5,9 +5,17 @@ from call_ai_function import call_ai_function
from json_parser import fix_and_parse_json
cfg = Config()
# Evaluating code
def evaluate_code(code: str) -> List[str]:
"""
A function that takes in a string and returns a response from create chat completion api call.
Parameters:
code (str): Code to be evaluated.
Returns:
A result string from create chat completion. A list of suggestions to improve the code.
"""
function_string = "def analyze_code(code: str) -> List[str]:"
args = [code]
description_string = """Analyzes the given code and returns a list of suggestions for improvements."""
@ -17,9 +25,17 @@ def evaluate_code(code: str) -> List[str]:
return result_string
# Improving code
def improve_code(suggestions: List[str], code: str) -> str:
"""
A function that takes in code and suggestions and returns a response from create chat completion api call.
Parameters:
suggestions (List): A list of suggestions around what needs to be improved.
code (str): Code to be improved.
Returns:
A result string from create chat completion. Improved code in response.
"""
function_string = (
"def generate_improved_code(suggestions: List[str], code: str) -> str:"
)
@ -30,10 +46,18 @@ def improve_code(suggestions: List[str], code: str) -> str:
return result_string
# Writing tests
def write_tests(code: str, focus: List[str]) -> str:
"""
A function that takes in code and focus topics and returns a response from create chat completion api call.
Parameters:
focus (List): A list of suggestions around what needs to be improved.
code (str): Code for test cases to be generated against.
Returns:
A result string from create chat completion. Test cases for the submitted code in response.
"""
function_string = (
"def create_test_cases(code: str, focus: Optional[str] = None) -> str:"
)
@ -42,5 +66,3 @@ def write_tests(code: str, focus: List[str]) -> str:
result_string = call_ai_function(function_string, args, description_string)
return result_string

View File

@ -5,8 +5,25 @@ from llm_utils import create_chat_completion
cfg = Config()
# Define and check for local file address prefixes
def check_local_file_access(url):
local_prefixes = ['file:///', 'file://localhost', 'http://localhost', 'https://localhost']
return any(url.startswith(prefix) for prefix in local_prefixes)
def scrape_text(url):
"""Scrape text from a webpage"""
# Most basic check if the URL is valid:
if not url.startswith('http'):
return "Error: Invalid URL"
# Restrict access to local files
if check_local_file_access(url):
return "Error: Access to local files is restricted"
try:
response = requests.get(url, headers=cfg.user_agent_header)
except requests.exceptions.RequestException as e:
return "Error: " + str(e)
# Check if the response contains an HTTP error
if response.status_code >= 400:
@ -26,6 +43,7 @@ def scrape_text(url):
def extract_hyperlinks(soup):
"""Extract hyperlinks from a BeautifulSoup object"""
hyperlinks = []
for link in soup.find_all('a', href=True):
hyperlinks.append((link.text, link['href']))
@ -33,6 +51,7 @@ def extract_hyperlinks(soup):
def format_hyperlinks(hyperlinks):
"""Format hyperlinks into a list of strings"""
formatted_links = []
for link_text, link_url in hyperlinks:
formatted_links.append(f"{link_text} ({link_url})")
@ -40,6 +59,7 @@ def format_hyperlinks(hyperlinks):
def scrape_links(url):
"""Scrape links from a webpage"""
response = requests.get(url, headers=cfg.user_agent_header)
# Check if the response contains an HTTP error
@ -57,6 +77,7 @@ def scrape_links(url):
def split_text(text, max_length=8192):
"""Split text into chunks of a maximum length"""
paragraphs = text.split("\n")
current_length = 0
current_chunk = []
@ -75,12 +96,14 @@ def split_text(text, max_length=8192):
def create_message(chunk, question):
"""Create a message for the user to summarize a chunk of text"""
return {
"role": "user",
"content": f"\"\"\"{chunk}\"\"\" Using the above text, please answer the following question: \"{question}\" -- if the question cannot be answered using the text, please summarize the text."
}
def summarize_text(text, question):
"""Summarize text using the LLM model"""
if not text:
return "Error: No text to summarize"

View File

@ -1,11 +1,14 @@
from config import Config
cfg = Config()
from llm_utils import create_chat_completion
# This is a magic function that can do anything with no-code. See
# https://github.com/Torantulino/AI-Functions for more info.
def call_ai_function(function, args, description, model=cfg.smart_llm_model):
def call_ai_function(function, args, description, model=None):
"""Call an AI function"""
if model is None:
model = cfg.smart_llm_model
# For each arg, if any are None, convert to "None":
args = [str(arg) if arg is not None else "None" for arg in args]
# parse args to comma seperated string

View File

@ -3,11 +3,9 @@ import openai
from dotenv import load_dotenv
from config import Config
import token_counter
cfg = Config()
from llm_utils import create_chat_completion
cfg = Config()
def create_chat_message(role, content):
"""
@ -26,8 +24,11 @@ def create_chat_message(role, content):
def generate_context(prompt, relevant_memory, full_message_history, model):
current_context = [
create_chat_message(
"system", prompt), create_chat_message(
"system", f"Permanent memory: {relevant_memory}")]
"system", prompt),
create_chat_message(
"system", f"The current time and date is {time.strftime('%c')}"),
create_chat_message(
"system", f"This reminds you of these events from your past:\n{relevant_memory}\n\n")]
# Add messages from the full message history until we reach the token limit
next_message_to_add_index = len(full_message_history) - 1
@ -43,8 +44,8 @@ def chat_with_ai(
user_input,
full_message_history,
permanent_memory,
token_limit,
debug=False):
token_limit):
"""Interact with the OpenAI API, sending the prompt, user input, message history, and permanent memory."""
while True:
try:
"""
@ -62,13 +63,15 @@ def chat_with_ai(
"""
model = cfg.fast_llm_model # TODO: Change model from hardcode to argument
# Reserve 1000 tokens for the response
if debug:
if cfg.debug:
print(f"Token limit: {token_limit}")
send_token_limit = token_limit - 1000
relevant_memory = permanent_memory.get_relevant(str(full_message_history[-5:]), 10)
if debug:
if cfg.debug:
print('Memory Stats: ', permanent_memory.get_stats())
next_message_to_add_index, current_tokens_used, insertion_index, current_context = generate_context(
@ -107,7 +110,7 @@ def chat_with_ai(
# assert tokens_remaining >= 0, "Tokens remaining is negative. This should never happen, please submit a bug report at https://www.github.com/Torantulino/Auto-GPT"
# Debug print the current context
if debug:
if cfg.debug:
print(f"Token limit: {token_limit}")
print(f"Send Token Count: {current_tokens_used}")
print(f"Tokens remaining for response: {tokens_remaining}")

View File

@ -1,6 +1,6 @@
import browse
import json
from factory import MemoryFactory
from memory import get_memory
import datetime
import agent_manager as agents
import speak
@ -25,6 +25,7 @@ def is_valid_int(value):
return False
def get_command(response):
"""Parse the response and return the command name and arguments"""
try:
response_json = fix_and_parse_json(response)
@ -41,9 +42,6 @@ def get_command(response):
# Use an empty dictionary if 'args' field is not present in 'command' object
arguments = command.get("args", {})
if not arguments:
arguments = {}
return command_name, arguments
except json.decoder.JSONDecodeError:
return "Error:", "Invalid JSON"
@ -53,7 +51,9 @@ def get_command(response):
def execute_command(command_name, arguments):
memory = MemoryFactory.get_memory(cfg.memory_provider)
"""Execute the command and return the result"""
memory = get_memory(cfg)
try:
if command_name == "google":
@ -105,21 +105,25 @@ def execute_command(command_name, arguments):
return execute_python_file(arguments["file"])
elif command_name == "generate_image":
return generate_image(arguments["prompt"])
elif command_name == "do_nothing":
return "No action performed."
elif command_name == "task_complete":
shutdown()
else:
return f"Unknown command {command_name}"
return f"Unknown command '{command_name}'. Please refer to the 'COMMANDS' list for availabe commands and only respond in the specified JSON format."
# All errors, return "Error: + error message"
except Exception as e:
return "Error: " + str(e)
def get_datetime():
"""Return the current date and time"""
return "Current date and time: " + \
datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
def google_search(query, num_results=8):
"""Return the results of a google search"""
search_results = []
for j in ddg(query, max_results=num_results):
search_results.append(j)
@ -127,6 +131,7 @@ def google_search(query, num_results=8):
return json.dumps(search_results, ensure_ascii=False, indent=4)
def google_official_search(query, num_results=8):
"""Return the results of a google search using the official Google API"""
from googleapiclient.discovery import build
from googleapiclient.errors import HttpError
import json
@ -162,6 +167,7 @@ def google_official_search(query, num_results=8):
return search_results_links
def browse_website(url, question):
"""Browse a website and return the summary and links"""
summary = get_text_summary(url, question)
links = get_hyperlinks(url)
@ -175,22 +181,72 @@ def browse_website(url, question):
def get_text_summary(url, question):
"""Return the results of a google search"""
text = browse.scrape_text(url)
summary = browse.summarize_text(text, question)
return """ "Result" : """ + summary
def get_hyperlinks(url):
"""Return the results of a google search"""
link_list = browse.scrape_links(url)
return link_list
def commit_memory(string):
"""Commit a string to memory"""
_text = f"""Committing memory with string "{string}" """
mem.permanent_memory.append(string)
return _text
def delete_memory(key):
"""Delete a memory with a given key"""
if key >= 0 and key < len(mem.permanent_memory):
_text = "Deleting memory with key " + str(key)
del mem.permanent_memory[key]
print(_text)
return _text
else:
print("Invalid key, cannot delete memory.")
return None
def overwrite_memory(key, string):
"""Overwrite a memory with a given key and string"""
# Check if the key is a valid integer
if is_valid_int(key):
key_int = int(key)
# Check if the integer key is within the range of the permanent_memory list
if 0 <= key_int < len(mem.permanent_memory):
_text = "Overwriting memory with key " + str(key) + " and string " + string
# Overwrite the memory slot with the given integer key and string
mem.permanent_memory[key_int] = string
print(_text)
return _text
else:
print(f"Invalid key '{key}', out of range.")
return None
# Check if the key is a valid string
elif isinstance(key, str):
_text = "Overwriting memory with key " + key + " and string " + string
# Overwrite the memory slot with the given string key and string
mem.permanent_memory[key] = string
print(_text)
return _text
else:
print(f"Invalid key '{key}', must be an integer or a string.")
return None
def shutdown():
"""Shut down the program"""
print("Shutting down...")
quit()
def start_agent(name, task, prompt, model=cfg.fast_llm_model):
"""Start an agent with a given name, task, and prompt"""
global cfg
# Remove underscores from name
@ -214,6 +270,7 @@ def start_agent(name, task, prompt, model=cfg.fast_llm_model):
def message_agent(key, message):
"""Message an agent with a given key and message"""
global cfg
# Check if the key is a valid integer
@ -232,10 +289,12 @@ def message_agent(key, message):
def list_agents():
"""List all agents"""
return agents.list_agents()
def delete_agent(key):
"""Delete an agent with a given key"""
result = agents.delete_agent(key)
if not result:
return f"Agent {key} does not exist."

View File

@ -1,3 +1,4 @@
import abc
import os
import openai
from dotenv import load_dotenv
@ -5,7 +6,7 @@ from dotenv import load_dotenv
load_dotenv()
class Singleton(type):
class Singleton(abc.ABCMeta, type):
"""
Singleton metaclass for ensuring only one instance of a class.
"""
@ -13,6 +14,7 @@ class Singleton(type):
_instances = {}
def __call__(cls, *args, **kwargs):
"""Call method for the singleton metaclass."""
if cls not in cls._instances:
cls._instances[cls] = super(
Singleton, cls).__call__(
@ -20,15 +22,21 @@ class Singleton(type):
return cls._instances[cls]
class AbstractSingleton(abc.ABC, metaclass=Singleton):
pass
class Config(metaclass=Singleton):
"""
Configuration class to store the state of bools for different scripts access.
"""
def __init__(self):
"""Initialize the Config class"""
self.debug = False
self.continuous_mode = False
self.speak_mode = False
# TODO - make these models be self-contained, using langchain, so we can configure them once and call it good
self.fast_llm_model = os.getenv("FAST_LLM_MODEL", "gpt-3.5-turbo")
self.smart_llm_model = os.getenv("SMART_LLM_MODEL", "gpt-4")
self.fast_token_limit = int(os.getenv("FAST_TOKEN_LIMIT", 4000))
@ -38,19 +46,21 @@ class Config(metaclass=Singleton):
self.use_azure = False
self.use_azure = os.getenv("USE_AZURE") == 'True'
if self.use_azure:
self.openai_api_base = os.getenv("OPENAI_API_BASE")
self.openai_api_version = os.getenv("OPENAI_API_VERSION")
self.openai_deployment_id = os.getenv("OPENAI_DEPLOYMENT_ID")
self.openai_api_base = os.getenv("OPENAI_AZURE_API_BASE")
self.openai_api_version = os.getenv("OPENAI_AZURE_API_VERSION")
self.openai_deployment_id = os.getenv("OPENAI_AZURE_DEPLOYMENT_ID")
openai.api_type = "azure"
openai.api_base = self.openai_api_base
openai.api_version = self.openai_api_version
self.elevenlabs_api_key = os.getenv("ELEVENLABS_API_KEY")
self.use_mac_os_tts = False
self.use_mac_os_tts = os.getenv("USE_MAC_OS_TTS")
self.google_api_key = os.getenv("GOOGLE_API_KEY")
self.custom_search_engine_id = os.getenv("CUSTOM_SEARCH_ENGINE_ID")
self.memory_provider = os.getenv("MEMORY_PROVIDER", 'pinecone')
self.pinecone_api_key = os.getenv("PINECONE_API_KEY")
self.pinecone_region = os.getenv("PINECONE_ENV")
@ -59,7 +69,6 @@ class Config(metaclass=Singleton):
self.weaviate_username = os.getenv("WEAVIATE_USERNAME", None)
self.weaviate_password = os.getenv("WEAVIATE_PASSWORD", None)
self.weaviate_scopes = os.getenv("WEAVIATE_SCOPES", None)
self.weaviate_index = os.getenv("WEAVIATE_INDEX", 'auto-gpt')
self.image_provider = os.getenv("IMAGE_PROVIDER")
self.huggingface_api_token = os.getenv("HUGGINGFACE_API_TOKEN")
@ -67,42 +76,68 @@ class Config(metaclass=Singleton):
# User agent headers to use when browsing web
# Some websites might just completely deny request with an error code if no user agent was found.
self.user_agent_header = {"User-Agent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.97 Safari/537.36"}
self.redis_host = os.getenv("REDIS_HOST", "localhost")
self.redis_port = os.getenv("REDIS_PORT", "6379")
self.redis_password = os.getenv("REDIS_PASSWORD", "")
self.wipe_redis_on_start = os.getenv("WIPE_REDIS_ON_START", "True") == 'True'
self.memory_index = os.getenv("MEMORY_INDEX", 'auto-gpt')
# Note that indexes must be created on db 0 in redis, this is not configureable.
self.memory_backend = os.getenv("MEMORY_BACKEND", 'local')
# Initialize the OpenAI API client
openai.api_key = self.openai_api_key
def set_continuous_mode(self, value: bool):
"""Set the continuous mode value."""
self.continuous_mode = value
def set_speak_mode(self, value: bool):
"""Set the speak mode value."""
self.speak_mode = value
def set_debug_mode(self, value: bool):
self.debug_mode = value
def set_fast_llm_model(self, value: str):
"""Set the fast LLM model value."""
self.fast_llm_model = value
def set_smart_llm_model(self, value: str):
"""Set the smart LLM model value."""
self.smart_llm_model = value
def set_fast_token_limit(self, value: int):
"""Set the fast token limit value."""
self.fast_token_limit = value
def set_smart_token_limit(self, value: int):
"""Set the smart token limit value."""
self.smart_token_limit = value
def set_openai_api_key(self, value: str):
"""Set the OpenAI API key value."""
self.openai_api_key = value
def set_elevenlabs_api_key(self, value: str):
"""Set the ElevenLabs API key value."""
self.elevenlabs_api_key = value
def set_google_api_key(self, value: str):
"""Set the Google API key value."""
self.google_api_key = value
def set_custom_search_engine_id(self, value: str):
"""Set the custom search engine id value."""
self.custom_search_engine_id = value
def set_pinecone_api_key(self, value: str):
"""Set the Pinecone API key value."""
self.pinecone_api_key = value
def set_pinecone_region(self, value: str):
"""Set the Pinecone region value."""
self.pinecone_region = value
def set_debug_mode(self, value: bool):
"""Set the debug mode value."""
self.debug = value

View File

@ -2,6 +2,7 @@ import os
from pathlib import Path
def load_prompt():
"""Load the prompt from data/prompt.txt"""
try:
# get directory of this file:
file_dir = Path(__file__).parent

View File

@ -24,6 +24,7 @@ COMMANDS:
18. Execute Python File: "execute_python_file", args: "file": "<file>"
19. Task Complete (Shutdown): "task_complete", args: "reason": "<reason>"
20. Generate Image: "generate_image", args: "prompt": "<prompt>"
21. Do Nothing: "do_nothing", args: ""
RESOURCES:

View File

@ -3,6 +3,7 @@ import os
def execute_python_file(file):
"""Execute a Python file in a Docker container and return the output"""
workspace_folder = "auto_gpt_workspace"
print (f"Executing file '{file}' in workspace '{workspace_folder}'")

View File

@ -1,13 +0,0 @@
from providers.pinecone import PineconeMemory
from providers.weaviate import WeaviateMemory
class MemoryFactory:
@staticmethod
def get_memory(mem_type):
if mem_type == 'pinecone':
return PineconeMemory()
if mem_type == 'weaviate':
return WeaviateMemory()
raise ValueError('Unknown memory provider')

View File

@ -4,11 +4,13 @@ import os.path
# Set a dedicated folder for file I/O
working_directory = "auto_gpt_workspace"
# Create the directory if it doesn't exist
if not os.path.exists(working_directory):
os.makedirs(working_directory)
def safe_join(base, *paths):
"""Join one or more path components intelligently."""
new_path = os.path.join(base, *paths)
norm_new_path = os.path.normpath(new_path)
@ -19,9 +21,10 @@ def safe_join(base, *paths):
def read_file(filename):
"""Read a file and return the contents"""
try:
filepath = safe_join(working_directory, filename)
with open(filepath, "r") as f:
with open(filepath, "r", encoding='utf-8') as f:
content = f.read()
return content
except Exception as e:
@ -29,6 +32,7 @@ def read_file(filename):
def write_to_file(filename, text):
"""Write text to a file"""
try:
filepath = safe_join(working_directory, filename)
directory = os.path.dirname(filepath)
@ -42,6 +46,7 @@ def write_to_file(filename, text):
def append_to_file(filename, text):
"""Append text to a file"""
try:
filepath = safe_join(working_directory, filename)
with open(filepath, "a") as f:
@ -52,6 +57,7 @@ def append_to_file(filename, text):
def delete_file(filename):
"""Delete a file"""
try:
filepath = safe_join(working_directory, filename)
os.remove(filepath)

View File

@ -1,10 +1,12 @@
import json
from typing import Any, Dict, Union
from call_ai_function import call_ai_function
from config import Config
from json_utils import correct_json
cfg = Config()
def fix_and_parse_json(json_str: str, try_to_fix_with_gpt: bool = True):
json_schema = """
JSON_SCHEMA = """
{
"command": {
"name": "command name",
@ -23,37 +25,65 @@ def fix_and_parse_json(json_str: str, try_to_fix_with_gpt: bool = True):
}
"""
def fix_and_parse_json(
json_str: str,
try_to_fix_with_gpt: bool = True
) -> Union[str, Dict[Any, Any]]:
"""Fix and parse JSON string"""
try:
json_str = json_str.replace('\t', '')
return json.loads(json_str)
except Exception as e:
# Let's do something manually - sometimes GPT responds with something BEFORE the braces:
# "I'm sorry, I don't understand. Please try again."{"text": "I'm sorry, I don't understand. Please try again.", "confidence": 0.0}
# So let's try to find the first brace and then parse the rest of the string
except json.JSONDecodeError as _: # noqa: F841
json_str = correct_json(json_str)
try:
return json.loads(json_str)
except json.JSONDecodeError as _: # noqa: F841
pass
# Let's do something manually:
# sometimes GPT responds with something BEFORE the braces:
# "I'm sorry, I don't understand. Please try again."
# {"text": "I'm sorry, I don't understand. Please try again.",
# "confidence": 0.0}
# So let's try to find the first brace and then parse the rest
# of the string
try:
brace_index = json_str.index("{")
json_str = json_str[brace_index:]
last_brace_index = json_str.rindex("}")
json_str = json_str[:last_brace_index+1]
return json.loads(json_str)
except Exception as e:
except json.JSONDecodeError as e: # noqa: F841
if try_to_fix_with_gpt:
print(f"Warning: Failed to parse AI output, attempting to fix.\n If you see this warning frequently, it's likely that your prompt is confusing the AI. Try changing it up slightly.")
print("Warning: Failed to parse AI output, attempting to fix."
"\n If you see this warning frequently, it's likely that"
" your prompt is confusing the AI. Try changing it up"
" slightly.")
# Now try to fix this up using the ai_functions
ai_fixed_json = fix_json(json_str, json_schema, False)
ai_fixed_json = fix_json(json_str, JSON_SCHEMA)
if ai_fixed_json != "failed":
return json.loads(ai_fixed_json)
else:
print(f"Failed to fix ai output, telling the AI.") # This allows the AI to react to the error message, which usually results in it correcting its ways.
# This allows the AI to react to the error message,
# which usually results in it correcting its ways.
print("Failed to fix ai output, telling the AI.")
return json_str
else:
raise e
def fix_json(json_str: str, schema: str, debug=False) -> str:
def fix_json(json_str: str, schema: str) -> str:
"""Fix the given JSON string to make it parseable and fully complient with the provided schema."""
# Try to fix the JSON using gpt:
function_string = "def fix_json(json_str: str, schema:str=None) -> str:"
args = [f"'''{json_str}'''", f"'''{schema}'''"]
description_string = """Fixes the provided JSON string to make it parseable and fully complient with the provided schema.\n If an object or field specifed in the schema isn't contained within the correct JSON, it is ommited.\n This function is brilliant at guessing when the format is incorrect."""
description_string = "Fixes the provided JSON string to make it parseable"\
" and fully complient with the provided schema.\n If an object or"\
" field specified in the schema isn't contained within the correct"\
" JSON, it is ommited.\n This function is brilliant at guessing"\
" when the format is incorrect."
# If it doesn't already start with a "`", add one:
if not json_str.startswith("`"):
@ -61,16 +91,17 @@ def fix_json(json_str: str, schema: str, debug=False) -> str:
result_string = call_ai_function(
function_string, args, description_string, model=cfg.fast_llm_model
)
if debug:
if cfg.debug:
print("------------ JSON FIX ATTEMPT ---------------")
print(f"Original JSON: {json_str}")
print("-----------")
print(f"Fixed JSON: {result_string}")
print("----------- END OF FIX ATTEMPT ----------------")
try:
json.loads(result_string) # just check the validity
return result_string
except:
except: # noqa: E722
# Get the call stack:
# import traceback
# call_stack = traceback.format_exc()

127
scripts/json_utils.py Normal file
View File

@ -0,0 +1,127 @@
import re
import json
from config import Config
cfg = Config()
def extract_char_position(error_message: str) -> int:
"""Extract the character position from the JSONDecodeError message.
Args:
error_message (str): The error message from the JSONDecodeError
exception.
Returns:
int: The character position.
"""
import re
char_pattern = re.compile(r'\(char (\d+)\)')
if match := char_pattern.search(error_message):
return int(match[1])
else:
raise ValueError("Character position not found in the error message.")
def add_quotes_to_property_names(json_string: str) -> str:
"""
Add quotes to property names in a JSON string.
Args:
json_string (str): The JSON string.
Returns:
str: The JSON string with quotes added to property names.
"""
def replace_func(match):
return f'"{match.group(1)}":'
property_name_pattern = re.compile(r'(\w+):')
corrected_json_string = property_name_pattern.sub(
replace_func,
json_string)
try:
json.loads(corrected_json_string)
return corrected_json_string
except json.JSONDecodeError as e:
raise e
def balance_braces(json_string: str) -> str:
"""
Balance the braces in a JSON string.
Args:
json_string (str): The JSON string.
Returns:
str: The JSON string with braces balanced.
"""
open_braces_count = json_string.count('{')
close_braces_count = json_string.count('}')
while open_braces_count > close_braces_count:
json_string += '}'
close_braces_count += 1
while close_braces_count > open_braces_count:
json_string = json_string.rstrip('}')
close_braces_count -= 1
try:
json.loads(json_string)
return json_string
except json.JSONDecodeError as e:
pass
def fix_invalid_escape(json_str: str, error_message: str) -> str:
while error_message.startswith('Invalid \\escape'):
bad_escape_location = extract_char_position(error_message)
json_str = json_str[:bad_escape_location] + \
json_str[bad_escape_location + 1:]
try:
json.loads(json_str)
return json_str
except json.JSONDecodeError as e:
if cfg.debug:
print('json loads error - fix invalid escape', e)
error_message = str(e)
return json_str
def correct_json(json_str: str) -> str:
"""
Correct common JSON errors.
Args:
json_str (str): The JSON string.
"""
try:
if cfg.debug:
print("json", json_str)
json.loads(json_str)
return json_str
except json.JSONDecodeError as e:
if cfg.debug:
print('json loads error', e)
error_message = str(e)
if error_message.startswith('Invalid \\escape'):
json_str = fix_invalid_escape(json_str, error_message)
if error_message.startswith('Expecting property name enclosed in double quotes'):
json_str = add_quotes_to_property_names(json_str)
try:
json.loads(json_str)
return json_str
except json.JSONDecodeError as e:
if cfg.debug:
print('json loads error - add quotes', e)
error_message = str(e)
if balanced_str := balance_braces(json_str):
return balanced_str
return json_str

View File

@ -6,6 +6,7 @@ openai.api_key = cfg.openai_api_key
# Overly simple abstraction until we create something better
def create_chat_completion(messages, model=None, temperature=None, max_tokens=None)->str:
"""Create a chat completion using the OpenAI API"""
if cfg.use_azure:
response = openai.ChatCompletion.create(
deployment_id=cfg.openai_deployment_id,

View File

@ -1,22 +1,41 @@
import json
import random
import commands as cmd
from factory import MemoryFactory
import utils
from memory import get_memory
import data
import chat
from colorama import Fore, Style
from spinner import Spinner
import time
import speak
from enum import Enum, auto
import sys
from config import Config
from json_parser import fix_and_parse_json
from ai_config import AIConfig
import traceback
import yaml
import argparse
import logging
cfg = Config()
def configure_logging():
logging.basicConfig(filename='log.txt',
filemode='a',
format='%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
datefmt='%H:%M:%S',
level=logging.DEBUG)
return logging.getLogger('AutoGPT')
def check_openai_api_key():
"""Check if the OpenAI API key is set in config.py or as an environment variable."""
if not cfg.openai_api_key:
print(
Fore.RED +
"Please set your OpenAI API key in config.py or as an environment variable."
)
print("You can get your key from https://beta.openai.com/account/api-keys")
exit(1)
def print_to_console(
title,
@ -25,11 +44,14 @@ def print_to_console(
speak_text=False,
min_typing_speed=0.05,
max_typing_speed=0.01):
"""Prints text to the console with a typing effect"""
global cfg
global logger
if speak_text and cfg.speak_mode:
speak.say_text(f"{title}. {content}")
print(title_color + title + " " + Style.RESET_ALL, end="")
if content:
logger.info(title + ': ' + content)
if isinstance(content, list):
content = " ".join(content)
words = content.split()
@ -46,6 +68,7 @@ def print_to_console(
def print_assistant_thoughts(assistant_reply):
"""Prints the assistant's thoughts to the console"""
global ai_name
global cfg
try:
@ -105,7 +128,7 @@ def print_assistant_thoughts(assistant_reply):
def load_variables(config_file="config.yaml"):
# Load variables from yaml file if it exists
"""Load variables from yaml file if it exists, otherwise prompt the user for input"""
try:
with open(config_file) as file:
config = yaml.load(file, Loader=yaml.FullLoader)
@ -119,12 +142,12 @@ def load_variables(config_file="config.yaml"):
# Prompt the user for input if config file is missing or empty values
if not ai_name:
ai_name = input("Name your AI: ")
ai_name = utils.clean_input("Name your AI: ")
if ai_name == "":
ai_name = "Entrepreneur-GPT"
if not ai_role:
ai_role = input(f"{ai_name} is: ")
ai_role = utils.clean_input(f"{ai_name} is: ")
if ai_role == "":
ai_role = "an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth."
@ -134,7 +157,7 @@ def load_variables(config_file="config.yaml"):
print("Enter nothing to load defaults, enter nothing when finished.")
ai_goals = []
for i in range(5):
ai_goal = input(f"Goal {i+1}: ")
ai_goal = utils.clean_input(f"Goal {i+1}: ")
if ai_goal == "":
break
ai_goals.append(ai_goal)
@ -159,6 +182,7 @@ def load_variables(config_file="config.yaml"):
def construct_prompt():
"""Construct the prompt for the AI to respond to"""
config = AIConfig.load()
if config.ai_name:
print_to_console(
@ -166,7 +190,7 @@ def construct_prompt():
Fore.GREEN,
f"Would you like me to return to being {config.ai_name}?",
speak_text=True)
should_continue = input(f"""Continue with the last settings?
should_continue = utils.clean_input(f"""Continue with the last settings?
Name: {config.ai_name}
Role: {config.ai_role}
Goals: {config.ai_goals}
@ -187,6 +211,7 @@ Continue (y/n): """)
def prompt_user():
"""Prompt the user for input"""
ai_name = ""
# Construct the prompt
print_to_console(
@ -200,7 +225,7 @@ def prompt_user():
"Name your AI: ",
Fore.GREEN,
"For example, 'Entrepreneur-GPT'")
ai_name = input("AI Name: ")
ai_name = utils.clean_input("AI Name: ")
if ai_name == "":
ai_name = "Entrepreneur-GPT"
@ -215,7 +240,7 @@ def prompt_user():
"Describe your AI's role: ",
Fore.GREEN,
"For example, 'an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth.'")
ai_role = input(f"{ai_name} is: ")
ai_role = utils.clean_input(f"{ai_name} is: ")
if ai_role == "":
ai_role = "an AI designed to autonomously develop and run businesses with the sole goal of increasing your net worth."
@ -227,7 +252,7 @@ def prompt_user():
print("Enter nothing to load defaults, enter nothing when finished.", flush=True)
ai_goals = []
for i in range(5):
ai_goal = input(f"{Fore.LIGHTBLUE_EX}Goal{Style.RESET_ALL} {i+1}: ")
ai_goal = utils.clean_input(f"{Fore.LIGHTBLUE_EX}Goal{Style.RESET_ALL} {i+1}: ")
if ai_goal == "":
break
ai_goals.append(ai_goal)
@ -239,6 +264,7 @@ def prompt_user():
return config
def parse_arguments():
"""Parses the arguments passed to the script"""
global cfg
cfg.set_continuous_mode(False)
cfg.set_speak_mode(False)
@ -262,14 +288,23 @@ def parse_arguments():
print_to_console("Speak Mode: ", Fore.GREEN, "ENABLED")
cfg.set_speak_mode(True)
if args.debug:
print_to_console("Debug Mode: ", Fore.GREEN, "ENABLED")
cfg.set_debug_mode(True)
if args.gpt3only:
print_to_console("GPT3.5 Only Mode: ", Fore.GREEN, "ENABLED")
cfg.set_smart_llm_model(cfg.fast_llm_model)
if args.debug:
print_to_console("Debug Mode: ", Fore.GREEN, "ENABLED")
cfg.set_debug_mode(True)
# TODO: fill in llm values here
check_openai_api_key()
cfg = Config()
logger = configure_logging()
parse_arguments()
ai_name = ""
prompt = construct_prompt()
@ -283,8 +318,7 @@ user_input = "Determine which next command to use, and respond using the format
# Initialize memory and make sure it is empty.
# this is particularly important for indexing and referencing pinecone memory
memory = MemoryFactory.get_memory(cfg.memory_provider)
memory.clear()
memory = get_memory(cfg, init=True)
print('Using memory of type: ' + memory.__class__.__name__)
# Interaction Loop
@ -320,7 +354,7 @@ while True:
f"Enter 'y' to authorise command, 'y -N' to run N continuous commands, 'n' to exit program, or enter feedback for {ai_name}...",
flush=True)
while True:
console_input = input(Fore.MAGENTA + "Input:" + Style.RESET_ALL)
console_input = utils.clean_input(Fore.MAGENTA + "Input:" + Style.RESET_ALL)
if console_input.lower() == "y":
user_input = "GENERATE NEXT COMMAND JSON"
break
@ -356,7 +390,7 @@ while True:
f"COMMAND = {Fore.CYAN}{command_name}{Style.RESET_ALL} ARGUMENTS = {Fore.CYAN}{arguments}{Style.RESET_ALL}")
# Execute command
if command_name.lower() == "error":
if command_name.lower().startswith( "error" ):
result = f"Command {command_name} threw the following error: " + arguments
elif command_name == "human_feedback":
result = f"Human feedback: {user_input}"
@ -381,4 +415,3 @@ while True:
chat.create_chat_message(
"system", "Unable to execute command"))
print_to_console("SYSTEM: ", Fore.YELLOW, "Unable to execute command")

View File

@ -0,0 +1,56 @@
from memory.local import LocalCache
try:
from memory.redismem import RedisMemory
except ImportError:
print("Redis not installed. Skipping import.")
RedisMemory = None
try:
from memory.pinecone import PineconeMemory
except ImportError:
print("Pinecone not installed. Skipping import.")
PineconeMemory = None
try:
from memory.weaviate import WeaviateMemory
except ImportError:
print("Weaviate not installed. Skipping import.")
WeaviateMemory = None
def get_memory(cfg, init=False):
memory = None
if cfg.memory_backend == "pinecone":
if not PineconeMemory:
print("Error: Pinecone is not installed. Please install pinecone"
" to use Pinecone as a memory backend.")
else:
memory = PineconeMemory(cfg)
if init:
memory.clear()
elif cfg.memory_backend == "redis":
if not RedisMemory:
print("Error: Redis is not installed. Please install redis-py to"
" use Redis as a memory backend.")
else:
memory = RedisMemory(cfg)
elif cfg.memory_backend == "weaviate":
if not WeaviateMemory:
print("Error: Weaviate is not installed. Please install weaviate-client to"
" use Weaviate as a memory backend.")
else:
memory = WeaviateMemory(cfg)
if memory is None:
memory = LocalCache(cfg)
if init:
memory.clear()
return memory
__all__ = [
"get_memory",
"LocalCache",
"RedisMemory",
"PineconeMemory",
"WeaviateMemory"
]

31
scripts/memory/base.py Normal file
View File

@ -0,0 +1,31 @@
"""Base class for memory providers."""
import abc
from config import AbstractSingleton
import openai
def get_ada_embedding(text):
text = text.replace("\n", " ")
return openai.Embedding.create(input=[text], model="text-embedding-ada-002")["data"][0]["embedding"]
class MemoryProviderSingleton(AbstractSingleton):
@abc.abstractmethod
def add(self, data):
pass
@abc.abstractmethod
def get(self, data):
pass
@abc.abstractmethod
def clear(self):
pass
@abc.abstractmethod
def get_relevant(self, data, num_relevant=5):
pass
@abc.abstractmethod
def get_stats(self):
pass

114
scripts/memory/local.py Normal file
View File

@ -0,0 +1,114 @@
import dataclasses
import orjson
from typing import Any, List, Optional
import numpy as np
import os
from memory.base import MemoryProviderSingleton, get_ada_embedding
EMBED_DIM = 1536
SAVE_OPTIONS = orjson.OPT_SERIALIZE_NUMPY | orjson.OPT_SERIALIZE_DATACLASS
def create_default_embeddings():
return np.zeros((0, EMBED_DIM)).astype(np.float32)
@dataclasses.dataclass
class CacheContent:
texts: List[str] = dataclasses.field(default_factory=list)
embeddings: np.ndarray = dataclasses.field(
default_factory=create_default_embeddings
)
class LocalCache(MemoryProviderSingleton):
# on load, load our database
def __init__(self, cfg) -> None:
self.filename = f"{cfg.memory_index}.json"
if os.path.exists(self.filename):
with open(self.filename, 'rb') as f:
loaded = orjson.loads(f.read())
self.data = CacheContent(**loaded)
else:
self.data = CacheContent()
def add(self, text: str):
"""
Add text to our list of texts, add embedding as row to our
embeddings-matrix
Args:
text: str
Returns: None
"""
if 'Command Error:' in text:
return ""
self.data.texts.append(text)
embedding = get_ada_embedding(text)
vector = np.array(embedding).astype(np.float32)
vector = vector[np.newaxis, :]
self.data.embeddings = np.concatenate(
[
vector,
self.data.embeddings,
],
axis=0,
)
with open(self.filename, 'wb') as f:
out = orjson.dumps(
self.data,
option=SAVE_OPTIONS
)
f.write(out)
return text
def clear(self) -> str:
"""
Clears the redis server.
Returns: A message indicating that the memory has been cleared.
"""
self.data = CacheContent()
return "Obliviated"
def get(self, data: str) -> Optional[List[Any]]:
"""
Gets the data from the memory that is most relevant to the given data.
Args:
data: The data to compare to.
Returns: The most relevant data.
"""
return self.get_relevant(data, 1)
def get_relevant(self, text: str, k: int) -> List[Any]:
""""
matrix-vector mult to find score-for-each-row-of-matrix
get indices for top-k winning scores
return texts for those indices
Args:
text: str
k: int
Returns: List[str]
"""
embedding = get_ada_embedding(text)
scores = np.dot(self.data.embeddings, embedding)
top_k_indices = np.argsort(scores)[-k:][::-1]
return [self.data.texts[i] for i in top_k_indices]
def get_stats(self):
"""
Returns: The stats of the local cache.
"""
return len(self.data.texts), self.data.embeddings.shape

View File

@ -1,14 +1,10 @@
from config import Config
from providers.memory import Memory, get_ada_embedding
import pinecone
cfg = Config()
from memory.base import MemoryProviderSingleton, get_ada_embedding
class PineconeMemory(Memory):
def __init__(self):
# raise an exception if pinecone_api_key or region is not provided
if not cfg.pinecone_api_key or not cfg.pinecone_region: raise Exception("Please provide pinecone_api_key and pinecone_region")
class PineconeMemory(MemoryProviderSingleton):
def __init__(self, cfg):
pinecone_api_key = cfg.pinecone_api_key
pinecone_region = cfg.pinecone_region
pinecone.init(api_key=pinecone_api_key, environment=pinecone_region)

143
scripts/memory/redismem.py Normal file
View File

@ -0,0 +1,143 @@
"""Redis memory provider."""
from typing import Any, List, Optional
import redis
from redis.commands.search.field import VectorField, TextField
from redis.commands.search.query import Query
from redis.commands.search.indexDefinition import IndexDefinition, IndexType
import numpy as np
from memory.base import MemoryProviderSingleton, get_ada_embedding
SCHEMA = [
TextField("data"),
VectorField(
"embedding",
"HNSW",
{
"TYPE": "FLOAT32",
"DIM": 1536,
"DISTANCE_METRIC": "COSINE"
}
),
]
class RedisMemory(MemoryProviderSingleton):
def __init__(self, cfg):
"""
Initializes the Redis memory provider.
Args:
cfg: The config object.
Returns: None
"""
redis_host = cfg.redis_host
redis_port = cfg.redis_port
redis_password = cfg.redis_password
self.dimension = 1536
self.redis = redis.Redis(
host=redis_host,
port=redis_port,
password=redis_password,
db=0 # Cannot be changed
)
self.cfg = cfg
if cfg.wipe_redis_on_start:
self.redis.flushall()
try:
self.redis.ft(f"{cfg.memory_index}").create_index(
fields=SCHEMA,
definition=IndexDefinition(
prefix=[f"{cfg.memory_index}:"],
index_type=IndexType.HASH
)
)
except Exception as e:
print("Error creating Redis search index: ", e)
existing_vec_num = self.redis.get(f'{cfg.memory_index}-vec_num')
self.vec_num = int(existing_vec_num.decode('utf-8')) if\
existing_vec_num else 0
def add(self, data: str) -> str:
"""
Adds a data point to the memory.
Args:
data: The data to add.
Returns: Message indicating that the data has been added.
"""
if 'Command Error:' in data:
return ""
vector = get_ada_embedding(data)
vector = np.array(vector).astype(np.float32).tobytes()
data_dict = {
b"data": data,
"embedding": vector
}
pipe = self.redis.pipeline()
pipe.hset(f"{self.cfg.memory_index}:{self.vec_num}", mapping=data_dict)
_text = f"Inserting data into memory at index: {self.vec_num}:\n"\
f"data: {data}"
self.vec_num += 1
pipe.set(f'{self.cfg.memory_index}-vec_num', self.vec_num)
pipe.execute()
return _text
def get(self, data: str) -> Optional[List[Any]]:
"""
Gets the data from the memory that is most relevant to the given data.
Args:
data: The data to compare to.
Returns: The most relevant data.
"""
return self.get_relevant(data, 1)
def clear(self) -> str:
"""
Clears the redis server.
Returns: A message indicating that the memory has been cleared.
"""
self.redis.flushall()
return "Obliviated"
def get_relevant(
self,
data: str,
num_relevant: int = 5
) -> Optional[List[Any]]:
"""
Returns all the data in the memory that is relevant to the given data.
Args:
data: The data to compare to.
num_relevant: The number of relevant data to return.
Returns: A list of the most relevant data.
"""
query_embedding = get_ada_embedding(data)
base_query = f"*=>[KNN {num_relevant} @embedding $vector AS vector_score]"
query = Query(base_query).return_fields(
"data",
"vector_score"
).sort_by("vector_score").dialect(2)
query_vector = np.array(query_embedding).astype(np.float32).tobytes()
try:
results = self.redis.ft(f"{self.cfg.memory_index}").search(
query, query_params={"vector": query_vector}
)
except Exception as e:
print("Error calling Redis search: ", e)
return None
return [result.data for result in results.docs]
def get_stats(self):
"""
Returns: The stats of the memory index.
"""
return self.redis.ft(f"{self.cfg.memory_index}").info()

View File

@ -1,15 +1,13 @@
from config import Config
from providers.memory import Memory, get_ada_embedding
from weaviate import Client
import weaviate
from memory.base import MemoryProviderSingleton, get_ada_embedding
import uuid
import weaviate
from weaviate import Client
from weaviate.util import generate_uuid5
cfg = Config()
SCHEMA = {
"class": cfg.weaviate_index,
def default_schema(weaviate_index):
return {
"class": weaviate_index,
"properties": [
{
"name": "raw_text",
@ -19,23 +17,22 @@ SCHEMA = {
],
}
class WeaviateMemory(Memory):
def __init__(self):
auth_credentials = self._build_auth_credentials()
class WeaviateMemory(MemoryProviderSingleton):
def __init__(self, cfg):
auth_credentials = self._build_auth_credentials(cfg)
url = f'{cfg.weaviate_host}:{cfg.weaviate_port}'
self.client = Client(url, auth_client_secret=auth_credentials)
self.index = cfg.memory_index
self._create_schema()
def _create_schema(self):
if not self.client.schema.contains(SCHEMA):
self.client.schema.create_class(SCHEMA)
schema = default_schema(self.index)
if not self.client.schema.contains(schema):
self.client.schema.create_class(schema)
@staticmethod
def _build_auth_credentials():
def _build_auth_credentials(self, cfg):
if cfg.weaviate_username and cfg.weaviate_password:
return weaviate_auth.AuthClientPassword(cfg.weaviate_username, cfg.weaviate_password)
else:
@ -44,9 +41,9 @@ class WeaviateMemory(Memory):
def add(self, data):
vector = get_ada_embedding(data)
doc_uuid = generate_uuid5(data, cfg.weaviate_index)
doc_uuid = generate_uuid5(data, self.index)
data_object = {
'class': cfg.weaviate_index,
'class': self.index,
'raw_text': data
}
@ -54,7 +51,7 @@ class WeaviateMemory(Memory):
batch.add_data_object(
uuid=doc_uuid,
data_object=data_object,
class_name=cfg.weaviate_index,
class_name=self.index,
vector=vector
)
@ -80,15 +77,13 @@ class WeaviateMemory(Memory):
def get_relevant(self, data, num_relevant=5):
query_embedding = get_ada_embedding(data)
try:
results = self.client.query.get(cfg.weaviate_index, ['raw_text']) \
results = self.client.query.get(self.index, ['raw_text']) \
.with_near_vector({'vector': query_embedding, 'certainty': 0.7}) \
.with_limit(num_relevant) \
.do()
print(results)
if len(results['data']['Get'][cfg.weaviate_index]) > 0:
return [str(item['raw_text']) for item in results['data']['Get'][cfg.weaviate_index]]
if len(results['data']['Get'][self.index]) > 0:
return [str(item['raw_text']) for item in results['data']['Get'][self.index]]
else:
return []
@ -97,4 +92,9 @@ class WeaviateMemory(Memory):
return []
def get_stats(self):
return self.client.index_stats.get(cfg.weaviate_index)
result = self.client.query.aggregate(self.index) \
.with_meta_count() \
.do()
class_data = result['data']['Aggregate'][self.index]
return class_data[0]['meta'] if class_data else {}

View File

@ -1,26 +0,0 @@
from config import Singleton
import openai
def get_ada_embedding(text):
text = text.replace("\n", " ")
return openai.Embedding.create(input=[text], model="text-embedding-ada-002")["data"][0]["embedding"]
def get_text_from_embedding(embedding):
return openai.Embedding.retrieve(embedding, model="text-embedding-ada-002")["data"][0]["text"]
class Memory(metaclass=Singleton):
def add(self, data):
raise NotImplementedError()
def get(self, data):
raise NotImplementedError()
def clear(self):
raise NotImplementedError()
def get_relevant(self, data, num_relevant=5):
raise NotImplementedError()
def get_stats(self):
raise NotImplementedError()

View File

@ -4,6 +4,8 @@ import requests
from config import Config
cfg = Config()
import gtts
import threading
from threading import Lock, Semaphore
# TODO: Nicer names for these ids
@ -14,7 +16,11 @@ tts_headers = {
"xi-api-key": cfg.elevenlabs_api_key
}
mutex_lock = Lock() # Ensure only one sound is played at a time
queue_semaphore = Semaphore(1) # The amount of sounds to queue before blocking the main thread
def eleven_labs_speech(text, voice_index=0):
"""Speak text using elevenlabs.io's API"""
tts_url = "https://api.elevenlabs.io/v1/text-to-speech/{voice_id}".format(
voice_id=voices[voice_index])
formatted_message = {"text": text}
@ -22,9 +28,10 @@ def eleven_labs_speech(text, voice_index=0):
tts_url, headers=tts_headers, json=formatted_message)
if response.status_code == 200:
with mutex_lock:
with open("speech.mpeg", "wb") as f:
f.write(response.content)
playsound("speech.mpeg")
playsound("speech.mpeg", True)
os.remove("speech.mpeg")
return True
else:
@ -34,15 +41,29 @@ def eleven_labs_speech(text, voice_index=0):
def gtts_speech(text):
tts = gtts.gTTS(text)
with mutex_lock:
tts.save("speech.mp3")
playsound("speech.mp3")
playsound("speech.mp3", True)
os.remove("speech.mp3")
def macos_tts_speech(text):
os.system(f'say "{text}"')
def say_text(text, voice_index=0):
def speak():
if not cfg.elevenlabs_api_key:
if cfg.use_mac_os_tts == 'True':
macos_tts_speech(text)
else:
gtts_speech(text)
else:
success = eleven_labs_speech(text, voice_index)
if not success:
gtts_speech(text)
queue_semaphore.release()
queue_semaphore.acquire(True)
thread = threading.Thread(target=speak)
thread.start()

View File

@ -5,7 +5,9 @@ import time
class Spinner:
"""A simple spinner class"""
def __init__(self, message="Loading...", delay=0.1):
"""Initialize the spinner class"""
self.spinner = itertools.cycle(['-', '/', '|', '\\'])
self.delay = delay
self.message = message
@ -13,6 +15,7 @@ class Spinner:
self.spinner_thread = None
def spin(self):
"""Spin the spinner"""
while self.running:
sys.stdout.write(next(self.spinner) + " " + self.message + "\r")
sys.stdout.flush()
@ -20,11 +23,13 @@ class Spinner:
sys.stdout.write('\b' * (len(self.message) + 2))
def __enter__(self):
"""Start the spinner"""
self.running = True
self.spinner_thread = threading.Thread(target=self.spin)
self.spinner_thread.start()
def __exit__(self, exc_type, exc_value, exc_traceback):
"""Stop the spinner"""
self.running = False
self.spinner_thread.join()
sys.stdout.write('\r' + ' ' * (len(self.message) + 2) + '\r')

8
scripts/utils.py Normal file
View File

@ -0,0 +1,8 @@
def clean_input(prompt: str=''):
try:
return input(prompt)
except KeyboardInterrupt:
print("You interrupted Auto-GPT")
print("Quitting...")
exit(0)

View File

@ -1,55 +0,0 @@
import unittest
from unittest import mock
import sys
import os
sys.path.append(os.path.abspath('./scripts'))
from factory import MemoryFactory
from providers.weaviate import WeaviateMemory
from providers.pinecone import PineconeMemory
class TestMemoryFactory(unittest.TestCase):
def test_invalid_memory_provider(self):
with self.assertRaises(ValueError):
memory = MemoryFactory.get_memory('Thanos')
def test_create_pinecone_provider(self):
# mock the init function of the provider to bypass
# connection to the external pinecone service
def __init__(self):
pass
with mock.patch.object(PineconeMemory, '__init__', __init__):
memory = MemoryFactory.get_memory('pinecone')
self.assertIsInstance(memory, PineconeMemory)
def test_create_weaviate_provider(self):
# mock the init function of the provider to bypass
# connection to the external weaviate service
def __init__(self):
pass
with mock.patch.object(WeaviateMemory, '__init__', __init__):
memory = MemoryFactory.get_memory('weaviate')
self.assertIsInstance(memory, WeaviateMemory)
def test_provider_is_singleton(self):
def __init__(self):
pass
with mock.patch.object(WeaviateMemory, '__init__', __init__):
instance = MemoryFactory.get_memory('weaviate')
other_instance = MemoryFactory.get_memory('weaviate')
self.assertIs(instance, other_instance)
if __name__ == '__main__':
unittest.main()

View File

@ -0,0 +1,99 @@
# Generated by CodiumAI
import requests
import pytest
from scripts.browse import scrape_text
"""
Code Analysis
Objective:
The objective of the "scrape_text" function is to scrape the text content from a given URL and return it as a string, after removing any unwanted HTML tags and scripts.
Inputs:
- url: a string representing the URL of the webpage to be scraped.
Flow:
1. Send a GET request to the given URL using the requests library and the user agent header from the config file.
2. Check if the response contains an HTTP error. If it does, return an error message.
3. Use BeautifulSoup to parse the HTML content of the response and extract all script and style tags.
4. Get the text content of the remaining HTML using the get_text() method of BeautifulSoup.
5. Split the text into lines and then into chunks, removing any extra whitespace.
6. Join the chunks into a single string with newline characters between them.
7. Return the cleaned text.
Outputs:
- A string representing the cleaned text content of the webpage.
Additional aspects:
- The function uses the requests library and BeautifulSoup to handle the HTTP request and HTML parsing, respectively.
- The function removes script and style tags from the HTML to avoid including unwanted content in the text output.
- The function uses a generator expression to split the text into lines and chunks, which can improve performance for large amounts of text.
"""
class TestScrapeText:
# Tests that scrape_text() returns the expected text when given a valid URL.
def test_scrape_text_with_valid_url(self, mocker):
# Mock the requests.get() method to return a response with expected text
expected_text = "This is some sample text"
mock_response = mocker.Mock()
mock_response.status_code = 200
mock_response.text = f"<html><body><div><p style='color: blue;'>{expected_text}</p></div></body></html>"
mocker.patch("requests.get", return_value=mock_response)
# Call the function with a valid URL and assert that it returns the expected text
url = "http://www.example.com"
assert scrape_text(url) == expected_text
# Tests that the function returns an error message when an invalid or unreachable url is provided.
def test_invalid_url(self, mocker):
# Mock the requests.get() method to raise an exception
mocker.patch("requests.get", side_effect=requests.exceptions.RequestException)
# Call the function with an invalid URL and assert that it returns an error message
url = "http://www.invalidurl.com"
error_message = scrape_text(url)
assert "Error:" in error_message
# Tests that the function returns an empty string when the html page contains no text to be scraped.
def test_no_text(self, mocker):
# Mock the requests.get() method to return a response with no text
mock_response = mocker.Mock()
mock_response.status_code = 200
mock_response.text = "<html><body></body></html>"
mocker.patch("requests.get", return_value=mock_response)
# Call the function with a valid URL and assert that it returns an empty string
url = "http://www.example.com"
assert scrape_text(url) == ""
# Tests that the function returns an error message when the response status code is an http error (>=400).
def test_http_error(self, mocker):
# Mock the requests.get() method to return a response with a 404 status code
mocker.patch('requests.get', return_value=mocker.Mock(status_code=404))
# Call the function with a URL
result = scrape_text("https://www.example.com")
# Check that the function returns an error message
assert result == "Error: HTTP 404 error"
# Tests that scrape_text() properly handles HTML tags.
def test_scrape_text_with_html_tags(self, mocker):
# Create a mock response object with HTML containing tags
html = "<html><body><p>This is <b>bold</b> text.</p></body></html>"
mock_response = mocker.Mock()
mock_response.status_code = 200
mock_response.text = html
mocker.patch("requests.get", return_value=mock_response)
# Call the function with a URL
result = scrape_text("https://www.example.com")
# Check that the function properly handles HTML tags
assert result == "This is bold text."

View File

@ -0,0 +1,99 @@
import unittest
from unittest import mock
import sys
import os
from weaviate import Client
from weaviate.util import get_valid_uuid
from uuid import uuid4
sys.path.append(os.path.abspath('./scripts'))
from config import Config
from memory.weaviate import WeaviateMemory
from memory.base import get_ada_embedding
@mock.patch.dict(os.environ, {
"WEAVIATE_HOST": "http://127.0.0.1",
"WEAVIATE_PORT": "8080",
"WEAVIATE_USERNAME": '',
"WEAVIATE_PASSWORD": '',
"MEMORY_INDEX": "AutogptTests"
})
class TestWeaviateMemory(unittest.TestCase):
"""
In order to run these tests you will need a local instance of
Weaviate running. Refer to https://weaviate.io/developers/weaviate/installation/docker-compose
for creating local instances using docker.
"""
def setUp(self):
self.cfg = Config()
self.client = Client('http://127.0.0.1:8080')
try:
self.client.schema.delete_class(self.cfg.memory_index)
except:
pass
self.memory = WeaviateMemory(self.cfg)
def test_add(self):
doc = 'You are a Titan name Thanos and you are looking for the Infinity Stones'
self.memory.add(doc)
result = self.client.query.get(self.cfg.memory_index, ['raw_text']).do()
actual = result['data']['Get'][self.cfg.memory_index]
self.assertEqual(len(actual), 1)
self.assertEqual(actual[0]['raw_text'], doc)
def test_get(self):
doc = 'You are an Avenger and swore to defend the Galaxy from a menace called Thanos'
with self.client.batch as batch:
batch.add_data_object(
uuid=get_valid_uuid(uuid4()),
data_object={'raw_text': doc},
class_name=self.cfg.memory_index,
vector=get_ada_embedding(doc)
)
batch.flush()
actual = self.memory.get(doc)
self.assertEqual(len(actual), 1)
self.assertEqual(actual[0], doc)
def test_get_stats(self):
docs = [
'You are now about to count the number of docs in this index',
'And then you about to find out if you can count correctly'
]
[self.memory.add(doc) for doc in docs]
stats = self.memory.get_stats()
self.assertTrue(stats)
self.assertTrue('count' in stats)
self.assertEqual(stats['count'], 2)
def test_clear(self):
docs = [
'Shame this is the last test for this class',
'Testing is fun when someone else is doing it'
]
[self.memory.add(doc) for doc in docs]
self.assertEqual(self.memory.get_stats()['count'], 2)
self.memory.clear()
self.assertEqual(self.memory.get_stats()['count'], 0)
if __name__ == '__main__':
unittest.main()