AutoGPT/autogpt_platform/backend/migrations/20250527091422_node_executi...
Swifty 17e973a8cb
fix(platform): Optimise Query Indexes (#10038)
# Query Optimization for AgentNodeExecution Tables

## Overview
This PR describes the database index optimizations applied to improve
the performance of slow queries in the AutoGPT platform backend.

## Problem Analysis
The following queries were identified as consuming significant database
resources:

### 1. Complex Filtering Query (19.3% of total time)
```sql
SELECT ... FROM "AgentNodeExecution" 
WHERE "agentNodeId" = $1 
  AND "agentGraphExecutionId" = $2 
  AND "executionStatus" = $3 
  AND "id" NOT IN (
    SELECT "referencedByInputExecId" 
    FROM "AgentNodeExecutionInputOutput" 
    WHERE "name" = $4 AND "referencedByInputExecId" IS NOT NULL
  )
ORDER BY "addedTime" ASC
```

### 2. Multi-table JOIN Query (8.9% of total time)
```sql
SELECT ... FROM "AgentNodeExecution" 
LEFT JOIN "AgentNode" ON ...
LEFT JOIN "AgentBlock" ON ...
WHERE "AgentBlock"."id" IN (...) 
  AND "executionStatus" != $11
  AND "agentGraphExecutionId" IN (...)
ORDER BY "queuedTime" DESC, "addedTime" DESC
```

### 3. Bulk Graph Execution Queries (multiple variations, ~10% combined)
Multiple queries filtering by `agentGraphExecutionId` with various
ordering requirements.

## Optimization Strategy

### 1. Composite Indexes for AgentNodeExecution

Set the following composite indexes to the `AgentNodeExecution` model:

```prisma
@@index([agentGraphExecutionId, agentNodeId, executionStatus])
@@index([addedTime, queuedTime])
```

#### Benefits:
- **Index 1**: Covers the exact WHERE clause of the complex filtering
query, allowing index-only scans
- **Index 2**: Optimizes queries filtering by graph execution and status
- **Index 3**: Supports efficient sorting when filtering by graph
execution
- **Index 4**: Optimizes ORDER BY operations on time fields

### 2. Composite Index for AgentNodeExecutionInputOutput

Added the following composite index:

```prisma
  // Input and Output pin names are unique for each AgentNodeExecution.
  @@unique([referencedByInputExecId, referencedByOutputExecId, name])
  @@index([referencedByOutputExecId])
  // Composite index for `upsert_execution_input`.
  @@index([name, time])
```

#### Benefits:
- Dramatically improves the NOT IN subquery performance in Query 1
- Allows the database to use an index scan instead of a full table scan
- Reduces the subquery execution time from O(n) to O(log n)

## Expected Performance Improvements

1. **Query 1 (19.3% of total time)**: 
   - Expected improvement: 80-90% reduction in execution time
- The composite index on `[agentNodeId, agentGraphExecutionId,
executionStatus]` will allow index-only scans
- The subquery will benefit from the new index on
`AgentNodeExecutionInputOutput`

2. **Query 2 (8.9% of total time)**:
   - Expected improvement: 50-70% reduction in execution time
- The `[agentGraphExecutionId, executionStatus]` index will reduce the
initial filtering cost

3. **Bulk Queries (10% combined)**:
   - Expected improvement: 60-80% reduction in execution time
- Composite indexes including time fields will optimize sorting
operations

## Migration Considerations

1. **Index Creation Time**: Creating these indexes on existing large
tables may take time
2. **Storage Impact**: Each index requires additional storage space
3. **Write Performance**: Slight decrease in INSERT/UPDATE performance
due to index maintenance


## Additional Optimizations

### NotificationEvent Table Index

Added index for notification batch queries:

```prisma
@@index([userNotificationBatchId])
```

This optimizes the query:
```sql
SELECT ... FROM "NotificationEvent" 
WHERE "userNotificationBatchId" IN (...)
```

#### Benefits:
- Eliminates full table scans when filtering by batch ID
- Improves query performance from O(n) to O(log n)
- Particularly beneficial for users with many notification events

## Future Optimizations

Consider these additional optimizations if needed:
1. Partitioning `AgentNodeExecution` table by `createdAt` or
`agentGraphExecutionId`
2. Implementing materialized views for frequently accessed aggregate
data
3. Adding covering indexes for specific query patterns
4. Implementing query result caching at the application level

---------

Co-authored-by: Zamil Majdy <zamil.majdy@agpt.co>
2025-05-27 09:20:36 +00:00
..
migration.sql fix(platform): Optimise Query Indexes (#10038) 2025-05-27 09:20:36 +00:00