History

Dave Page 990b4f5e54 Fix an issue where the AI Assistant was not retaining conversation context between messages, with chat history compaction to manage token budgets. * Address CodeRabbit review feedback for chat context and compaction. - Track tool-use turns as groups instead of one-to-one pairs, so multi-tool assistant messages don't leave orphaned results. - Add fallback to shrink the recent window when protected messages alone exceed the token budget, preventing compaction no-ops. - Fix low-value test fixtures to keep transient messages short so they actually classify as low-importance. - Guard Clear button against in-flight stream race conditions by adding a clearedRef flag and cancelling active streams. - Assert that conversation history is actually passed through to chat_with_database in the "With History" test. * Address remaining CodeRabbit review feedback for compaction module. - Expand protected set to cover full tool groups, preventing orphaned tool call/result messages when a turn straddles the recent window. - Add input validation in deserialize_history() for non-list/non-dict data. - Strengthen test assertion for preserved recent window tail. * Fix CI test failures in compaction and NLQ chat tests. - Lower max_tokens budget in test_drops_low_value to reliably force compaction (500 was borderline, use 200). - Consume SSE response data before asserting mock calls in NLQ chat test, since Flask's streaming generator only executes on iteration. * Clarify mock patch target in NLQ chat test. Add comment explaining why we patch the source module rather than the use site: the endpoint uses a local import inside the function body, so there is no module-level binding to patch.		2026-03-16 19:02:36 +05:30
..
README.md	Fix an issue where the AI Assistant was not retaining conversation context between messages, with chat history compaction to manage token budgets.	2026-03-16 19:02:36 +05:30
__init__.py	Core LLM integration infrastructure to allow pgAdmin to connect to AI providers. #9641	2026-02-17 17:16:06 +05:30
test_compaction.py	Fix an issue where the AI Assistant was not retaining conversation context between messages, with chat history compaction to manage token budgets.	2026-03-16 19:02:36 +05:30
test_llm_status.py	Core LLM integration infrastructure to allow pgAdmin to connect to AI providers. #9641	2026-02-17 17:16:06 +05:30
test_report_endpoints.py	Core LLM integration infrastructure to allow pgAdmin to connect to AI providers. #9641	2026-02-17 17:16:06 +05:30

README.md

LLM Module Tests

This directory contains comprehensive tests for the pgAdmin LLM/AI functionality.

Test Files

Python Tests

`test_client.py` - LLM Client Tests

Tests the core LLM client functionality including:

Provider initialization (Anthropic, OpenAI, Ollama)
API key loading from files and environment variables
Graceful handling of missing API keys
User preference overrides
Provider selection logic
Whitespace handling in API keys

Key Features:

Tests pass even without API keys configured
Mocks external API calls
Tests all three provider types

`test_reports.py` - Report Generation Tests

Tests report generation functionality including:

Security, performance, and design report types
Server, database, and schema level reports
Report request validation
Progress callback functionality
Error handling during generation
Markdown formatting

Key Features:

Tests data collection from PostgreSQL
Validates report structure
Tests streaming progress updates

`test_chat.py` - Chat Session Tests

Tests interactive chat functionality including:

Chat session initialization
Message history management
Context passing (database, SQL queries)
Streaming responses
Token counting for context management
Maximum history limits
Error handling

Key Features:

Tests conversation flow
Validates context integration
Tests memory management

`test_compaction.py` - Conversation Compaction Tests

Tests the conversation history compaction module including:

Token estimation with provider-specific ratios
SQL content token multiplier
History compaction with token budget enforcement
First message and recent window preservation
Low-value message dropping by importance classification
Tool call/result pair integrity during compaction
History deserialization from frontend JSON format
Conversational message filtering (stripping tool internals)

Key Features:

Tests all five importance classification tiers
Validates tool pair preservation (no orphaned tool results)
Tests round-trip serialization/deserialization
Tests edge cases (empty history, within-budget, unknown roles)

`test_views.py` - API Endpoint Tests

Tests Flask endpoints including:

/llm/status - LLM availability check
/llm/reports/security/* - Security report endpoints
/llm/reports/performance/* - Performance report endpoints
/llm/reports/design/* - Design review endpoints
/llm/chat - Chat endpoint
Streaming endpoints with SSE

Key Features:

Tests authentication and permissions
Tests API error responses
Tests SSE streaming format

JavaScript Tests

`AIReport.spec.js` - AIReport Component Tests

Tests the React component for AI report display including:

Component rendering in light and dark modes
Theme detection from body styles
Progress display during generation
Error handling
Markdown rendering
Download functionality
SSE event handling
Support for all report categories and types

Key Features:

Tests with React Testing Library
Mocks EventSource for SSE
Tests theme transitions
Validates accessibility

Running the Tests

Python Tests

From the web directory:

# Run all LLM tests
python -m pytest pgadmin/llm/tests/

# Run specific test file
python -m pytest pgadmin/llm/tests/test_client.py

# Run specific test case
python -m pytest pgadmin/llm/tests/test_client.py::LLMClientTestCase::test_anthropic_provider_with_api_key

# Run with coverage
python -m pytest --cov=pgadmin/llm pgadmin/llm/tests/

JavaScript Tests

From the web directory:

# Run all JavaScript tests
yarn run test:karma

# Run specific test file
yarn run test:karma -- --file regression/javascript/llm/AIReport.spec.js

Test Coverage

What's Tested

✅ LLM client initialization with all providers ✅ API key loading from files and environment ✅ Graceful handling of missing API keys ✅ User preference overrides ✅ Report generation for all categories (security, performance, design) ✅ Report generation for all levels (server, database, schema) ✅ Chat session management and history ✅ Conversation history compaction and token budgets ✅ Conversational message filtering ✅ History serialization/deserialization round-trip ✅ Streaming progress updates via SSE ✅ API endpoint authentication and authorization ✅ React component rendering in both themes ✅ Dark mode text color detection ✅ Error handling throughout the stack

What's Mocked

External LLM API calls (Anthropic, OpenAI, Ollama)
PostgreSQL database connections
File system access for API keys
EventSource for SSE streaming
Theme detection (window.getComputedStyle)

Environment Variables for Testing

These environment variables can be set for integration testing with real APIs:

# For Anthropic
export ANTHROPIC_API_KEY="your-api-key"

# For OpenAI
export OPENAI_API_KEY="your-api-key"

# For Ollama
export OLLAMA_API_URL="http://localhost:11434"

Note: Tests are designed to pass without these variables set. They will mock API responses when keys are not available.

Test Philosophy

Graceful Degradation: All tests pass even without API keys configured
Mocking by Default: External APIs are mocked to avoid dependencies
Comprehensive Coverage: Tests cover happy paths, error cases, and edge cases
Documentation: Tests serve as documentation for expected behavior
Integration Ready: Tests can be run with real APIs when keys are provided

Adding New Tests

When adding new functionality to the LLM module:

Add unit tests to the appropriate test file
Mock external dependencies
Test both success and failure cases
Test with and without API keys/configuration
Update this README with new test coverage

Troubleshooting

Common Issues

Import errors: Make sure you're running tests from the web directory

API key warnings: These are expected - tests should pass without API keys

Theme mocking errors: Ensure fake_theme.js is available in regression/javascript/

EventSource not found: This is mocked in JavaScript tests, ensure mocks are properly set up

README.md

LLM Module Tests

Test Files

Python Tests

test_client.py - LLM Client Tests

test_reports.py - Report Generation Tests

test_chat.py - Chat Session Tests

test_compaction.py - Conversation Compaction Tests

test_views.py - API Endpoint Tests

JavaScript Tests

AIReport.spec.js - AIReport Component Tests

Running the Tests

Python Tests

JavaScript Tests

Test Coverage

What's Tested

What's Mocked

Environment Variables for Testing

Test Philosophy

Adding New Tests

Troubleshooting

Common Issues

`test_client.py` - LLM Client Tests

`test_reports.py` - Report Generation Tests

`test_chat.py` - Chat Session Tests

`test_compaction.py` - Conversation Compaction Tests

`test_views.py` - API Endpoint Tests

`AIReport.spec.js` - AIReport Component Tests