Files
iov_data_analysis_agent/.kiro/specs/agent-robustness-optimization/tasks.md

75 lines
5.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Tasks — Agent Robustness Optimization
## Priority 1: Configuration Foundation
- [x] 1. Add new config fields to AppConfig
- [x] 1.1 Add `max_data_context_retries` field (default=2) with `APP_MAX_DATA_CONTEXT_RETRIES` env override to `config/app_config.py`
- [x] 1.2 Add `conversation_window_size` field (default=10) with `APP_CONVERSATION_WINDOW_SIZE` env override to `config/app_config.py`
- [x] 1.3 Add `max_parallel_profiles` field (default=4) with `APP_MAX_PARALLEL_PROFILES` env override to `config/app_config.py`
## Priority 2: Data Privacy Fallback (R1R3)
- [ ] 2. Implement error classification
- [-] 2.1 Add `_classify_error(error_message: str) -> str` method to `DataAnalysisAgent` in `data_analysis_agent.py` with regex patterns for KeyError, ValueError, NameError, empty DataFrame
- [-] 2.2 Add `_extract_column_from_error(error_message: str) -> Optional[str]` function to `utils/data_privacy.py`
- [-] 2.3 Add `_lookup_column_in_profile(column_name, safe_profile) -> Optional[dict]` function to `utils/data_privacy.py`
- [ ] 3. Implement enriched hint generation
- [-] 3.1 Add `generate_enriched_hint(error_message: str, safe_profile: str) -> str` function to `utils/data_privacy.py`
- [-] 3.2 Integrate retry logic into the `analyze()` loop in `data_analysis_agent.py`: add per-round retry counter, call `_classify_error` on failures, generate enriched hint when below retry limit, fall back to normal error handling at limit
## Priority 3: Conversation History Trimming (R4R5)
- [ ] 4. Implement conversation trimming
- [~] 4.1 Add `_trim_conversation_history()` method to `DataAnalysisAgent` implementing sliding window with first-message preservation
- [~] 4.2 Add `_compress_trimmed_messages(messages: list) -> str` method to `DataAnalysisAgent` that generates summary with action types and success/failure, excluding code blocks and raw output
- [~] 4.3 Call `_trim_conversation_history()` at the start of each round in the `analyze()` loop, after the first round
## Priority 4: Analysis Template System (R6R8)
- [ ] 5. Backend template integration
- [~] 5.1 Add optional `template_name` parameter to `DataAnalysisAgent.analyze()` method; retrieve template via `get_template()`, prepend `get_full_prompt()` to user requirement
- [~] 5.2 Add `GET /api/templates` endpoint to `web/main.py` returning `list_templates()` result
- [~] 5.3 Add optional `template` field to `StartRequest` model in `web/main.py`; pass template name to agent in `run_analysis_task`
- [ ] 6. Frontend template selector
- [~] 6.1 Add template selector HTML section (cards above requirement input) to `web/static/index.html`
- [~] 6.2 Add template fetching, selection logic, and "No Template" default to `web/static/script.js`
- [~] 6.3 Add template card styles (`.template-card`, `.template-card.selected`) to `web/static/clean_style.css`
## Priority 5: Frontend Progress Bar (R9)
- [ ] 7. Backend progress updates
- [~] 7.1 Add `set_progress_callback(callback)` method to `DataAnalysisAgent`; call callback at start of each round in `analyze()` loop
- [~] 7.2 Wire progress callback in `run_analysis_task` in `web/main.py` to update `SessionData` progress fields
- [~] 7.3 Add `current_round`, `max_rounds`, `progress_percentage`, `status_message` to `GET /api/status` response in `web/main.py`
- [ ] 8. Frontend progress bar
- [~] 8.1 Add progress bar HTML element below the status bar area in `web/static/index.html`
- [~] 8.2 Add `updateProgressBar(percentage, message)` function to `web/static/script.js`; call it during polling when `is_running` is true; set to 100% on completion
- [~] 8.3 Add progress bar styles with CSS transition animation to `web/static/clean_style.css`
## Priority 6: Multi-File Chunked & Parallel Loading (R10R11)
- [ ] 9. Chunked loading enhancement
- [~] 9.1 Add `_profile_chunked(file_path: str) -> str` function to `utils/data_loader.py` that profiles using first chunk + sampled subsequent chunks
- [~] 9.2 Add `load_and_profile_data_smart(file_paths, max_file_size_mb) -> str` function to `utils/data_loader.py` that selects chunked vs full loading based on file size threshold
- [~] 9.3 Update `DataAnalysisAgent.analyze()` to use smart loader and expose chunked iterator in Code_Executor namespace for large files
- [ ] 10. Parallel profiling
- [~] 10.1 Add `_profile_files_parallel(file_paths: list) -> tuple[str, str]` method to `DataAnalysisAgent` using `ThreadPoolExecutor` with `max_parallel_profiles` workers
- [~] 10.2 Update `DataAnalysisAgent.analyze()` to call `_profile_files_parallel` when multiple files are provided, replacing sequential `build_safe_profile` + `build_local_profile` calls
## Priority 7: Testing
- [ ] 11. Write property-based tests
- [ ] 11.1 ~PBT~ Property test for error classification correctness (Property 1) using `hypothesis`
- [ ] 11.2 ~PBT~ Property test for enriched hint content and privacy (Property 3) using `hypothesis`
- [ ] 11.3 ~PBT~ Property test for env var config override (Property 4) using `hypothesis`
- [ ] 11.4 ~PBT~ Property test for sliding window trimming invariants (Property 5) using `hypothesis`
- [ ] 11.5 ~PBT~ Property test for trimming summary content (Property 6) using `hypothesis`
- [ ] 11.6 ~PBT~ Property test for template prompt integration (Property 7) using `hypothesis`
- [ ] 11.7 ~PBT~ Property test for invalid template error (Property 8) using `hypothesis`
- [ ] 11.8 ~PBT~ Property test for parallel profile merge with error resilience (Property 11) using `hypothesis`
- [ ] 12. Write unit and integration tests
- [ ] 12.1 Unit tests for error classifier with known error messages
- [ ] 12.2 Unit tests for conversation trimming at boundary conditions
- [ ] 12.3 Integration tests for `GET /api/templates` and `POST /api/start` with template field
- [ ] 12.4 Integration tests for `GET /api/status` progress fields