Files
iov_data_analysis_agent/.kiro/specs/agent-robustness-optimization/tasks.md

5.8 KiB
Raw Blame History

Tasks — Agent Robustness Optimization

Priority 1: Configuration Foundation

  • 1. Add new config fields to AppConfig
    • 1.1 Add max_data_context_retries field (default=2) with APP_MAX_DATA_CONTEXT_RETRIES env override to config/app_config.py
    • 1.2 Add conversation_window_size field (default=10) with APP_CONVERSATION_WINDOW_SIZE env override to config/app_config.py
    • 1.3 Add max_parallel_profiles field (default=4) with APP_MAX_PARALLEL_PROFILES env override to config/app_config.py

Priority 2: Data Privacy Fallback (R1R3)

  • 2. Implement error classification
    • [-] 2.1 Add _classify_error(error_message: str) -> str method to DataAnalysisAgent in data_analysis_agent.py with regex patterns for KeyError, ValueError, NameError, empty DataFrame
    • [-] 2.2 Add _extract_column_from_error(error_message: str) -> Optional[str] function to utils/data_privacy.py
    • [-] 2.3 Add _lookup_column_in_profile(column_name, safe_profile) -> Optional[dict] function to utils/data_privacy.py
  • 3. Implement enriched hint generation
    • [-] 3.1 Add generate_enriched_hint(error_message: str, safe_profile: str) -> str function to utils/data_privacy.py
    • [-] 3.2 Integrate retry logic into the analyze() loop in data_analysis_agent.py: add per-round retry counter, call _classify_error on failures, generate enriched hint when below retry limit, fall back to normal error handling at limit

Priority 3: Conversation History Trimming (R4R5)

  • 4. Implement conversation trimming
    • [~] 4.1 Add _trim_conversation_history() method to DataAnalysisAgent implementing sliding window with first-message preservation
    • [~] 4.2 Add _compress_trimmed_messages(messages: list) -> str method to DataAnalysisAgent that generates summary with action types and success/failure, excluding code blocks and raw output
    • [~] 4.3 Call _trim_conversation_history() at the start of each round in the analyze() loop, after the first round

Priority 4: Analysis Template System (R6R8)

  • 5. Backend template integration
    • [~] 5.1 Add optional template_name parameter to DataAnalysisAgent.analyze() method; retrieve template via get_template(), prepend get_full_prompt() to user requirement
    • [~] 5.2 Add GET /api/templates endpoint to web/main.py returning list_templates() result
    • [~] 5.3 Add optional template field to StartRequest model in web/main.py; pass template name to agent in run_analysis_task
  • 6. Frontend template selector
    • [~] 6.1 Add template selector HTML section (cards above requirement input) to web/static/index.html
    • [~] 6.2 Add template fetching, selection logic, and "No Template" default to web/static/script.js
    • [~] 6.3 Add template card styles (.template-card, .template-card.selected) to web/static/clean_style.css

Priority 5: Frontend Progress Bar (R9)

  • 7. Backend progress updates
    • [~] 7.1 Add set_progress_callback(callback) method to DataAnalysisAgent; call callback at start of each round in analyze() loop
    • [~] 7.2 Wire progress callback in run_analysis_task in web/main.py to update SessionData progress fields
    • [~] 7.3 Add current_round, max_rounds, progress_percentage, status_message to GET /api/status response in web/main.py
  • 8. Frontend progress bar
    • [~] 8.1 Add progress bar HTML element below the status bar area in web/static/index.html
    • [~] 8.2 Add updateProgressBar(percentage, message) function to web/static/script.js; call it during polling when is_running is true; set to 100% on completion
    • [~] 8.3 Add progress bar styles with CSS transition animation to web/static/clean_style.css

Priority 6: Multi-File Chunked & Parallel Loading (R10R11)

  • 9. Chunked loading enhancement
    • [~] 9.1 Add _profile_chunked(file_path: str) -> str function to utils/data_loader.py that profiles using first chunk + sampled subsequent chunks
    • [~] 9.2 Add load_and_profile_data_smart(file_paths, max_file_size_mb) -> str function to utils/data_loader.py that selects chunked vs full loading based on file size threshold
    • [~] 9.3 Update DataAnalysisAgent.analyze() to use smart loader and expose chunked iterator in Code_Executor namespace for large files
  • 10. Parallel profiling
    • [~] 10.1 Add _profile_files_parallel(file_paths: list) -> tuple[str, str] method to DataAnalysisAgent using ThreadPoolExecutor with max_parallel_profiles workers
    • [~] 10.2 Update DataAnalysisAgent.analyze() to call _profile_files_parallel when multiple files are provided, replacing sequential build_safe_profile + build_local_profile calls

Priority 7: Testing

  • 11. Write property-based tests
    • 11.1 PBT Property test for error classification correctness (Property 1) using hypothesis
    • 11.2 PBT Property test for enriched hint content and privacy (Property 3) using hypothesis
    • 11.3 PBT Property test for env var config override (Property 4) using hypothesis
    • 11.4 PBT Property test for sliding window trimming invariants (Property 5) using hypothesis
    • 11.5 PBT Property test for trimming summary content (Property 6) using hypothesis
    • 11.6 PBT Property test for template prompt integration (Property 7) using hypothesis
    • 11.7 PBT Property test for invalid template error (Property 8) using hypothesis
    • 11.8 PBT Property test for parallel profile merge with error resilience (Property 11) using hypothesis
  • 12. Write unit and integration tests
    • 12.1 Unit tests for error classifier with known error messages
    • 12.2 Unit tests for conversation trimming at boundary conditions
    • 12.3 Integration tests for GET /api/templates and POST /api/start with template field
    • 12.4 Integration tests for GET /api/status progress fields