Files
iov_data_analysis_agent/.kiro/specs/analysis-dashboard-redesign/tasks.md

9.1 KiB
Raw Blame History

Tasks: Analysis Dashboard Redesign

Phase 1: Backend Data Model + API Changes (Foundation)

  • 1. Extend SessionData model

    • 1.1 Add rounds: List[Dict] attribute to SessionData.__init__() in web/main.py, initialized to empty list
    • 1.2 Add data_files: List[Dict] attribute to SessionData.__init__() in web/main.py, initialized to empty list
    • 1.3 Update _reconstruct_session() to load rounds and data_files from results.json when reconstructing historical sessions
    • 1.4 Update run_analysis_task() to persist session.rounds and session.data_files to results.json on analysis completion
  • 2. Update Status API response

    • 2.1 Add rounds field to GET /api/status response dict, returning session.rounds
    • 2.2 Verify backward compatibility: ensure log, is_running, has_report, progress_percentage, current_round, max_rounds, status_message fields remain unchanged
  • 3. Add Data Files API endpoints

    • 3.1 Implement GET /api/data-files endpoint: return session.data_files merged with fallback directory scan for CSV/XLSX files, each entry containing filename, description, rows, cols, size_bytes
    • 3.2 Implement GET /api/data-files/preview endpoint: read CSV/XLSX via pandas, return {columns: [...], rows: [...first 5 rows as dicts...]}; return 404 if file not found
    • 3.3 Implement GET /api/data-files/download endpoint: return FileResponse with correct MIME type (text/csv or application/vnd.openxmlformats-officedocument.spreadsheetml.sheet); return 404 if file not found
  • 4. Enhance Report API for evidence linking

    • 4.1 Implement _extract_evidence_annotations(paragraphs, session) function: parse <!-- evidence:round_N --> comments from paragraph content, look up session.rounds[N-1].evidence_rows, build supporting_data mapping keyed by paragraph ID
    • 4.2 Update GET /api/report to include supporting_data mapping in response JSON

Phase 2: CodeExecutor Enhancements

  • 5. Add evidence capture to CodeExecutor

    • 5.1 In execute_code(), after successful execution, check if result.result is a DataFrame; if so, capture result.result.head(10).to_dict(orient='records') as evidence_rows; wrap in try/except returning empty list on failure
    • 5.2 Also check the last-assigned DataFrame variable in the namespace as a fallback evidence source when result.result is not a DataFrame
    • 5.3 Include evidence_rows key in the returned result dict
  • 6. Add DataFrame auto-detection and export

    • 6.1 Before shell.run_cell(code), snapshot DataFrame variables: {name: id(obj) for name, obj in shell.user_ns.items() if isinstance(obj, pd.DataFrame)}
    • 6.2 After execution, compare snapshots to detect new or changed DataFrame variables
    • 6.3 For each new DataFrame, export to {output_dir}/{var_name}.csv with numeric suffix deduplication if file exists
    • 6.4 Record metadata for each export: {variable_name, filename, rows, cols, columns} in auto_exported_files list
    • 6.5 Include auto_exported_files key in the returned result dict
  • 7. Add DATA_FILE_SAVED marker parsing

    • 7.1 After execution, scan captured.stdout for lines matching [DATA_FILE_SAVED] filename: {name}, rows: {count}, description: {desc}
    • 7.2 Parse each marker line and record {filename, rows, description} in prompt_saved_files list
    • 7.3 Include prompt_saved_files key in the returned result dict

Phase 3: Agent Changes

  • 8. Structured Round_Data construction in DataAnalysisAgent

    • 8.1 Add _summarize_result(result) method: produce one-line summary from execution result (e.g., "执行成功,输出 DataFrame (150行×8列)" or "执行失败: {error}")
    • 8.2 In _handle_generate_code(), construct round_data dict with fields: round, reasoning (from yaml_data.get("reasoning", "")), code, result_summary, evidence_rows, raw_log, auto_exported_files, prompt_saved_files
    • 8.3 After constructing round_data, append it to SessionData.rounds (via progress callback or direct reference)
    • 8.4 Merge file metadata from auto_exported_files and prompt_saved_files into SessionData.data_files
  • 9. Update system prompts

    • 9.1 Add intermediate data saving instructions to data_analysis_system_prompt in prompts.py: instruct LLM to save intermediate results and print [DATA_FILE_SAVED] marker
    • 9.2 Add evidence annotation instructions to final_report_system_prompt in prompts.py: instruct LLM to add <!-- evidence:round_N --> comments to report paragraphs
    • 9.3 Update _build_final_report_prompt() in data_analysis_agent.py to include collected evidence data from all rounds in the prompt context

Phase 4: Frontend Tab Restructuring

  • 10. HTML restructuring

    • 10.1 In index.html, replace tab labels: "Live Log" → "执行过程", add "数据文件" tab, keep "Report"; remove "Gallery" tab
    • 10.2 Replace the logsTab div content with an Execution Process container (executionTab) containing a scrollable round-cards wrapper
    • 10.3 Add a datafilesTab div with a file-cards grid container and a preview panel area
    • 10.4 Remove the Gallery tab HTML: carousel container, navigation buttons, image info panel
  • 11. JavaScript: Execution Process Tab

    • 11.1 Add lastRenderedRound state variable and renderRoundCards(rounds) function: compare rounds.length with lastRenderedRound, create and append new Round_Card DOM elements for new entries only
    • 11.2 Implement Round_Card HTML generation: collapsed state shows round number + result_summary; expanded state shows reasoning, code (collapsible), result_summary, evidence table ("本轮数据案例"), raw log (collapsible)
    • 11.3 Add click handler for Round_Card toggle (collapse/expand)
    • 11.4 Add auto-scroll logic: when analysis is running, scroll Execution Process container to bottom after appending new cards
  • 12. JavaScript: Data Files Tab

    • 12.1 Implement loadDataFiles(): fetch GET /api/data-files, render file cards showing filename, description, row count
    • 12.2 Implement previewDataFile(filename): fetch GET /api/data-files/preview, render a table with column headers and up to 5 rows
    • 12.3 Implement downloadDataFile(filename): trigger download via GET /api/data-files/download
    • 12.4 In startPolling(), call loadDataFiles() on each polling cycle when Data Files tab is active or when analysis is running
  • 13. JavaScript: Gallery removal and tab updates

    • 13.1 Remove gallery functions: loadGallery, renderGalleryImage, prevImage, nextImage and state variables galleryImages, currentImageIndex
    • 13.2 Update switchTab() to handle execution, datafiles, report identifiers instead of logs, report, gallery
    • 13.3 Update startPolling() to call renderRoundCards() with data.rounds on each polling cycle
  • 14. JavaScript: Supporting data in Report

    • 14.1 Update loadReport() to store supporting_data mapping from API response
    • 14.2 Update renderParagraphReport() to add "查看支撑数据" button below paragraphs that have entries in supporting_data
    • 14.3 Implement showSupportingData(paraId): display a popover/modal with evidence rows rendered as a table
  • 15. CSS updates

    • 15.1 Add .round-card, .round-card-header, .round-card-body, .round-card-collapsed, .round-card-expanded styles
    • 15.2 Add .data-file-card, .data-preview-table styles
    • 15.3 Add .supporting-data-btn, .supporting-data-popover styles
    • 15.4 Remove .carousel-* styles (carousel-container, carousel-slide, carousel-btn, image-info, image-title, image-desc)

Phase 5: Property-Based Tests

  • 16. Write property-based tests
    • 16.1 PBT Property 1: Round_Data structural completeness — generate random execution results, verify all required fields present with correct types and insertion order preserved
    • 16.2 PBT Property 2: Evidence capture bounded — generate random DataFrames (0-10000 rows, 1-50 cols), verify evidence_rows length <= 10 and each row dict has correct keys
    • 16.3 PBT Property 3: Filename deduplication — generate sequences of same-name exports (1-20), verify all filenames unique
    • 16.4 PBT Property 4: Auto-export metadata completeness — generate random DataFrames, verify metadata contains variable_name, filename, rows, cols, columns with correct values
    • 16.5 PBT Property 5: DATA_FILE_SAVED marker parsing round-trip — generate random filenames/rows/descriptions, verify parse(format(x)) == x
    • 16.6 PBT Property 6: Data file preview bounded rows — generate random CSVs (0-10000 rows), verify preview returns at most 5 rows with correct column names
    • 16.7 PBT Property 7: Evidence annotation parsing — generate random annotated Markdown, verify correct round extraction and non-annotated paragraph exclusion
    • 16.8 PBT Property 8: SessionData JSON round-trip — generate random rounds/data_files, verify serialize then deserialize produces equal data