Files
iov_data_analysis_agent/.kiro/specs/analysis-dashboard-redesign/tasks.md

103 lines
9.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Tasks: Analysis Dashboard Redesign
## Phase 1: Backend Data Model + API Changes (Foundation)
- [x] 1. Extend SessionData model
- [x] 1.1 Add `rounds: List[Dict]` attribute to `SessionData.__init__()` in `web/main.py`, initialized to empty list
- [x] 1.2 Add `data_files: List[Dict]` attribute to `SessionData.__init__()` in `web/main.py`, initialized to empty list
- [x] 1.3 Update `_reconstruct_session()` to load `rounds` and `data_files` from `results.json` when reconstructing historical sessions
- [x] 1.4 Update `run_analysis_task()` to persist `session.rounds` and `session.data_files` to `results.json` on analysis completion
- [x] 2. Update Status API response
- [x] 2.1 Add `rounds` field to `GET /api/status` response dict, returning `session.rounds`
- [x] 2.2 Verify backward compatibility: ensure `log`, `is_running`, `has_report`, `progress_percentage`, `current_round`, `max_rounds`, `status_message` fields remain unchanged
- [x] 3. Add Data Files API endpoints
- [x] 3.1 Implement `GET /api/data-files` endpoint: return `session.data_files` merged with fallback directory scan for CSV/XLSX files, each entry containing filename, description, rows, cols, size_bytes
- [x] 3.2 Implement `GET /api/data-files/preview` endpoint: read CSV/XLSX via pandas, return `{columns: [...], rows: [...first 5 rows as dicts...]}`; return 404 if file not found
- [x] 3.3 Implement `GET /api/data-files/download` endpoint: return `FileResponse` with correct MIME type (`text/csv` or `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`); return 404 if file not found
- [x] 4. Enhance Report API for evidence linking
- [x] 4.1 Implement `_extract_evidence_annotations(paragraphs, session)` function: parse `<!-- evidence:round_N -->` comments from paragraph content, look up `session.rounds[N-1].evidence_rows`, build `supporting_data` mapping keyed by paragraph ID
- [x] 4.2 Update `GET /api/report` to include `supporting_data` mapping in response JSON
## Phase 2: CodeExecutor Enhancements
- [x] 5. Add evidence capture to CodeExecutor
- [x] 5.1 In `execute_code()`, after successful execution, check if `result.result` is a DataFrame; if so, capture `result.result.head(10).to_dict(orient='records')` as `evidence_rows`; wrap in try/except returning empty list on failure
- [x] 5.2 Also check the last-assigned DataFrame variable in the namespace as a fallback evidence source when `result.result` is not a DataFrame
- [x] 5.3 Include `evidence_rows` key in the returned result dict
- [x] 6. Add DataFrame auto-detection and export
- [x] 6.1 Before `shell.run_cell(code)`, snapshot DataFrame variables: `{name: id(obj) for name, obj in shell.user_ns.items() if isinstance(obj, pd.DataFrame)}`
- [x] 6.2 After execution, compare snapshots to detect new or changed DataFrame variables
- [x] 6.3 For each new DataFrame, export to `{output_dir}/{var_name}.csv` with numeric suffix deduplication if file exists
- [x] 6.4 Record metadata for each export: `{variable_name, filename, rows, cols, columns}` in `auto_exported_files` list
- [x] 6.5 Include `auto_exported_files` key in the returned result dict
- [x] 7. Add DATA_FILE_SAVED marker parsing
- [x] 7.1 After execution, scan `captured.stdout` for lines matching `[DATA_FILE_SAVED] filename: {name}, rows: {count}, description: {desc}`
- [x] 7.2 Parse each marker line and record `{filename, rows, description}` in `prompt_saved_files` list
- [x] 7.3 Include `prompt_saved_files` key in the returned result dict
## Phase 3: Agent Changes
- [x] 8. Structured Round_Data construction in DataAnalysisAgent
- [x] 8.1 Add `_summarize_result(result)` method: produce one-line summary from execution result (e.g., "执行成功,输出 DataFrame (150行×8列)" or "执行失败: {error}")
- [x] 8.2 In `_handle_generate_code()`, construct `round_data` dict with fields: round, reasoning (from `yaml_data.get("reasoning", "")`), code, result_summary, evidence_rows, raw_log, auto_exported_files, prompt_saved_files
- [x] 8.3 After constructing round_data, append it to `SessionData.rounds` (via progress callback or direct reference)
- [x] 8.4 Merge file metadata from `auto_exported_files` and `prompt_saved_files` into `SessionData.data_files`
- [x] 9. Update system prompts
- [x] 9.1 Add intermediate data saving instructions to `data_analysis_system_prompt` in `prompts.py`: instruct LLM to save intermediate results and print `[DATA_FILE_SAVED]` marker
- [x] 9.2 Add evidence annotation instructions to `final_report_system_prompt` in `prompts.py`: instruct LLM to add `<!-- evidence:round_N -->` comments to report paragraphs
- [x] 9.3 Update `_build_final_report_prompt()` in `data_analysis_agent.py` to include collected evidence data from all rounds in the prompt context
## Phase 4: Frontend Tab Restructuring
- [x] 10. HTML restructuring
- [x] 10.1 In `index.html`, replace tab labels: "Live Log" → "执行过程", add "数据文件" tab, keep "Report"; remove "Gallery" tab
- [x] 10.2 Replace the `logsTab` div content with an Execution Process container (`executionTab`) containing a scrollable round-cards wrapper
- [x] 10.3 Add a `datafilesTab` div with a file-cards grid container and a preview panel area
- [x] 10.4 Remove the Gallery tab HTML: carousel container, navigation buttons, image info panel
- [x] 11. JavaScript: Execution Process Tab
- [x] 11.1 Add `lastRenderedRound` state variable and `renderRoundCards(rounds)` function: compare `rounds.length` with `lastRenderedRound`, create and append new Round_Card DOM elements for new entries only
- [x] 11.2 Implement Round_Card HTML generation: collapsed state shows round number + result_summary; expanded state shows reasoning, code (collapsible), result_summary, evidence table ("本轮数据案例"), raw log (collapsible)
- [x] 11.3 Add click handler for Round_Card toggle (collapse/expand)
- [x] 11.4 Add auto-scroll logic: when analysis is running, scroll Execution Process container to bottom after appending new cards
- [x] 12. JavaScript: Data Files Tab
- [x] 12.1 Implement `loadDataFiles()`: fetch `GET /api/data-files`, render file cards showing filename, description, row count
- [x] 12.2 Implement `previewDataFile(filename)`: fetch `GET /api/data-files/preview`, render a table with column headers and up to 5 rows
- [x] 12.3 Implement `downloadDataFile(filename)`: trigger download via `GET /api/data-files/download`
- [x] 12.4 In `startPolling()`, call `loadDataFiles()` on each polling cycle when Data Files tab is active or when analysis is running
- [x] 13. JavaScript: Gallery removal and tab updates
- [x] 13.1 Remove gallery functions: `loadGallery`, `renderGalleryImage`, `prevImage`, `nextImage` and state variables `galleryImages`, `currentImageIndex`
- [x] 13.2 Update `switchTab()` to handle `execution`, `datafiles`, `report` identifiers instead of `logs`, `report`, `gallery`
- [x] 13.3 Update `startPolling()` to call `renderRoundCards()` with `data.rounds` on each polling cycle
- [x] 14. JavaScript: Supporting data in Report
- [x] 14.1 Update `loadReport()` to store `supporting_data` mapping from API response
- [x] 14.2 Update `renderParagraphReport()` to add "查看支撑数据" button below paragraphs that have entries in `supporting_data`
- [x] 14.3 Implement `showSupportingData(paraId)`: display a popover/modal with evidence rows rendered as a table
- [x] 15. CSS updates
- [x] 15.1 Add `.round-card`, `.round-card-header`, `.round-card-body`, `.round-card-collapsed`, `.round-card-expanded` styles
- [x] 15.2 Add `.data-file-card`, `.data-preview-table` styles
- [x] 15.3 Add `.supporting-data-btn`, `.supporting-data-popover` styles
- [x] 15.4 Remove `.carousel-*` styles (carousel-container, carousel-slide, carousel-btn, image-info, image-title, image-desc)
## Phase 5: Property-Based Tests
- [x] 16. Write property-based tests
- [x] 16.1 ~PBT~ Property 1: Round_Data structural completeness — generate random execution results, verify all required fields present with correct types and insertion order preserved
- [x] 16.2 ~PBT~ Property 2: Evidence capture bounded — generate random DataFrames (0-10000 rows, 1-50 cols), verify evidence_rows length <= 10 and each row dict has correct keys
- [x] 16.3 ~PBT~ Property 3: Filename deduplication — generate sequences of same-name exports (1-20), verify all filenames unique
- [x] 16.4 ~PBT~ Property 4: Auto-export metadata completeness — generate random DataFrames, verify metadata contains variable_name, filename, rows, cols, columns with correct values
- [x] 16.5 ~PBT~ Property 5: DATA_FILE_SAVED marker parsing round-trip — generate random filenames/rows/descriptions, verify parse(format(x)) == x
- [x] 16.6 ~PBT~ Property 6: Data file preview bounded rows — generate random CSVs (0-10000 rows), verify preview returns at most 5 rows with correct column names
- [x] 16.7 ~PBT~ Property 7: Evidence annotation parsing — generate random annotated Markdown, verify correct round extraction and non-annotated paragraph exclusion
- [x] 16.8 ~PBT~ Property 8: SessionData JSON round-trip — generate random rounds/data_files, verify serialize then deserialize produces equal data