Files
iov_data_analysis_agent/.kiro/specs/analysis-dashboard-redesign/design.md

19 KiB
Raw Blame History

Design Document: Analysis Dashboard Redesign

Overview

This design transforms the Analysis Dashboard from a raw-log-centric 3-tab layout (Live Log, Report, Gallery) into a structured, evidence-driven 3-tab layout (Execution Process, Data Files, Report). The core architectural change is introducing a Round_Data structured data model that flows from the agent's execution loop through the API to the frontend, replacing the current raw text log approach.

Key design decisions:

  • Round_Data as the central abstraction: Every analysis round produces a structured object containing reasoning, code, result summary, data evidence, and raw log. This single model drives the Execution Process tab, evidence linking, and data file tracking.
  • Auto-detection at the CodeExecutor level: DataFrame detection and CSV export happen transparently in CodeExecutor.execute_code(), requiring no LLM cooperation. Prompt guidance is additive — it encourages the LLM to save files explicitly, but the system doesn't depend on it.
  • Gallery absorbed into Report: Images are already rendered inline via marked.js Markdown parsing. Removing the Gallery tab is a subtraction, not an addition.
  • Evidence linking via HTML comments: The LLM annotates report paragraphs with <!-- evidence:round_N --> comments during final report generation. The backend parses these to build a supporting_data mapping. This is a best-effort approach — missing annotations simply mean no "查看支撑数据" button.

Architecture

flowchart TD
    subgraph Backend
        A[DataAnalysisAgent] -->|produces| B[Round_Data objects]
        A -->|calls| C[CodeExecutor]
        C -->|auto-detects DataFrames| D[CSV export to session dir]
        C -->|captures evidence rows| B
        C -->|parses DATA_FILE_SAVED markers| E[File metadata]
        B -->|stored on| F[SessionData]
        E -->|stored on| F
        F -->|serves| G[GET /api/status]
        F -->|serves| H[GET /api/data-files]
        F -->|serves| I[GET /api/report]
    end

    subgraph Frontend
        G -->|rounds array| J[Execution Process Tab]
        H -->|file list + preview| K[Data Files Tab]
        I -->|paragraphs + supporting_data| L[Report Tab]
    end

Data Flow

  1. Agent loop (DataAnalysisAgent.analyze): Each round calls CodeExecutor.execute_code(), which returns an enriched result dict containing evidence_rows, auto_exported_files, and prompt_saved_files. The agent wraps this into a Round_Data dict and appends it to SessionData.rounds.

  2. Status polling: Frontend polls GET /api/status every 2 seconds. The response now includes a rounds array. The frontend incrementally appends new Round_Card elements — it tracks the last-seen round count and only renders new entries.

  3. Data Files: GET /api/data-files reads SessionData.data_files plus scans the session directory for CSV/XLSX files (fallback discovery). Preview reads the first 5 rows via pandas.

  4. Report with evidence: GET /api/report parses <!-- evidence:round_N --> annotations, looks up SessionData.rounds[N].evidence_rows, and builds a supporting_data mapping keyed by paragraph ID.

Components and Interfaces

1. CodeExecutor Enhancements (utils/code_executor.py)

New behavior in execute_code():

def execute_code(self, code: str) -> Dict[str, Any]:
    """Returns dict with keys: success, output, error, variables,
       evidence_rows, auto_exported_files, prompt_saved_files"""
  • DataFrame snapshot before/after: Before execution, capture {name: id(obj)} for all DataFrame variables. After execution, detect new names or changed id() values.
  • Evidence capture: If the execution result is a DataFrame (via result.result), call .head(10).to_dict(orient='records') to produce evidence_rows. Also check the last assigned DataFrame variable in the namespace.
  • Auto-export: For each newly detected DataFrame, export to {session_dir}/{var_name}.csv with dedup suffix. Record metadata in auto_exported_files list.
  • Marker parsing: Scan captured.stdout for [DATA_FILE_SAVED] lines, parse filename/rows/description, record in prompt_saved_files list.

Interface contract:

# evidence_rows: list[dict]  — up to 10 rows as dicts
# auto_exported_files: list[dict] — [{variable_name, filename, rows, cols, columns}]
# prompt_saved_files: list[dict] — [{filename, rows, description}]

2. DataAnalysisAgent Changes (data_analysis_agent.py)

Round_Data construction in _handle_generate_code() and the main loop:

round_data = {
    "round": self.current_round,
    "reasoning": yaml_data.get("reasoning", ""),
    "code": code,
    "result_summary": self._summarize_result(result),
    "evidence_rows": result.get("evidence_rows", []),
    "raw_log": feedback,
    "auto_exported_files": result.get("auto_exported_files", []),
    "prompt_saved_files": result.get("prompt_saved_files", []),
}

The agent appends round_data to SessionData.rounds (accessed via the progress callback or a direct reference). File metadata from both auto_exported_files and prompt_saved_files is merged into SessionData.data_files.

_summarize_result(): Produces a one-line summary from the execution result — e.g., "执行成功,输出 DataFrame (150行×8列)" or "执行失败: KeyError: 'col_x'".

3. SessionData Extension (web/main.py)

class SessionData:
    def __init__(self, session_id: str):
        # ... existing fields ...
        self.rounds: List[Dict] = []        # Round_Data objects
        self.data_files: List[Dict] = []    # File metadata dicts

Persistence: rounds and data_files are written to results.json on analysis completion (existing pattern).

4. API Changes (web/main.py)

GET /api/status — add rounds to response:

return {
    # ... existing fields ...
    "rounds": session.rounds,
}

GET /api/data-files — new endpoint:

@app.get("/api/data-files")
async def list_data_files(session_id: str = Query(...)):
    # Returns session.data_files + fallback directory scan

GET /api/data-files/preview — new endpoint:

@app.get("/api/data-files/preview")
async def preview_data_file(session_id: str = Query(...), filename: str = Query(...)):
    # Reads CSV/XLSX, returns {columns: [...], rows: [...first 5...]}

GET /api/data-files/download — new endpoint:

@app.get("/api/data-files/download")
async def download_data_file(session_id: str = Query(...), filename: str = Query(...)):
    # Returns FileResponse with appropriate MIME type

GET /api/report — enhanced response:

return {
    "content": content,
    "base_path": web_base_path,
    "paragraphs": paragraphs,
    "supporting_data": supporting_data_map,  # NEW: {paragraph_id: [evidence_rows]}
}

5. Prompt Changes (prompts.py)

Add to data_analysis_system_prompt after the existing code generation rules:

**中间数据保存规则**
- 当你生成了有价值的中间数据筛选子集、聚合表、聚类结果等请主动保存为CSV/XLSX文件。
- 保存后必须打印标记行:`[DATA_FILE_SAVED] filename: {文件名}, rows: {行数}, description: {描述}`
- 示例:
  ```python
  top_issues.to_csv(os.path.join(session_output_dir, "TOP问题汇总.csv"), index=False)
  print(f"[DATA_FILE_SAVED] filename: TOP问题汇总.csv, rows: {len(top_issues)}, description: 各类型TOP问题聚合统计")

Add to `final_report_system_prompt` for evidence annotation:

证据标注规则

  • 当报告段落的结论来源于某一轮分析的数据请在段落末尾添加HTML注释标注<!-- evidence:round_N -->
  • N 为产生该数据的分析轮次编号从1开始
  • 示例某段落描述了第3轮分析发现的车型分布规律则在段落末尾添加 <!-- evidence:round_3 -->

### 6. Frontend Changes

**`index.html`**:
- Replace tab labels: "Live Log" → "执行过程", add "数据文件", keep "Report"
- Remove Gallery tab HTML and carousel container
- Add Execution Process tab container with round card template
- Add Data Files tab container with file card template

**`script.js`**:
- Remove gallery functions and state
- Add `renderRoundCards(rounds)` — incremental rendering using a `lastRenderedRound` counter
- Add `loadDataFiles()`, `previewDataFile(filename)`, `downloadDataFile(filename)`
- Modify `startPolling()` to call `renderRoundCards()` and `loadDataFiles()` on each cycle
- Add `showSupportingData(paraId)` for the evidence popover
- Modify `renderParagraphReport()` to add "查看支撑数据" buttons when `supporting_data[paraId]` exists
- Update `switchTab()` to handle `execution`, `datafiles`, `report`

**`clean_style.css`**:
- Add `.round-card`, `.round-card-header`, `.round-card-body` styles
- Add `.data-file-card`, `.data-preview-table` styles
- Add `.supporting-data-btn`, `.supporting-data-popover` styles
- Remove `.carousel-*` styles

## Data Models

### Round_Data (Python dict)

```python
{
    "round": int,                    # 1-indexed round number
    "reasoning": str,                # LLM reasoning text (may be empty)
    "code": str,                     # Generated Python code
    "result_summary": str,           # One-line execution summary
    "evidence_rows": list[dict],     # Up to 10 rows as [{col: val, ...}]
    "raw_log": str,                  # Full execution feedback text
    "auto_exported_files": list[dict],  # Auto-detected DataFrame exports
    "prompt_saved_files": list[dict],   # LLM-guided file saves
}

File Metadata (Python dict)

{
    "filename": str,          # e.g., "top_issues.csv"
    "description": str,       # Human-readable description
    "rows": int,              # Row count
    "cols": int,              # Column count (optional, may be 0)
    "columns": list[str],     # Column names (optional)
    "size_bytes": int,        # File size
    "source": str,            # "auto" | "prompt" — how the file was created
}

SessionData Extension

class SessionData:
    rounds: List[Dict] = []       # List of Round_Data dicts
    data_files: List[Dict] = []   # List of File Metadata dicts

API Response: GET /api/status (extended)

{
    "is_running": true,
    "log": "...",
    "has_report": false,
    "rounds": [
        {
            "round": 1,
            "reasoning": "正在执行阶段1...",
            "code": "import pandas as pd\n...",
            "result_summary": "执行成功,输出 DataFrame (150行×8列)",
            "evidence_rows": [{"车型": "...", "模块": "..."}],
            "raw_log": "..."
        }
    ],
    "progress_percentage": 25.0,
    "current_round": 1,
    "max_rounds": 20,
    "status_message": "第1/20轮分析中..."
}

API Response: GET /api/data-files

{
    "files": [
        {
            "filename": "top_issues.csv",
            "description": "各类型TOP问题聚合统计",
            "rows": 25,
            "cols": 6,
            "size_bytes": 2048
        }
    ]
}

API Response: GET /api/report (extended)

{
    "content": "...",
    "base_path": "/outputs/session_xxx",
    "paragraphs": [...],
    "supporting_data": {
        "p-3": [{"车型": "A", "模块": "TSP", "数量": 42}],
        "p-7": [{"问题类型": "远控", "占比": "35%"}]
    }
}

Correctness Properties

A property is a characteristic or behavior that should hold true across all valid executions of a system — essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.

Property 1: Round_Data Structural Completeness and Ordering

For any sequence of analysis rounds (varying in count from 1 to N, with varying execution results including successes, failures, and missing YAML fields), every Round_Data object appended to SessionData.rounds SHALL contain all required fields (round, reasoning, code, result_summary, evidence_rows, raw_log) with correct types, and the list SHALL preserve insertion order (i.e., rounds[i].round <= rounds[i+1].round for all consecutive pairs).

Validates: Requirements 1.1, 1.3, 1.4

Property 2: Evidence Capture Bounded and Correctly Serialized

For any DataFrame of arbitrary size (0 to 10,000 rows, 1 to 50 columns) produced by code execution, the evidence capture SHALL return a list of at most 10 dictionaries, where each dictionary's keys exactly match the DataFrame's column names, and the list length equals min(10, len(dataframe)).

Validates: Requirements 4.1, 4.2, 4.3

Property 3: Filename Deduplication Uniqueness

For any sequence of auto-export operations (1 to 20) targeting the same variable name in the same session directory, all generated filenames SHALL be unique (no two exports produce the same filename), and no previously existing file SHALL be overwritten.

Validates: Requirements 5.3

Property 4: Auto-Export Metadata Completeness

For any newly detected DataFrame variable (with arbitrary variable name, row count, column count, and column names), the auto-export metadata dict SHALL contain all required fields (variable_name, filename, rows, cols, columns) with values matching the source DataFrame's actual properties.

Validates: Requirements 5.4, 5.5

Property 5: DATA_FILE_SAVED Marker Parsing Round-Trip

For any valid filename string (alphanumeric, Chinese characters, underscores, hyphens, with .csv or .xlsx extension), any positive integer row count, and any non-empty description string, formatting these values into the standardized marker format [DATA_FILE_SAVED] filename: {name}, rows: {count}, description: {desc} and then parsing the marker SHALL recover the original filename, row count, and description exactly.

Validates: Requirements 6.3

Property 6: Data File Preview Bounded Rows

For any CSV file containing 0 to 10,000 rows and 1 to 50 columns, the preview function SHALL return a result with columns matching the file's column names exactly, and rows containing at most 5 dictionaries, where each dictionary's keys match the column names.

Validates: Requirements 7.2

Property 7: Evidence Annotation Parsing Correctness

For any Markdown report text containing a mix of paragraphs with and without <!-- evidence:round_N --> annotations (where N varies from 1 to 100), the annotation parser SHALL: (a) correctly extract the round number for every annotated paragraph, (b) exclude non-annotated paragraphs from the supporting_data mapping, and (c) produce a mapping where each key is a valid paragraph ID and each value references a valid round number.

Validates: Requirements 11.3, 11.4

Property 8: SessionData JSON Serialization Round-Trip

For any SessionData instance with arbitrary rounds (list of Round_Data dicts) and data_files (list of file metadata dicts), serializing these to JSON and deserializing back SHALL produce lists that are equal to the originals.

Validates: Requirements 12.4

Error Handling

CodeExecutor Errors

  • DataFrame evidence capture failure: If .head(10).to_dict(orient='records') raises an exception (e.g., mixed types, memory issues), catch the exception and return an empty evidence_rows list. Log a warning but do not fail the execution.
  • Auto-export failure: If CSV writing fails for a detected DataFrame (e.g., permission error, disk full), catch the exception, log a warning with the variable name, and skip that export. Other detected DataFrames should still be exported.
  • Marker parsing failure: If a [DATA_FILE_SAVED] line doesn't match the expected format, skip it silently. Malformed markers should not crash the execution pipeline.

API Errors

  • Missing session: All new endpoints return HTTP 404 with {"detail": "Session not found"} for invalid session IDs.
  • Missing file: GET /api/data-files/preview and GET /api/data-files/download return HTTP 404 with {"detail": "File not found: {filename}"} when the requested file doesn't exist in the session directory.
  • Corrupt CSV: If a CSV file can't be read by pandas during preview, return HTTP 500 with {"detail": "Failed to read file: {error}"}.

Frontend Errors

  • Polling with missing rounds: If rounds is undefined or null in the status response, treat it as an empty array. Don't crash the rendering loop.
  • Evidence popover with empty data: If supporting_data[paraId] is an empty array, don't show the button (same as missing).
  • Incremental rendering mismatch: If rounds.length < lastRenderedRound (server restart scenario), reset lastRenderedRound to 0 and re-render all cards.

Agent Errors

  • Missing reasoning field: Already handled — store empty string (Requirement 1.4).
  • Evidence annotation missing: Already handled — paragraphs without annotations simply don't get supporting data buttons. This is by design, not an error.

Testing Strategy

Property-Based Tests (Hypothesis)

The project already uses hypothesis with max_examples=20 for fast execution (see tests/test_properties.py). New property tests will follow the same pattern.

Library: hypothesis (already installed) Configuration: max_examples=100 minimum per property (increased from existing 20 for new properties) Tag format: Feature: analysis-dashboard-redesign, Property {N}: {title}

Properties to implement:

  1. Round_Data structural completeness — Generate random execution results, verify Round_Data fields
  2. Evidence capture bounded — Generate random DataFrames, verify evidence row count and format
  3. Filename deduplication — Generate sequences of same-name exports, verify uniqueness
  4. Auto-export metadata — Generate random DataFrames, verify metadata fields
  5. Marker parsing round-trip — Generate random filenames/rows/descriptions, verify parse(format(x)) == x
  6. Preview bounded rows — Generate random CSVs, verify preview row count and columns
  7. Evidence annotation parsing — Generate random annotated Markdown, verify extraction
  8. SessionData JSON round-trip — Generate random rounds/data_files, verify serialize/deserialize identity

Unit Tests

  • Prompt content assertions (6.1, 6.2, 11.2): Verify prompt strings contain required instruction text
  • SessionData initialization (12.1, 12.2): Verify new attributes exist with correct defaults
  • API response shape (2.1, 2.3): Verify status endpoint returns rounds array and log field
  • Tab switching (9.4): Verify switchTab handles new tab identifiers

Integration Tests

  • End-to-end round capture: Run a mini analysis session, verify rounds are populated
  • Data file API flow: Create files, call list/preview/download endpoints, verify responses
  • Report evidence linking: Generate a report with annotations, call report API, verify supporting_data mapping

Manual Testing

  • UI layout verification (3.1-3.6, 8.1-8.5, 9.1-9.3, 10.1-10.4): Visual inspection of tab layout, round cards, data file cards, inline images, and supporting data popovers