Преглед на файлове

Add AI automation foundation

codex преди 1 месец
родител
ревизия
398fa25833

+ 406 - 3
api-docs.md

@@ -60,6 +60,121 @@
 
 内置标签不可删除。
 
+## AI 配置与调用
+
+AI 模块分为项目内部统一 AI 服务层和具体供应商 API 对接层。当前支持:
+
+| 类型 | 说明 |
+| --- | --- |
+| OPENAI | OpenAI Chat Completions API,默认 Base URL 为 `https://api.openai.com/v1` |
+| OPENAI_COMPATIBLE | OpenAI 兼容格式 API,适用于 LM Studio、本地模型服务或其他兼容服务 |
+| GOOGLE_GEMINI | Google Gemini Generate Content API,默认 Base URL 为 `https://generativelanguage.googleapis.com/v1beta` |
+
+服务商 API Key 可为空,接口返回时只返回 `api_key_set`,不会返回明文 Key。
+
+### 查询 AI 服务商
+
+`GET /api/ai/providers`
+
+### 新增 AI 服务商
+
+`POST /api/ai/providers`
+
+```json
+{
+  "name": "本地 LM Studio",
+  "provider_type": "OPENAI_COMPATIBLE",
+  "base_url": "http://127.0.0.1:1234/v1",
+  "api_key": "",
+  "enabled": true
+}
+```
+
+### 更新 AI 服务商
+
+`PATCH /api/ai/providers/{provider_id}`
+
+```json
+{
+  "name": "OpenAI",
+  "provider_type": "OPENAI",
+  "base_url": "https://api.openai.com/v1",
+  "api_key": "sk-xxxx",
+  "clear_api_key": false,
+  "enabled": true
+}
+```
+
+说明:编辑时 `api_key` 留空表示不修改原 Key;`clear_api_key = true` 表示清空原 Key。
+
+### 删除 AI 服务商
+
+`DELETE /api/ai/providers/{provider_id}`
+
+会级联删除该服务商下的模型配置。
+
+### 查询 AI 模型
+
+`GET /api/ai/models?provider_id=1`
+
+`provider_id` 可选;不传时返回全部模型。
+
+### 新增 AI 模型
+
+`POST /api/ai/models`
+
+```json
+{
+  "provider_id": 1,
+  "name": "gpt-4o-mini",
+  "display_name": "GPT-4o mini",
+  "is_default": true
+}
+```
+
+### 更新 AI 模型
+
+`PATCH /api/ai/models/{model_id}`
+
+```json
+{
+  "provider_id": 1,
+  "name": "gemini-1.5-flash",
+  "display_name": "Gemini Flash",
+  "is_default": true
+}
+```
+
+同一服务商只能有一个默认模型。设置新的默认模型后,同服务商其他模型会自动取消默认。
+
+### 删除 AI 模型
+
+`DELETE /api/ai/models/{model_id}`
+
+### 测试 AI 服务
+
+`POST /api/ai/test`
+
+```json
+{
+  "provider_id": 1,
+  "model_id": 1,
+  "prompt": "请用一句话回答:你已经可以连接了吗?",
+  "temperature": 0.2
+}
+```
+
+返回:
+
+```json
+{
+  "provider": {},
+  "model": {},
+  "content": "AI 输出文本",
+  "raw_response": {}
+}
+```
+
 ## 扫描
 
 ### 执行完整扫描
@@ -213,12 +328,51 @@ smartctl -a -d jmb39x,1 /dev/sdb
       "judgement": "TRUSTED",
       "risk_level": "LOW",
       "reason": "Microsoft 官方安全组件。",
-      "suggestion": "保持运行。"
+      "suggestion": "保持运行。",
+      "tags": ["windows系统"]
     }
   ]
 }
 ```
 
+说明:
+
+- `tags` 为可选字段,格式为标签名称数组;老格式不传 `tags` 仍可导入。
+- 如果传入 `tags`,后端会将对应服务的标签替换为该数组。
+- 数据库中不存在的标签会自动创建,默认 `is_controllable = true`、`is_builtin = false`。
+
+### 调用 AI 分析服务并返回待确认结果
+
+`POST /api/services/analyze-ai`
+
+```json
+{
+  "provider_id": 1,
+  "model_id": 1,
+  "temperature": 0.2,
+  "scope": "selected",
+  "ids": [1, 2]
+}
+```
+
+说明:
+
+- 后端会复用服务 AI 提示词模板,调用配置的 AI 服务商和模型。
+- 接口只返回 AI 输出解析结果和更新预览,不直接写入数据库。
+- 前端确认后继续调用 `/api/services/import-ai` 入库。
+
+返回主要字段:
+
+| 字段 | 说明 |
+| --- | --- |
+| items | 解析后的 AI JSON 数组 |
+| preview | 当前数据库值与 AI 建议值的对比 |
+| raw_output | AI 原始输出文本 |
+| provider | 本次使用的服务商信息 |
+| model | 本次使用的模型信息 |
+| prompt_text | 本次发送给 AI 的提示词 |
+| markdown_table | 人工核对表 |
+
 ### 生成服务 AI 分析提示词
 
 `POST /api/services/ai-prompt`
@@ -330,12 +484,38 @@ smartctl -a -d jmb39x,1 /dev/sdb
       "judgement": "SUSPICIOUS",
       "risk_level": "HIGH",
       "reason": "路径和命令行异常。",
-      "suggestion": "建议隔离并检查哈希和网络连接。"
+      "suggestion": "建议隔离并检查哈希和网络连接。",
+      "tags": ["可疑程序"]
     }
   ]
 }
 ```
 
+说明:
+
+- `tags` 为可选字段,格式为标签名称数组;老格式不传 `tags` 仍可导入。
+- 如果传入 `tags`,后端会将对应进程的标签替换为该数组。
+- 数据库中不存在的标签会自动创建,默认 `is_controllable = true`、`is_builtin = false`。
+
+### 调用 AI 分析进程并返回待确认结果
+
+`POST /api/processes/analyze-ai`
+
+```json
+{
+  "provider_id": 1,
+  "model_id": 1,
+  "temperature": 0.2,
+  "scope": "pending"
+}
+```
+
+说明:
+
+- 后端会复用进程 AI 提示词模板,调用配置的 AI 服务商和模型。
+- 接口只返回 AI 输出解析结果和更新预览,不直接写入数据库。
+- 前端确认后继续调用 `/api/processes/import-ai` 入库。
+
 ### 生成进程 AI 分析提示词
 
 `POST /api/processes/ai-prompt`
@@ -379,10 +559,233 @@ smartctl -a -d jmb39x,1 /dev/sdb
 - 任一关联标签 `is_controllable = false` 时不允许控制。
 - 停止进程要求该进程在最近一次扫描中仍然存在。
 
+## Windows 自动化
+
+自动化接口用于后续复杂本机操作编排。关机、重启、鼠标和键盘操作会直接影响当前 Windows 桌面环境,调用前应由上层业务明确确认。
+
+### 关机
+
+`POST /api/automation/power/shutdown`
+
+```json
+{
+  "delay_seconds": 60,
+  "force": false,
+  "reason": "计划关机"
+}
+```
+
+### 重启
+
+`POST /api/automation/power/restart`
+
+请求体同关机接口。
+
+### 取消已排程关机或重启
+
+`POST /api/automation/power/cancel`
+
+### 启动程序
+
+`POST /api/automation/programs/start`
+
+```json
+{
+  "command": "notepad.exe",
+  "cwd": null,
+  "shell": true
+}
+```
+
+### 关闭程序
+
+`POST /api/automation/programs/stop`
+
+```json
+{
+  "pid": 1234,
+  "name": null,
+  "timeout_seconds": 8,
+  "kill_after_timeout": true
+}
+```
+
+说明:`pid` 和 `name` 至少传一个;按名称关闭时会匹配同名进程。
+
+### 屏幕截图
+
+`POST /api/automation/screenshot`
+
+```json
+{
+  "save_path": "C:/Temp/screen.png",
+  "include_base64": true
+}
+```
+
+### 鼠标操作
+
+`POST /api/automation/mouse`
+
+```json
+{
+  "action": "click",
+  "x": 100,
+  "y": 200,
+  "button": "left",
+  "clicks": 1,
+  "duration": 0,
+  "amount": 0
+}
+```
+
+`action` 支持:`move_to`、`move_rel`、`click`、`double_click`、`right_click`、`drag_to`、`scroll`。
+
+### 键盘操作
+
+`POST /api/automation/keyboard`
+
+```json
+{
+  "action": "hotkey",
+  "keys": ["ctrl", "s"],
+  "interval": 0
+}
+```
+
+`action` 支持:`press`、`hotkey`、`write`、`key_down`、`key_up`。
+
+### AI 视觉分析当前界面
+
+`POST /api/automation/vision/analyze`
+
+```json
+{
+  "provider_id": 1,
+  "model_id": 1,
+  "temperature": 0.1
+}
+```
+
+后端会截取当前 Windows 屏幕,调用支持视觉输入的 AI 模型识别界面名称、描述、是否为 Windows 桌面、是否为浏览器网页,以及可操作元素列表。AI 返回的百分比坐标会按原始截图分辨率换算为像素坐标;截图和识别结果会保存到数据库。
+
+### 高层自动化动作接口
+
+以下接口会在动作执行前获取一次当前进程列表,执行后再次获取并对比新增进程。若请求携带 `screen_id`,后端会先截图当前屏幕,并把当前截图和数据库中保存的目标界面截图发送给 AI 做界面对比;不匹配时会写入自动化错误记录并终止动作。
+
+`POST /api/automation/actions/mouse`
+
+```json
+{
+  "screen_id": 1,
+  "provider_id": 1,
+  "model_id": 1,
+  "temperature": 0.1,
+  "x": 420,
+  "y": 260,
+  "mouse_action": "click"
+}
+```
+
+`mouse_action` 支持:`click`、`right_click`、`double_click`。
+
+`POST /api/automation/actions/keyboard`
+
+```json
+{
+  "screen_id": 1,
+  "provider_id": 1,
+  "model_id": 1,
+  "keys": ["ctrl", "s"]
+}
+```
+
+`POST /api/automation/actions/text-input`
+
+```json
+{
+  "screen_id": 1,
+  "provider_id": 1,
+  "model_id": 1,
+  "text": "要输入的中文文本"
+}
+```
+
+文本输入使用剪贴板粘贴,避免中文直接按键模拟不稳定。
+
+`POST /api/automation/actions/start-program`
+
+```json
+{
+  "command": "msedge",
+  "cwd": null,
+  "shell": true
+}
+```
+
+`POST /api/automation/actions/close-opened-programs`
+
+```json
+{
+  "pids": [1234, 5678]
+}
+```
+
+如果不传 `pids`,后端会尝试关闭当前后端进程内记录的自动化新增进程。
+
+### 自动化工作流
+
+`GET /api/automation/workflows`
+
+`POST /api/automation/workflows`
+
+```json
+{
+  "name": "打开浏览器并点击",
+  "description": "示例工作流",
+  "nodes": [
+    {
+      "node_type": "start_program",
+      "title": "启动 Edge",
+      "config": { "command": "msedge" }
+    },
+    {
+      "node_type": "mouse",
+      "screen_id": 1,
+      "title": "点击按钮",
+      "config": { "x": 420, "y": 260, "mouse_action": "click" }
+    }
+  ]
+}
+```
+
+节点类型当前支持:`mouse`、`keyboard`、`text_input`、`start_program`、`close_programs`。
+
+`GET /api/automation/workflows/{workflow_id}`
+
+`PUT /api/automation/workflows/{workflow_id}`
+
+`DELETE /api/automation/workflows/{workflow_id}`
+
+### 已识别界面
+
+`GET /api/automation/screens`
+
+`GET /api/automation/screens/{screen_id}?include_image=true`
+
+`DELETE /api/automation/screens/{screen_id}`
+
+### 自动化错误记录
+
+`GET /api/automation/errors`
+
+`GET /api/automation/errors/{error_id}?include_images=true`
+
 ## AI 提示词中的标签信息
 
 服务和进程的 AI 提示词会包含:
 
 - 每个待分析对象的 `tags` 字段。
 - 系统中已有标签列表,包括 `name`、`description`、`is_controllable`。
-- 标签判断要求:AI 需要结合已有标签进行分析,如果适合已有标签,可以在 `suggestion` 中建议使用对应标签名称,但不能创造不存在的新标签。
+- 标签判断要求:AI 需要结合已有标签进行分析,并在输出对象的 `tags` 字段中返回建议绑定的标签名称数组。
+- AI 可以使用已有标签,也可以在确有必要时返回新的短标签名称;导入时不存在的标签会自动新增。

+ 74 - 0
backend/app/ai_gemini.py

@@ -0,0 +1,74 @@
+from __future__ import annotations
+
+from typing import Any
+from urllib.parse import quote
+
+import httpx
+
+
+DEFAULT_GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta"
+
+
+def normalize_base_url(base_url: str | None) -> str:
+    return (base_url or DEFAULT_GEMINI_BASE_URL).rstrip("/")
+
+
+def chat(provider: dict[str, Any], model: dict[str, Any], prompt: str, temperature: float) -> dict[str, Any]:
+    base_url = normalize_base_url(provider.get("base_url"))
+    headers = {"Content-Type": "application/json"}
+    api_key = provider.get("api_key")
+    if api_key:
+        headers["x-goog-api-key"] = api_key
+
+    payload = {
+        "contents": [{"parts": [{"text": prompt}]}],
+        "generationConfig": {"temperature": temperature},
+    }
+    model_name = quote(model["name"], safe="")
+    with httpx.Client(timeout=120) as client:
+        response = client.post(f"{base_url}/models/{model_name}:generateContent", json=payload, headers=headers)
+        response.raise_for_status()
+        data = response.json()
+
+    parts = data.get("candidates", [{}])[0].get("content", {}).get("parts", [])
+    content = "".join(part.get("text", "") for part in parts)
+    return {"content": content, "raw_response": data}
+
+
+def chat_with_images(
+    provider: dict[str, Any],
+    model: dict[str, Any],
+    prompt: str,
+    images: list[dict[str, str]],
+    temperature: float,
+) -> dict[str, Any]:
+    base_url = normalize_base_url(provider.get("base_url"))
+    headers = {"Content-Type": "application/json"}
+    api_key = provider.get("api_key")
+    if api_key:
+        headers["x-goog-api-key"] = api_key
+
+    parts: list[dict[str, Any]] = [{"text": prompt}]
+    for image in images:
+        parts.append(
+            {
+                "inlineData": {
+                    "mimeType": image["mime_type"],
+                    "data": image["base64"],
+                }
+            }
+        )
+
+    payload = {
+        "contents": [{"parts": parts}],
+        "generationConfig": {"temperature": temperature},
+    }
+    model_name = quote(model["name"], safe="")
+    with httpx.Client(timeout=180) as client:
+        response = client.post(f"{base_url}/models/{model_name}:generateContent", json=payload, headers=headers)
+        response.raise_for_status()
+        data = response.json()
+
+    response_parts = data.get("candidates", [{}])[0].get("content", {}).get("parts", [])
+    content = "".join(part.get("text", "") for part in response_parts)
+    return {"content": content, "raw_response": data}

+ 71 - 0
backend/app/ai_openai.py

@@ -0,0 +1,71 @@
+from __future__ import annotations
+
+from typing import Any
+
+import httpx
+
+
+DEFAULT_OPENAI_BASE_URL = "https://api.openai.com/v1"
+
+
+def normalize_base_url(base_url: str | None) -> str:
+    return (base_url or DEFAULT_OPENAI_BASE_URL).rstrip("/")
+
+
+def chat(provider: dict[str, Any], model: dict[str, Any], prompt: str, temperature: float) -> dict[str, Any]:
+    base_url = normalize_base_url(provider.get("base_url"))
+    headers = {"Content-Type": "application/json"}
+    api_key = provider.get("api_key")
+    if api_key:
+        headers["Authorization"] = f"Bearer {api_key}"
+
+    payload = {
+        "model": model["name"],
+        "messages": [{"role": "user", "content": prompt}],
+        "temperature": temperature,
+    }
+    with httpx.Client(timeout=120) as client:
+        response = client.post(f"{base_url}/chat/completions", json=payload, headers=headers)
+        response.raise_for_status()
+        data = response.json()
+
+    content = data.get("choices", [{}])[0].get("message", {}).get("content")
+    if content is None:
+        content = ""
+    return {"content": content, "raw_response": data}
+
+
+def chat_with_images(
+    provider: dict[str, Any],
+    model: dict[str, Any],
+    prompt: str,
+    images: list[dict[str, str]],
+    temperature: float,
+) -> dict[str, Any]:
+    base_url = normalize_base_url(provider.get("base_url"))
+    headers = {"Content-Type": "application/json"}
+    api_key = provider.get("api_key")
+    if api_key:
+        headers["Authorization"] = f"Bearer {api_key}"
+
+    content: list[dict[str, Any]] = [{"type": "text", "text": prompt}]
+    for image in images:
+        content.append(
+            {
+                "type": "image_url",
+                "image_url": {"url": f"data:{image['mime_type']};base64,{image['base64']}"},
+            }
+        )
+
+    payload = {
+        "model": model["name"],
+        "messages": [{"role": "user", "content": content}],
+        "temperature": temperature,
+    }
+    with httpx.Client(timeout=180) as client:
+        response = client.post(f"{base_url}/chat/completions", json=payload, headers=headers)
+        response.raise_for_status()
+        data = response.json()
+
+    result = data.get("choices", [{}])[0].get("message", {}).get("content") or ""
+    return {"content": result, "raw_response": data}

+ 311 - 0
backend/app/ai_service.py

@@ -0,0 +1,311 @@
+from __future__ import annotations
+
+import json
+import re
+import sqlite3
+from typing import Any
+
+import httpx
+from fastapi import HTTPException
+from pydantic import ValidationError
+
+from . import ai_gemini, ai_openai
+from .database import get_db
+from .scanner import now_iso
+from .schemas import (
+    AiImportItem,
+    AiModelCreate,
+    AiModelUpdate,
+    AiProviderCreate,
+    AiProviderUpdate,
+)
+
+
+def public_provider(row: dict[str, Any]) -> dict[str, Any]:
+    item = dict(row)
+    item["enabled"] = bool(item["enabled"])
+    item["api_key_set"] = bool(item.get("api_key"))
+    item.pop("api_key", None)
+    return item
+
+
+def public_model(row: dict[str, Any]) -> dict[str, Any]:
+    item = dict(row)
+    item["is_default"] = bool(item["is_default"])
+    return item
+
+
+def list_providers() -> list[dict[str, Any]]:
+    with get_db() as conn:
+        rows = conn.execute("SELECT * FROM ai_providers ORDER BY name ASC").fetchall()
+    return [public_provider(row) for row in rows]
+
+
+def create_provider(payload: AiProviderCreate) -> dict[str, Any]:
+    now = now_iso()
+    try:
+        with get_db() as conn:
+            cursor = conn.execute(
+                """
+                INSERT INTO ai_providers (name, provider_type, base_url, api_key, enabled, created_at, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?, ?)
+                """,
+                (
+                    payload.name.strip(),
+                    payload.provider_type,
+                    clean_optional(payload.base_url),
+                    clean_optional(payload.api_key),
+                    1 if payload.enabled else 0,
+                    now,
+                    now,
+                ),
+            )
+            row = conn.execute("SELECT * FROM ai_providers WHERE id = ?", (cursor.lastrowid,)).fetchone()
+    except sqlite3.IntegrityError as exc:
+        raise HTTPException(status_code=409, detail="AI provider name already exists") from exc
+    return public_provider(row)
+
+
+def update_provider(provider_id: int, payload: AiProviderUpdate) -> dict[str, Any]:
+    now = now_iso()
+    with get_db() as conn:
+        existing = conn.execute("SELECT * FROM ai_providers WHERE id = ?", (provider_id,)).fetchone()
+        if not existing:
+            raise HTTPException(status_code=404, detail="AI provider not found")
+        api_key = existing.get("api_key")
+        if payload.clear_api_key:
+            api_key = None
+        elif payload.api_key is not None and payload.api_key != "":
+            api_key = payload.api_key
+        try:
+            conn.execute(
+                """
+                UPDATE ai_providers
+                SET name = ?, provider_type = ?, base_url = ?, api_key = ?, enabled = ?, updated_at = ?
+                WHERE id = ?
+                """,
+                (
+                    payload.name.strip(),
+                    payload.provider_type,
+                    clean_optional(payload.base_url),
+                    clean_optional(api_key),
+                    1 if payload.enabled else 0,
+                    now,
+                    provider_id,
+                ),
+            )
+        except sqlite3.IntegrityError as exc:
+            raise HTTPException(status_code=409, detail="AI provider name already exists") from exc
+        row = conn.execute("SELECT * FROM ai_providers WHERE id = ?", (provider_id,)).fetchone()
+    return public_provider(row)
+
+
+def delete_provider(provider_id: int) -> dict[str, Any]:
+    with get_db() as conn:
+        cursor = conn.execute("DELETE FROM ai_providers WHERE id = ?", (provider_id,))
+    if cursor.rowcount == 0:
+        raise HTTPException(status_code=404, detail="AI provider not found")
+    return {"deleted": cursor.rowcount}
+
+
+def list_models(provider_id: int | None = None) -> list[dict[str, Any]]:
+    where = "WHERE m.provider_id = ?" if provider_id else ""
+    params = [provider_id] if provider_id else []
+    with get_db() as conn:
+        rows = conn.execute(
+            f"""
+            SELECT m.*, p.name AS provider_name, p.provider_type
+            FROM ai_models m
+            JOIN ai_providers p ON p.id = m.provider_id
+            {where}
+            ORDER BY p.name ASC, m.is_default DESC, m.name ASC
+            """,
+            params,
+        ).fetchall()
+    return [public_model(row) for row in rows]
+
+
+def create_model(payload: AiModelCreate) -> dict[str, Any]:
+    now = now_iso()
+    with get_db() as conn:
+        ensure_provider_exists(conn, payload.provider_id)
+        try:
+            cursor = conn.execute(
+                """
+                INSERT INTO ai_models (provider_id, name, display_name, is_default, created_at, updated_at)
+                VALUES (?, ?, ?, ?, ?, ?)
+                """,
+                (
+                    payload.provider_id,
+                    payload.name.strip(),
+                    clean_optional(payload.display_name),
+                    1 if payload.is_default else 0,
+                    now,
+                    now,
+                ),
+            )
+        except sqlite3.IntegrityError as exc:
+            raise HTTPException(status_code=409, detail="AI model already exists for this provider") from exc
+        if payload.is_default:
+            clear_other_default_models(conn, payload.provider_id, cursor.lastrowid)
+        row = get_model_row(conn, cursor.lastrowid)
+    return public_model(row)
+
+
+def update_model(model_id: int, payload: AiModelUpdate) -> dict[str, Any]:
+    now = now_iso()
+    with get_db() as conn:
+        ensure_provider_exists(conn, payload.provider_id)
+        if not conn.execute("SELECT id FROM ai_models WHERE id = ?", (model_id,)).fetchone():
+            raise HTTPException(status_code=404, detail="AI model not found")
+        try:
+            conn.execute(
+                """
+                UPDATE ai_models
+                SET provider_id = ?, name = ?, display_name = ?, is_default = ?, updated_at = ?
+                WHERE id = ?
+                """,
+                (
+                    payload.provider_id,
+                    payload.name.strip(),
+                    clean_optional(payload.display_name),
+                    1 if payload.is_default else 0,
+                    now,
+                    model_id,
+                ),
+            )
+        except sqlite3.IntegrityError as exc:
+            raise HTTPException(status_code=409, detail="AI model already exists for this provider") from exc
+        if payload.is_default:
+            clear_other_default_models(conn, payload.provider_id, model_id)
+        row = get_model_row(conn, model_id)
+    return public_model(row)
+
+
+def delete_model(model_id: int) -> dict[str, Any]:
+    with get_db() as conn:
+        cursor = conn.execute("DELETE FROM ai_models WHERE id = ?", (model_id,))
+    if cursor.rowcount == 0:
+        raise HTTPException(status_code=404, detail="AI model not found")
+    return {"deleted": cursor.rowcount}
+
+
+def chat(provider_id: int, model_id: int, prompt: str, temperature: float) -> dict[str, Any]:
+    provider, model = get_provider_and_model(provider_id, model_id)
+    try:
+        if provider["provider_type"] in {"OPENAI", "OPENAI_COMPATIBLE"}:
+            result = ai_openai.chat(provider, model, prompt, temperature)
+        elif provider["provider_type"] == "GOOGLE_GEMINI":
+            result = ai_gemini.chat(provider, model, prompt, temperature)
+        else:
+            raise HTTPException(status_code=400, detail="Unsupported AI provider type")
+    except httpx.HTTPStatusError as exc:
+        detail = exc.response.text[:1000] if exc.response is not None else str(exc)
+        raise HTTPException(status_code=502, detail=f"AI provider returned an error: {detail}") from exc
+    except httpx.HTTPError as exc:
+        raise HTTPException(status_code=502, detail=f"AI provider request failed: {exc}") from exc
+    return {
+        "provider": public_provider(provider),
+        "model": public_model(model),
+        **result,
+    }
+
+
+def chat_with_images(
+    provider_id: int,
+    model_id: int,
+    prompt: str,
+    images: list[dict[str, str]],
+    temperature: float,
+) -> dict[str, Any]:
+    provider, model = get_provider_and_model(provider_id, model_id)
+    try:
+        if provider["provider_type"] in {"OPENAI", "OPENAI_COMPATIBLE"}:
+            result = ai_openai.chat_with_images(provider, model, prompt, images, temperature)
+        elif provider["provider_type"] == "GOOGLE_GEMINI":
+            result = ai_gemini.chat_with_images(provider, model, prompt, images, temperature)
+        else:
+            raise HTTPException(status_code=400, detail="Unsupported AI provider type")
+    except httpx.HTTPStatusError as exc:
+        detail = exc.response.text[:1000] if exc.response is not None else str(exc)
+        raise HTTPException(status_code=502, detail=f"AI provider returned an error: {detail}") from exc
+    except httpx.HTTPError as exc:
+        raise HTTPException(status_code=502, detail=f"AI provider request failed: {exc}") from exc
+    return {
+        "provider": public_provider(provider),
+        "model": public_model(model),
+        **result,
+    }
+
+
+def parse_ai_items(content: str) -> list[dict[str, Any]]:
+    parsed = json.loads(extract_json_text(content))
+    items = parsed.get("items") if isinstance(parsed, dict) else parsed
+    if not isinstance(items, list):
+        raise ValueError("AI output must be a JSON array or an object containing items")
+    validated = []
+    for item in items:
+        validated.append(AiImportItem.model_validate(item).model_dump())
+    return validated
+
+
+def extract_json_text(content: str) -> str:
+    text = content.strip()
+    fenced = re.search(r"```(?:json)?\s*(.*?)```", text, re.DOTALL | re.IGNORECASE)
+    if fenced:
+        text = fenced.group(1).strip()
+    if text.startswith("[") or text.startswith("{"):
+        return text
+    start_candidates = [index for index in [text.find("["), text.find("{")] if index >= 0]
+    if not start_candidates:
+        return text
+    start = min(start_candidates)
+    end = max(text.rfind("]"), text.rfind("}"))
+    return text[start : end + 1] if end > start else text[start:]
+
+
+def get_provider_and_model(provider_id: int, model_id: int) -> tuple[dict[str, Any], dict[str, Any]]:
+    with get_db() as conn:
+        provider = conn.execute("SELECT * FROM ai_providers WHERE id = ?", (provider_id,)).fetchone()
+        model = conn.execute("SELECT * FROM ai_models WHERE id = ?", (model_id,)).fetchone()
+    if not provider:
+        raise HTTPException(status_code=404, detail="AI provider not found")
+    if not provider["enabled"]:
+        raise HTTPException(status_code=400, detail="AI provider is disabled")
+    if not model or model["provider_id"] != provider_id:
+        raise HTTPException(status_code=400, detail="AI model does not belong to this provider")
+    return provider, model
+
+
+def ensure_provider_exists(conn, provider_id: int) -> None:
+    if not conn.execute("SELECT id FROM ai_providers WHERE id = ?", (provider_id,)).fetchone():
+        raise HTTPException(status_code=404, detail="AI provider not found")
+
+
+def get_model_row(conn, model_id: int) -> dict[str, Any]:
+    row = conn.execute(
+        """
+        SELECT m.*, p.name AS provider_name, p.provider_type
+        FROM ai_models m
+        JOIN ai_providers p ON p.id = m.provider_id
+        WHERE m.id = ?
+        """,
+        (model_id,),
+    ).fetchone()
+    if not row:
+        raise HTTPException(status_code=404, detail="AI model not found")
+    return row
+
+
+def clear_other_default_models(conn, provider_id: int, model_id: int) -> None:
+    conn.execute(
+        "UPDATE ai_models SET is_default = 0, updated_at = ? WHERE provider_id = ? AND id <> ?",
+        (now_iso(), provider_id, model_id),
+    )
+
+
+def clean_optional(value: str | None) -> str | None:
+    if value is None:
+        return None
+    stripped = value.strip()
+    return stripped or None

+ 635 - 0
backend/app/automation_service.py

@@ -0,0 +1,635 @@
+from __future__ import annotations
+
+import base64
+import json
+import mimetypes
+import time
+from pathlib import Path
+from typing import Any
+
+import psutil
+from fastapi import HTTPException
+
+from . import ai_service, windows_automation
+from .database import DATA_DIR, get_db
+from .scanner import now_iso
+from .schemas import (
+    AutomationKeyboardActionRequest,
+    AutomationMouseActionRequest,
+    AutomationStartProgramRequest,
+    AutomationTextInputRequest,
+    AutomationVisionAnalyzeRequest,
+    AutomationWorkflowSaveRequest,
+)
+
+
+AUTOMATION_DIR = DATA_DIR / "automation"
+SCREEN_DIR = AUTOMATION_DIR / "screens"
+ERROR_DIR = AUTOMATION_DIR / "errors"
+RUNTIME_DIR = AUTOMATION_DIR / "runtime"
+OPENED_PROCESS_IDS: set[int] = set()
+
+SCREEN_ANALYZE_PROMPT = """请作为 AI 视觉自动化助手分析这张 Windows 屏幕截图,并严格只输出 JSON 对象。
+
+输出字段:
+- interface_name:界面名称,简洁中文。
+- description:界面描述,说明当前主要窗口或桌面内容。
+- is_windows_desktop:boolean,截图是否处于 Windows 桌面。
+- is_browser_webpage:boolean,截图是否为浏览器中的网页。
+- elements:可操作元素数组。
+
+元素字段:
+- name:元素名称。
+- x_percent:元素中心点 X 相对整张截图宽度的百分比,范围 0-100,可以保留 2 位小数。
+- y_percent:元素中心点 Y 相对整张截图高度的百分比,范围 0-100,可以保留 2 位小数。
+
+判断规则:
+1. 如果截图位于 Windows 桌面,请识别桌面图标、开始菜单入口、任务栏应用、托盘区域等可操作元素。
+2. 如果不是 Windows 桌面,也就是存在打开的前台窗口或全屏界面,只识别该前台窗口内的可操作元素,不要识别被遮挡的桌面元素。
+3. 不要输出 Markdown,不要解释,只输出 JSON。
+"""
+
+SCREEN_COMPARE_PROMPT = """请作为 AI 视觉自动化校验器判断两张截图是否处于同一个目标界面。
+
+图片1是当前实际屏幕截图。图片2是数据库中保存的目标界面截图。
+目标界面描述如下:
+{description}
+
+请严格只输出 JSON 对象,字段为:
+- is_match:boolean,图片1是否仍然处于目标界面。
+- similarity:0 到 1 的数值,表示相似度。
+- reason:简短中文原因。
+
+判断时可以允许小的光标位置、时间、列表内容滚动或轻微刷新差异,但如果前台窗口、网页、弹窗、主要页面或应用已经不同,应返回 false。
+"""
+
+
+def ensure_dirs() -> None:
+    """确保自动化截图、错误截图和运行时目录存在。"""
+    for path in [SCREEN_DIR, ERROR_DIR, RUNTIME_DIR]:
+        path.mkdir(parents=True, exist_ok=True)
+
+
+def image_to_base64(path: str | Path) -> dict[str, str]:
+    """读取图片文件并转为 AI 服务可接收的 base64 结构。"""
+    file_path = Path(path)
+    mime_type = mimetypes.guess_type(file_path.name)[0] or "image/png"
+    return {
+        "base64": base64.b64encode(file_path.read_bytes()).decode("ascii"),
+        "mime_type": mime_type,
+    }
+
+
+def json_from_ai(content: str) -> dict[str, Any]:
+    """从 AI 输出中提取 JSON 对象,兼容模型误加代码块的情况。"""
+    parsed = json.loads(ai_service.extract_json_text(content))
+    if not isinstance(parsed, dict):
+        raise ValueError("AI output must be a JSON object")
+    return parsed
+
+
+def take_screenshot_file(folder: Path, prefix: str) -> dict[str, Any]:
+    """截取当前屏幕并保存为 PNG 文件,同时返回 base64 和分辨率信息。"""
+    ensure_dirs()
+    filename = f"{prefix}_{int(time.time() * 1000)}.png"
+    path = folder / filename
+    result = windows_automation.take_screenshot(str(path), include_base64=True)
+    result["path"] = str(path)
+    return result
+
+
+def analyze_screen(payload: AutomationVisionAnalyzeRequest) -> dict[str, Any]:
+    """截图当前屏幕,调用 AI 识别界面和可操作元素,并保存识别结果。"""
+    screenshot = take_screenshot_file(SCREEN_DIR, "screen")
+    image = image_to_base64(screenshot["path"])
+    ai_result = ai_service.chat_with_images(
+        payload.provider_id,
+        payload.model_id,
+        SCREEN_ANALYZE_PROMPT,
+        [image],
+        payload.temperature,
+    )
+    try:
+        parsed = json_from_ai(ai_result["content"])
+    except (json.JSONDecodeError, ValueError) as exc:
+        raise HTTPException(status_code=502, detail=f"AI vision output is not valid JSON: {exc}") from exc
+
+    width = int(screenshot["width"])
+    height = int(screenshot["height"])
+    elements = normalize_elements(parsed.get("elements"), width, height)
+    now = now_iso()
+    with get_db() as conn:
+        cursor = conn.execute(
+            """
+            INSERT INTO automation_screens (
+                interface_name, description, image_path, width, height,
+                is_windows_desktop, is_browser_webpage, raw_ai_json, created_at, updated_at
+            )
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                str(parsed.get("interface_name") or "未命名界面")[:160],
+                parsed.get("description"),
+                screenshot["path"],
+                width,
+                height,
+                1 if bool(parsed.get("is_windows_desktop")) else 0,
+                1 if bool(parsed.get("is_browser_webpage")) else 0,
+                json.dumps(parsed, ensure_ascii=False),
+                now,
+                now,
+            ),
+        )
+        screen_id = cursor.lastrowid
+        for index, element in enumerate(elements, start=1):
+            conn.execute(
+                """
+                INSERT INTO automation_screen_elements (
+                    screen_id, element_index, name, x_percent, y_percent, x, y, raw_json, created_at
+                )
+                VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
+                """,
+                (
+                    screen_id,
+                    index,
+                    element["name"],
+                    element["x_percent"],
+                    element["y_percent"],
+                    element["x"],
+                    element["y"],
+                    json.dumps(element.get("raw") or element, ensure_ascii=False),
+                    now,
+                ),
+            )
+    detail = get_screen(screen_id)
+    detail["image_base64"] = screenshot["image_base64"]
+    detail["mime_type"] = screenshot["mime_type"]
+    detail["ai_raw_content"] = ai_result["content"]
+    return detail
+
+
+def normalize_elements(raw_elements: Any, width: int, height: int) -> list[dict[str, Any]]:
+    """把 AI 返回的百分比坐标转换为截图像素坐标。"""
+    if not isinstance(raw_elements, list):
+        return []
+    result = []
+    for item in raw_elements:
+        if not isinstance(item, dict):
+            continue
+        name = str(item.get("name") or f"元素 {len(result) + 1}")[:160]
+        x_percent = normalize_percent(item.get("x_percent"))
+        y_percent = normalize_percent(item.get("y_percent"))
+        x = round(width * x_percent / 100)
+        y = round(height * y_percent / 100)
+        result.append(
+            {
+                "name": name,
+                "x_percent": x_percent,
+                "y_percent": y_percent,
+                "x": max(0, min(width - 1, x)),
+                "y": max(0, min(height - 1, y)),
+                "raw": item,
+            }
+        )
+    return result
+
+
+def normalize_percent(value: Any) -> float:
+    """规范化百分比数值,兼容模型偶尔输出 0-1 小数的情况。"""
+    try:
+        number = float(value)
+    except (TypeError, ValueError):
+        return 0.0
+    if 0 <= number <= 1:
+        number *= 100
+    return max(0.0, min(100.0, round(number, 2)))
+
+
+def list_screens(page: int, page_size: int) -> dict[str, Any]:
+    """分页查询已识别界面列表。"""
+    offset = (page - 1) * page_size
+    with get_db() as conn:
+        total = conn.execute("SELECT COUNT(*) AS total FROM automation_screens").fetchone()["total"]
+        rows = conn.execute(
+            """
+            SELECT s.*, COUNT(e.id) AS element_count
+            FROM automation_screens s
+            LEFT JOIN automation_screen_elements e ON e.screen_id = s.id
+            GROUP BY s.id
+            ORDER BY s.created_at DESC
+            LIMIT ? OFFSET ?
+            """,
+            (page_size, offset),
+        ).fetchall()
+    return {"items": [public_screen(row) for row in rows], "total": total, "page": page, "page_size": page_size}
+
+
+def get_screen(screen_id: int, include_image: bool = False) -> dict[str, Any]:
+    """读取单个已识别界面的详情和可操作元素。"""
+    with get_db() as conn:
+        screen = conn.execute("SELECT * FROM automation_screens WHERE id = ?", (screen_id,)).fetchone()
+        if not screen:
+            raise HTTPException(status_code=404, detail="Automation screen not found")
+        elements = conn.execute(
+            "SELECT * FROM automation_screen_elements WHERE screen_id = ? ORDER BY element_index ASC",
+            (screen_id,),
+        ).fetchall()
+    item = public_screen(screen)
+    item["elements"] = [public_element(row) for row in elements]
+    if include_image and Path(item["image_path"]).exists():
+        image = image_to_base64(item["image_path"])
+        item["image_base64"] = image["base64"]
+        item["mime_type"] = image["mime_type"]
+    return item
+
+
+def delete_screen(screen_id: int) -> dict[str, Any]:
+    """删除已识别界面记录,图片文件保留用于审计。"""
+    with get_db() as conn:
+        cursor = conn.execute("DELETE FROM automation_screens WHERE id = ?", (screen_id,))
+    if cursor.rowcount == 0:
+        raise HTTPException(status_code=404, detail="Automation screen not found")
+    return {"deleted": cursor.rowcount}
+
+
+def public_screen(row: dict[str, Any]) -> dict[str, Any]:
+    """把数据库中的界面行转换为接口返回格式。"""
+    item = dict(row)
+    item["is_windows_desktop"] = bool(item.get("is_windows_desktop"))
+    item["is_browser_webpage"] = bool(item.get("is_browser_webpage"))
+    return item
+
+
+def public_element(row: dict[str, Any]) -> dict[str, Any]:
+    """把数据库中的元素行转换为接口返回格式。"""
+    item = dict(row)
+    return item
+
+
+def process_snapshot() -> dict[int, dict[str, Any]]:
+    """获取当前进程快照,只用于自动化动作前后对比,不写入进程扫描表。"""
+    snapshot: dict[int, dict[str, Any]] = {}
+    for proc in psutil.process_iter(["pid", "name", "exe"]):
+        try:
+            snapshot[int(proc.info["pid"])] = {
+                "pid": int(proc.info["pid"]),
+                "name": proc.info.get("name"),
+                "exe": proc.info.get("exe"),
+            }
+        except (psutil.Error, OSError, TypeError, ValueError):
+            continue
+    return snapshot
+
+
+def diff_new_processes(before: dict[int, dict[str, Any]], after: dict[int, dict[str, Any]]) -> list[dict[str, Any]]:
+    """比较动作前后的进程快照,找出本次自动化动作新增的进程。"""
+    new_items = [after[pid] for pid in sorted(set(after) - set(before))]
+    OPENED_PROCESS_IDS.update(item["pid"] for item in new_items)
+    return new_items
+
+
+def validate_screen_before_action(
+    screen_id: int | None,
+    provider_id: int | None,
+    model_id: int | None,
+    temperature: float,
+    action_type: str,
+    workflow_id: int | None = None,
+    node_id: int | None = None,
+) -> dict[str, Any] | None:
+    """如果动作绑定了界面 ID,则先用 AI 判断当前屏幕是否仍处于目标界面。"""
+    if screen_id is None:
+        return None
+    if provider_id is None or model_id is None:
+        raise HTTPException(status_code=400, detail="provider_id and model_id are required when screen_id is provided")
+
+    target = get_screen(screen_id)
+    current = take_screenshot_file(RUNTIME_DIR, "compare_current")
+    prompt = SCREEN_COMPARE_PROMPT.replace("{description}", target.get("description") or target.get("interface_name") or "")
+    ai_result = ai_service.chat_with_images(
+        provider_id,
+        model_id,
+        prompt,
+        [image_to_base64(current["path"]), image_to_base64(target["image_path"])],
+        temperature,
+    )
+    try:
+        parsed = json_from_ai(ai_result["content"])
+    except (json.JSONDecodeError, ValueError) as exc:
+        raise HTTPException(status_code=502, detail=f"AI compare output is not valid JSON: {exc}") from exc
+
+    is_match = bool(parsed.get("is_match"))
+    similarity = safe_float(parsed.get("similarity"))
+    if not is_match:
+        error = record_error(
+            action_type=action_type,
+            message=str(parsed.get("reason") or "界面对比失败,当前屏幕不是目标界面"),
+            screen_id=screen_id,
+            workflow_id=workflow_id,
+            node_id=node_id,
+            similarity=similarity,
+            expected_image_path=target["image_path"],
+            actual_image_path=current["path"],
+            compare_result=parsed,
+        )
+        raise HTTPException(status_code=409, detail={"message": error["message"], "error": error})
+    return parsed
+
+
+def safe_float(value: Any) -> float | None:
+    """安全转换浮点数。"""
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return None
+
+
+def record_error(
+    action_type: str,
+    message: str,
+    screen_id: int | None = None,
+    workflow_id: int | None = None,
+    node_id: int | None = None,
+    similarity: float | None = None,
+    expected_image_path: str | None = None,
+    actual_image_path: str | None = None,
+    compare_result: dict[str, Any] | None = None,
+) -> dict[str, Any]:
+    """保存自动化错误记录,便于在错误记录菜单中回看。"""
+    now = now_iso()
+    with get_db() as conn:
+        cursor = conn.execute(
+            """
+            INSERT INTO automation_errors (
+                workflow_id, node_id, screen_id, action_type, message, similarity,
+                expected_image_path, actual_image_path, compare_result_json, created_at
+            )
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                workflow_id,
+                node_id,
+                screen_id,
+                action_type,
+                message,
+                similarity,
+                expected_image_path,
+                actual_image_path,
+                json.dumps(compare_result or {}, ensure_ascii=False),
+                now,
+            ),
+        )
+        row = conn.execute("SELECT * FROM automation_errors WHERE id = ?", (cursor.lastrowid,)).fetchone()
+    return public_error(row)
+
+
+def execute_mouse_action(payload: AutomationMouseActionRequest) -> dict[str, Any]:
+    """执行鼠标点击类动作,并记录动作前后新增进程。"""
+    before = process_snapshot()
+    compare = validate_screen_before_action(
+        payload.screen_id,
+        payload.provider_id,
+        payload.model_id,
+        payload.temperature,
+        f"mouse_{payload.mouse_action}",
+        payload.workflow_id,
+        payload.node_id,
+    )
+    action_map = {"click": "click", "double_click": "double_click", "right_click": "right_click"}
+    result = windows_automation.mouse_action(action_map[payload.mouse_action], x=payload.x, y=payload.y)
+    time.sleep(0.5)
+    new_processes = diff_new_processes(before, process_snapshot())
+    return {"result": result, "compare": compare, "new_processes": new_processes}
+
+
+def execute_keyboard_action(payload: AutomationKeyboardActionRequest) -> dict[str, Any]:
+    """执行键盘组合键动作,并记录动作前后新增进程。"""
+    before = process_snapshot()
+    compare = validate_screen_before_action(
+        payload.screen_id,
+        payload.provider_id,
+        payload.model_id,
+        payload.temperature,
+        "keyboard",
+        payload.workflow_id,
+        payload.node_id,
+    )
+    result = windows_automation.keyboard_action("hotkey" if len(payload.keys) > 1 else "press", key=payload.keys[0], keys=payload.keys)
+    time.sleep(0.5)
+    new_processes = diff_new_processes(before, process_snapshot())
+    return {"result": result, "compare": compare, "new_processes": new_processes}
+
+
+def execute_text_input(payload: AutomationTextInputRequest) -> dict[str, Any]:
+    """通过剪贴板粘贴文本,避免直接模拟按键时中文输入不稳定。"""
+    before = process_snapshot()
+    compare = validate_screen_before_action(
+        payload.screen_id,
+        payload.provider_id,
+        payload.model_id,
+        payload.temperature,
+        "text_input",
+        payload.workflow_id,
+        payload.node_id,
+    )
+    try:
+        import pyperclip
+    except ImportError as exc:
+        raise HTTPException(status_code=500, detail="pyperclip is not installed") from exc
+    pyperclip.copy(payload.text)
+    result = windows_automation.keyboard_action("hotkey", keys=["ctrl", "v"])
+    time.sleep(0.5)
+    new_processes = diff_new_processes(before, process_snapshot())
+    return {"result": result, "compare": compare, "new_processes": new_processes}
+
+
+def execute_start_program(payload: AutomationStartProgramRequest) -> dict[str, Any]:
+    """启动程序,并把动作后新增的进程记录为本次自动化打开的程序。"""
+    before = process_snapshot()
+    compare = validate_screen_before_action(
+        payload.screen_id,
+        payload.provider_id,
+        payload.model_id,
+        payload.temperature,
+        "start_program",
+        payload.workflow_id,
+        payload.node_id,
+    )
+    result = windows_automation.start_program(payload.command, payload.cwd, payload.shell)
+    time.sleep(1)
+    new_processes = diff_new_processes(before, process_snapshot())
+    if result.get("pid"):
+        OPENED_PROCESS_IDS.add(int(result["pid"]))
+    return {"result": result, "compare": compare, "new_processes": new_processes}
+
+
+def close_opened_programs(pids: list[int] | None = None) -> dict[str, Any]:
+    """关闭本次自动化过程中记录的新进程。"""
+    targets = sorted(set(pids or list(OPENED_PROCESS_IDS)))
+    closed = []
+    for pid in targets:
+        try:
+            closed.append(windows_automation.stop_program(pid=pid))
+            OPENED_PROCESS_IDS.discard(pid)
+        except HTTPException as exc:
+            closed.append({"pid": pid, "error": exc.detail})
+    return {"action": "close_opened_programs", "items": closed}
+
+
+def save_workflow(payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
+    """保存前端记录或手动编辑的自动化工作流和节点。"""
+    now = now_iso()
+    raw_json = payload.model_dump()
+    with get_db() as conn:
+        cursor = conn.execute(
+            """
+            INSERT INTO automation_workflows (name, description, raw_json, created_at, updated_at)
+            VALUES (?, ?, ?, ?, ?)
+            """,
+            (payload.name.strip(), payload.description, json.dumps(raw_json, ensure_ascii=False), now, now),
+        )
+        workflow_id = cursor.lastrowid
+        insert_workflow_nodes(conn, workflow_id, payload.nodes, now)
+    return get_workflow(workflow_id)
+
+
+def update_workflow(workflow_id: int, payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
+    """更新工作流基础信息和节点列表。"""
+    now = now_iso()
+    raw_json = payload.model_dump()
+    with get_db() as conn:
+        existing = conn.execute("SELECT id FROM automation_workflows WHERE id = ?", (workflow_id,)).fetchone()
+        if not existing:
+            raise HTTPException(status_code=404, detail="Automation workflow not found")
+        conn.execute(
+            """
+            UPDATE automation_workflows
+            SET name = ?, description = ?, raw_json = ?, updated_at = ?
+            WHERE id = ?
+            """,
+            (payload.name.strip(), payload.description, json.dumps(raw_json, ensure_ascii=False), now, workflow_id),
+        )
+        conn.execute("DELETE FROM automation_workflow_nodes WHERE workflow_id = ?", (workflow_id,))
+        insert_workflow_nodes(conn, workflow_id, payload.nodes, now)
+    return get_workflow(workflow_id)
+
+
+def insert_workflow_nodes(conn, workflow_id: int, nodes: list[Any], now: str) -> None:
+    """批量写入工作流节点。"""
+    for index, node in enumerate(nodes, start=1):
+        conn.execute(
+            """
+            INSERT INTO automation_workflow_nodes (
+                workflow_id, node_index, node_type, screen_id, title, config_json, created_at, updated_at
+            )
+            VALUES (?, ?, ?, ?, ?, ?, ?, ?)
+            """,
+            (
+                workflow_id,
+                index,
+                node.node_type,
+                node.screen_id,
+                node.title,
+                json.dumps(node.config, ensure_ascii=False),
+                now,
+                now,
+            ),
+        )
+
+
+def list_workflows(page: int, page_size: int) -> dict[str, Any]:
+    """分页查询自动化工作流列表。"""
+    offset = (page - 1) * page_size
+    with get_db() as conn:
+        total = conn.execute("SELECT COUNT(*) AS total FROM automation_workflows").fetchone()["total"]
+        rows = conn.execute(
+            """
+            SELECT w.*, COUNT(n.id) AS node_count
+            FROM automation_workflows w
+            LEFT JOIN automation_workflow_nodes n ON n.workflow_id = w.id
+            GROUP BY w.id
+            ORDER BY w.updated_at DESC
+            LIMIT ? OFFSET ?
+            """,
+            (page_size, offset),
+        ).fetchall()
+    return {"items": rows, "total": total, "page": page, "page_size": page_size}
+
+
+def get_workflow(workflow_id: int) -> dict[str, Any]:
+    """读取工作流详情和节点列表。"""
+    with get_db() as conn:
+        workflow = conn.execute("SELECT * FROM automation_workflows WHERE id = ?", (workflow_id,)).fetchone()
+        if not workflow:
+            raise HTTPException(status_code=404, detail="Automation workflow not found")
+        nodes = conn.execute(
+            "SELECT * FROM automation_workflow_nodes WHERE workflow_id = ? ORDER BY node_index ASC",
+            (workflow_id,),
+        ).fetchall()
+    item = dict(workflow)
+    item["nodes"] = [public_node(row) for row in nodes]
+    return item
+
+
+def delete_workflow(workflow_id: int) -> dict[str, Any]:
+    """删除工作流及其节点。"""
+    with get_db() as conn:
+        cursor = conn.execute("DELETE FROM automation_workflows WHERE id = ?", (workflow_id,))
+    if cursor.rowcount == 0:
+        raise HTTPException(status_code=404, detail="Automation workflow not found")
+    return {"deleted": cursor.rowcount}
+
+
+def public_node(row: dict[str, Any]) -> dict[str, Any]:
+    """把工作流节点行转换为接口返回格式。"""
+    item = dict(row)
+    try:
+        item["config"] = json.loads(item.pop("config_json") or "{}")
+    except json.JSONDecodeError:
+        item["config"] = {}
+    return item
+
+
+def list_errors(page: int, page_size: int) -> dict[str, Any]:
+    """分页查询自动化错误记录。"""
+    offset = (page - 1) * page_size
+    with get_db() as conn:
+        total = conn.execute("SELECT COUNT(*) AS total FROM automation_errors").fetchone()["total"]
+        rows = conn.execute(
+            """
+            SELECT e.*, s.interface_name
+            FROM automation_errors e
+            LEFT JOIN automation_screens s ON s.id = e.screen_id
+            ORDER BY e.created_at DESC
+            LIMIT ? OFFSET ?
+            """,
+            (page_size, offset),
+        ).fetchall()
+    return {"items": [public_error(row) for row in rows], "total": total, "page": page, "page_size": page_size}
+
+
+def get_error(error_id: int, include_images: bool = False) -> dict[str, Any]:
+    """读取单条自动化错误详情,可附带目标截图和实际截图。"""
+    with get_db() as conn:
+        row = conn.execute("SELECT * FROM automation_errors WHERE id = ?", (error_id,)).fetchone()
+    if not row:
+        raise HTTPException(status_code=404, detail="Automation error not found")
+    item = public_error(row)
+    if include_images:
+        for key in ["expected_image_path", "actual_image_path"]:
+            path = item.get(key)
+            if path and Path(path).exists():
+                image = image_to_base64(path)
+                item[key.replace("_path", "_base64")] = image["base64"]
+                item[key.replace("_path", "_mime_type")] = image["mime_type"]
+    return item
+
+
+def public_error(row: dict[str, Any]) -> dict[str, Any]:
+    """把错误记录行转换为接口返回格式。"""
+    item = dict(row)
+    try:
+        item["compare_result"] = json.loads(item.pop("compare_result_json") or "{}")
+    except json.JSONDecodeError:
+        item["compare_result"] = {}
+    return item

+ 103 - 0
backend/app/database.py

@@ -123,6 +123,109 @@ def init_db() -> None:
 
             CREATE INDEX IF NOT EXISTS idx_item_tags_item
                 ON item_tags(item_type, item_id);
+
+            CREATE TABLE IF NOT EXISTS ai_providers (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                name TEXT NOT NULL UNIQUE,
+                provider_type TEXT NOT NULL CHECK(provider_type IN ('OPENAI', 'OPENAI_COMPATIBLE', 'GOOGLE_GEMINI')),
+                base_url TEXT,
+                api_key TEXT,
+                enabled INTEGER NOT NULL DEFAULT 1,
+                created_at TEXT NOT NULL,
+                updated_at TEXT NOT NULL
+            );
+
+            CREATE TABLE IF NOT EXISTS ai_models (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                provider_id INTEGER NOT NULL,
+                name TEXT NOT NULL,
+                display_name TEXT,
+                is_default INTEGER NOT NULL DEFAULT 0,
+                created_at TEXT NOT NULL,
+                updated_at TEXT NOT NULL,
+                UNIQUE(provider_id, name),
+                FOREIGN KEY(provider_id) REFERENCES ai_providers(id) ON DELETE CASCADE
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_ai_models_provider
+                ON ai_models(provider_id);
+
+            CREATE TABLE IF NOT EXISTS automation_screens (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                interface_name TEXT NOT NULL,
+                description TEXT,
+                image_path TEXT NOT NULL,
+                width INTEGER NOT NULL,
+                height INTEGER NOT NULL,
+                is_windows_desktop INTEGER NOT NULL DEFAULT 0,
+                is_browser_webpage INTEGER NOT NULL DEFAULT 0,
+                raw_ai_json TEXT,
+                created_at TEXT NOT NULL,
+                updated_at TEXT NOT NULL
+            );
+
+            CREATE TABLE IF NOT EXISTS automation_screen_elements (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                screen_id INTEGER NOT NULL,
+                element_index INTEGER NOT NULL,
+                name TEXT NOT NULL,
+                x_percent REAL NOT NULL,
+                y_percent REAL NOT NULL,
+                x INTEGER NOT NULL,
+                y INTEGER NOT NULL,
+                raw_json TEXT,
+                created_at TEXT NOT NULL,
+                FOREIGN KEY(screen_id) REFERENCES automation_screens(id) ON DELETE CASCADE
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_automation_screen_elements_screen
+                ON automation_screen_elements(screen_id, element_index);
+
+            CREATE TABLE IF NOT EXISTS automation_workflows (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                name TEXT NOT NULL,
+                description TEXT,
+                raw_json TEXT,
+                created_at TEXT NOT NULL,
+                updated_at TEXT NOT NULL
+            );
+
+            CREATE TABLE IF NOT EXISTS automation_workflow_nodes (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                workflow_id INTEGER NOT NULL,
+                node_index INTEGER NOT NULL,
+                node_type TEXT NOT NULL,
+                screen_id INTEGER,
+                title TEXT,
+                config_json TEXT NOT NULL,
+                created_at TEXT NOT NULL,
+                updated_at TEXT NOT NULL,
+                FOREIGN KEY(workflow_id) REFERENCES automation_workflows(id) ON DELETE CASCADE,
+                FOREIGN KEY(screen_id) REFERENCES automation_screens(id) ON DELETE SET NULL
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_automation_workflow_nodes_workflow
+                ON automation_workflow_nodes(workflow_id, node_index);
+
+            CREATE TABLE IF NOT EXISTS automation_errors (
+                id INTEGER PRIMARY KEY AUTOINCREMENT,
+                workflow_id INTEGER,
+                node_id INTEGER,
+                screen_id INTEGER,
+                action_type TEXT,
+                message TEXT NOT NULL,
+                similarity REAL,
+                expected_image_path TEXT,
+                actual_image_path TEXT,
+                compare_result_json TEXT,
+                created_at TEXT NOT NULL,
+                FOREIGN KEY(workflow_id) REFERENCES automation_workflows(id) ON DELETE SET NULL,
+                FOREIGN KEY(node_id) REFERENCES automation_workflow_nodes(id) ON DELETE SET NULL,
+                FOREIGN KEY(screen_id) REFERENCES automation_screens(id) ON DELETE SET NULL
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_automation_errors_created
+                ON automation_errors(created_at DESC);
             """
         )
         seed_default_tags(conn)

+ 355 - 24
backend/app/main.py

@@ -7,6 +7,7 @@ from typing import Any
 from fastapi import FastAPI, HTTPException, Query
 from fastapi.middleware.cors import CORSMiddleware
 
+from . import ai_service, automation_service, windows_automation
 from .control import (
     CONFIRMED_CONTROL_STATUSES,
     restart_service,
@@ -18,7 +19,34 @@ from .control import (
 from .database import get_db, init_db
 from .scanner import now_iso, run_full_scan
 from .sensors import collect_sensors
-from .schemas import AiImportRequest, BatchStatusUpdate, PromptRequest, StatusUpdate, TagAssignRequest, TagCreate, TagUpdate
+from .schemas import (
+    AiAnalyzeRequest,
+    AiChatRequest,
+    AiImportRequest,
+    AiModelCreate,
+    AiModelUpdate,
+    AiProviderCreate,
+    AiProviderUpdate,
+    AutomationKeyboardRequest,
+    AutomationKeyboardActionRequest,
+    AutomationMouseRequest,
+    AutomationMouseActionRequest,
+    AutomationPowerRequest,
+    AutomationCloseProgramsRequest,
+    AutomationStartProgramRequest,
+    AutomationProgramStartRequest,
+    AutomationProgramStopRequest,
+    AutomationScreenshotRequest,
+    AutomationTextInputRequest,
+    AutomationVisionAnalyzeRequest,
+    AutomationWorkflowSaveRequest,
+    BatchStatusUpdate,
+    PromptRequest,
+    StatusUpdate,
+    TagAssignRequest,
+    TagCreate,
+    TagUpdate,
+)
 from .smart import collect_all_smart, get_device_smart, scan_devices
 
 
@@ -26,14 +54,14 @@ AI_PROMPT_TEMPLATE = """请作为资深的 Windows 系统安全专家,帮我
 
 输出要求:
 1. 必须且只能输出纯 JSON 数组,不要输出任何额外的解释、问候语,也不要使用 Markdown 代码块(如 ```json)包裹。
-2. 每个对象必须包含以下 7 个字段:type、name、description、judgement、risk_level、reason、suggestion。
+2. 每个对象必须包含以下 8 个字段:type、name、description、judgement、risk_level、reason、suggestion、tags
 3. type 只能是 "service" 或 "process"。
 4. description 请简要说明该服务或进程的官方用途或常规功能(如果是未知/恶意程序,请描述其伪装意图或表现)。
 5. judgement 只能是 "TRUSTED"、"SUSPICIOUS"、"NEED_MORE_INFO"。
 6. risk_level 只能是 "LOW"、"MEDIUM"、"HIGH"。
 7. 如果提供的信息不足以做出判断,请将 judgement 设为 "NEED_MORE_INFO"。
-8. 请结合每个对象的 tags 字段进行判断。已有标签是人工上下文,不代表最终结论,但如果标签显示为“windows系统”或“本系统相关”,请在 reason 或 suggestion 中体现这一点。
-9. 如果你认为某个对象适合系统已有标签,可以在 suggestion 中建议使用对应标签名称;不要创造不存在的新标签。
+8. 待分析数据里的 tags 字段是当前已有标签上下文,不代表最终结论,但如果标签显示为“windows系统”或“本系统相关”,请在 reason 或 suggestion 中体现这一点。
+9. 输出对象里的 tags 字段必须是字符串数组,填写你建议系统最终绑定到该对象上的标签名称。可以使用系统已有标签,也可以在确有必要时给出新的短标签名称;标签名称应简洁稳定,不要把长句放入标签。
 
 JSON 格式示例:
 
@@ -45,7 +73,8 @@ JSON 格式示例:
     "judgement": "TRUSTED",
     "risk_level": "LOW",
     "reason": "这是 Microsoft 官方的安全组件,路径和名称符合系统原生服务的标准特征。",
-    "suggestion": "可标记为可信,建议保持运行。"
+    "suggestion": "可标记为可信,建议保持运行。",
+    "tags": ["windows系统"]
   },
   {
     "type": "process",
@@ -54,7 +83,8 @@ JSON 格式示例:
     "judgement": "SUSPICIOUS",
     "risk_level": "HIGH",
     "reason": "进程位于用户 AppData 临时目录,启动命令行异常,且缺少有效的官方数字签名。",
-    "suggestion": "建议立即隔离,检查文件的 SHA256 散列值及外部网络连接记录,不要直接运行或信任。"
+    "suggestion": "建议立即隔离,检查文件的 SHA256 散列值及外部网络连接记录,不要直接运行或信任。",
+    "tags": ["可疑程序"]
   }
 ]
 
@@ -377,33 +407,140 @@ def ensure_control_allowed(table: str, item_id: int) -> dict[str, Any]:
     return item
 
 
+def normalize_import_tag_names(tag_names: list[str] | None) -> list[str]:
+    if tag_names is None:
+        return []
+    normalized = []
+    seen = set()
+    for tag_name in tag_names:
+        name = str(tag_name).strip()[:80]
+        if not name or name in seen:
+            continue
+        seen.add(name)
+        normalized.append(name)
+    return normalized
+
+
+def ensure_tag_ids(conn, tag_names: list[str]) -> list[int]:
+    tag_ids = []
+    now = now_iso()
+    for tag_name in tag_names:
+        row = conn.execute("SELECT id FROM tags WHERE name = ?", (tag_name,)).fetchone()
+        if row:
+            tag_ids.append(row["id"])
+            continue
+        cursor = conn.execute(
+            """
+            INSERT INTO tags (name, description, is_controllable, is_builtin, created_at, updated_at)
+            VALUES (?, ?, 1, 0, ?, ?)
+            """,
+            (tag_name, "AI 自动新增标签", now, now),
+        )
+        tag_ids.append(cursor.lastrowid)
+    return tag_ids
+
+
+def replace_item_tags(conn, item_type: str, item_id: int, tag_ids: list[int]) -> None:
+    now = now_iso()
+    conn.execute("DELETE FROM item_tags WHERE item_type = ? AND item_id = ?", (item_type, item_id))
+    for tag_id in tag_ids:
+        conn.execute(
+            "INSERT INTO item_tags (item_type, item_id, tag_id, created_at) VALUES (?, ?, ?, ?)",
+            (item_type, item_id, tag_id, now),
+        )
+
+
 def import_ai_results(table: str, item_type: str, payload: AiImportRequest) -> dict[str, Any]:
     updated = 0
     with get_db() as conn:
         for item in payload.items:
             if item.type != item_type:
                 continue
-            cursor = conn.execute(
-                f"""
-                UPDATE {table}
-                SET confirm_status = ?, ai_description = ?, ai_reason = ?,
-                    ai_suggestion = ?, risk_level = ?, updated_at = ?
-                WHERE name = ?
-                """,
-                (
-                    item.judgement,
-                    item.description,
-                    item.reason,
-                    item.suggestion,
-                    item.risk_level,
-                    now_iso(),
-                    item.name,
-                ),
-            )
-            updated += cursor.rowcount
+            matched_rows = conn.execute(f"SELECT id FROM {table} WHERE name = ?", (item.name,)).fetchall()
+            tag_ids = ensure_tag_ids(conn, normalize_import_tag_names(item.tags)) if item.tags is not None else None
+            for row in matched_rows:
+                cursor = conn.execute(
+                    f"""
+                    UPDATE {table}
+                    SET confirm_status = ?, ai_description = ?, ai_reason = ?,
+                        ai_suggestion = ?, risk_level = ?, updated_at = ?
+                    WHERE id = ?
+                    """,
+                    (
+                        item.judgement,
+                        item.description,
+                        item.reason,
+                        item.suggestion,
+                        item.risk_level,
+                        now_iso(),
+                        row["id"],
+                    ),
+                )
+                if tag_ids is not None:
+                    replace_item_tags(conn, item_type, row["id"], tag_ids)
+                updated += cursor.rowcount
     return {"updated": updated}
 
 
+def ai_update_preview(table: str, item_type: str, proposed_items: list[dict[str, Any]]) -> list[dict[str, Any]]:
+    names = [item["name"] for item in proposed_items if item.get("type") == item_type and item.get("name")]
+    if not names:
+        return []
+    placeholders = ",".join("?" for _ in names)
+    with get_db() as conn:
+        rows = conn.execute(
+            f"""
+            SELECT id, name, confirm_status, ai_description, ai_reason, ai_suggestion, risk_level
+            FROM {table}
+            WHERE name IN ({placeholders})
+            """,
+            names,
+        ).fetchall()
+        tag_map = tags_for_items(conn, item_type, [row["id"] for row in rows])
+    row_map = {}
+    for row in rows:
+        current = dict(row)
+        current["tags"] = [tag["name"] for tag in tag_map.get(row["id"], [])]
+        row_map[row["name"]] = current
+    preview = []
+    for item in proposed_items:
+        if item.get("type") != item_type:
+            continue
+        current = row_map.get(item.get("name"))
+        preview.append(
+            {
+                "matched": current is not None,
+                "current": current,
+                "proposed": item,
+            }
+        )
+    return preview
+
+
+def analyze_items_with_ai(table: str, item_type: str, payload: AiAnalyzeRequest) -> dict[str, Any]:
+    rows = rows_for_prompt(table, item_type, PromptRequest(scope=payload.scope, ids=payload.ids))
+    if not rows:
+        raise HTTPException(status_code=400, detail="No items available for AI analysis")
+    prompt_data = prompt_response(rows)
+    result = ai_service.chat(payload.provider_id, payload.model_id, prompt_data["prompt_text"], payload.temperature)
+    try:
+        parsed_items = ai_service.parse_ai_items(result["content"])
+    except (json.JSONDecodeError, ValueError) as exc:
+        raise HTTPException(
+            status_code=502,
+            detail=f"AI output is not valid import JSON: {exc}",
+        ) from exc
+    return {
+        "items": parsed_items,
+        "preview": ai_update_preview(table, item_type, parsed_items),
+        "raw_output": result["content"],
+        "provider": result["provider"],
+        "model": result["model"],
+        "prompt_text": prompt_data["prompt_text"],
+        "markdown_table": prompt_data["markdown_table"],
+    }
+
+
 @app.get("/api/dashboard")
 def dashboard() -> dict[str, Any]:
     with get_db() as conn:
@@ -493,6 +630,190 @@ def tag_delete(tag_id: int) -> dict[str, Any]:
     return {"deleted": cursor.rowcount}
 
 
+@app.get("/api/ai/providers")
+def ai_providers() -> dict[str, Any]:
+    return {"items": ai_service.list_providers()}
+
+
+@app.post("/api/ai/providers")
+def ai_provider_create(payload: AiProviderCreate) -> dict[str, Any]:
+    return ai_service.create_provider(payload)
+
+
+@app.patch("/api/ai/providers/{provider_id}")
+def ai_provider_update(provider_id: int, payload: AiProviderUpdate) -> dict[str, Any]:
+    return ai_service.update_provider(provider_id, payload)
+
+
+@app.delete("/api/ai/providers/{provider_id}")
+def ai_provider_delete(provider_id: int) -> dict[str, Any]:
+    return ai_service.delete_provider(provider_id)
+
+
+@app.get("/api/ai/models")
+def ai_models(provider_id: int | None = None) -> dict[str, Any]:
+    return {"items": ai_service.list_models(provider_id)}
+
+
+@app.post("/api/ai/models")
+def ai_model_create(payload: AiModelCreate) -> dict[str, Any]:
+    return ai_service.create_model(payload)
+
+
+@app.patch("/api/ai/models/{model_id}")
+def ai_model_update(model_id: int, payload: AiModelUpdate) -> dict[str, Any]:
+    return ai_service.update_model(model_id, payload)
+
+
+@app.delete("/api/ai/models/{model_id}")
+def ai_model_delete(model_id: int) -> dict[str, Any]:
+    return ai_service.delete_model(model_id)
+
+
+@app.post("/api/ai/test")
+def ai_test(payload: AiChatRequest) -> dict[str, Any]:
+    return ai_service.chat(payload.provider_id, payload.model_id, payload.prompt, payload.temperature)
+
+
+@app.post("/api/automation/power/shutdown")
+def automation_shutdown(payload: AutomationPowerRequest) -> dict[str, Any]:
+    return windows_automation.shutdown_windows(payload.delay_seconds, payload.force, payload.reason)
+
+
+@app.post("/api/automation/power/restart")
+def automation_restart(payload: AutomationPowerRequest) -> dict[str, Any]:
+    return windows_automation.restart_windows(payload.delay_seconds, payload.force, payload.reason)
+
+
+@app.post("/api/automation/power/cancel")
+def automation_power_cancel() -> dict[str, Any]:
+    return windows_automation.cancel_power_action()
+
+
+@app.post("/api/automation/programs/start")
+def automation_program_start(payload: AutomationProgramStartRequest) -> dict[str, Any]:
+    return windows_automation.start_program(payload.command, payload.cwd, payload.shell)
+
+
+@app.post("/api/automation/programs/stop")
+def automation_program_stop(payload: AutomationProgramStopRequest) -> dict[str, Any]:
+    return windows_automation.stop_program(
+        pid=payload.pid,
+        name=payload.name,
+        timeout_seconds=payload.timeout_seconds,
+        kill_after_timeout=payload.kill_after_timeout,
+    )
+
+
+@app.post("/api/automation/screenshot")
+def automation_screenshot(payload: AutomationScreenshotRequest) -> dict[str, Any]:
+    return windows_automation.take_screenshot(payload.save_path, payload.include_base64)
+
+
+@app.post("/api/automation/mouse")
+def automation_mouse(payload: AutomationMouseRequest) -> dict[str, Any]:
+    return windows_automation.mouse_action(
+        action=payload.action,
+        x=payload.x,
+        y=payload.y,
+        duration=payload.duration,
+        button=payload.button,
+        clicks=payload.clicks,
+        amount=payload.amount,
+    )
+
+
+@app.post("/api/automation/keyboard")
+def automation_keyboard(payload: AutomationKeyboardRequest) -> dict[str, Any]:
+    return windows_automation.keyboard_action(
+        action=payload.action,
+        key=payload.key,
+        keys=payload.keys,
+        text=payload.text,
+        interval=payload.interval,
+    )
+
+
+@app.post("/api/automation/vision/analyze")
+def automation_vision_analyze(payload: AutomationVisionAnalyzeRequest) -> dict[str, Any]:
+    return automation_service.analyze_screen(payload)
+
+
+@app.post("/api/automation/actions/mouse")
+def automation_action_mouse(payload: AutomationMouseActionRequest) -> dict[str, Any]:
+    return automation_service.execute_mouse_action(payload)
+
+
+@app.post("/api/automation/actions/keyboard")
+def automation_action_keyboard(payload: AutomationKeyboardActionRequest) -> dict[str, Any]:
+    return automation_service.execute_keyboard_action(payload)
+
+
+@app.post("/api/automation/actions/text-input")
+def automation_action_text_input(payload: AutomationTextInputRequest) -> dict[str, Any]:
+    return automation_service.execute_text_input(payload)
+
+
+@app.post("/api/automation/actions/start-program")
+def automation_action_start_program(payload: AutomationStartProgramRequest) -> dict[str, Any]:
+    return automation_service.execute_start_program(payload)
+
+
+@app.post("/api/automation/actions/close-opened-programs")
+def automation_action_close_opened_programs(payload: AutomationCloseProgramsRequest) -> dict[str, Any]:
+    return automation_service.close_opened_programs(payload.pids)
+
+
+@app.get("/api/automation/workflows")
+def automation_workflows(page: int = Query(default=1, ge=1), page_size: int = Query(default=20, ge=1, le=200)) -> dict[str, Any]:
+    return automation_service.list_workflows(page, page_size)
+
+
+@app.post("/api/automation/workflows")
+def automation_workflow_create(payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
+    return automation_service.save_workflow(payload)
+
+
+@app.get("/api/automation/workflows/{workflow_id}")
+def automation_workflow_detail(workflow_id: int) -> dict[str, Any]:
+    return automation_service.get_workflow(workflow_id)
+
+
+@app.put("/api/automation/workflows/{workflow_id}")
+def automation_workflow_update(workflow_id: int, payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
+    return automation_service.update_workflow(workflow_id, payload)
+
+
+@app.delete("/api/automation/workflows/{workflow_id}")
+def automation_workflow_delete(workflow_id: int) -> dict[str, Any]:
+    return automation_service.delete_workflow(workflow_id)
+
+
+@app.get("/api/automation/screens")
+def automation_screens(page: int = Query(default=1, ge=1), page_size: int = Query(default=20, ge=1, le=200)) -> dict[str, Any]:
+    return automation_service.list_screens(page, page_size)
+
+
+@app.get("/api/automation/screens/{screen_id}")
+def automation_screen_detail(screen_id: int, include_image: bool = False) -> dict[str, Any]:
+    return automation_service.get_screen(screen_id, include_image)
+
+
+@app.delete("/api/automation/screens/{screen_id}")
+def automation_screen_delete(screen_id: int) -> dict[str, Any]:
+    return automation_service.delete_screen(screen_id)
+
+
+@app.get("/api/automation/errors")
+def automation_errors(page: int = Query(default=1, ge=1), page_size: int = Query(default=20, ge=1, le=200)) -> dict[str, Any]:
+    return automation_service.list_errors(page, page_size)
+
+
+@app.get("/api/automation/errors/{error_id}")
+def automation_error_detail(error_id: int, include_images: bool = False) -> dict[str, Any]:
+    return automation_service.get_error(error_id, include_images)
+
+
 @app.get("/api/sensors")
 def sensors() -> dict[str, Any]:
     return collect_sensors()
@@ -572,6 +893,11 @@ def service_import_ai(payload: AiImportRequest) -> dict[str, Any]:
     return import_ai_results("windows_services", "service", payload)
 
 
+@app.post("/api/services/analyze-ai")
+def service_analyze_ai(payload: AiAnalyzeRequest) -> dict[str, Any]:
+    return analyze_items_with_ai("windows_services", "service", payload)
+
+
 @app.post("/api/services/ai-prompt")
 def service_ai_prompt(payload: PromptRequest) -> dict[str, Any]:
     return prompt_response(rows_for_prompt("windows_services", "service", payload))
@@ -667,6 +993,11 @@ def process_import_ai(payload: AiImportRequest) -> dict[str, Any]:
     return import_ai_results("windows_processes", "process", payload)
 
 
+@app.post("/api/processes/analyze-ai")
+def process_analyze_ai(payload: AiAnalyzeRequest) -> dict[str, Any]:
+    return analyze_items_with_ai("windows_processes", "process", payload)
+
+
 @app.post("/api/processes/ai-prompt")
 def process_ai_prompt(payload: PromptRequest) -> dict[str, Any]:
     return prompt_response(rows_for_prompt("windows_processes", "process", payload))

+ 164 - 2
backend/app/schemas.py

@@ -1,12 +1,16 @@
 from __future__ import annotations
 
-from typing import Literal
+from typing import Any, Literal
 
-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, field_validator
 
 
 ConfirmStatus = Literal["PENDING", "TRUSTED", "SUSPICIOUS", "IGNORED", "NEED_MORE_INFO"]
 ItemType = Literal["service", "process"]
+AiProviderType = Literal["OPENAI", "OPENAI_COMPATIBLE", "GOOGLE_GEMINI"]
+MouseAutomationAction = Literal["move_to", "move_rel", "click", "double_click", "right_click", "drag_to", "scroll"]
+KeyboardAutomationAction = Literal["press", "hotkey", "write", "key_down", "key_up"]
+AutomationNodeType = Literal["mouse", "keyboard", "text_input", "start_program", "close_programs"]
 
 
 class StatusUpdate(BaseModel):
@@ -28,6 +32,24 @@ class AiImportItem(BaseModel):
     risk_level: Literal["LOW", "MEDIUM", "HIGH"]
     reason: str | None = None
     suggestion: str | None = None
+    tags: list[str] | None = None
+
+    @field_validator("tags", mode="before")
+    @classmethod
+    def normalize_tags(cls, value: Any) -> list[str] | None:
+        if value is None:
+            return None
+        if isinstance(value, str):
+            return [value]
+        if isinstance(value, list):
+            names: list[str] = []
+            for item in value:
+                if isinstance(item, str):
+                    names.append(item)
+                elif isinstance(item, dict) and item.get("name"):
+                    names.append(str(item["name"]))
+            return names
+        raise ValueError("tags must be a string, list of strings, or list of objects with name")
 
 
 class AiImportRequest(BaseModel):
@@ -58,3 +80,143 @@ class TagAssignRequest(BaseModel):
 class ProcessStartRequest(BaseModel):
     command: str = Field(min_length=1)
     cwd: str | None = None
+
+
+class AutomationPowerRequest(BaseModel):
+    delay_seconds: int = Field(default=0, ge=0, le=86400)
+    force: bool = False
+    reason: str | None = Field(default=None, max_length=512)
+
+
+class AutomationProgramStartRequest(BaseModel):
+    command: str = Field(min_length=1)
+    cwd: str | None = None
+    shell: bool = True
+
+
+class AutomationProgramStopRequest(BaseModel):
+    pid: int | None = Field(default=None, ge=0)
+    name: str | None = Field(default=None, min_length=1)
+    timeout_seconds: float = Field(default=8, ge=0, le=60)
+    kill_after_timeout: bool = True
+
+
+class AutomationScreenshotRequest(BaseModel):
+    save_path: str | None = None
+    include_base64: bool = True
+
+
+class AutomationMouseRequest(BaseModel):
+    action: MouseAutomationAction
+    x: int | None = None
+    y: int | None = None
+    duration: float = Field(default=0, ge=0, le=60)
+    button: Literal["left", "middle", "right"] = "left"
+    clicks: int = Field(default=1, ge=1, le=20)
+    amount: int = 0
+
+
+class AutomationKeyboardRequest(BaseModel):
+    action: KeyboardAutomationAction
+    key: str | None = None
+    keys: list[str] | None = None
+    text: str | None = None
+    interval: float = Field(default=0, ge=0, le=10)
+
+
+class AutomationVisionAnalyzeRequest(BaseModel):
+    provider_id: int
+    model_id: int
+    temperature: float = Field(default=0.1, ge=0, le=2)
+
+
+class AutomationActionBase(BaseModel):
+    screen_id: int | None = None
+    provider_id: int | None = None
+    model_id: int | None = None
+    temperature: float = Field(default=0.1, ge=0, le=2)
+    workflow_id: int | None = None
+    node_id: int | None = None
+
+
+class AutomationMouseActionRequest(AutomationActionBase):
+    x: int
+    y: int
+    mouse_action: Literal["click", "double_click", "right_click"]
+
+
+class AutomationKeyboardActionRequest(AutomationActionBase):
+    keys: list[str] = Field(min_length=1)
+
+
+class AutomationTextInputRequest(AutomationActionBase):
+    text: str
+
+
+class AutomationStartProgramRequest(AutomationActionBase):
+    command: str = Field(min_length=1)
+    cwd: str | None = None
+    shell: bool = True
+
+
+class AutomationCloseProgramsRequest(BaseModel):
+    pids: list[int] | None = None
+
+
+class AutomationWorkflowNode(BaseModel):
+    node_type: AutomationNodeType
+    screen_id: int | None = None
+    title: str | None = None
+    config: dict[str, Any] = Field(default_factory=dict)
+
+
+class AutomationWorkflowSaveRequest(BaseModel):
+    name: str = Field(min_length=1, max_length=160)
+    description: str | None = None
+    nodes: list[AutomationWorkflowNode] = Field(default_factory=list)
+
+
+class AiProviderCreate(BaseModel):
+    name: str = Field(min_length=1, max_length=120)
+    provider_type: AiProviderType
+    base_url: str | None = None
+    api_key: str | None = None
+    enabled: bool = True
+
+
+class AiProviderUpdate(BaseModel):
+    name: str = Field(min_length=1, max_length=120)
+    provider_type: AiProviderType
+    base_url: str | None = None
+    api_key: str | None = None
+    clear_api_key: bool = False
+    enabled: bool = True
+
+
+class AiModelCreate(BaseModel):
+    provider_id: int
+    name: str = Field(min_length=1, max_length=160)
+    display_name: str | None = None
+    is_default: bool = False
+
+
+class AiModelUpdate(BaseModel):
+    provider_id: int
+    name: str = Field(min_length=1, max_length=160)
+    display_name: str | None = None
+    is_default: bool = False
+
+
+class AiChatRequest(BaseModel):
+    provider_id: int
+    model_id: int
+    prompt: str = Field(min_length=1)
+    temperature: float = Field(default=0.2, ge=0, le=2)
+
+
+class AiAnalyzeRequest(BaseModel):
+    provider_id: int
+    model_id: int
+    temperature: float = Field(default=0.2, ge=0, le=2)
+    ids: list[int] | None = None
+    scope: Literal["selected", "pending"] = "pending"

+ 258 - 0
backend/app/windows_automation.py

@@ -0,0 +1,258 @@
+from __future__ import annotations
+
+import base64
+import locale
+import os
+import subprocess
+from pathlib import Path
+from typing import Any, Literal
+
+import psutil
+from fastapi import HTTPException
+
+
+MouseAction = Literal["move_to", "move_rel", "click", "double_click", "right_click", "drag_to", "scroll"]
+KeyboardAction = Literal["press", "hotkey", "write", "key_down", "key_up"]
+
+
+def hidden_creationflags() -> int:
+    """返回 Windows 下隐藏控制台窗口所需的启动标志。"""
+    if os.name != "nt":
+        return 0
+    return subprocess.CREATE_NO_WINDOW
+
+
+def command_encoding() -> str:
+    """获取当前系统命令行输出编码,避免中文 Windows 输出乱码。"""
+    return locale.getpreferredencoding(False) or "utf-8"
+
+
+def ensure_windows() -> None:
+    """确认当前运行环境是 Windows,系统电源操作只允许在 Windows 上执行。"""
+    if os.name != "nt":
+        raise HTTPException(status_code=400, detail="Windows automation is only available on Windows")
+
+
+def load_pyautogui():
+    """按需加载 pyautogui,避免未安装依赖时影响后端其他接口启动。"""
+    try:
+        import pyautogui
+    except ImportError as exc:
+        raise HTTPException(
+            status_code=500,
+            detail="pyautogui is not installed. Run pip install -r backend/requirements.txt",
+        ) from exc
+    pyautogui.FAILSAFE = True
+    return pyautogui
+
+
+def run_shutdown_command(args: list[str], timeout: int = 10) -> dict[str, Any]:
+    """执行 shutdown.exe 命令,并统一返回命令输出。"""
+    ensure_windows()
+    result = subprocess.run(
+        ["shutdown.exe", *args],
+        capture_output=True,
+        text=True,
+        encoding=command_encoding(),
+        errors="replace",
+        timeout=timeout,
+        creationflags=hidden_creationflags(),
+        check=False,
+    )
+    output = "\n".join(part for part in [result.stdout.strip(), result.stderr.strip()] if part)
+    if result.returncode != 0:
+        raise HTTPException(status_code=500, detail=output or f"shutdown.exe exited with {result.returncode}")
+    return {"returncode": result.returncode, "output": output}
+
+
+def shutdown_windows(delay_seconds: int = 0, force: bool = False, reason: str | None = None) -> dict[str, Any]:
+    """关闭 Windows 系统,支持延迟秒数和强制关闭正在运行的程序。"""
+    args = ["/s", "/t", str(delay_seconds)]
+    if force:
+        args.append("/f")
+    if reason:
+        args.extend(["/c", reason[:512]])
+    result = run_shutdown_command(args)
+    return {"action": "shutdown", "delay_seconds": delay_seconds, "force": force, **result}
+
+
+def restart_windows(delay_seconds: int = 0, force: bool = False, reason: str | None = None) -> dict[str, Any]:
+    """重启 Windows 系统,支持延迟秒数和强制关闭正在运行的程序。"""
+    args = ["/r", "/t", str(delay_seconds)]
+    if force:
+        args.append("/f")
+    if reason:
+        args.extend(["/c", reason[:512]])
+    result = run_shutdown_command(args)
+    return {"action": "restart", "delay_seconds": delay_seconds, "force": force, **result}
+
+
+def cancel_power_action() -> dict[str, Any]:
+    """取消已经排程但尚未执行的关机或重启操作。"""
+    result = run_shutdown_command(["/a"])
+    return {"action": "cancel_power_action", **result}
+
+
+def start_program(command: str, cwd: str | None = None, shell: bool = True) -> dict[str, Any]:
+    """启动一个程序或命令,返回新进程 PID 供后续自动化流程追踪。"""
+    working_dir = cwd if cwd and os.path.isdir(cwd) else None
+    try:
+        proc = subprocess.Popen(
+            command,
+            cwd=working_dir,
+            shell=shell,
+            creationflags=hidden_creationflags(),
+        )
+    except OSError as exc:
+        raise HTTPException(status_code=500, detail=str(exc)) from exc
+    return {"action": "start_program", "pid": proc.pid, "command": command, "cwd": working_dir}
+
+
+def stop_program(pid: int | None = None, name: str | None = None, timeout_seconds: float = 8, kill_after_timeout: bool = True) -> dict[str, Any]:
+    """按 PID 或进程名关闭程序;优先温和终止,超时后可选择强制结束。"""
+    processes = find_processes(pid=pid, name=name)
+    if not processes:
+        raise HTTPException(status_code=404, detail="No matching process found")
+
+    stopped: list[dict[str, Any]] = []
+    for proc in processes:
+        item: dict[str, Any] = {"pid": proc.pid, "name": safe_proc_name(proc)}
+        try:
+            proc.terminate()
+            proc.wait(timeout=timeout_seconds)
+            item["stopped_by"] = "terminate"
+        except psutil.TimeoutExpired:
+            if not kill_after_timeout:
+                item["stopped_by"] = None
+                item["error"] = "terminate timeout"
+            else:
+                proc.kill()
+                proc.wait(timeout=5)
+                item["stopped_by"] = "kill"
+        except psutil.NoSuchProcess:
+            item["already_stopped"] = True
+        except psutil.AccessDenied as exc:
+            item["error"] = f"access denied: {exc}"
+        stopped.append(item)
+    return {"action": "stop_program", "matched": len(processes), "items": stopped}
+
+
+def find_processes(pid: int | None = None, name: str | None = None) -> list[psutil.Process]:
+    """根据 PID 或进程名查找进程,供关闭程序等动作复用。"""
+    if pid is None and not name:
+        raise HTTPException(status_code=400, detail="pid or name is required")
+    if pid is not None:
+        try:
+            return [psutil.Process(pid)]
+        except psutil.NoSuchProcess:
+            return []
+        except psutil.AccessDenied as exc:
+            raise HTTPException(status_code=403, detail=f"Access denied: {exc}") from exc
+
+    target = (name or "").lower()
+    matched = []
+    for proc in psutil.process_iter(["name"]):
+        proc_name = (proc.info.get("name") or "").lower()
+        if proc_name == target:
+            matched.append(proc)
+    return matched
+
+
+def safe_proc_name(proc: psutil.Process) -> str | None:
+    """安全读取进程名,避免进程消失或权限不足导致自动化流程中断。"""
+    try:
+        return proc.name()
+    except (psutil.Error, OSError):
+        return None
+
+
+def take_screenshot(save_path: str | None = None, include_base64: bool = True) -> dict[str, Any]:
+    """截取当前屏幕;可保存为 PNG 文件,也可返回 base64 供接口直接预览。"""
+    pyautogui = load_pyautogui()
+    image = pyautogui.screenshot()
+    width, height = image.size
+
+    result: dict[str, Any] = {"action": "screenshot", "width": width, "height": height}
+    if save_path:
+        path = Path(save_path).expanduser().resolve()
+        path.parent.mkdir(parents=True, exist_ok=True)
+        image.save(path, format="PNG")
+        result["path"] = str(path)
+
+    if include_base64:
+        from io import BytesIO
+
+        buffer = BytesIO()
+        image.save(buffer, format="PNG")
+        result["image_base64"] = base64.b64encode(buffer.getvalue()).decode("ascii")
+        result["mime_type"] = "image/png"
+    return result
+
+
+def mouse_action(
+    action: MouseAction,
+    x: int | None = None,
+    y: int | None = None,
+    duration: float = 0,
+    button: str = "left",
+    clicks: int = 1,
+    amount: int = 0,
+) -> dict[str, Any]:
+    """执行鼠标动作,包括移动、点击、拖拽和滚轮操作。"""
+    pyautogui = load_pyautogui()
+    if action in {"move_to", "drag_to"} and (x is None or y is None):
+        raise HTTPException(status_code=400, detail="x and y are required for this mouse action")
+
+    if action == "move_to":
+        pyautogui.moveTo(x, y, duration=duration)
+    elif action == "move_rel":
+        pyautogui.moveRel(x or 0, y or 0, duration=duration)
+    elif action == "click":
+        pyautogui.click(x=x, y=y, clicks=clicks, button=button)
+    elif action == "double_click":
+        pyautogui.doubleClick(x=x, y=y, button=button)
+    elif action == "right_click":
+        pyautogui.rightClick(x=x, y=y)
+    elif action == "drag_to":
+        pyautogui.dragTo(x, y, duration=duration, button=button)
+    elif action == "scroll":
+        pyautogui.scroll(amount)
+    else:
+        raise HTTPException(status_code=400, detail="Unsupported mouse action")
+
+    position = pyautogui.position()
+    return {"action": f"mouse_{action}", "x": position.x, "y": position.y}
+
+
+def keyboard_action(
+    action: KeyboardAction,
+    key: str | None = None,
+    keys: list[str] | None = None,
+    text: str | None = None,
+    interval: float = 0,
+) -> dict[str, Any]:
+    """执行键盘动作,包括单键、组合键、输入文本、按下和释放。"""
+    pyautogui = load_pyautogui()
+    if action == "press":
+        if not key:
+            raise HTTPException(status_code=400, detail="key is required")
+        pyautogui.press(key, interval=interval)
+    elif action == "hotkey":
+        if not keys:
+            raise HTTPException(status_code=400, detail="keys are required")
+        pyautogui.hotkey(*keys, interval=interval)
+    elif action == "write":
+        if text is None:
+            raise HTTPException(status_code=400, detail="text is required")
+        pyautogui.write(text, interval=interval)
+    elif action == "key_down":
+        if not key:
+            raise HTTPException(status_code=400, detail="key is required")
+        pyautogui.keyDown(key)
+    elif action == "key_up":
+        if not key:
+            raise HTTPException(status_code=400, detail="key is required")
+        pyautogui.keyUp(key)
+    else:
+        raise HTTPException(status_code=400, detail="Unsupported keyboard action")
+    return {"action": f"keyboard_{action}", "key": key, "keys": keys}

+ 4 - 0
backend/requirements.txt

@@ -2,3 +2,7 @@ fastapi>=0.115.0
 uvicorn[standard]>=0.30.0
 psutil>=6.0.0
 pydantic>=2.8.0
+httpx>=0.27.0
+pyautogui>=0.9.54
+pillow>=10.0.0
+pyperclip>=1.9.0

+ 80 - 1
frontend/src/App.vue

@@ -8,6 +8,19 @@
         <el-menu-item index="services">Windows 服务</el-menu-item>
         <el-menu-item index="processes">Windows 进程</el-menu-item>
         <el-menu-item index="tags">标签管理</el-menu-item>
+        <el-sub-menu index="ai">
+          <template #title>AI 配置</template>
+          <el-menu-item index="ai-providers">AI 服务商管理</el-menu-item>
+          <el-menu-item index="ai-models">AI 模型管理</el-menu-item>
+          <el-menu-item index="ai-test">AI 服务测试</el-menu-item>
+        </el-sub-menu>
+        <el-sub-menu index="automation">
+          <template #title>自动化</template>
+          <el-menu-item index="automation-actions">自动化操作</el-menu-item>
+          <el-menu-item index="automation-workflows">自动化工作流</el-menu-item>
+          <el-menu-item index="automation-screens">已识别界面</el-menu-item>
+          <el-menu-item index="automation-errors">自动化错误记录</el-menu-item>
+        </el-sub-menu>
         <el-menu-item index="sensors">传感器信息</el-menu-item>
         <el-menu-item index="smart">硬盘 SMART</el-menu-item>
         <el-menu-item index="scans">扫描历史</el-menu-item>
@@ -18,7 +31,7 @@
       <div class="topbar">
         <div>
           <div class="page-title">{{ title }}</div>
-          <div class="muted">采集 Windows 服务和进程,确认可信状态,并整理给 AI 分析的数据。</div>
+          <div class="muted">{{ subtitle }}</div>
         </div>
         <el-button type="primary" :loading="scanning" @click="runScan">执行扫描</el-button>
       </div>
@@ -64,6 +77,34 @@
         <TagManager />
       </section>
 
+      <section v-if="activeView === 'ai-providers'">
+        <AiProviderManager ref="aiProviderManager" />
+      </section>
+
+      <section v-if="activeView === 'ai-models'">
+        <AiModelManager ref="aiModelManager" />
+      </section>
+
+      <section v-if="activeView === 'ai-test'">
+        <AiTestView ref="aiTestView" />
+      </section>
+
+      <section v-if="activeView === 'automation-actions'">
+        <AutomationActionView ref="automationActionView" />
+      </section>
+
+      <section v-if="activeView === 'automation-workflows'">
+        <AutomationWorkflowView ref="automationWorkflowView" />
+      </section>
+
+      <section v-if="activeView === 'automation-screens'">
+        <AutomationScreensView ref="automationScreensView" />
+      </section>
+
+      <section v-if="activeView === 'automation-errors'">
+        <AutomationErrorsView ref="automationErrorsView" />
+      </section>
+
       <section v-if="activeView === 'sensors'">
         <SensorView />
       </section>
@@ -93,6 +134,13 @@
 import { computed, nextTick, onMounted, ref, watch } from 'vue'
 import { ElMessage } from 'element-plus'
 import { api } from './api'
+import AiModelManager from './components/AiModelManager.vue'
+import AiProviderManager from './components/AiProviderManager.vue'
+import AiTestView from './components/AiTestView.vue'
+import AutomationActionView from './components/AutomationActionView.vue'
+import AutomationErrorsView from './components/AutomationErrorsView.vue'
+import AutomationScreensView from './components/AutomationScreensView.vue'
+import AutomationWorkflowView from './components/AutomationWorkflowView.vue'
 import ItemTable from './components/ItemTable.vue'
 import SensorView from './components/SensorView.vue'
 import SmartView from './components/SmartView.vue'
@@ -107,6 +155,13 @@ const serviceTable = ref(null)
 const processTable = ref(null)
 const pendingServiceTable = ref(null)
 const pendingProcessTable = ref(null)
+const aiProviderManager = ref(null)
+const aiModelManager = ref(null)
+const aiTestView = ref(null)
+const automationActionView = ref(null)
+const automationWorkflowView = ref(null)
+const automationScreensView = ref(null)
+const automationErrorsView = ref(null)
 
 const title = computed(() => ({
   dashboard: '仪表盘',
@@ -114,11 +169,25 @@ const title = computed(() => ({
   services: 'Windows 服务',
   processes: 'Windows 进程',
   tags: '标签管理',
+  'ai-providers': 'AI 服务商管理',
+  'ai-models': 'AI 模型管理',
+  'ai-test': 'AI 服务测试',
+  'automation-actions': '自动化操作',
+  'automation-workflows': '自动化工作流',
+  'automation-screens': '已识别界面',
+  'automation-errors': '自动化错误记录',
   sensors: '传感器信息',
   smart: '硬盘 SMART',
   scans: '扫描历史',
 })[activeView.value])
 
+const subtitle = computed(() => {
+  if (String(activeView.value).startsWith('automation')) {
+    return '通过 AI 视觉识别 Windows 界面,执行自动化动作,并沉淀可复用工作流。'
+  }
+  return '采集 Windows 服务和进程,确认可信状态,并整理给 AI 分析的数据。'
+})
+
 async function loadDashboard() {
   const { data } = await api.get('/api/dashboard')
   dashboard.value = data
@@ -137,6 +206,13 @@ async function refreshCurrent() {
   processTable.value?.load()
   pendingServiceTable.value?.load()
   pendingProcessTable.value?.load()
+  aiProviderManager.value?.load()
+  aiModelManager.value?.refreshAll()
+  aiTestView.value?.loadOptions()
+  automationActionView.value?.loadOptions()
+  automationWorkflowView.value?.load()
+  automationScreensView.value?.load()
+  automationErrorsView.value?.load()
 }
 
 async function runScan() {
@@ -155,6 +231,9 @@ async function runScan() {
 watch(activeView, async (view) => {
   if (view === 'dashboard') await loadDashboard()
   if (view === 'scans') await loadScans()
+  if (view === 'automation-workflows') await automationWorkflowView.value?.load()
+  if (view === 'automation-screens') await automationScreensView.value?.load()
+  if (view === 'automation-errors') await automationErrorsView.value?.load()
 })
 
 onMounted(refreshCurrent)

+ 145 - 0
frontend/src/components/AiModelManager.vue

@@ -0,0 +1,145 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-select v-model="providerFilter" clearable placeholder="筛选服务商" style="width: 220px" @change="load">
+          <el-option v-for="provider in providers" :key="provider.id" :label="provider.name" :value="provider.id" />
+        </el-select>
+        <el-button type="primary" @click="openCreate">新增模型</el-button>
+        <el-button @click="refreshAll">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="models" border stripe>
+      <el-table-column prop="provider_name" label="服务商" min-width="160" />
+      <el-table-column prop="name" label="模型名称" min-width="180" />
+      <el-table-column prop="display_name" label="显示名称" min-width="180" />
+      <el-table-column label="默认" width="90">
+        <template #default="{ row }">
+          <el-tag v-if="row.is_default" type="success">默认</el-tag>
+          <span v-else>-</span>
+        </template>
+      </el-table-column>
+      <el-table-column label="操作" width="180" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" @click="openEdit(row)">编辑</el-button>
+          <el-button size="small" type="danger" @click="remove(row)">删除</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="dialog" :title="form.id ? '编辑模型' : '新增模型'" width="560px">
+      <el-form label-width="110px">
+        <el-form-item label="服务商">
+          <el-select v-model="form.provider_id" style="width: 100%">
+            <el-option v-for="provider in providers" :key="provider.id" :label="provider.name" :value="provider.id" />
+          </el-select>
+        </el-form-item>
+        <el-form-item label="模型名称">
+          <el-input v-model="form.name" placeholder="例如 gpt-4o-mini、gemini-1.5-flash、local-model" />
+        </el-form-item>
+        <el-form-item label="显示名称">
+          <el-input v-model="form.display_name" />
+        </el-form-item>
+        <el-form-item label="默认模型">
+          <el-switch v-model="form.is_default" active-text="默认" inactive-text="普通" />
+        </el-form-item>
+      </el-form>
+      <template #footer>
+        <el-button @click="dialog = false">取消</el-button>
+        <el-button type="primary" @click="save">保存</el-button>
+      </template>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { onMounted, reactive, ref } from 'vue'
+import { ElMessage, ElMessageBox } from 'element-plus'
+import { api } from '../api'
+
+const providers = ref([])
+const models = ref([])
+const providerFilter = ref(null)
+const dialog = ref(false)
+const form = reactive({
+  id: null,
+  provider_id: null,
+  name: '',
+  display_name: '',
+  is_default: false,
+})
+
+async function loadProviders() {
+  const { data } = await api.get('/api/ai/providers')
+  providers.value = data.items
+}
+
+async function load() {
+  const params = providerFilter.value ? { provider_id: providerFilter.value } : {}
+  const { data } = await api.get('/api/ai/models', { params })
+  models.value = data.items
+}
+
+async function refreshAll() {
+  await loadProviders()
+  await load()
+}
+
+function resetForm() {
+  form.id = null
+  form.provider_id = providerFilter.value || providers.value[0]?.id || null
+  form.name = ''
+  form.display_name = ''
+  form.is_default = false
+}
+
+function openCreate() {
+  if (!providers.value.length) {
+    ElMessage.warning('请先新增 AI 服务商')
+    return
+  }
+  resetForm()
+  dialog.value = true
+}
+
+function openEdit(row) {
+  form.id = row.id
+  form.provider_id = row.provider_id
+  form.name = row.name
+  form.display_name = row.display_name || ''
+  form.is_default = row.is_default
+  dialog.value = true
+}
+
+async function save() {
+  if (!form.provider_id || !form.name.trim()) {
+    ElMessage.warning('请选择服务商并填写模型名称')
+    return
+  }
+  const payload = {
+    provider_id: form.provider_id,
+    name: form.name.trim(),
+    display_name: form.display_name,
+    is_default: form.is_default,
+  }
+  if (form.id) {
+    await api.patch(`/api/ai/models/${form.id}`, payload)
+  } else {
+    await api.post('/api/ai/models', payload)
+  }
+  dialog.value = false
+  ElMessage.success('已保存')
+  await load()
+}
+
+async function remove(row) {
+  await ElMessageBox.confirm(`确认删除模型“${row.name}”?`, '删除模型', { type: 'warning' })
+  await api.delete(`/api/ai/models/${row.id}`)
+  ElMessage.success('已删除')
+  await load()
+}
+
+defineExpose({ refreshAll })
+onMounted(refreshAll)
+</script>

+ 162 - 0
frontend/src/components/AiProviderManager.vue

@@ -0,0 +1,162 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-button type="primary" @click="openCreate">新增服务商</el-button>
+        <el-button @click="load">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="providers" border stripe>
+      <el-table-column prop="name" label="服务商名称" min-width="160" />
+      <el-table-column label="类型" width="170">
+        <template #default="{ row }">{{ providerTypeLabel(row.provider_type) }}</template>
+      </el-table-column>
+      <el-table-column prop="base_url" label="Base URL" min-width="260">
+        <template #default="{ row }">{{ row.base_url || defaultBaseUrl(row.provider_type) }}</template>
+      </el-table-column>
+      <el-table-column label="API Key" width="110">
+        <template #default="{ row }">
+          <el-tag :type="row.api_key_set ? 'success' : 'info'">{{ row.api_key_set ? '已设置' : '未设置' }}</el-tag>
+        </template>
+      </el-table-column>
+      <el-table-column label="启用" width="90">
+        <template #default="{ row }">
+          <el-tag :type="row.enabled ? 'success' : 'info'">{{ row.enabled ? '启用' : '停用' }}</el-tag>
+        </template>
+      </el-table-column>
+      <el-table-column label="操作" width="180" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" @click="openEdit(row)">编辑</el-button>
+          <el-button size="small" type="danger" @click="remove(row)">删除</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="dialog" :title="form.id ? '编辑服务商' : '新增服务商'" width="620px">
+      <el-form label-width="120px">
+        <el-form-item label="服务商名称">
+          <el-input v-model="form.name" />
+        </el-form-item>
+        <el-form-item label="服务商类型">
+          <el-select v-model="form.provider_type" style="width: 100%">
+            <el-option v-for="item in providerTypes" :key="item.value" :label="item.label" :value="item.value" />
+          </el-select>
+        </el-form-item>
+        <el-form-item label="Base URL">
+          <el-input v-model="form.base_url" :placeholder="defaultBaseUrl(form.provider_type)" />
+        </el-form-item>
+        <el-form-item label="API Key">
+          <el-input v-model="form.api_key" type="password" show-password :placeholder="form.id ? '留空则不修改' : '可选'" />
+        </el-form-item>
+        <el-form-item v-if="form.id" label="清空 API Key">
+          <el-checkbox v-model="form.clear_api_key">保存时清空已设置的 API Key</el-checkbox>
+        </el-form-item>
+        <el-form-item label="启用">
+          <el-switch v-model="form.enabled" active-text="启用" inactive-text="停用" />
+        </el-form-item>
+      </el-form>
+      <template #footer>
+        <el-button @click="dialog = false">取消</el-button>
+        <el-button type="primary" @click="save">保存</el-button>
+      </template>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { onMounted, reactive, ref } from 'vue'
+import { ElMessage, ElMessageBox } from 'element-plus'
+import { api } from '../api'
+
+const providerTypes = [
+  { label: 'OpenAI', value: 'OPENAI' },
+  { label: 'OpenAI 兼容', value: 'OPENAI_COMPATIBLE' },
+  { label: 'Google Gemini', value: 'GOOGLE_GEMINI' },
+]
+
+const providers = ref([])
+const dialog = ref(false)
+const form = reactive({
+  id: null,
+  name: '',
+  provider_type: 'OPENAI',
+  base_url: '',
+  api_key: '',
+  clear_api_key: false,
+  enabled: true,
+})
+
+function providerTypeLabel(value) {
+  return providerTypes.find((item) => item.value === value)?.label || value
+}
+
+function defaultBaseUrl(value) {
+  if (value === 'GOOGLE_GEMINI') return 'https://generativelanguage.googleapis.com/v1beta'
+  return 'https://api.openai.com/v1'
+}
+
+async function load() {
+  const { data } = await api.get('/api/ai/providers')
+  providers.value = data.items
+}
+
+function resetForm() {
+  form.id = null
+  form.name = ''
+  form.provider_type = 'OPENAI'
+  form.base_url = ''
+  form.api_key = ''
+  form.clear_api_key = false
+  form.enabled = true
+}
+
+function openCreate() {
+  resetForm()
+  dialog.value = true
+}
+
+function openEdit(row) {
+  form.id = row.id
+  form.name = row.name
+  form.provider_type = row.provider_type
+  form.base_url = row.base_url || ''
+  form.api_key = ''
+  form.clear_api_key = false
+  form.enabled = row.enabled
+  dialog.value = true
+}
+
+async function save() {
+  if (!form.name.trim()) {
+    ElMessage.warning('请输入服务商名称')
+    return
+  }
+  const payload = {
+    name: form.name.trim(),
+    provider_type: form.provider_type,
+    base_url: form.base_url,
+    api_key: form.api_key || null,
+    enabled: form.enabled,
+  }
+  if (form.id) {
+    payload.clear_api_key = form.clear_api_key
+    await api.patch(`/api/ai/providers/${form.id}`, payload)
+  } else {
+    await api.post('/api/ai/providers', payload)
+  }
+  dialog.value = false
+  ElMessage.success('已保存')
+  await load()
+}
+
+async function remove(row) {
+  await ElMessageBox.confirm(`确认删除 AI 服务商“${row.name}”?关联模型也会删除。`, '删除服务商', { type: 'warning' })
+  await api.delete(`/api/ai/providers/${row.id}`)
+  ElMessage.success('已删除')
+  await load()
+}
+
+defineExpose({ load })
+onMounted(load)
+</script>

+ 88 - 0
frontend/src/components/AiTestView.vue

@@ -0,0 +1,88 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-select v-model="form.provider_id" placeholder="AI 服务商" style="width: 220px" @change="selectDefaultModel">
+          <el-option v-for="provider in enabledProviders" :key="provider.id" :label="provider.name" :value="provider.id" />
+        </el-select>
+        <el-select v-model="form.model_id" placeholder="AI 模型" style="width: 220px">
+          <el-option
+            v-for="model in providerModels"
+            :key="model.id"
+            :label="model.display_name || model.name"
+            :value="model.id"
+          />
+        </el-select>
+        <span class="muted">温度</span>
+        <el-input-number v-model="form.temperature" :min="0" :max="2" :step="0.1" />
+        <el-button type="primary" :loading="loading" @click="send">发送</el-button>
+      </div>
+    </div>
+
+    <el-input v-model="form.prompt" class="prompt-box" type="textarea" placeholder="输入测试提示词" />
+    <div style="margin-top: 12px">
+      <div class="section-title">输出</div>
+      <pre class="raw-output">{{ output || '暂无输出' }}</pre>
+    </div>
+  </div>
+</template>
+
+<script setup>
+import { computed, onMounted, reactive, ref } from 'vue'
+import { ElMessage } from 'element-plus'
+import { api } from '../api'
+
+const providers = ref([])
+const models = ref([])
+const loading = ref(false)
+const output = ref('')
+const form = reactive({
+  provider_id: null,
+  model_id: null,
+  temperature: 0.2,
+  prompt: '请用一句话回答:你已经可以连接了吗?',
+})
+
+const enabledProviders = computed(() => providers.value.filter((item) => item.enabled))
+const providerModels = computed(() => models.value.filter((item) => item.provider_id === form.provider_id))
+
+async function loadOptions() {
+  const [providerResult, modelResult] = await Promise.all([
+    api.get('/api/ai/providers'),
+    api.get('/api/ai/models'),
+  ])
+  providers.value = providerResult.data.items
+  models.value = modelResult.data.items
+  form.provider_id = enabledProviders.value[0]?.id || null
+  selectDefaultModel()
+}
+
+function selectDefaultModel() {
+  const available = providerModels.value
+  form.model_id = available.find((item) => item.is_default)?.id || available[0]?.id || null
+}
+
+async function send() {
+  if (!form.provider_id || !form.model_id || !form.prompt.trim()) {
+    ElMessage.warning('请选择服务商、模型并输入内容')
+    return
+  }
+  loading.value = true
+  try {
+    const { data } = await api.post('/api/ai/test', {
+      provider_id: form.provider_id,
+      model_id: form.model_id,
+      prompt: form.prompt,
+      temperature: form.temperature,
+    })
+    output.value = data.content
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || 'AI 测试失败')
+  } finally {
+    loading.value = false
+  }
+}
+
+defineExpose({ loadOptions })
+onMounted(loadOptions)
+</script>

+ 366 - 0
frontend/src/components/AutomationActionView.vue

@@ -0,0 +1,366 @@
+<template>
+  <div class="automation-workspace">
+    <div class="automation-main panel">
+      <div class="toolbar">
+        <div class="filters">
+          <el-select v-model="ai.provider_id" placeholder="AI 服务商" style="width: 190px" @change="selectDefaultModel">
+            <el-option v-for="provider in enabledProviders" :key="provider.id" :label="provider.name" :value="provider.id" />
+          </el-select>
+          <el-select v-model="ai.model_id" placeholder="AI 模型" style="width: 210px">
+            <el-option v-for="model in providerModels" :key="model.id" :label="model.display_name || model.name" :value="model.id" />
+          </el-select>
+          <el-input-number v-model="ai.temperature" :min="0" :max="2" :step="0.1" />
+          <el-button type="primary" :loading="analyzing" @click="analyzeScreen">分析界面</el-button>
+        </div>
+      </div>
+
+      <div class="screenshot-stage">
+        <div v-if="imageSrc" class="screenshot-canvas" :style="canvasStyle">
+          <img class="screenshot-image" :src="imageSrc" alt="当前 Windows 截图" />
+          <template v-if="currentScreen?.elements">
+            <button
+              v-for="element in currentScreen.elements"
+              :key="element.id || element.element_index"
+              class="element-marker"
+              :style="markerStyle(element)"
+              :title="`${element.element_index}. ${element.name}`"
+            >
+              {{ element.element_index }}
+            </button>
+          </template>
+        </div>
+        <div v-else class="screenshot-empty">暂无截图</div>
+      </div>
+    </div>
+
+    <aside class="automation-side panel">
+      <div class="side-section">
+        <div class="section-title">步骤记录</div>
+        <div class="record-row">
+          <el-button v-if="!recording" type="success" @click="startRecording">开始记录步骤</el-button>
+          <el-button v-else type="danger" :loading="savingWorkflow" @click="finishRecording">结束记录步骤</el-button>
+          <el-tag :type="recording ? 'success' : 'info'">{{ recording ? `记录中:${recordedNodes.length}` : '未记录' }}</el-tag>
+        </div>
+      </div>
+
+      <div class="side-section">
+        <div class="section-title">快捷操作</div>
+        <div class="action-grid">
+          <el-button @click="openKeyboardDialog">执行键盘操作</el-button>
+          <el-button @click="openTextDialog">键盘输入</el-button>
+          <el-button @click="openProgramDialog">启动程序</el-button>
+          <el-button type="warning" @click="closeOpenedPrograms">关闭程序</el-button>
+        </div>
+      </div>
+
+      <div class="side-section">
+        <div class="section-title">可操作元素</div>
+        <el-table :data="currentScreen?.elements || []" height="420" border stripe>
+          <el-table-column prop="element_index" label="#" width="54" />
+          <el-table-column prop="name" label="名称" min-width="130" show-overflow-tooltip />
+          <el-table-column label="坐标" width="110">
+            <template #default="{ row }">{{ row.x }}, {{ row.y }}</template>
+          </el-table-column>
+          <el-table-column label="操作" width="100" fixed="right">
+            <template #default="{ row }">
+              <el-dropdown @command="(command) => runElementMouse(row, command)">
+                <el-button size="small">点击</el-button>
+                <template #dropdown>
+                  <el-dropdown-menu>
+                    <el-dropdown-item command="click">左键点击</el-dropdown-item>
+                    <el-dropdown-item command="right_click">右键点击</el-dropdown-item>
+                    <el-dropdown-item command="double_click">双击</el-dropdown-item>
+                  </el-dropdown-menu>
+                </template>
+              </el-dropdown>
+            </template>
+          </el-table-column>
+        </el-table>
+      </div>
+    </aside>
+
+    <el-dialog v-model="keyboardDialog" title="执行键盘操作" width="420px" @opened="focusKeyCapture">
+      <div ref="keyCaptureRef" class="key-capture" tabindex="0" @keydown.prevent="captureKey">
+        <div class="muted">点击此区域后按下单键或组合键</div>
+        <div class="key-list">
+          <el-tag v-for="key in capturedKeys" :key="key">{{ key }}</el-tag>
+        </div>
+      </div>
+      <template #footer>
+        <el-button @click="keyboardDialog = false">取消</el-button>
+        <el-button type="primary" @click="runKeyboard">确定</el-button>
+      </template>
+    </el-dialog>
+
+    <el-dialog v-model="textDialog" title="键盘输入" width="520px">
+      <el-input v-model="textInput" type="textarea" :rows="5" placeholder="请输入要粘贴的文本" />
+      <template #footer>
+        <el-button @click="textDialog = false">取消</el-button>
+        <el-button type="primary" @click="runTextInput">确定</el-button>
+      </template>
+    </el-dialog>
+
+    <el-dialog v-model="programDialog" title="启动程序" width="520px">
+      <el-select v-model="quickProgram" placeholder="常用程序" style="width: 100%; margin-bottom: 10px" @change="programCommand = quickProgram">
+        <el-option label="Microsoft Edge" value="msedge" />
+        <el-option label="Chrome" value="chrome" />
+        <el-option label="Firefox" value="firefox" />
+      </el-select>
+      <el-input v-model="programCommand" placeholder="例如:notepad.exe" />
+      <template #footer>
+        <el-button @click="programDialog = false">取消</el-button>
+        <el-button type="primary" @click="runStartProgram">确定</el-button>
+      </template>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { computed, nextTick, onMounted, reactive, ref } from 'vue'
+import { ElMessage, ElMessageBox } from 'element-plus'
+import { api } from '../api'
+
+const providers = ref([])
+const models = ref([])
+const analyzing = ref(false)
+const savingWorkflow = ref(false)
+const currentScreen = ref(null)
+const recording = ref(false)
+const recordedNodes = ref([])
+const openedPids = ref([])
+const keyboardDialog = ref(false)
+const textDialog = ref(false)
+const programDialog = ref(false)
+const capturedKeys = ref([])
+const keyCaptureRef = ref(null)
+const textInput = ref('')
+const quickProgram = ref('')
+const programCommand = ref('')
+
+const ai = reactive({
+  provider_id: null,
+  model_id: null,
+  temperature: 0.1,
+})
+
+const enabledProviders = computed(() => providers.value.filter((item) => item.enabled))
+const providerModels = computed(() => models.value.filter((item) => item.provider_id === ai.provider_id))
+const imageSrc = computed(() => {
+  if (!currentScreen.value?.image_base64) return ''
+  return `data:${currentScreen.value.mime_type || 'image/png'};base64,${currentScreen.value.image_base64}`
+})
+const canvasStyle = computed(() => {
+  if (!currentScreen.value?.width || !currentScreen.value?.height) return {}
+  return { aspectRatio: `${currentScreen.value.width} / ${currentScreen.value.height}` }
+})
+
+async function loadOptions() {
+  const [providerResult, modelResult] = await Promise.all([api.get('/api/ai/providers'), api.get('/api/ai/models')])
+  providers.value = providerResult.data.items
+  models.value = modelResult.data.items
+  if (!ai.provider_id) ai.provider_id = enabledProviders.value[0]?.id || null
+  selectDefaultModel()
+}
+
+function selectDefaultModel() {
+  const available = providerModels.value
+  ai.model_id = available.find((item) => item.is_default)?.id || available[0]?.id || null
+}
+
+function ensureAiSelected() {
+  if (!ai.provider_id || !ai.model_id) {
+    ElMessage.warning('请先选择 AI 服务商和模型')
+    return false
+  }
+  return true
+}
+
+async function analyzeScreen() {
+  if (!ensureAiSelected()) return
+  analyzing.value = true
+  try {
+    const { data } = await api.post('/api/automation/vision/analyze', {
+      provider_id: ai.provider_id,
+      model_id: ai.model_id,
+      temperature: ai.temperature,
+    })
+    currentScreen.value = data
+    ElMessage.success(`识别完成:${data.elements?.length || 0} 个元素`)
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || '界面分析失败')
+  } finally {
+    analyzing.value = false
+  }
+}
+
+function markerStyle(element) {
+  const left = currentScreen.value?.width ? (element.x / currentScreen.value.width) * 100 : element.x_percent
+  const top = currentScreen.value?.height ? (element.y / currentScreen.value.height) * 100 : element.y_percent
+  return { left: `${left}%`, top: `${top}%` }
+}
+
+function actionBase() {
+  return {
+    screen_id: currentScreen.value?.id || null,
+    provider_id: ai.provider_id,
+    model_id: ai.model_id,
+    temperature: ai.temperature,
+  }
+}
+
+function rememberProcesses(items = []) {
+  for (const item of items) {
+    if (item.pid && !openedPids.value.includes(item.pid)) openedPids.value.push(item.pid)
+  }
+}
+
+function addNode(node) {
+  if (!recording.value) return
+  recordedNodes.value.push(node)
+}
+
+async function runElementMouse(element, mouseAction) {
+  if (!ensureAiSelected()) return
+  try {
+    const { data } = await api.post('/api/automation/actions/mouse', {
+      ...actionBase(),
+      x: element.x,
+      y: element.y,
+      mouse_action: mouseAction,
+    })
+    rememberProcesses(data.new_processes)
+    addNode({
+      node_type: 'mouse',
+      screen_id: currentScreen.value?.id || null,
+      title: `${element.name} ${mouseAction}`,
+      config: { x: element.x, y: element.y, mouse_action: mouseAction, element_name: element.name },
+    })
+    ElMessage.success('鼠标操作已执行')
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail?.message || error.response?.data?.detail || '鼠标操作失败')
+  }
+}
+
+function openKeyboardDialog() {
+  capturedKeys.value = []
+  keyboardDialog.value = true
+}
+
+async function focusKeyCapture() {
+  await nextTick()
+  keyCaptureRef.value?.focus()
+}
+
+function captureKey(event) {
+  const key = normalizeKey(event.key)
+  if (!capturedKeys.value.includes(key)) capturedKeys.value.push(key)
+}
+
+function normalizeKey(key) {
+  const map = { Control: 'ctrl', Shift: 'shift', Alt: 'alt', Meta: 'win', Escape: 'esc', ' ': 'space' }
+  return map[key] || key.toLowerCase()
+}
+
+async function runKeyboard() {
+  if (!capturedKeys.value.length || !ensureAiSelected()) return
+  try {
+    const keys = [...capturedKeys.value]
+    const { data } = await api.post('/api/automation/actions/keyboard', { ...actionBase(), keys })
+    rememberProcesses(data.new_processes)
+    addNode({ node_type: 'keyboard', screen_id: currentScreen.value?.id || null, title: keys.join('+'), config: { keys } })
+    keyboardDialog.value = false
+    ElMessage.success('键盘操作已执行')
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail?.message || error.response?.data?.detail || '键盘操作失败')
+  }
+}
+
+function openTextDialog() {
+  textInput.value = ''
+  textDialog.value = true
+}
+
+async function runTextInput() {
+  if (!ensureAiSelected()) return
+  try {
+    const { data } = await api.post('/api/automation/actions/text-input', { ...actionBase(), text: textInput.value })
+    rememberProcesses(data.new_processes)
+    addNode({ node_type: 'text_input', screen_id: currentScreen.value?.id || null, title: '键盘输入', config: { text: textInput.value } })
+    textDialog.value = false
+    ElMessage.success('文本已输入')
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail?.message || error.response?.data?.detail || '文本输入失败')
+  }
+}
+
+function openProgramDialog() {
+  quickProgram.value = ''
+  programCommand.value = ''
+  programDialog.value = true
+}
+
+async function runStartProgram() {
+  if (!programCommand.value.trim()) return
+  try {
+    const { data } = await api.post('/api/automation/actions/start-program', { command: programCommand.value.trim() })
+    rememberProcesses(data.new_processes)
+    if (data.result?.pid && !openedPids.value.includes(data.result.pid)) openedPids.value.push(data.result.pid)
+    addNode({ node_type: 'start_program', title: `启动 ${programCommand.value.trim()}`, config: { command: programCommand.value.trim() } })
+    programDialog.value = false
+    ElMessage.success('程序已启动')
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || '启动程序失败')
+  }
+}
+
+async function closeOpenedPrograms() {
+  try {
+    await ElMessageBox.confirm('确认关闭本次自动化过程中打开过的程序?', '关闭程序', { type: 'warning' })
+  } catch {
+    return
+  }
+  const { data } = await api.post('/api/automation/actions/close-opened-programs', { pids: openedPids.value })
+  addNode({ node_type: 'close_programs', title: '关闭本次打开的程序', config: { pids: [...openedPids.value] } })
+  openedPids.value = []
+  ElMessage.success(`已处理 ${data.items?.length || 0} 个进程`)
+}
+
+function startRecording() {
+  recordedNodes.value = []
+  recording.value = true
+  ElMessage.success('已开始记录步骤')
+}
+
+async function finishRecording() {
+  if (!recordedNodes.value.length) {
+    recording.value = false
+    ElMessage.info('没有记录到步骤')
+    return
+  }
+  let value
+  try {
+    const result = await ElMessageBox.prompt('请输入工作流名称', '保存工作流', {
+      inputValue: `自动化工作流 ${new Date().toLocaleString()}`,
+    })
+    value = result.value
+  } catch {
+    return
+  }
+  savingWorkflow.value = true
+  try {
+    await api.post('/api/automation/workflows', {
+      name: value,
+      description: currentScreen.value?.description || '',
+      nodes: recordedNodes.value,
+    })
+    recording.value = false
+    recordedNodes.value = []
+    ElMessage.success('工作流已保存')
+  } finally {
+    savingWorkflow.value = false
+  }
+}
+
+defineExpose({ loadOptions })
+onMounted(loadOptions)
+</script>

+ 76 - 0
frontend/src/components/AutomationErrorsView.vue

@@ -0,0 +1,76 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-button @click="load">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="errors.items" border stripe>
+      <el-table-column prop="id" label="ID" width="80" />
+      <el-table-column prop="action_type" label="动作" width="150" />
+      <el-table-column prop="interface_name" label="目标界面" min-width="160" />
+      <el-table-column prop="message" label="错误信息" min-width="320" show-overflow-tooltip />
+      <el-table-column prop="similarity" label="相似度" width="100">
+        <template #default="{ row }">{{ row.similarity ?? '-' }}</template>
+      </el-table-column>
+      <el-table-column prop="created_at" label="时间" min-width="180" />
+      <el-table-column label="操作" width="100" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" @click="openDetail(row)">详情</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="dialog" title="错误详情" width="1040px">
+      <div v-if="detail">
+        <el-descriptions :column="2" border>
+          <el-descriptions-item label="动作">{{ detail.action_type }}</el-descriptions-item>
+          <el-descriptions-item label="相似度">{{ detail.similarity ?? '-' }}</el-descriptions-item>
+          <el-descriptions-item label="错误信息" :span="2">{{ detail.message }}</el-descriptions-item>
+          <el-descriptions-item label="AI 判断" :span="2">{{ detail.compare_result?.reason || '-' }}</el-descriptions-item>
+        </el-descriptions>
+        <div class="error-images">
+          <div>
+            <div class="section-title">目标界面</div>
+            <img v-if="expectedImageSrc" class="error-image" :src="expectedImageSrc" alt="目标界面截图" />
+          </div>
+          <div>
+            <div class="section-title">实际屏幕</div>
+            <img v-if="actualImageSrc" class="error-image" :src="actualImageSrc" alt="实际屏幕截图" />
+          </div>
+        </div>
+      </div>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { computed, onMounted, ref } from 'vue'
+import { api } from '../api'
+
+const errors = ref({ items: [] })
+const detail = ref(null)
+const dialog = ref(false)
+
+const expectedImageSrc = computed(() => imageSrc(detail.value?.expected_image_base64, detail.value?.expected_image_mime_type))
+const actualImageSrc = computed(() => imageSrc(detail.value?.actual_image_base64, detail.value?.actual_image_mime_type))
+
+function imageSrc(base64, mimeType) {
+  return base64 ? `data:${mimeType || 'image/png'};base64,${base64}` : ''
+}
+
+async function load() {
+  const { data } = await api.get('/api/automation/errors')
+  errors.value = data
+}
+
+async function openDetail(row) {
+  const { data } = await api.get(`/api/automation/errors/${row.id}`, { params: { include_images: true } })
+  detail.value = data
+  dialog.value = true
+}
+
+defineExpose({ load })
+onMounted(load)
+</script>

+ 110 - 0
frontend/src/components/AutomationScreensView.vue

@@ -0,0 +1,110 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-button @click="load">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="screens.items" border stripe>
+      <el-table-column prop="id" label="ID" width="80" />
+      <el-table-column prop="interface_name" label="界面名称" min-width="180" />
+      <el-table-column prop="description" label="描述" min-width="320" show-overflow-tooltip />
+      <el-table-column label="类型" width="180">
+        <template #default="{ row }">
+          <el-tag v-if="row.is_windows_desktop" type="success">Windows 桌面</el-tag>
+          <el-tag v-if="row.is_browser_webpage" type="primary">浏览器网页</el-tag>
+          <el-tag v-if="!row.is_windows_desktop && !row.is_browser_webpage" type="info">窗口界面</el-tag>
+        </template>
+      </el-table-column>
+      <el-table-column prop="element_count" label="元素" width="90" />
+      <el-table-column prop="created_at" label="识别时间" min-width="180" />
+      <el-table-column label="操作" width="180" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" @click="openDetail(row)">详情</el-button>
+          <el-button size="small" type="danger" @click="remove(row)">删除</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="dialog" title="界面详情" width="980px">
+      <div v-if="detail" class="screen-detail-layout">
+        <div>
+          <div class="screenshot-stage small">
+            <div v-if="detailImageSrc" class="screenshot-canvas" :style="canvasStyle">
+              <img class="screenshot-image" :src="detailImageSrc" alt="已识别界面截图" />
+              <button
+                v-for="element in detail.elements || []"
+                :key="element.id"
+                class="element-marker"
+                :style="markerStyle(element)"
+              >
+                {{ element.element_index }}
+              </button>
+            </div>
+          </div>
+        </div>
+        <div>
+          <el-descriptions :column="1" border>
+            <el-descriptions-item label="名称">{{ detail.interface_name }}</el-descriptions-item>
+            <el-descriptions-item label="描述">{{ detail.description || '-' }}</el-descriptions-item>
+            <el-descriptions-item label="分辨率">{{ detail.width }} x {{ detail.height }}</el-descriptions-item>
+          </el-descriptions>
+          <el-table :data="detail.elements || []" height="360" border stripe style="margin-top: 12px">
+            <el-table-column prop="element_index" label="#" width="54" />
+            <el-table-column prop="name" label="名称" min-width="140" show-overflow-tooltip />
+            <el-table-column label="坐标" width="120">
+              <template #default="{ row }">{{ row.x }}, {{ row.y }}</template>
+            </el-table-column>
+          </el-table>
+        </div>
+      </div>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { computed, onMounted, ref } from 'vue'
+import { ElMessage, ElMessageBox } from 'element-plus'
+import { api } from '../api'
+
+const screens = ref({ items: [] })
+const detail = ref(null)
+const dialog = ref(false)
+
+const detailImageSrc = computed(() => {
+  if (!detail.value?.image_base64) return ''
+  return `data:${detail.value.mime_type || 'image/png'};base64,${detail.value.image_base64}`
+})
+const canvasStyle = computed(() => {
+  if (!detail.value?.width || !detail.value?.height) return {}
+  return { aspectRatio: `${detail.value.width} / ${detail.value.height}` }
+})
+
+async function load() {
+  const { data } = await api.get('/api/automation/screens')
+  screens.value = data
+}
+
+async function openDetail(row) {
+  const { data } = await api.get(`/api/automation/screens/${row.id}`, { params: { include_image: true } })
+  detail.value = data
+  dialog.value = true
+}
+
+function markerStyle(element) {
+  const left = detail.value?.width ? (element.x / detail.value.width) * 100 : element.x_percent
+  const top = detail.value?.height ? (element.y / detail.value.height) * 100 : element.y_percent
+  return { left: `${left}%`, top: `${top}%` }
+}
+
+async function remove(row) {
+  await ElMessageBox.confirm(`确认删除界面“${row.interface_name}”?`, '删除界面', { type: 'warning' })
+  await api.delete(`/api/automation/screens/${row.id}`)
+  ElMessage.success('已删除')
+  await load()
+}
+
+defineExpose({ load })
+onMounted(load)
+</script>

+ 184 - 0
frontend/src/components/AutomationWorkflowView.vue

@@ -0,0 +1,184 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-button type="primary" @click="createEmpty">新建空工作流</el-button>
+        <el-button @click="load">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="workflows.items" border stripe>
+      <el-table-column prop="id" label="ID" width="80" />
+      <el-table-column prop="name" label="名称" min-width="180" />
+      <el-table-column prop="description" label="描述" min-width="260" show-overflow-tooltip />
+      <el-table-column prop="node_count" label="节点数" width="100" />
+      <el-table-column prop="updated_at" label="更新时间" min-width="180" />
+      <el-table-column label="操作" width="210" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" @click="openEdit(row)">编辑</el-button>
+          <el-button size="small" type="danger" @click="remove(row)">删除</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="dialog" :title="form.id ? '编辑工作流' : '新建工作流'" width="860px">
+      <el-form label-width="90px">
+        <el-form-item label="名称">
+          <el-input v-model="form.name" />
+        </el-form-item>
+        <el-form-item label="描述">
+          <el-input v-model="form.description" type="textarea" :rows="2" />
+        </el-form-item>
+      </el-form>
+
+      <div class="toolbar">
+        <div class="filters">
+          <el-select v-model="newNodeType" style="width: 180px">
+            <el-option v-for="item in nodeTypes" :key="item.value" :label="item.label" :value="item.value" />
+          </el-select>
+          <el-button @click="addNode">添加节点</el-button>
+        </div>
+      </div>
+
+      <div class="node-editor">
+        <div
+          v-for="(node, index) in form.nodes"
+          :key="index"
+          class="node-row"
+          draggable="true"
+          @dragstart="dragIndex = index"
+          @dragover.prevent
+          @drop="dropNode(index)"
+        >
+          <div class="node-order">{{ index + 1 }}</div>
+          <el-select v-model="node.node_type" style="width: 150px">
+            <el-option v-for="item in nodeTypes" :key="item.value" :label="item.label" :value="item.value" />
+          </el-select>
+          <el-input v-model="node.title" placeholder="节点标题" style="width: 190px" />
+          <el-input-number v-model="node.screen_id" :min="1" placeholder="界面 ID" />
+          <el-input
+            v-model="node.configText"
+            type="textarea"
+            :rows="2"
+            placeholder="节点 JSON 配置"
+            class="node-config"
+          />
+          <el-button type="danger" @click="form.nodes.splice(index, 1)">删除</el-button>
+        </div>
+      </div>
+
+      <template #footer>
+        <el-button @click="dialog = false">取消</el-button>
+        <el-button type="primary" @click="save">保存</el-button>
+      </template>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { onMounted, reactive, ref } from 'vue'
+import { ElMessage, ElMessageBox } from 'element-plus'
+import { api } from '../api'
+
+const nodeTypes = [
+  { label: '鼠标操作', value: 'mouse' },
+  { label: '键盘操作', value: 'keyboard' },
+  { label: '键盘输入', value: 'text_input' },
+  { label: '启动程序', value: 'start_program' },
+  { label: '关闭程序', value: 'close_programs' },
+]
+
+const workflows = ref({ items: [] })
+const dialog = ref(false)
+const newNodeType = ref('mouse')
+const dragIndex = ref(null)
+const form = reactive({
+  id: null,
+  name: '',
+  description: '',
+  nodes: [],
+})
+
+async function load() {
+  const { data } = await api.get('/api/automation/workflows')
+  workflows.value = data
+}
+
+function resetForm() {
+  form.id = null
+  form.name = ''
+  form.description = ''
+  form.nodes = []
+}
+
+function createEmpty() {
+  resetForm()
+  form.name = `空工作流 ${new Date().toLocaleString()}`
+  dialog.value = true
+}
+
+async function openEdit(row) {
+  const { data } = await api.get(`/api/automation/workflows/${row.id}`)
+  form.id = data.id
+  form.name = data.name
+  form.description = data.description || ''
+  form.nodes = (data.nodes || []).map((node) => ({
+    node_type: node.node_type,
+    screen_id: node.screen_id,
+    title: node.title || '',
+    configText: JSON.stringify(node.config || {}, null, 2),
+  }))
+  dialog.value = true
+}
+
+function addNode() {
+  form.nodes.push({
+    node_type: newNodeType.value,
+    screen_id: null,
+    title: nodeTypes.find((item) => item.value === newNodeType.value)?.label || newNodeType.value,
+    configText: '{}',
+  })
+}
+
+function dropNode(targetIndex) {
+  if (dragIndex.value === null || dragIndex.value === targetIndex) return
+  const [node] = form.nodes.splice(dragIndex.value, 1)
+  form.nodes.splice(targetIndex, 0, node)
+  dragIndex.value = null
+}
+
+async function save() {
+  if (!form.name.trim()) {
+    ElMessage.warning('请输入工作流名称')
+    return
+  }
+  let nodes
+  try {
+    nodes = form.nodes.map((node) => ({
+      node_type: node.node_type,
+      screen_id: node.screen_id || null,
+      title: node.title,
+      config: JSON.parse(node.configText || '{}'),
+    }))
+  } catch {
+    ElMessage.error('节点配置不是合法 JSON')
+    return
+  }
+  const payload = { name: form.name.trim(), description: form.description, nodes }
+  if (form.id) await api.put(`/api/automation/workflows/${form.id}`, payload)
+  else await api.post('/api/automation/workflows', payload)
+  dialog.value = false
+  ElMessage.success('已保存')
+  await load()
+}
+
+async function remove(row) {
+  await ElMessageBox.confirm(`确认删除工作流“${row.name}”?`, '删除工作流', { type: 'warning' })
+  await api.delete(`/api/automation/workflows/${row.id}`)
+  ElMessage.success('已删除')
+  await load()
+}
+
+defineExpose({ load })
+onMounted(load)
+</script>

+ 186 - 1
frontend/src/components/ItemTable.vue

@@ -15,6 +15,8 @@
       <div class="filters">
         <el-button @click="openPrompt('selected')" :disabled="!selected.length">复制选中给 AI</el-button>
         <el-button @click="openPrompt('pending')">复制全部待确认给 AI</el-button>
+        <el-button type="primary" plain @click="openAiAnalyze('selected')" :disabled="!selected.length">AI 分析选中</el-button>
+        <el-button type="primary" plain @click="openAiAnalyze('pending')">AI 分析全部待确认</el-button>
         <el-button @click="importDialog = true">导入 AI JSON</el-button>
         <el-dropdown :disabled="!selected.length" @command="batchUpdate">
           <el-button>批量标记</el-button>
@@ -153,11 +155,97 @@
         <el-button type="primary" @click="importAi">导入</el-button>
       </template>
     </el-dialog>
+
+    <el-dialog v-model="aiDialog" title="发送给 AI 分析" width="620px">
+      <el-form label-width="110px">
+        <el-form-item label="AI 服务商">
+          <el-select v-model="aiForm.provider_id" placeholder="选择 AI 服务商" style="width: 100%" @change="selectDefaultAiModel">
+            <el-option v-for="provider in enabledProviders" :key="provider.id" :label="provider.name" :value="provider.id" />
+          </el-select>
+        </el-form-item>
+        <el-form-item label="AI 模型">
+          <el-select v-model="aiForm.model_id" placeholder="选择 AI 模型" style="width: 100%">
+            <el-option
+              v-for="model in providerModels"
+              :key="model.id"
+              :label="model.display_name || model.name"
+              :value="model.id"
+            />
+          </el-select>
+        </el-form-item>
+        <el-form-item label="温度">
+          <el-input-number v-model="aiForm.temperature" :min="0" :max="2" :step="0.1" />
+        </el-form-item>
+      </el-form>
+      <template #footer>
+        <el-button @click="aiDialog = false">取消</el-button>
+        <el-button type="primary" :loading="aiLoading" @click="runAiAnalyze">发送分析</el-button>
+      </template>
+    </el-dialog>
+
+    <el-dialog v-model="aiPreviewDialog" title="确认 AI 分析结果" width="960px">
+      <el-table :data="aiResult?.preview || []" border stripe max-height="420">
+        <el-table-column label="匹配" width="80">
+          <template #default="{ row }">
+            <el-tag :type="row.matched ? 'success' : 'danger'">{{ row.matched ? '已匹配' : '未匹配' }}</el-tag>
+          </template>
+        </el-table-column>
+        <el-table-column label="名称" min-width="160">
+          <template #default="{ row }">{{ row.proposed.name }}</template>
+        </el-table-column>
+        <el-table-column label="当前状态" width="120">
+          <template #default="{ row }">
+            <el-tag v-if="row.current" :type="statusMeta(row.current.confirm_status).type">
+              {{ statusMeta(row.current.confirm_status).label }}
+            </el-tag>
+            <span v-else>-</span>
+          </template>
+        </el-table-column>
+        <el-table-column label="AI 建议状态" width="130">
+          <template #default="{ row }">
+            <el-tag :type="statusMeta(row.proposed.judgement).type">{{ statusMeta(row.proposed.judgement).label }}</el-tag>
+          </template>
+        </el-table-column>
+        <el-table-column label="风险" width="90">
+          <template #default="{ row }">{{ row.proposed.risk_level }}</template>
+        </el-table-column>
+        <el-table-column label="标签变更" min-width="190">
+          <template #default="{ row }">
+            <div class="muted">当前</div>
+            <el-tag v-for="tag in row.current?.tags || []" :key="`current-${tag}`" type="info" style="margin: 2px">
+              {{ tag }}
+            </el-tag>
+            <span v-if="!row.current?.tags?.length" class="muted">无</span>
+            <div class="muted" style="margin-top: 6px">AI 建议</div>
+            <el-tag v-for="tag in row.proposed.tags || []" :key="`proposed-${tag}`" type="success" style="margin: 2px">
+              {{ tag }}
+            </el-tag>
+            <span v-if="!row.proposed.tags?.length" class="muted">无</span>
+          </template>
+        </el-table-column>
+        <el-table-column label="说明/原因/建议" min-width="300">
+          <template #default="{ row }">
+            <div><strong>说明:</strong>{{ row.proposed.description || '-' }}</div>
+            <div><strong>原因:</strong>{{ row.proposed.reason || '-' }}</div>
+            <div><strong>建议:</strong>{{ row.proposed.suggestion || '-' }}</div>
+          </template>
+        </el-table-column>
+      </el-table>
+      <el-collapse style="margin-top: 12px">
+        <el-collapse-item title="查看 AI 原始输出" name="raw">
+          <pre class="raw-output">{{ aiResult?.raw_output || '' }}</pre>
+        </el-collapse-item>
+      </el-collapse>
+      <template #footer>
+        <el-button @click="aiPreviewDialog = false">取消</el-button>
+        <el-button type="primary" :disabled="!(aiResult?.items || []).length" @click="confirmAiImport">确认更新入库</el-button>
+      </template>
+    </el-dialog>
   </div>
 </template>
 
 <script setup>
-import { onMounted, reactive, ref } from 'vue'
+import { computed, onMounted, reactive, ref } from 'vue'
 import { ElMessage, ElMessageBox } from 'element-plus'
 import { api, statusMeta, statusOptions } from '../api'
 
@@ -173,14 +261,21 @@ const detailDialog = ref(false)
 const promptDialog = ref(false)
 const importDialog = ref(false)
 const tagDialog = ref(false)
+const aiDialog = ref(false)
+const aiPreviewDialog = ref(false)
+const aiLoading = ref(false)
 const promptText = ref('')
 const importJson = ref('')
 const current = ref(null)
 const currentStatus = ref('PENDING')
 const note = ref('')
 const allTags = ref([])
+const aiProviders = ref([])
+const aiModels = ref([])
 const tagSelection = ref([])
 const tagTarget = ref(null)
+const aiScope = ref('pending')
+const aiResult = ref(null)
 const query = reactive({
   keyword: '',
   confirm_status: props.confirmStatus || '',
@@ -190,6 +285,14 @@ const query = reactive({
   page: 1,
   page_size: 20,
 })
+const aiForm = reactive({
+  provider_id: null,
+  model_id: null,
+  temperature: 0.2,
+})
+
+const enabledProviders = computed(() => aiProviders.value.filter((item) => item.enabled))
+const providerModels = computed(() => aiModels.value.filter((item) => item.provider_id === aiForm.provider_id))
 
 async function load() {
   query.confirm_status = props.confirmStatus || query.confirm_status
@@ -216,6 +319,26 @@ async function loadTags() {
   allTags.value = data.items
 }
 
+async function loadAiOptions() {
+  const [providerResult, modelResult] = await Promise.all([
+    api.get('/api/ai/providers'),
+    api.get('/api/ai/models'),
+  ])
+  aiProviders.value = providerResult.data.items
+  aiModels.value = modelResult.data.items
+  if (!aiForm.provider_id || !enabledProviders.value.some((item) => item.id === aiForm.provider_id)) {
+    aiForm.provider_id = enabledProviders.value[0]?.id || null
+  }
+  selectDefaultAiModel()
+}
+
+function selectDefaultAiModel() {
+  const available = providerModels.value
+  if (!available.some((item) => item.id === aiForm.model_id)) {
+    aiForm.model_id = available.find((item) => item.is_default)?.id || available[0]?.id || null
+  }
+}
+
 async function singleUpdate(row, confirm_status) {
   await api.patch(`${basePath}/${row.id}`, { confirm_status, user_note: row.user_note })
   ElMessage.success('已更新')
@@ -292,6 +415,67 @@ async function importAi() {
   await load()
 }
 
+async function openAiAnalyze(scope) {
+  await loadAiOptions()
+  if (!enabledProviders.value.length) {
+    ElMessage.warning('请先在 AI 配置中新增并启用服务商')
+    return
+  }
+  if (!providerModels.value.length) {
+    ElMessage.warning('请先为当前 AI 服务商新增模型')
+    return
+  }
+  aiScope.value = scope
+  aiDialog.value = true
+}
+
+async function runAiAnalyze() {
+  if (!aiForm.provider_id || !aiForm.model_id) {
+    ElMessage.warning('请选择 AI 服务商和模型')
+    return
+  }
+  aiLoading.value = true
+  try {
+    const payload = aiScope.value === 'selected'
+      ? {
+          scope: 'selected',
+          ids: selected.value.map((row) => row.id),
+          provider_id: aiForm.provider_id,
+          model_id: aiForm.model_id,
+          temperature: aiForm.temperature,
+        }
+      : {
+          scope: 'pending',
+          provider_id: aiForm.provider_id,
+          model_id: aiForm.model_id,
+          temperature: aiForm.temperature,
+        }
+    const { data } = await api.post(`${basePath}/analyze-ai`, payload)
+    aiResult.value = data
+    aiDialog.value = false
+    aiPreviewDialog.value = true
+  } catch (error) {
+    ElMessage.error(formatError(error, 'AI 分析失败'))
+  } finally {
+    aiLoading.value = false
+  }
+}
+
+async function confirmAiImport() {
+  const items = aiResult.value?.items || []
+  const { data } = await api.post(`${basePath}/import-ai`, { items })
+  ElMessage.success(`AI 结果已确认,更新 ${data.updated} 条`)
+  aiPreviewDialog.value = false
+  aiResult.value = null
+  await load()
+}
+
+function formatError(error, fallback) {
+  const detail = error.response?.data?.detail
+  if (!detail) return fallback
+  return typeof detail === 'string' ? detail : JSON.stringify(detail)
+}
+
 function controlActions(row) {
   if (!row.can_control) return []
   if (props.type === 'service' && row.is_present_now) {
@@ -324,6 +508,7 @@ async function control(row, action) {
 defineExpose({ load })
 onMounted(async () => {
   await loadTags()
+  await loadAiOptions()
   await load()
 })
 </script>

+ 180 - 0
frontend/src/styles.css

@@ -150,6 +150,170 @@ body {
   margin-left: 0;
 }
 
+.automation-workspace {
+  display: grid;
+  grid-template-columns: minmax(0, 1fr) 420px;
+  gap: 14px;
+}
+
+.automation-main,
+.automation-side {
+  min-width: 0;
+}
+
+.automation-side {
+  display: flex;
+  flex-direction: column;
+  gap: 14px;
+}
+
+.side-section {
+  border-bottom: 1px solid #e5e7eb;
+  padding-bottom: 12px;
+}
+
+.side-section:last-child {
+  border-bottom: 0;
+  padding-bottom: 0;
+}
+
+.record-row,
+.action-grid {
+  display: flex;
+  gap: 8px;
+  flex-wrap: wrap;
+  align-items: center;
+}
+
+.screenshot-stage {
+  position: relative;
+  display: grid;
+  place-items: center;
+  min-height: 620px;
+  overflow: hidden;
+  border: 1px solid #d1d5db;
+  border-radius: 8px;
+  background: #111827;
+}
+
+.screenshot-stage.small {
+  min-height: 420px;
+}
+
+.screenshot-canvas {
+  position: relative;
+  width: 100%;
+  max-height: 720px;
+}
+
+.screenshot-stage.small .screenshot-canvas {
+  max-height: 480px;
+}
+
+.screenshot-image {
+  display: block;
+  width: 100%;
+  height: 100%;
+  object-fit: fill;
+}
+
+.screenshot-empty {
+  color: #9ca3af;
+  font-size: 16px;
+}
+
+.element-marker {
+  position: absolute;
+  transform: translate(-50%, -50%);
+  min-width: 24px;
+  height: 24px;
+  border: 2px solid #fff;
+  border-radius: 999px;
+  background: #dc2626;
+  color: #fff;
+  font-size: 12px;
+  font-weight: 700;
+  line-height: 19px;
+  text-align: center;
+  box-shadow: 0 2px 8px rgb(0 0 0 / 35%);
+  cursor: default;
+}
+
+.key-capture {
+  min-height: 140px;
+  padding: 18px;
+  border: 1px dashed #9ca3af;
+  border-radius: 8px;
+  outline: none;
+}
+
+.key-capture:focus {
+  border-color: #409eff;
+}
+
+.key-list {
+  display: flex;
+  gap: 8px;
+  flex-wrap: wrap;
+  margin-top: 14px;
+}
+
+.screen-detail-layout {
+  display: grid;
+  grid-template-columns: minmax(0, 1.2fr) minmax(320px, 0.8fr);
+  gap: 14px;
+}
+
+.node-editor {
+  display: flex;
+  flex-direction: column;
+  gap: 10px;
+  max-height: 460px;
+  overflow: auto;
+}
+
+.node-row {
+  display: grid;
+  grid-template-columns: 42px 150px 190px 130px minmax(220px, 1fr) 72px;
+  gap: 8px;
+  align-items: center;
+  padding: 10px;
+  border: 1px solid #e5e7eb;
+  border-radius: 8px;
+  background: #fff;
+}
+
+.node-order {
+  display: grid;
+  place-items: center;
+  width: 30px;
+  height: 30px;
+  border-radius: 999px;
+  background: #eef2ff;
+  color: #3730a3;
+  font-weight: 700;
+}
+
+.node-config textarea {
+  font-family: Consolas, monospace;
+}
+
+.error-images {
+  display: grid;
+  grid-template-columns: repeat(2, minmax(0, 1fr));
+  gap: 14px;
+  margin-top: 14px;
+}
+
+.error-image {
+  width: 100%;
+  max-height: 520px;
+  object-fit: contain;
+  border: 1px solid #e5e7eb;
+  border-radius: 8px;
+  background: #111827;
+}
+
 @media (max-width: 760px) {
   .app-shell {
     display: block;
@@ -162,4 +326,20 @@ body {
   .main {
     padding: 14px;
   }
+
+  .automation-workspace,
+  .screen-detail-layout,
+  .error-images {
+    grid-template-columns: 1fr;
+  }
+
+  .node-row {
+    grid-template-columns: 1fr;
+  }
+}
+
+@media (max-width: 1180px) {
+  .automation-workspace {
+    grid-template-columns: 1fr;
+  }
 }

+ 20 - 0
task.md

@@ -35,6 +35,22 @@
 - [x] 根据确认状态和不可控标签隐藏控制操作
 - [x] AI 提示词增加标签说明,并包含系统已有标签信息
 - [x] Windows 服务、进程列表查询增加排序功能
+- [x] 增加 AI 服务模块,解耦项目 AI 服务和具体供应商 API 对接
+- [x] 增加 OpenAI/OpenAI 兼容 API 对接模块
+- [x] 增加 Google Gemini API 对接模块
+- [x] 增加 AI 服务商和 AI 模型 SQLite 配置表
+- [x] 增加 AI 服务商管理、AI 模型管理和 AI 服务测试接口
+- [x] 增加前端 AI 配置菜单、服务商管理、模型管理和测试页面
+- [x] 待确认服务/进程增加直接调用 AI 分析、预览差异、确认后入库流程
+- [x] 保留 AI 提示词复制和 AI JSON 导入原流程
+- [x] AI 输出 JSON 增加 tags 字段,导入时自动创建缺失标签并绑定到服务/进程
+- [x] 增加 Windows 自动化操作模块,支持关机、重启、启动/关闭程序、屏幕截图、鼠标和键盘操作
+- [x] 增加 Windows 自动化操作 API,供后续复杂自动化任务调用
+- [x] 补充 Windows 自动化接口说明文档
+- [x] 增加 AI 视觉自动化基础表:已识别界面、界面元素、自动化工作流、工作流节点、自动化错误记录
+- [x] 扩展 AI 服务模块,支持 OpenAI/OpenAI 兼容和 Gemini 的图片输入视觉调用
+- [x] 增加 AI 视觉识别界面、界面对比校验、自动化动作执行、工作流管理、错误记录查询接口
+- [x] 增加前端自动化菜单和自动化操作、自动化工作流、已识别界面、自动化错误记录页面
 
 ## 进度日志
 
@@ -47,3 +63,7 @@
 - 2026-05-09:开始增加标签、服务/进程控制和 AI 标签上下文功能。
 - 2026-05-09:完成标签管理、服务/进程标签编辑、服务/进程控制接口和前端按钮、AI 标签上下文,验证编译、构建和核心接口;补充系统核心进程保护。
 - 2026-05-10:服务和进程列表增加远程排序,前端支持点击表头排序。
+- 2026-05-10:完成 AI 服务商/模型配置、OpenAI/OpenAI 兼容/Gemini 调用模块、AI 测试页面和待确认服务/进程直接 AI 分析确认流程;验证 Python 编译、前端构建、AI 配置接口和浏览器页面。
+- 2026-05-10:AI 提示词和导入格式增加 tags 字段;后端导入时支持自动新增缺失标签并更新服务/进程标签关联,前端确认弹窗展示标签变更。
+- 2026-05-10:开始并完成 Windows 自动化操作模块,新增关机/重启、程序启动/关闭、屏幕截图、pyautogui 鼠标和键盘操作接口;更新后端依赖和接口文档。
+- 2026-05-10:完成 AI 视觉自动化基础功能,支持截图识别并保存界面元素、动作前界面对比校验、错误记录、自动化工作流保存与节点管理;前端增加自动化四个菜单页面,并验证后端编译、前端构建和页面渲染。