Parcourir la source

Add visual automation workflows and remote control UI

codex il y a 1 semaine
Parent
commit
a8808dd83a
53 fichiers modifiés avec 4869 ajouts et 145 suppressions
  1. 1 0
      .gitignore
  2. 124 11
      api-docs.md
  3. 17 2
      backend/app/automation/nodes/__init__.py
  4. 122 0
      backend/app/automation/nodes/browser_control.py
  5. 64 0
      backend/app/automation/nodes/media.py
  6. 393 0
      backend/app/automation/nodes/research.py
  7. 302 0
      backend/app/automation/nodes/video.py
  8. 412 0
      backend/app/automation/nodes/vision.py
  9. 490 0
      backend/app/automation/nodes/web_search.py
  10. 4 0
      backend/app/automation/registry.py
  11. 270 7
      backend/app/automation_service.py
  12. 33 1
      backend/app/database.py
  13. 95 15
      backend/app/main.py
  14. 5 0
      backend/app/schemas.py
  15. 268 0
      backend/app/workflow_task_service.py
  16. 1 0
      backend/requirements.txt
  17. 34 0
      backend/tests/test_automation_token.py
  18. 59 0
      backend/tests/test_video_workflows.py
  19. 52 0
      backend/tests/test_vision_node.py
  20. 196 0
      backend/tests/test_web_search.py
  21. 38 0
      backend/tests/test_workflow_node_registry.py
  22. 65 0
      backend/tests/test_workflow_zip.py
  23. 64 64
      deployment.md
  24. 86 6
      frontend/src/App.vue
  25. 45 0
      frontend/src/api.js
  26. 220 14
      frontend/src/components/AutomationWorkflowEditorPage.vue
  27. 123 0
      frontend/src/components/AutomationWorkflowTasksView.vue
  28. 114 6
      frontend/src/components/AutomationWorkflowView.vue
  29. 9 3
      frontend/src/components/SystemSettingsView.vue
  30. 178 1
      frontend/src/styles.css
  31. 5 0
      start.cmd
  32. 205 0
      start.ps1
  33. 5 0
      stop.cmd
  34. 97 0
      stop.ps1
  35. 3 0
      task.md
  36. 48 15
      workflow-format.md
  37. 105 0
      workflows/ai-web-research.workflow.json
  38. 52 0
      workflows/bilibili-home-random-video.workflow.json
  39. 58 0
      workflows/bilibili-up-latest-video.workflow.json
  40. 48 0
      workflows/douyin-next-video.workflow.json
  41. 58 0
      workflows/douyin-random-video.workflow.json
  42. 17 0
      workflows/entertainment-browser-back.workflow.json
  43. 17 0
      workflows/entertainment-close-browser.workflow.json
  44. 17 0
      workflows/entertainment-escape.workflow.json
  45. 17 0
      workflows/entertainment-fullscreen.workflow.json
  46. 17 0
      workflows/entertainment-mute.workflow.json
  47. 17 0
      workflows/entertainment-next.workflow.json
  48. 17 0
      workflows/entertainment-play-pause.workflow.json
  49. 17 0
      workflows/entertainment-previous.workflow.json
  50. 17 0
      workflows/entertainment-volume-down.workflow.json
  51. 17 0
      workflows/entertainment-volume-up.workflow.json
  52. 65 0
      workflows/youtube-channel-latest-video.workflow.json
  53. 66 0
      workflows/youtube-home-random-video.workflow.json

+ 1 - 0
.gitignore

@@ -6,6 +6,7 @@ __pycache__/
 .venv/
 venv/
 env/
+.runtime/
 .env
 .env.*
 !.env.example

+ 124 - 11
api-docs.md

@@ -99,7 +99,15 @@ AI 模块分为项目内部统一 AI 服务层和具体供应商 API 对接层
 }
 ```
 
-路径必须是相对路径,后端会解析到 `backend/data` 目录下。`automation_remote_token` 用于按 key 远程执行工作流,远程调用时通过 `X-Automation-Token` 请求头传入。
+路径必须是相对路径,后端会解析到 `backend/data` 目录下。`automation_remote_token` 用于远程执行 workflow 和查询 workflow 任务状态。配置后,调用方需要通过以下任一方式传入 Token:
+
+```text
+X-Automation-Token: your-remote-token
+Authorization: Bearer your-remote-token
+?automation_token=your-remote-token
+```
+
+推荐 iOS 快捷指令优先使用请求头;查询参数适合作为部分客户端无法设置请求头时的兜底方式。
 
 ### 查询 AI 服务商
 
@@ -820,7 +828,11 @@ smartctl -a -d jmb39x,1 /dev/sdb
 
 返回后端注册的节点定义,前端可据此生成节点库和参数表单。每个节点包含 `type`、`category`、`label`、`params`、`inputs`、`outputs`、`control_ports`。
 
-常用节点示例:`browser.open_url` 打开网页,`wait.seconds` 等待,`mouse.double_click` 双击坐标,`text.input` 输入文本。
+常用节点示例:`browser.open_url` 打开网页,`browser.web_search` 使用真实浏览器和多模态模型完成网页搜索研究,`wait.seconds` 等待,`mouse.double_click` 双击坐标,`text.input` 输入文本。
+
+`POST /api/automation/workflows/templates/web-search`
+
+创建或读取 key 为 `ai-web-research` 的 AI 多轮网页研究工作流。该 workflow 会让 AI 规划查询词、执行视觉搜索、校验目标和 JSON Schema,并在未达标时继续下一轮。
 
 `POST /api/automation/workflows/plan`
 
@@ -948,11 +960,64 @@ smartctl -a -d jmb39x,1 /dev/sdb
 
 连线 `kind` 分为 `control` 和 `data`。`control` 决定执行顺序,`data` 把源节点输出写入目标节点输入。完整格式见 `workflow-format.md`。
 
+`POST /api/automation/workflows/import`
+
+导入可迁移的 workflow JSON:
+
+```json
+{
+  "workflow": {
+    "schema_version": "workflow/v1",
+    "workflow_key": "ai-web-research",
+    "name": "AI 多轮网页搜索研究",
+    "variables": {},
+    "settings": {},
+    "nodes": [],
+    "edges": []
+  },
+  "conflict_strategy": "replace"
+}
+```
+
+`conflict_strategy` 可为 `error` 或 `replace`。仓库内置模板位于 `workflows/ai-web-research.workflow.json`。
+
+`GET /api/automation/workflows/export.zip`
+
+批量导出全部 workflow,返回 `application/zip`。ZIP 内包含:
+
+- `manifest.json`:导出时间、数量和文件清单。
+- `workflows/{workflow_key}.workflow.json`:每个 workflow 的可迁移 JSON,不包含数据库 ID 和时间字段。
+
+`POST /api/automation/workflows/import.zip`
+
+批量导入 ZIP。请求体直接发送 ZIP 二进制内容,`Content-Type` 为 `application/zip`。导入时会逐个校验 workflow JSON;如果目标设备中已经存在相同 `workflow_key`,该条会跳过,不覆盖。
+
+返回示例:
+
+```json
+{
+  "created_count": 2,
+  "skipped_count": 1,
+  "failed_count": 0,
+  "created": [
+    {"path": "workflows/a.workflow.json", "id": 12, "workflow_key": "a", "name": "A"}
+  ],
+  "skipped": [
+    {"path": "workflows/b.workflow.json", "workflow_key": "b", "reason": "workflow_key already exists"}
+  ],
+  "failed": []
+}
+```
+
+`GET /api/automation/workflows/by-key/{workflow_key}/export`
+
+导出不包含数据库 ID、创建时间等设备相关字段的 `workflow/v1` JSON,可直接在其他设备导入。
+
 `GET /api/automation/workflows/{workflow_id}`
 
 `GET /api/automation/workflows/by-key/{workflow_key}`
 
-`POST /api/automation/workflows/{workflow_id}/run`
+`POST /api/automation/workflows/by-key/{workflow_key}/run`
 
 ```json
 {
@@ -960,31 +1025,79 @@ smartctl -a -d jmb39x,1 /dev/sdb
   "model_id": null,
   "temperature": null,
   "variables": {
-    "target_text": "本次运行覆盖值"
+    "objective": "研究 Python 官方网站提供的主要服务",
+    "output_schema": {
+      "type": "object",
+      "required": ["official_url", "services"],
+      "properties": {
+        "official_url": {"type": "string"},
+        "services": {"type": "array", "items": {"type": "string"}}
+      }
+    },
+    "constraints": {
+      "language": "zh-CN",
+      "min_sources": 1,
+      "required_domains": ["python.org"]
+    },
+    "max_attempts": 3
   }
 }
 ```
 
-如果不传 AI 参数,后端会使用系统设置中的默认 AI 服务商、模型和温度。后端会从 `flow.start` 或没有入边的节点开始,沿 `control` 连线执行。节点输出会写入运行上下文,供后续节点通过 `node_output` 或 `data` 连线读取。节点失败时,返回项会尽量包含 `artifacts.screenshot_path`,用于前端展示失败截图并继续询问用户。
+接口返回 HTTP `202`,只创建异步任务,不等待 workflow 完成:
 
-`POST /api/automation/workflows/by-key/{workflow_key}/run`
+```json
+{
+  "id": "task-uuid",
+  "workflow_key": "ai-web-research",
+  "status": "QUEUED",
+  "queue_position": 1,
+  "created_at": "2026-06-15T10:00:00+08:00"
+}
+```
 
-按稳定 key 远程执行工作流。该接口必须先在系统设置中配置 `automation_remote_token`,并在请求头携带:
+所有 workflow 共用一个全局执行队列,任何时刻最多只有一个任务处于 `RUNNING`,其余任务按创建时间排队。若系统设置配置了 `automation_remote_token`,远程执行和任务查询必须携带 Token。支持以下任一方式
 
 ```text
 X-Automation-Token: your-remote-token
+Authorization: Bearer your-remote-token
+GET /api/automation/workflow-tasks/{task_id}?automation_token=your-remote-token
 ```
 
-请求体与按 ID 执行一致:
+未配置 Token 时允许调用,正式部署建议设置 Token。前端系统设置页保存 Token 后,会在当前浏览器本地保存一份,并自动给后续请求附加 `X-Automation-Token`。
+
+`GET /api/automation/workflow-tasks/{task_id}`
+
+获取状态和最终数据:
 
 ```json
 {
-  "variables": {
-    "channel_url": "https://tv.cctv.com/live/cctv5"
-  }
+  "id": "task-uuid",
+  "workflow_key": "ai-web-research",
+  "status": "SUCCESS",
+  "queue_position": null,
+  "request": {},
+  "return_data": {
+    "status": "GOAL_ACHIEVED",
+    "goal_achieved": true,
+    "data": {},
+    "validation": {"schema_valid": true, "constraints_valid": true, "valid": true},
+    "sources": [],
+    "attempts": []
+  },
+  "result": {},
+  "created_at": "...",
+  "started_at": "...",
+  "finished_at": "..."
 }
 ```
 
+状态包括 `QUEUED`、`RUNNING`、`SUCCESS`、`FAILED`、`PAUSED`。`return_data` 由 workflow 的 `settings.return` 指定,供外部程序直接消费;`result` 保留完整节点执行记录,便于审计。
+
+`GET /api/automation/workflow-tasks?page=1&page_size=20&status=RUNNING`
+
+分页查询自动化任务历史。按 ID 同步执行接口已取消,数据库 ID 仅用于工作流编辑和删除。
+
 `PUT /api/automation/workflows/{workflow_id}`
 
 `DELETE /api/automation/workflows/{workflow_id}`

+ 17 - 2
backend/app/automation/nodes/__init__.py

@@ -1,3 +1,18 @@
-from . import flow, human, keyboard, mouse, program, screen, text, wait
+from . import browser_control, flow, human, keyboard, media, mouse, program, research, screen, text, video, vision, wait, web_search
 
-__all__ = ["flow", "human", "keyboard", "mouse", "program", "screen", "text", "wait"]
+__all__ = [
+    "browser_control",
+    "flow",
+    "human",
+    "keyboard",
+    "media",
+    "mouse",
+    "program",
+    "research",
+    "screen",
+    "text",
+    "video",
+    "vision",
+    "wait",
+    "web_search",
+]

+ 122 - 0
backend/app/automation/nodes/browser_control.py

@@ -0,0 +1,122 @@
+from __future__ import annotations
+
+import time
+from typing import Any
+
+from ... import windows_automation
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+
+
+def _number(value: Any, default: float = 0) -> float:
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return default
+
+
+def _boolean(value: Any, default: bool = False) -> bool:
+    if value in (None, ""):
+        return default
+    if isinstance(value, str):
+        return value.strip().lower() in {"1", "true", "yes", "y", "on"}
+    return bool(value)
+
+
+def ensure_browser_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    browser = str(inputs.get("browser", params.get("browser")) or "edge")
+    url = str(inputs.get("url", params.get("url")) or "").strip()
+    maximize = _boolean(inputs.get("maximize", params.get("maximize")), True)
+    wait_seconds = max(0, min(_number(inputs.get("wait_seconds", params.get("wait_seconds")), 2), 30))
+
+    if url:
+        opened = windows_automation.open_url(url=url, browser=browser, new_window=bool(params.get("new_window", True)))
+    elif browser.lower() in {"edge", "msedge"}:
+        opened = windows_automation.start_program("cmd.exe /c start msedge", shell=True)
+    else:
+        opened = windows_automation.start_program("cmd.exe /c start chrome", shell=True)
+    context.remember_pid(opened.get("pid"))
+    if wait_seconds:
+        time.sleep(wait_seconds)
+    if maximize:
+        windows_automation.keyboard_action("hotkey", keys=["win", "up"])
+    return {"action": "ensure_browser", "browser": browser, "url": url, "opened": opened, "maximized": maximize}
+
+
+def browser_key_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    action = str(inputs.get("action", params.get("action")) or "fullscreen")
+    if action == "fullscreen":
+        result = windows_automation.keyboard_action("press", key="f11")
+    elif action == "back":
+        result = windows_automation.keyboard_action("hotkey", keys=["alt", "left"])
+    elif action == "forward":
+        result = windows_automation.keyboard_action("hotkey", keys=["alt", "right"])
+    elif action == "refresh":
+        result = windows_automation.keyboard_action("press", key="f5")
+    elif action == "escape":
+        result = windows_automation.keyboard_action("press", key="escape")
+    elif action == "close_tab":
+        result = windows_automation.keyboard_action("hotkey", keys=["ctrl", "w"])
+    elif action == "close_window":
+        result = windows_automation.keyboard_action("hotkey", keys=["alt", "f4"])
+    else:
+        result = {"action": "browser_noop", "key": None}
+    return {"action": action, "keyboard": result}
+
+
+register_node(
+    {
+        "type": "browser.ensure_foreground",
+        "category": "browser",
+        "label": "确保浏览器前台",
+        "description": "打开或唤起浏览器,可选打开指定网址,并尝试最大化窗口。",
+        "params": {
+            "url": field_def("text", "网址", "", description="为空时仅尝试启动浏览器。"),
+            "browser": field_def("select", "浏览器", "edge", options=["default", "edge"]),
+            "new_window": field_def("boolean", "新窗口", True),
+            "maximize": field_def("boolean", "最大化", True),
+            "wait_seconds": field_def("number", "等待秒数", 2, minimum=0, maximum=30),
+        },
+        "inputs": {
+            "url": field_def("string", "网址"),
+            "browser": field_def("string", "浏览器"),
+            "maximize": field_def("boolean", "最大化"),
+            "wait_seconds": field_def("number", "等待秒数"),
+        },
+        "outputs": {
+            "action": {"type": "string", "label": "动作"},
+            "browser": {"type": "string", "label": "浏览器"},
+            "url": {"type": "string", "label": "网址"},
+            "opened": {"type": "object", "label": "启动结果"},
+            "maximized": {"type": "boolean", "label": "是否最大化"},
+        },
+        "control_ports": control_ports(),
+    },
+    ensure_browser_node,
+)
+
+register_node(
+    {
+        "type": "browser.control",
+        "category": "browser",
+        "label": "浏览器快捷控制",
+        "description": "对当前浏览器窗口执行全屏、返回、刷新、关闭标签页等快捷操作。",
+        "params": {
+            "action": field_def(
+                "select",
+                "动作",
+                "fullscreen",
+                options=["fullscreen", "back", "forward", "refresh", "escape", "close_tab", "close_window"],
+            ),
+        },
+        "inputs": {"action": field_def("string", "动作")},
+        "outputs": {
+            "action": {"type": "string", "label": "动作"},
+            "keyboard": {"type": "object", "label": "键盘结果"},
+        },
+        "control_ports": control_ports(),
+    },
+    browser_key_node,
+)

+ 64 - 0
backend/app/automation/nodes/media.py

@@ -0,0 +1,64 @@
+from __future__ import annotations
+
+from typing import Any
+
+from ... import windows_automation
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+
+
+MEDIA_KEY_MAP = {
+    "play_pause": "space",
+    "site_fullscreen": "f",
+    "mute": "m",
+    "volume_up": "volumeup",
+    "volume_down": "volumedown",
+    "system_mute": "volumemute",
+    "next": "down",
+    "previous": "up",
+    "escape": "escape",
+}
+
+
+def media_control_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    action = str(inputs.get("action", params.get("action")) or "play_pause")
+    key = MEDIA_KEY_MAP.get(action, "space")
+    result = windows_automation.keyboard_action("press", key=key)
+    return {"action": action, "key": key, "keyboard": result}
+
+
+register_node(
+    {
+        "type": "media.control",
+        "category": "media",
+        "label": "媒体遥控",
+        "description": "对当前播放页面执行播放暂停、网页全屏、静音、音量和上下条等客厅遥控动作。",
+        "params": {
+            "action": field_def(
+                "select",
+                "动作",
+                "play_pause",
+                options=[
+                    "play_pause",
+                    "site_fullscreen",
+                    "mute",
+                    "volume_up",
+                    "volume_down",
+                    "system_mute",
+                    "next",
+                    "previous",
+                    "escape",
+                ],
+            )
+        },
+        "inputs": {"action": field_def("string", "动作")},
+        "outputs": {
+            "action": {"type": "string", "label": "动作"},
+            "key": {"type": "string", "label": "按键"},
+            "keyboard": {"type": "object", "label": "键盘结果"},
+        },
+        "control_ports": control_ports(),
+    },
+    media_control_node,
+)

+ 393 - 0
backend/app/automation/nodes/research.py

@@ -0,0 +1,393 @@
+from __future__ import annotations
+
+import json
+from typing import Any
+from urllib.parse import urlparse
+
+from fastapi import HTTPException
+
+from ... import ai_service
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+from .web_search import WebSearchRunner, _integer
+
+
+def parse_object(value: Any, field_name: str) -> dict[str, Any]:
+    """兼容 API 直接传对象和编辑器中保存的 JSON 文本。"""
+    if isinstance(value, dict):
+        return value
+    if isinstance(value, str) and value.strip():
+        try:
+            parsed = json.loads(value)
+        except json.JSONDecodeError as exc:
+            raise HTTPException(status_code=400, detail=f"{field_name} 不是有效 JSON 对象: {exc}") from exc
+        if isinstance(parsed, dict):
+            return parsed
+    raise HTTPException(status_code=400, detail=f"{field_name} 必须是 JSON 对象")
+
+
+def validate_json_data(data: Any, schema: dict[str, Any]) -> dict[str, Any]:
+    """使用标准 JSON Schema 校验最终返回数据。"""
+    try:
+        from jsonschema import Draft202012Validator
+    except ImportError as exc:
+        raise HTTPException(status_code=500, detail="jsonschema is not installed") from exc
+
+    try:
+        Draft202012Validator.check_schema(schema)
+    except Exception as exc:
+        raise HTTPException(status_code=400, detail=f"output_schema 无效: {exc}") from exc
+
+    validator = Draft202012Validator(schema)
+    errors = sorted(validator.iter_errors(data), key=lambda item: list(item.absolute_path))
+    return {
+        "schema_valid": not errors,
+        "errors": [
+            {
+                "path": ".".join(str(part) for part in error.absolute_path),
+                "message": error.message,
+            }
+            for error in errors
+        ],
+    }
+
+
+def validate_research_result(
+    data: Any,
+    schema: dict[str, Any],
+    constraints: dict[str, Any],
+    evidence: list[dict[str, Any]],
+) -> dict[str, Any]:
+    """组合 JSON Schema 与来源数量、必需域名等确定性约束。"""
+    result = validate_json_data(data, schema)
+    constraint_errors: list[str] = []
+    sources = sources_from_evidence(evidence)
+    try:
+        min_sources = max(0, int(constraints.get("min_sources", 0)))
+    except (TypeError, ValueError):
+        min_sources = 0
+    if len(sources) < min_sources:
+        constraint_errors.append(f"来源数量 {len(sources)} 少于要求的 {min_sources}")
+
+    required_domains = constraints.get("required_domains") or []
+    if isinstance(required_domains, list):
+        source_hosts = {urlparse(item["url"]).netloc.lower() for item in sources}
+        for domain in required_domains:
+            normalized = str(domain).lower().strip()
+            if normalized and not any(host == normalized or host.endswith(f".{normalized}") for host in source_hosts):
+                constraint_errors.append(f"缺少必需来源域名: {normalized}")
+
+    result["constraints_valid"] = not constraint_errors
+    result["constraint_errors"] = constraint_errors
+    result["valid"] = result["schema_valid"] and result["constraints_valid"]
+    return result
+
+
+class AiWebResearchRunner:
+    """AI 驱动的多轮视觉网页研究状态机。"""
+
+    def __init__(self, context: WorkflowContext, params: dict[str, Any]) -> None:
+        if not context.provider_id or not context.model_id:
+            raise HTTPException(status_code=400, detail="AI 搜索研究节点需要配置默认 AI 服务商和模型")
+        self.context = context
+        self.params = params
+        self.objective = str(params.get("objective") or "").strip()
+        if not self.objective:
+            raise HTTPException(status_code=400, detail="研究目标不能为空")
+        self.output_schema = parse_object(params.get("output_schema"), "output_schema")
+        self.constraints = parse_object(params.get("constraints") or {}, "constraints")
+        self.max_attempts = _integer(params.get("max_attempts"), 3, 1, 10)
+        self.search_engine = str(params.get("search_engine") or "bing")
+        self.browser = str(params.get("browser") or "edge")
+        self.max_search_pages = _integer(params.get("max_search_pages"), 2, 1, 10)
+        self.result_count = _integer(params.get("result_count"), 2, 1, 5)
+        self.detail_max_pages = _integer(params.get("detail_max_pages"), 2, 1, 10)
+
+    def run(self) -> dict[str, Any]:
+        plan = self._create_plan()
+        pending_queries = self._normalize_queries(plan.get("queries"))
+        if not pending_queries:
+            pending_queries = [self.objective]
+
+        searched_queries: list[str] = []
+        evidence: list[dict[str, Any]] = []
+        attempts: list[dict[str, Any]] = []
+        latest_assessment: dict[str, Any] = {}
+        latest_data: Any = {}
+        latest_validation = validate_research_result(latest_data, self.output_schema, self.constraints, evidence)
+
+        for attempt_number in range(1, self.max_attempts + 1):
+            query = self._next_query(pending_queries, searched_queries, latest_assessment)
+            searched_queries.append(query)
+            search_output = WebSearchRunner(
+                self.context,
+                {
+                    "query": query,
+                    "search_engine": self.search_engine,
+                    "browser": self.browser,
+                    "max_search_pages": self.max_search_pages,
+                    "result_count": self.result_count,
+                    "detail_max_pages": self.detail_max_pages,
+                    "click_attempts": self.params.get("click_attempts", 2),
+                    "page_load_wait_seconds": self.params.get("page_load_wait_seconds", 8),
+                    "action_wait_seconds": self.params.get("action_wait_seconds", 1),
+                    "close_browser": True,
+                    "include_debug_analyses": False,
+                },
+            ).run()
+            attempt_evidence = compact_evidence(search_output)
+            evidence.extend(attempt_evidence)
+            latest_assessment = self._assess_progress(plan, searched_queries, evidence)
+            latest_data = latest_assessment.get("candidate_data")
+            latest_validation = validate_research_result(
+                latest_data,
+                self.output_schema,
+                self.constraints,
+                evidence,
+            )
+            goal_achieved = bool(latest_assessment.get("goal_achieved")) and latest_validation["valid"]
+            attempts.append(
+                {
+                    "attempt": attempt_number,
+                    "query": query,
+                    "search_result_count": search_output.get("result_count", 0),
+                    "researched_count": search_output.get("researched_count", 0),
+                    "sources": sources_from_evidence(attempt_evidence),
+                    "assessment": {
+                        "goal_achieved": bool(latest_assessment.get("goal_achieved")),
+                        "confidence": latest_assessment.get("confidence"),
+                        "reason": latest_assessment.get("reason"),
+                        "missing_information": latest_assessment.get("missing_information") or [],
+                    },
+                    "validation": latest_validation,
+                }
+            )
+            if goal_achieved:
+                return self._build_output(
+                    plan,
+                    attempts,
+                    evidence,
+                    latest_data,
+                    latest_validation,
+                    latest_assessment,
+                    True,
+                )
+            pending_queries.extend(self._normalize_queries(latest_assessment.get("next_queries")))
+
+        return self._build_output(
+            plan,
+            attempts,
+            evidence,
+            latest_data,
+            latest_validation,
+            latest_assessment,
+            False,
+        )
+
+    def _text_json(self, prompt: str) -> dict[str, Any]:
+        result = ai_service.chat(
+            int(self.context.provider_id),
+            int(self.context.model_id),
+            prompt,
+            self.context.temperature,
+        )
+        try:
+            parsed = json.loads(ai_service.extract_json_text(result["content"]))
+        except (json.JSONDecodeError, TypeError, ValueError) as exc:
+            raise HTTPException(status_code=502, detail=f"AI 研究模型未返回有效 JSON: {exc}") from exc
+        if not isinstance(parsed, dict):
+            raise HTTPException(status_code=502, detail="AI 研究模型返回值必须是 JSON 对象")
+        return parsed
+
+    def _create_plan(self) -> dict[str, Any]:
+        prompt = f"""请为一个使用真实浏览器和视觉截图的网页研究任务制定搜索计划。
+
+研究目标:
+{self.objective}
+
+最终输出 JSON Schema:
+{json.dumps(self.output_schema, ensure_ascii=False, indent=2)}
+
+约束:
+{json.dumps(self.constraints, ensure_ascii=False, indent=2)}
+
+最多尝试次数:{self.max_attempts}
+
+请严格只输出 JSON:
+{{
+  "summary": string,
+  "acceptance_criteria": [string],
+  "queries": [string],
+  "source_preferences": [string],
+  "risks": [string]
+}}
+
+queries 应按优先级排列,数量不超过最多尝试次数。"""
+        return self._text_json(prompt)
+
+    def _assess_progress(
+        self,
+        plan: dict[str, Any],
+        searched_queries: list[str],
+        evidence: list[dict[str, Any]],
+    ) -> dict[str, Any]:
+        prompt = f"""请评估网页研究任务是否已经达成,并生成符合指定 JSON Schema 的候选数据。
+
+研究目标:
+{self.objective}
+
+研究计划:
+{json.dumps(plan, ensure_ascii=False)}
+
+输出 JSON Schema:
+{json.dumps(self.output_schema, ensure_ascii=False, indent=2)}
+
+约束:
+{json.dumps(self.constraints, ensure_ascii=False)}
+
+已搜索查询:
+{json.dumps(searched_queries, ensure_ascii=False)}
+
+已获得证据:
+{json.dumps(evidence[-20:], ensure_ascii=False)}
+
+判断规则:
+1. 只有证据足以覆盖研究目标和计划中的验收标准时,goal_achieved 才能为 true。
+2. candidate_data 必须严格匹配给定 JSON Schema,不要添加 Schema 未允许的包装字段。
+3. 缺少信息时给出下一轮更精确、且与已搜索内容不同的查询词。
+4. 不要把搜索摘要中的推测当作已验证事实。
+
+严格只输出 JSON:
+{{
+  "goal_achieved": boolean,
+  "confidence": number,
+  "reason": string,
+  "missing_information": [string],
+  "next_queries": [string],
+  "candidate_data": object
+}}"""
+        return self._text_json(prompt)
+
+    def _next_query(
+        self,
+        pending_queries: list[str],
+        searched_queries: list[str],
+        assessment: dict[str, Any],
+    ) -> str:
+        searched = {item.strip().lower() for item in searched_queries}
+        while pending_queries:
+            query = pending_queries.pop(0).strip()
+            if query and query.lower() not in searched:
+                return query
+        missing = assessment.get("missing_information") or []
+        suffix = " ".join(str(item) for item in missing[:2])
+        return f"{self.objective} {suffix} 补充资料 第{len(searched_queries) + 1}轮".strip()
+
+    @staticmethod
+    def _normalize_queries(value: Any) -> list[str]:
+        if not isinstance(value, list):
+            return []
+        return [str(item).strip() for item in value if str(item).strip()]
+
+    def _build_output(
+        self,
+        plan: dict[str, Any],
+        attempts: list[dict[str, Any]],
+        evidence: list[dict[str, Any]],
+        data: Any,
+        validation: dict[str, Any],
+        assessment: dict[str, Any],
+        goal_achieved: bool,
+    ) -> dict[str, Any]:
+        return {
+            "status": "GOAL_ACHIEVED" if goal_achieved else "MAX_ATTEMPTS_REACHED",
+            "goal_achieved": goal_achieved,
+            "objective": self.objective,
+            "attempts_used": len(attempts),
+            "max_attempts": self.max_attempts,
+            "data": data,
+            "validation": validation,
+            "assessment": {
+                "confidence": assessment.get("confidence"),
+                "reason": assessment.get("reason"),
+                "missing_information": assessment.get("missing_information") or [],
+            },
+            "sources": sources_from_evidence(evidence),
+            "plan": plan,
+            "attempts": attempts,
+            "next_port": "success" if goal_achieved else "partial",
+        }
+
+
+def compact_evidence(search_output: dict[str, Any]) -> list[dict[str, Any]]:
+    """只保留评估所需字段,控制多轮提示词长度。"""
+    evidence: list[dict[str, Any]] = []
+    for detail in search_output.get("researched_details") or []:
+        if not isinstance(detail, dict):
+            continue
+        result = detail.get("result") if isinstance(detail.get("result"), dict) else {}
+        cleaned = detail.get("cleaned") if isinstance(detail.get("cleaned"), dict) else {}
+        evidence.append(
+            {
+                "title": cleaned.get("clean_title") or result.get("title"),
+                "url": detail.get("visited_url") or result.get("url"),
+                "text": cleaned.get("clean_text") or detail.get("error") or "",
+                "key_points": cleaned.get("key_points") or [],
+                "opened_detail_page": bool(detail.get("opened_detail_page")),
+            }
+        )
+    return evidence
+
+
+def sources_from_evidence(evidence: list[dict[str, Any]]) -> list[dict[str, str]]:
+    sources: list[dict[str, str]] = []
+    seen: set[str] = set()
+    for item in evidence:
+        url = str(item.get("url") or "").strip()
+        if not url or url in seen:
+            continue
+        seen.add(url)
+        sources.append({"title": str(item.get("title") or url), "url": url})
+    return sources
+
+
+def ai_web_research_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = {**(node.get("params") or {}), **inputs}
+    return AiWebResearchRunner(context, params).run()
+
+
+register_node(
+    {
+        "type": "research.ai_web_research",
+        "category": "research",
+        "label": "AI 多轮网页研究",
+        "params": {
+            "objective": field_def("textarea", "研究目标", required=True),
+            "output_schema": field_def("textarea", "返回 JSON Schema", required=True),
+            "constraints": field_def("textarea", "研究约束", "{}"),
+            "max_attempts": field_def("number", "最多尝试次数", 3, minimum=1, maximum=10),
+            "search_engine": field_def("select", "搜索引擎", "bing", options=["google", "bing"]),
+            "browser": field_def("select", "浏览器", "edge", options=["default", "edge"]),
+            "max_search_pages": field_def("number", "每轮搜索页屏", 2, minimum=1, maximum=10),
+            "result_count": field_def("number", "每轮研究结果数", 2, minimum=1, maximum=5),
+            "detail_max_pages": field_def("number", "每个详情页屏", 2, minimum=1, maximum=10),
+        },
+        "inputs": {
+            "objective": field_def("string", "研究目标"),
+            "output_schema": field_def("object", "返回 JSON Schema"),
+            "constraints": field_def("object", "研究约束"),
+            "max_attempts": field_def("number", "最多尝试次数"),
+        },
+        "outputs": {
+            "status": {"type": "string", "label": "研究状态"},
+            "goal_achieved": {"type": "boolean", "label": "是否达成目标"},
+            "data": {"type": "object", "label": "符合 Schema 的数据"},
+            "validation": {"type": "object", "label": "Schema 校验结果"},
+            "assessment": {"type": "object", "label": "目标评估"},
+            "sources": {"type": "array", "label": "来源"},
+            "attempts": {"type": "array", "label": "尝试记录"},
+        },
+        "control_ports": control_ports(["success", "partial", "failure"]),
+    },
+    ai_web_research_node,
+)

+ 302 - 0
backend/app/automation/nodes/video.py

@@ -0,0 +1,302 @@
+from __future__ import annotations
+
+import json
+import hashlib
+import random
+import re
+import time
+from html import unescape
+from typing import Any
+from urllib.parse import parse_qs, quote, urlencode, urljoin, urlparse
+from urllib.request import Request, urlopen
+
+from ... import windows_automation
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+
+
+REQUEST_HEADERS = {
+    "User-Agent": (
+        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
+        "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0 Safari/537.36"
+    ),
+    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8",
+}
+
+YOUTUBE_VIDEO_RE = re.compile(r'(?:/watch\?v=|["\']videoId["\']\s*:\s*["\'])([A-Za-z0-9_-]{11})')
+BILIBILI_BVID_RE = re.compile(r'(?:/video/|["\']bvid["\']\s*:\s*["\'])(BV[0-9A-Za-z]+)')
+BILIBILI_MIXIN_KEY_ENC_TABLE = [
+    46,
+    47,
+    18,
+    2,
+    53,
+    8,
+    23,
+    32,
+    15,
+    50,
+    10,
+    31,
+    58,
+    3,
+    45,
+    35,
+    27,
+    43,
+    5,
+    49,
+    33,
+    9,
+    42,
+    19,
+    29,
+    28,
+    14,
+    39,
+    12,
+    38,
+    41,
+    13,
+    37,
+    48,
+    7,
+    16,
+    24,
+    55,
+    40,
+    61,
+    26,
+    17,
+    0,
+    1,
+    60,
+    51,
+    30,
+    4,
+    22,
+    25,
+    54,
+    21,
+    56,
+    59,
+    6,
+    63,
+    57,
+    62,
+    11,
+    36,
+    20,
+    34,
+    44,
+    52,
+]
+
+
+def fetch_text(url: str, timeout_seconds: float = 12) -> str:
+    """抓取网页 HTML,供公开视频链接解析使用。"""
+    request = Request(url, headers=REQUEST_HEADERS)
+    with urlopen(request, timeout=timeout_seconds) as response:
+        charset = response.headers.get_content_charset() or "utf-8"
+        return response.read().decode(charset, errors="replace")
+
+
+def unique_matches(pattern: re.Pattern[str], text: str) -> list[str]:
+    """按出现顺序去重,避免随机选择时同一个视频被重复加权。"""
+    seen: set[str] = set()
+    values: list[str] = []
+    for match in pattern.finditer(text):
+        value = unescape(match.group(1))
+        if value not in seen:
+            seen.add(value)
+            values.append(value)
+    return values
+
+
+def with_query(url: str, extra: dict[str, str]) -> str:
+    """在保留原参数的同时追加播放参数。"""
+    parsed = urlparse(url)
+    query = parse_qs(parsed.query)
+    for key, value in extra.items():
+        query[key] = [value]
+    return parsed._replace(query=urlencode(query, doseq=True)).geturl()
+
+
+def extract_bilibili_mid(url: str) -> str | None:
+    """从 B 站空间链接中提取 UP 主 mid。"""
+    match = re.search(r"space\.bilibili\.com/(\d+)", url)
+    return match.group(1) if match else None
+
+
+def bilibili_mixin_key() -> str:
+    """获取 B 站 WBI 签名所需的 mixin key。"""
+    payload = json.loads(fetch_text("https://api.bilibili.com/x/web-interface/nav"))
+    wbi_img = ((payload.get("data") or {}).get("wbi_img") or {})
+    img_key = urlparse(str(wbi_img.get("img_url") or "")).path.rsplit("/", 1)[-1].split(".")[0]
+    sub_key = urlparse(str(wbi_img.get("sub_url") or "")).path.rsplit("/", 1)[-1].split(".")[0]
+    raw_key = img_key + sub_key
+    if len(raw_key) < 64:
+        raise ValueError("无法获取 B 站 WBI 签名 key")
+    return "".join(raw_key[index] for index in BILIBILI_MIXIN_KEY_ENC_TABLE)[:32]
+
+
+def bilibili_signed_query(params: dict[str, Any]) -> str:
+    """生成 B 站空间接口的 WBI 签名查询串。"""
+    signed = {key: value for key, value in params.items() if value not in (None, "")}
+    signed["wts"] = int(time.time())
+    clean = {
+        key: re.sub(r"[!'()*]", "", str(value))
+        for key, value in sorted(signed.items())
+    }
+    query = urlencode(clean)
+    clean["w_rid"] = hashlib.md5((query + bilibili_mixin_key()).encode("utf-8")).hexdigest()
+    return urlencode(clean)
+
+
+def choose_youtube_home_video() -> str:
+    video_ids: list[str] = []
+    # YouTube 未登录首页有时只返回框架和登录入口;首页没有候选时,用热门搜索页兜底。
+    for url in [
+        "https://www.youtube.com/",
+        "https://www.youtube.com/results?search_query=%E7%83%AD%E9%97%A8%E8%A7%86%E9%A2%91",
+        "https://www.youtube.com/results?search_query=popular%20videos",
+    ]:
+        video_ids = unique_matches(YOUTUBE_VIDEO_RE, fetch_text(url))
+        if video_ids:
+            break
+    if not video_ids:
+        raise ValueError("未在 YouTube 首页解析到推荐视频")
+    return f"https://www.youtube.com/watch?v={random.choice(video_ids)}&autoplay=1"
+
+
+def choose_youtube_channel_latest(channel_url: str) -> str:
+    base_url = channel_url.rstrip("/")
+    videos_url = base_url if base_url.endswith("/videos") else f"{base_url}/videos"
+    html = fetch_text(videos_url)
+    video_ids = unique_matches(YOUTUBE_VIDEO_RE, html)
+    if not video_ids:
+        raise ValueError("未在 YouTube 主播视频页解析到最新视频")
+    return f"https://www.youtube.com/watch?v={video_ids[0]}&autoplay=1"
+
+
+def choose_bilibili_home_video() -> str:
+    html = fetch_text("https://www.bilibili.com/")
+    bvids = unique_matches(BILIBILI_BVID_RE, html)
+    if not bvids:
+        raise ValueError("未在 B 站首页解析到推荐视频")
+    return f"https://www.bilibili.com/video/{random.choice(bvids)}?autoplay=1"
+
+
+def choose_bilibili_up_latest(up_url: str) -> str:
+    mid = extract_bilibili_mid(up_url)
+    if mid:
+        for endpoint, params in [
+            ("https://api.bilibili.com/x/space/wbi/arc/search", {"mid": mid, "ps": 1, "pn": 1, "order": "pubdate"}),
+            ("https://api.bilibili.com/x/space/arc/search", {"mid": mid, "ps": 1, "pn": 1, "order": "pubdate"}),
+        ]:
+            try:
+                query = bilibili_signed_query(params) if "/wbi/" in endpoint else urlencode(params)
+                payload = json.loads(fetch_text(f"{endpoint}?{query}"))
+                videos = (((payload.get("data") or {}).get("list") or {}).get("vlist") or [])
+                if videos:
+                    video = videos[0]
+                    if video.get("bvid"):
+                        return f"https://www.bilibili.com/video/{video['bvid']}?autoplay=1"
+                    if video.get("aid"):
+                        return f"https://www.bilibili.com/video/av{video['aid']}?autoplay=1"
+            except Exception:
+                # B 站接口偶尔会因风控失败,失败时继续尝试下一种来源。
+                pass
+
+    videos_url = up_url.rstrip("/")
+    if "/video" not in urlparse(videos_url).path:
+        videos_url = urljoin(videos_url + "/", "video")
+    html = fetch_text(videos_url)
+    bvids = unique_matches(BILIBILI_BVID_RE, html)
+    if not bvids:
+        raise ValueError("未在 B 站 UP 主视频页解析到最新视频")
+    return f"https://www.bilibili.com/video/{bvids[0]}?autoplay=1"
+
+
+def selected_video_url(action: str, params: dict[str, Any], inputs: dict[str, Any]) -> str | None:
+    """根据动作类型解析目标视频 URL。"""
+    if action == "youtube_home_random":
+        return choose_youtube_home_video()
+    if action == "youtube_channel_latest":
+        channel_url = str(inputs.get("channel_url", params.get("channel_url")) or "").strip()
+        if not channel_url:
+            raise ValueError("channel_url is required")
+        return choose_youtube_channel_latest(channel_url)
+    if action == "bilibili_home_random":
+        return choose_bilibili_home_video()
+    if action == "bilibili_up_latest":
+        up_url = str(inputs.get("up_url", params.get("up_url")) or "").strip()
+        if not up_url:
+            raise ValueError("up_url is required")
+        return choose_bilibili_up_latest(up_url)
+    if action == "douyin_random":
+        return str(inputs.get("douyin_url", params.get("douyin_url")) or "https://www.douyin.com/").strip()
+    return None
+
+
+def video_action_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    action = str(inputs.get("action", params.get("action")) or "").strip()
+    browser = inputs.get("browser", params.get("browser")) or "edge"
+
+    if action == "douyin_next":
+        result = windows_automation.keyboard_action("press", key="down")
+        return {"action": action, **result}
+
+    target_url = selected_video_url(action, params, inputs)
+    if not target_url:
+        raise ValueError(f"Unsupported video action: {action}")
+
+    opened = windows_automation.open_url(target_url, browser=browser, new_window=bool(params.get("new_window", True)))
+    context.remember_pid(opened.get("pid"))
+    return {"action": action, "selected_url": target_url, "browser": browser, "opened": opened}
+
+
+register_node(
+    {
+        "type": "browser.video_action",
+        "category": "browser",
+        "label": "视频平台动作",
+        "params": {
+            "action": field_def(
+                "select",
+                "动作",
+                "youtube_home_random",
+                required=True,
+                options=[
+                    "youtube_home_random",
+                    "youtube_channel_latest",
+                    "bilibili_home_random",
+                    "bilibili_up_latest",
+                    "douyin_random",
+                    "douyin_next",
+                ],
+            ),
+            "browser": field_def("select", "浏览器", "edge", options=["default", "edge"]),
+            "new_window": field_def("boolean", "新窗口", True),
+            "channel_url": field_def("text", "YouTube 主播地址"),
+            "up_url": field_def("text", "B 站 UP 主空间地址"),
+            "douyin_url": field_def("text", "抖音入口地址", "https://www.douyin.com/"),
+        },
+        "inputs": {
+            "action": field_def("string", "动作"),
+            "browser": field_def("string", "浏览器"),
+            "channel_url": field_def("string", "YouTube 主播地址"),
+            "up_url": field_def("string", "B 站 UP 主空间地址"),
+            "douyin_url": field_def("string", "抖音入口地址"),
+        },
+        "outputs": {
+            "action": {"type": "string", "label": "动作"},
+            "selected_url": {"type": "string", "label": "选中的视频 URL"},
+            "browser": {"type": "string", "label": "浏览器"},
+            "opened": {"type": "object", "label": "打开结果"},
+        },
+        "control_ports": control_ports(),
+    },
+    video_action_node,
+)

+ 412 - 0
backend/app/automation/nodes/vision.py

@@ -0,0 +1,412 @@
+from __future__ import annotations
+
+import json
+import random
+import time
+import uuid
+from pathlib import Path
+from typing import Any
+
+from fastapi import HTTPException
+
+from ... import ai_service, settings_service, windows_automation
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+
+
+LOCATE_TARGET_PROMPT = """请作为 AI 视觉自动化定位助手,在这张真实屏幕截图中寻找用户指定的可点击目标。
+
+目标描述:
+{target_description}
+
+当前页面/操作上下文:
+{screen_context}
+
+选择要求:
+1. 如果有多个候选目标,{selection_rule}
+2. 返回目标可点击区域的中心点,不要返回窗口、浏览器地址栏或整块页面的中心。
+3. 坐标必须是相对整张截图宽高的百分比,范围 0-100。
+4. 如果目标不可见、被遮挡、需要滚动、页面未加载完成或你不确定,请返回 found=false。
+
+严格只输出 JSON 对象,不要输出 Markdown:
+{{
+  "found": boolean,
+  "x_percent": number|null,
+  "y_percent": number|null,
+  "confidence": number,
+  "target_label": string,
+  "reason": string
+}}"""
+
+VERIFY_PAGE_PROMPT = """请作为 AI 视觉自动化校验器,判断当前屏幕是否符合预期状态。
+
+预期状态:
+{expected_state}
+
+当前页面/操作上下文:
+{screen_context}
+
+严格只输出 JSON 对象,不要输出 Markdown:
+{{
+  "matched": boolean,
+  "page_state": string,
+  "confidence": number,
+  "reason": string
+}}"""
+
+
+def _number(value: Any, default: float = 0) -> float:
+    try:
+        return float(value)
+    except (TypeError, ValueError):
+        return default
+
+
+def _boolean(value: Any, default: bool = False) -> bool:
+    if value in (None, ""):
+        return default
+    if isinstance(value, str):
+        return value.strip().lower() in {"1", "true", "yes", "y", "on"}
+    return bool(value)
+
+
+def _percent(value: Any) -> float | None:
+    try:
+        number = float(value)
+    except (TypeError, ValueError):
+        return None
+    if 0 <= number <= 1:
+        number *= 100
+    return max(0.0, min(100.0, number))
+
+
+def _runtime_screenshot_path() -> Path:
+    """生成 workflow 运行期截图路径,便于失败排查和任务结果追踪。"""
+    folder = settings_service.resolve_data_path("automation_runtime_path", "automation/runtime")
+    folder.mkdir(parents=True, exist_ok=True)
+    return folder / f"vision_locate_{int(time.time() * 1000)}_{uuid.uuid4().hex[:8]}.png"
+
+
+def _capture_screen(save_screenshot: bool) -> dict[str, Any]:
+    save_path = _runtime_screenshot_path() if save_screenshot else None
+    screenshot = windows_automation.take_screenshot(str(save_path) if save_path else None, include_base64=True)
+    screenshot["mime_type"] = "image/png"
+    return screenshot
+
+
+def _parse_ai_json(content: str) -> dict[str, Any]:
+    parsed = json.loads(ai_service.extract_json_text(content))
+    if not isinstance(parsed, dict):
+        raise ValueError("AI locate output must be a JSON object")
+    return parsed
+
+
+def _vision_json(context: WorkflowContext, prompt: str, screenshot: dict[str, Any], temperature: float) -> tuple[dict[str, Any], dict[str, Any]]:
+    ai_result = ai_service.chat_with_images(
+        int(context.provider_id),
+        int(context.model_id),
+        prompt,
+        [{"base64": screenshot["image_base64"], "mime_type": screenshot["mime_type"]}],
+        temperature,
+    )
+    return _parse_ai_json(ai_result["content"]), ai_result
+
+
+def _locate_target(
+    context: WorkflowContext,
+    target_description: str,
+    screen_context: str,
+    randomize: bool,
+    save_screenshot: bool,
+    temperature: float,
+) -> dict[str, Any]:
+    screenshot = _capture_screen(save_screenshot)
+    if screenshot.get("path"):
+        context.runtime["current_screenshot_path"] = screenshot["path"]
+
+    if randomize:
+        selection_rule = f"请结合随机种子 {random.randint(1, 1_000_000)},从可见候选中随机挑选一个"
+    else:
+        selection_rule = "请选择最符合目标描述、最容易点击的一个"
+    prompt = LOCATE_TARGET_PROMPT.format(
+        target_description=target_description,
+        screen_context=screen_context,
+        selection_rule=selection_rule,
+    )
+    try:
+        parsed, ai_result = _vision_json(context, prompt, screenshot, temperature)
+    except (json.JSONDecodeError, ValueError) as exc:
+        raise HTTPException(status_code=502, detail=f"AI locate output is not valid JSON: {exc}") from exc
+
+    found = bool(parsed.get("found"))
+    x_percent = _percent(parsed.get("x_percent"))
+    y_percent = _percent(parsed.get("y_percent"))
+    base = {
+        "screenshot_path": screenshot.get("path"),
+        "width": screenshot.get("width"),
+        "height": screenshot.get("height"),
+        "ai_result": parsed,
+        "ai_raw_content": ai_result["content"],
+    }
+    if not found or x_percent is None or y_percent is None:
+        return {"located": False, "found": False, "next_port": "not_found", **base}
+
+    width = int(screenshot["width"])
+    height = int(screenshot["height"])
+    x = max(0, min(width - 1, round(width * x_percent / 100)))
+    y = max(0, min(height - 1, round(height * y_percent / 100)))
+    return {
+        "located": True,
+        "found": True,
+        "x_percent": x_percent,
+        "y_percent": y_percent,
+        "x": x,
+        "y": y,
+        "confidence": parsed.get("confidence"),
+        "target_label": parsed.get("target_label"),
+        "reason": parsed.get("reason"),
+        **base,
+    }
+
+
+def locate_element_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    if not context.provider_id or not context.model_id:
+        raise HTTPException(status_code=400, detail="AI 视觉定位节点需要配置默认 AI 服务商和模型")
+
+    target_description = str(inputs.get("target_description", params.get("target_description")) or "").strip()
+    if not target_description:
+        raise HTTPException(status_code=400, detail="target_description is required")
+
+    screen_context = str(inputs.get("screen_context", params.get("screen_context")) or "当前屏幕").strip()
+    randomize = _boolean(inputs.get("randomize", params.get("randomize")), False)
+    save_screenshot = _boolean(inputs.get("save_screenshot", params.get("save_screenshot")), True)
+    fail_if_not_found = _boolean(inputs.get("fail_if_not_found", params.get("fail_if_not_found")), True)
+    temperature = _number(inputs.get("temperature", params.get("temperature")), context.temperature)
+
+    result = _locate_target(
+        context,
+        target_description=target_description,
+        screen_context=screen_context,
+        randomize=randomize,
+        save_screenshot=save_screenshot,
+        temperature=temperature,
+    )
+    if not result.get("located"):
+        if fail_if_not_found:
+            ai_result = result.get("ai_result") if isinstance(result.get("ai_result"), dict) else {}
+            raise HTTPException(status_code=404, detail=ai_result.get("reason") or "AI 未定位到目标元素")
+        return result
+    return result
+
+
+def verify_page_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    if not context.provider_id or not context.model_id:
+        raise HTTPException(status_code=400, detail="AI 页面校验节点需要配置默认 AI 服务商和模型")
+    expected_state = str(inputs.get("expected_state", params.get("expected_state")) or "").strip()
+    if not expected_state:
+        raise HTTPException(status_code=400, detail="expected_state is required")
+    screen_context = str(inputs.get("screen_context", params.get("screen_context")) or "当前屏幕").strip()
+    save_screenshot = _boolean(inputs.get("save_screenshot", params.get("save_screenshot")), True)
+    temperature = _number(inputs.get("temperature", params.get("temperature")), context.temperature)
+    screenshot = _capture_screen(save_screenshot)
+    if screenshot.get("path"):
+        context.runtime["current_screenshot_path"] = screenshot["path"]
+    prompt = VERIFY_PAGE_PROMPT.format(expected_state=expected_state, screen_context=screen_context)
+    try:
+        parsed, ai_result = _vision_json(context, prompt, screenshot, temperature)
+    except (json.JSONDecodeError, ValueError) as exc:
+        raise HTTPException(status_code=502, detail=f"AI verify output is not valid JSON: {exc}") from exc
+    matched = bool(parsed.get("matched"))
+    return {
+        "matched": matched,
+        "next_port": "matched" if matched else "not_matched",
+        "page_state": parsed.get("page_state"),
+        "confidence": parsed.get("confidence"),
+        "reason": parsed.get("reason"),
+        "screenshot_path": screenshot.get("path"),
+        "width": screenshot.get("width"),
+        "height": screenshot.get("height"),
+        "ai_result": parsed,
+        "ai_raw_content": ai_result["content"],
+    }
+
+
+def click_target_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    target_description = str(inputs.get("target_description", params.get("target_description")) or "").strip()
+    if not target_description:
+        raise HTTPException(status_code=400, detail="target_description is required")
+    screen_context = str(inputs.get("screen_context", params.get("screen_context")) or "当前屏幕").strip()
+    randomize = _boolean(inputs.get("randomize", params.get("randomize")), False)
+    save_screenshot = _boolean(inputs.get("save_screenshot", params.get("save_screenshot")), True)
+    fail_if_not_found = _boolean(inputs.get("fail_if_not_found", params.get("fail_if_not_found")), True)
+    temperature = _number(inputs.get("temperature", params.get("temperature")), context.temperature)
+    button = str(inputs.get("button", params.get("button")) or "left")
+    clicks = int(max(1, min(_number(inputs.get("clicks", params.get("clicks")), 1), 20)))
+    result = _locate_target(context, target_description, screen_context, randomize, save_screenshot, temperature)
+    if not result.get("located"):
+        if fail_if_not_found:
+            ai_result = result.get("ai_result") if isinstance(result.get("ai_result"), dict) else {}
+            raise HTTPException(status_code=404, detail=ai_result.get("reason") or "AI 未定位到可点击目标")
+        return result
+    clicked = windows_automation.mouse_action("click", x=int(result["x"]), y=int(result["y"]), button=button, clicks=clicks)
+    return {**result, "clicked": True, "click": clicked, "button": button, "clicks": clicks}
+
+
+def close_popups_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = node.get("params", {})
+    target_description = str(
+        inputs.get("target_description", params.get("target_description"))
+        or "当前页面可见的弹窗关闭按钮、跳过按钮、稍后再说按钮、我知道了按钮或拒绝按钮"
+    )
+    screen_context = str(inputs.get("screen_context", params.get("screen_context")) or "当前浏览器页面").strip()
+    attempts = int(max(1, min(_number(inputs.get("attempts", params.get("attempts")), 2), 5)))
+    temperature = _number(inputs.get("temperature", params.get("temperature")), context.temperature)
+    closed: list[dict[str, Any]] = []
+    for _ in range(attempts):
+        result = _locate_target(context, target_description, screen_context, False, True, temperature)
+        if not result.get("located"):
+            return {"closed_count": len(closed), "items": closed, "next_port": "success"}
+        clicked = windows_automation.mouse_action("click", x=int(result["x"]), y=int(result["y"]))
+        closed.append({**result, "click": clicked})
+        time.sleep(0.8)
+    return {"closed_count": len(closed), "items": closed, "next_port": "success"}
+
+
+register_node(
+    {
+        "type": "vision.locate_element",
+        "category": "vision",
+        "label": "AI 视觉定位元素",
+        "params": {
+            "target_description": field_def("text", "目标描述", required=True),
+            "screen_context": field_def("text", "页面上下文"),
+            "randomize": field_def("boolean", "多候选随机选择", False),
+            "save_screenshot": field_def("boolean", "保存截图", True),
+            "fail_if_not_found": field_def("boolean", "找不到时报错", True),
+            "temperature": field_def("number", "定位温度", 0.1, minimum=0, maximum=2),
+        },
+        "inputs": {
+            "target_description": field_def("string", "目标描述"),
+            "screen_context": field_def("string", "页面上下文"),
+            "randomize": field_def("boolean", "多候选随机选择"),
+            "save_screenshot": field_def("boolean", "保存截图"),
+            "fail_if_not_found": field_def("boolean", "找不到时报错"),
+            "temperature": field_def("number", "定位温度"),
+        },
+        "outputs": {
+            "located": {"type": "boolean", "label": "是否定位成功"},
+            "x_percent": {"type": "number", "label": "X 百分比"},
+            "y_percent": {"type": "number", "label": "Y 百分比"},
+            "x": {"type": "number", "label": "X 坐标"},
+            "y": {"type": "number", "label": "Y 坐标"},
+            "confidence": {"type": "number", "label": "置信度"},
+            "target_label": {"type": "string", "label": "目标标签"},
+            "screenshot_path": {"type": "string", "label": "截图路径"},
+            "ai_result": {"type": "object", "label": "AI 结果"},
+        },
+        "control_ports": control_ports(["success", "not_found", "failure"]),
+    },
+    locate_element_node,
+)
+
+register_node(
+    {
+        "type": "vision.verify_page",
+        "category": "vision",
+        "label": "AI 校验页面状态",
+        "description": "截取当前屏幕,让多模态 AI 判断页面是否符合预期,并按 matched/not_matched 分支继续。",
+        "params": {
+            "expected_state": field_def("text", "预期状态", required=True),
+            "screen_context": field_def("text", "页面上下文"),
+            "save_screenshot": field_def("boolean", "保存截图", True),
+            "temperature": field_def("number", "校验温度", 0.1, minimum=0, maximum=2),
+        },
+        "inputs": {
+            "expected_state": field_def("string", "预期状态"),
+            "screen_context": field_def("string", "页面上下文"),
+            "save_screenshot": field_def("boolean", "保存截图"),
+            "temperature": field_def("number", "校验温度"),
+        },
+        "outputs": {
+            "matched": {"type": "boolean", "label": "是否匹配"},
+            "page_state": {"type": "string", "label": "页面状态"},
+            "confidence": {"type": "number", "label": "置信度"},
+            "reason": {"type": "string", "label": "原因"},
+            "screenshot_path": {"type": "string", "label": "截图路径"},
+            "ai_result": {"type": "object", "label": "AI 结果"},
+        },
+        "control_ports": control_ports(["matched", "not_matched", "failure"]),
+    },
+    verify_page_node,
+)
+
+register_node(
+    {
+        "type": "vision.click_target",
+        "category": "vision",
+        "label": "AI 定位并点击",
+        "description": "截屏定位目标元素,换算坐标后立即点击,适合封装常见视觉点击步骤。",
+        "params": {
+            "target_description": field_def("text", "目标描述", required=True),
+            "screen_context": field_def("text", "页面上下文"),
+            "randomize": field_def("boolean", "多候选随机选择", False),
+            "button": field_def("select", "按键", "left", options=["left", "middle", "right"]),
+            "clicks": field_def("number", "点击次数", 1, minimum=1, maximum=20),
+            "save_screenshot": field_def("boolean", "保存截图", True),
+            "fail_if_not_found": field_def("boolean", "找不到时报错", True),
+            "temperature": field_def("number", "定位温度", 0.1, minimum=0, maximum=2),
+        },
+        "inputs": {
+            "target_description": field_def("string", "目标描述"),
+            "screen_context": field_def("string", "页面上下文"),
+            "randomize": field_def("boolean", "多候选随机选择"),
+            "button": field_def("string", "按键"),
+            "clicks": field_def("number", "点击次数"),
+            "save_screenshot": field_def("boolean", "保存截图"),
+            "fail_if_not_found": field_def("boolean", "找不到时报错"),
+            "temperature": field_def("number", "定位温度"),
+        },
+        "outputs": {
+            "located": {"type": "boolean", "label": "是否定位成功"},
+            "clicked": {"type": "boolean", "label": "是否已点击"},
+            "x": {"type": "number", "label": "X 坐标"},
+            "y": {"type": "number", "label": "Y 坐标"},
+            "confidence": {"type": "number", "label": "置信度"},
+            "target_label": {"type": "string", "label": "目标标签"},
+            "click": {"type": "object", "label": "点击结果"},
+            "ai_result": {"type": "object", "label": "AI 结果"},
+        },
+        "control_ports": control_ports(["success", "not_found", "failure"]),
+    },
+    click_target_node,
+)
+
+register_node(
+    {
+        "type": "vision.close_popups",
+        "category": "vision",
+        "label": "AI 关闭弹窗",
+        "description": "尝试识别并点击当前页面上的关闭、跳过、稍后再说等弹窗按钮。",
+        "params": {
+            "target_description": field_def("text", "关闭目标", "当前页面可见的弹窗关闭按钮、跳过按钮、稍后再说按钮、我知道了按钮或拒绝按钮"),
+            "screen_context": field_def("text", "页面上下文", "当前浏览器页面"),
+            "attempts": field_def("number", "最多尝试", 2, minimum=1, maximum=5),
+            "temperature": field_def("number", "定位温度", 0.1, minimum=0, maximum=2),
+        },
+        "inputs": {
+            "target_description": field_def("string", "关闭目标"),
+            "screen_context": field_def("string", "页面上下文"),
+            "attempts": field_def("number", "最多尝试"),
+            "temperature": field_def("number", "定位温度"),
+        },
+        "outputs": {
+            "closed_count": {"type": "number", "label": "关闭数量"},
+            "items": {"type": "array", "label": "关闭记录"},
+        },
+        "control_ports": control_ports(),
+    },
+    close_popups_node,
+)

+ 490 - 0
backend/app/automation/nodes/web_search.py

@@ -0,0 +1,490 @@
+from __future__ import annotations
+
+import json
+import time
+from pathlib import Path
+from typing import Any
+from urllib.parse import quote_plus
+
+from fastapi import HTTPException
+
+from ... import ai_service, settings_service, windows_automation
+from ..context import WorkflowContext
+from ..registry import control_ports, field_def, register_node
+
+
+SEARCH_ENGINES = {
+    "google": "https://www.google.com/search?q={query}",
+    "bing": "https://www.bing.com/search?q={query}",
+}
+
+
+def _number(value: Any, default: float, minimum: float, maximum: float) -> float:
+    try:
+        number = float(value)
+    except (TypeError, ValueError):
+        number = default
+    return max(minimum, min(maximum, number))
+
+
+def _integer(value: Any, default: int, minimum: int, maximum: int) -> int:
+    return int(_number(value, default, minimum, maximum))
+
+
+def _percent(value: Any) -> float | None:
+    try:
+        number = float(value)
+    except (TypeError, ValueError):
+        return None
+    if 0 <= number <= 1:
+        number *= 100
+    return max(0.0, min(100.0, number))
+
+
+def _screen_point(x_percent: Any, y_percent: Any, width: Any, height: Any) -> tuple[int | None, int | None]:
+    x = _percent(x_percent)
+    y = _percent(y_percent)
+    try:
+        screen_width = int(width)
+        screen_height = int(height)
+    except (TypeError, ValueError):
+        return None, None
+    if x is None or y is None or screen_width <= 0 or screen_height <= 0:
+        return None, None
+    return round(screen_width * x / 100), round(screen_height * y / 100)
+
+
+def normalize_search_result(item: Any, scroll_page: int, width: Any, height: Any) -> dict[str, Any] | None:
+    """规范化视觉模型返回的搜索结果,并换算标题点击坐标。"""
+    if not isinstance(item, dict):
+        return None
+    title = str(item.get("title") or "").strip()
+    url = str(item.get("url") or "").strip()
+    if not title and not url:
+        return None
+    x_percent = _percent(item.get("title_center_x_percent"))
+    y_percent = _percent(item.get("title_center_y_percent"))
+    x, y = _screen_point(x_percent, y_percent, width, height)
+    return {
+        "title": title,
+        "url": url,
+        "snippet": str(item.get("snippet") or "").strip(),
+        "position": item.get("position") if isinstance(item.get("position"), (int, float)) else None,
+        "scroll_page": scroll_page,
+        "title_center_x_percent": x_percent,
+        "title_center_y_percent": y_percent,
+        "title_center_x": x,
+        "title_center_y": y,
+    }
+
+
+def result_identity(item: dict[str, Any]) -> str:
+    """优先按 URL 去重;视觉模型未识别 URL 时退回标题。"""
+    return str(item.get("url") or item.get("title") or "").strip().lower()
+
+
+class WebSearchRunner:
+    """使用真实浏览器、屏幕截图和多模态模型完成网页搜索研究。"""
+
+    def __init__(self, context: WorkflowContext, params: dict[str, Any]) -> None:
+        if not context.provider_id or not context.model_id:
+            raise HTTPException(status_code=400, detail="网页搜索节点需要配置默认 AI 服务商和模型")
+        self.context = context
+        self.params = params
+        self.query = str(params.get("query") or "").strip()
+        if not self.query:
+            raise HTTPException(status_code=400, detail="网页搜索关键词不能为空")
+        self.page_wait = _number(params.get("page_load_wait_seconds"), 8, 0, 120)
+        self.action_wait = _number(params.get("action_wait_seconds"), 1, 0, 30)
+        self.max_search_pages = _integer(params.get("max_search_pages"), 4, 1, 20)
+        self.result_count = _integer(params.get("result_count"), 3, 1, 10)
+        self.detail_max_pages = _integer(params.get("detail_max_pages"), 4, 1, 20)
+        self.click_attempts = _integer(params.get("click_attempts"), 2, 1, 5)
+        self.analyses: list[dict[str, Any]] = []
+
+    def run(self) -> dict[str, Any]:
+        browser = str(self.params.get("browser") or "edge")
+        engine = str(self.params.get("search_engine") or "google").lower()
+        template = SEARCH_ENGINES.get(engine, SEARCH_ENGINES["google"])
+        search_url = template.format(query=quote_plus(self.query))
+        opened = windows_automation.open_url(search_url, browser=browser, new_window=True)
+        self.context.remember_pid(opened.get("pid"))
+        time.sleep(self.page_wait)
+
+        try:
+            results = self._collect_results(engine)
+            ranked = self._rank_results(results)
+            details = self._research_results(ranked)
+            final_summary = self._summarize(details, ranked)
+            report_path = self._write_report(results, ranked, details, final_summary)
+            output = {
+                "query": self.query,
+                "search_url": search_url,
+                "result_count": len(results),
+                "researched_count": len(details),
+                "results": results,
+                "ranked_results": ranked,
+                "researched_details": details,
+                "summary": str(final_summary.get("summary") or ""),
+                "key_points": final_summary.get("key_points") or [],
+                "conclusion": str(final_summary.get("conclusion") or ""),
+                "report_path": report_path,
+                "next_port": "success" if results else "no_results",
+            }
+            if bool(self.params.get("include_debug_analyses", False)):
+                output["analyses"] = self.analyses
+            return output
+        finally:
+            if bool(self.params.get("close_browser", True)):
+                try:
+                    windows_automation.keyboard_action("hotkey", keys=["alt", "f4"])
+                    time.sleep(self.action_wait)
+                except Exception:
+                    # 清理浏览器失败不应覆盖已经得到的搜索结果或原始异常。
+                    pass
+
+    def _capture(self) -> dict[str, Any]:
+        return windows_automation.take_screenshot(None, include_base64=True)
+
+    def _vision_json(self, prompt: str, screenshot: dict[str, Any]) -> dict[str, Any]:
+        result = ai_service.chat_with_images(
+            int(self.context.provider_id),
+            int(self.context.model_id),
+            prompt,
+            [{"base64": screenshot["image_base64"], "mime_type": screenshot.get("mime_type", "image/png")}],
+            self.context.temperature,
+        )
+        try:
+            parsed = json.loads(ai_service.extract_json_text(result["content"]))
+        except (json.JSONDecodeError, ValueError, TypeError) as exc:
+            raise HTTPException(status_code=502, detail=f"网页视觉模型未返回有效 JSON: {exc}") from exc
+        if not isinstance(parsed, dict):
+            raise HTTPException(status_code=502, detail="网页视觉模型返回值必须是 JSON 对象")
+        return parsed
+
+    def _text_json(self, prompt: str) -> dict[str, Any]:
+        result = ai_service.chat(
+            int(self.context.provider_id),
+            int(self.context.model_id),
+            prompt,
+            self.context.temperature,
+        )
+        try:
+            parsed = json.loads(ai_service.extract_json_text(result["content"]))
+        except (json.JSONDecodeError, ValueError, TypeError) as exc:
+            raise HTTPException(status_code=502, detail=f"网页搜索模型未返回有效 JSON: {exc}") from exc
+        if not isinstance(parsed, dict):
+            raise HTTPException(status_code=502, detail="网页搜索模型返回值必须是 JSON 对象")
+        return parsed
+
+    def _collect_results(self, engine: str) -> list[dict[str, Any]]:
+        results: list[dict[str, Any]] = []
+        seen: set[str] = set()
+        for scroll_page in range(self.max_search_pages):
+            screenshot = self._capture()
+            prompt = f"""请分析真实 Windows 浏览器中的搜索结果截图。当前搜索引擎:{engine},查询词:{self.query}。
+
+任务:
+1. 判断当前页面是否为搜索结果页、验证码/阻止页或其他页面。
+2. 提取可见的自然搜索结果,忽略广告、导航、相关搜索和重复项。
+3. 估算每个结果标题中心点相对整张截图的百分比坐标。
+4. 判断是否已经到当前搜索结果页底部。
+5. 严格只输出 JSON:
+{{
+  "is_bottom": boolean,
+  "page_state": "search_results|blocked|captcha|consent|other",
+  "results": [{{
+    "title": string,
+    "url": string,
+    "snippet": string,
+    "position": number|null,
+    "title_center_x_percent": number|null,
+    "title_center_y_percent": number|null
+  }}],
+  "notes": string
+}}"""
+            analysis = self._vision_json(prompt, screenshot)
+            analysis["scroll_page"] = scroll_page
+            self.analyses.append({"type": "search_page", **analysis})
+            if analysis.get("page_state") not in {None, "search_results"}:
+                break
+            for raw_item in analysis.get("results") or []:
+                item = normalize_search_result(raw_item, scroll_page, screenshot.get("width"), screenshot.get("height"))
+                if not item:
+                    continue
+                identity = result_identity(item)
+                if not identity or identity in seen:
+                    continue
+                seen.add(identity)
+                results.append(item)
+            if bool(analysis.get("is_bottom")):
+                break
+            windows_automation.keyboard_action("press", key="pagedown")
+            time.sleep(self.action_wait)
+        return results
+
+    def _rank_results(self, results: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        if not results:
+            return []
+        indexed = [{"original_index": index, **item} for index, item in enumerate(results)]
+        prompt = f"""请对网页搜索结果去重并按与查询词的相关性排序。
+查询词:{self.query}
+最多选择:{self.result_count}
+
+严格只输出 JSON:
+{{
+  "ranked_results": [{{
+    "original_index": number,
+    "relevance_score": number,
+    "dedupe_reason": string,
+    "why_relevant": string
+  }}],
+  "notes": string
+}}
+
+搜索结果:
+{json.dumps(indexed, ensure_ascii=False, indent=2)}"""
+        ranking = self._text_json(prompt)
+        self.analyses.append({"type": "ranking", **ranking})
+        ranked: list[dict[str, Any]] = []
+        used: set[int] = set()
+        for rank_item in ranking.get("ranked_results") or []:
+            if not isinstance(rank_item, dict):
+                continue
+            try:
+                index = int(rank_item.get("original_index"))
+            except (TypeError, ValueError):
+                continue
+            if index in used or index < 0 or index >= len(results):
+                continue
+            used.add(index)
+            ranked.append({**results[index], **rank_item, "original_index": index})
+            if len(ranked) >= self.result_count:
+                break
+        if not ranked:
+            ranked = [{**item, "original_index": index} for index, item in enumerate(results[: self.result_count])]
+        return ranked
+
+    def _research_results(self, ranked: list[dict[str, Any]]) -> list[dict[str, Any]]:
+        details: list[dict[str, Any]] = []
+        for rank, result in enumerate(ranked[: self.result_count], start=1):
+            classification = self._open_result(result)
+            if not classification.get("opened_detail_page"):
+                details.append({"rank": rank, "result": result, "opened_detail_page": False, "error": classification.get("notes")})
+                self._restore_search_page_if_needed(classification)
+                continue
+            visited_url = self._current_url()
+            chunks = self._extract_detail(result)
+            cleaned = self._clean_detail(result, visited_url, chunks)
+            details.append({
+                "rank": rank,
+                "result": result,
+                "visited_url": visited_url,
+                "opened_detail_page": True,
+                "chunks": chunks,
+                "cleaned": cleaned,
+            })
+            windows_automation.keyboard_action("hotkey", keys=["alt", "left"])
+            time.sleep(self.page_wait)
+        return details
+
+    def _go_to_scroll_page(self, scroll_page: int) -> None:
+        windows_automation.keyboard_action("press", key="home")
+        time.sleep(self.action_wait)
+        for _ in range(max(0, scroll_page)):
+            windows_automation.keyboard_action("press", key="pagedown")
+            time.sleep(self.action_wait)
+
+    def _open_result(self, result: dict[str, Any]) -> dict[str, Any]:
+        title = str(result.get("title") or "")
+        scroll_page = _integer(result.get("scroll_page"), 0, 0, self.max_search_pages)
+        last: dict[str, Any] = {
+            "opened_detail_page": False,
+            "is_search_results_page": True,
+            "notes": "未执行点击",
+        }
+        for attempt in range(1, self.click_attempts + 1):
+            self._go_to_scroll_page(scroll_page)
+            x = result.get("title_center_x") if attempt == 1 else None
+            y = result.get("title_center_y") if attempt == 1 else None
+            if x is None or y is None:
+                screenshot = self._capture()
+                prompt = f"""请在搜索结果截图中定位与目标标题最匹配的可点击标题。
+目标标题:{title}
+严格只输出 JSON:
+{{"found": boolean, "center_x_percent": number|null, "center_y_percent": number|null, "confidence": number, "notes": string}}"""
+                location = self._vision_json(prompt, screenshot)
+                self.analyses.append({"type": "result_location", "title": title, **location})
+                if not location.get("found"):
+                    last = {"opened_detail_page": False, "is_search_results_page": True, **location}
+                    continue
+                x, y = _screen_point(
+                    location.get("center_x_percent"),
+                    location.get("center_y_percent"),
+                    screenshot.get("width"),
+                    screenshot.get("height"),
+                )
+            if x is None or y is None:
+                last = {
+                    "opened_detail_page": False,
+                    "is_search_results_page": True,
+                    "notes": "模型未返回可用点击坐标",
+                }
+                continue
+            windows_automation.mouse_action("click", x=int(x), y=int(y))
+            time.sleep(self.page_wait)
+            screenshot = self._capture()
+            prompt = f"""请判断点击搜索结果后当前浏览器页面的类型。
+预期标题:{title}
+严格只输出 JSON:
+{{
+  "is_search_results_page": boolean,
+  "is_article_or_detail_page": boolean,
+  "page_state": "search_results|article_or_detail|captcha|blocked|other",
+  "confidence": number,
+  "notes": string
+}}"""
+            classification = self._vision_json(prompt, screenshot)
+            classification["attempt"] = attempt
+            self.analyses.append({"type": "clicked_page", "title": title, **classification})
+            if classification.get("is_article_or_detail_page") and not classification.get("is_search_results_page"):
+                return {"opened_detail_page": True, **classification}
+            last = {"opened_detail_page": False, **classification}
+            if not classification.get("is_search_results_page"):
+                break
+        return last
+
+    def _restore_search_page_if_needed(self, classification: dict[str, Any]) -> None:
+        if classification.get("is_search_results_page"):
+            return
+        windows_automation.keyboard_action("hotkey", keys=["alt", "left"])
+        time.sleep(self.page_wait)
+
+    def _current_url(self) -> str:
+        try:
+            import pyperclip
+        except ImportError as exc:
+            raise HTTPException(status_code=500, detail="pyperclip is not installed") from exc
+        windows_automation.keyboard_action("hotkey", keys=["alt", "d"])
+        time.sleep(self.action_wait)
+        windows_automation.keyboard_action("hotkey", keys=["ctrl", "c"])
+        time.sleep(self.action_wait)
+        url = str(pyperclip.paste() or "").strip()
+        windows_automation.keyboard_action("press", key="escape")
+        time.sleep(self.action_wait)
+        return url
+
+    def _extract_detail(self, result: dict[str, Any]) -> list[dict[str, Any]]:
+        chunks: list[dict[str, Any]] = []
+        title = str(result.get("title") or "")
+        for detail_page in range(self.detail_max_pages):
+            screenshot = self._capture()
+            prompt = f"""请提取文章、文档或详情页截图中与研究问题相关的可见信息。
+研究问题:{self.query}
+原搜索结果标题:{title}
+忽略广告、导航、Cookie 提示和重复页眉页脚。
+严格只输出 JSON:
+{{
+  "is_bottom": boolean,
+  "page_state": "article_or_detail|blocked|captcha|other",
+  "visible_information": string,
+  "confidence": number,
+  "notes": string
+}}"""
+            extraction = self._vision_json(prompt, screenshot)
+            extraction["detail_page"] = detail_page
+            chunks.append(extraction)
+            self.analyses.append({"type": "detail_extraction", "title": title, **extraction})
+            if extraction.get("is_bottom") or extraction.get("page_state") in {"blocked", "captcha"}:
+                break
+            windows_automation.keyboard_action("press", key="pagedown")
+            time.sleep(self.action_wait)
+        return chunks
+
+    def _clean_detail(self, result: dict[str, Any], visited_url: str, chunks: list[dict[str, Any]]) -> dict[str, Any]:
+        prompt = f"""请清理、去重并组织一个网页搜索结果中提取的信息。
+研究问题:{self.query}
+搜索结果:{json.dumps({**result, 'visited_url': visited_url}, ensure_ascii=False)}
+提取片段:{json.dumps(chunks, ensure_ascii=False)}
+严格只输出 JSON:
+{{"clean_title": string, "clean_text": string, "key_points": [string], "notes": string}}"""
+        cleaned = self._text_json(prompt)
+        self.analyses.append({"type": "clean_detail", "title": result.get("title"), **cleaned})
+        return cleaned
+
+    def _summarize(self, details: list[dict[str, Any]], ranked: list[dict[str, Any]]) -> dict[str, Any]:
+        if not details:
+            return {"summary": "未获取到可研究的网页详情。", "key_points": [], "conclusion": "", "notes": ""}
+        prompt = f"""请根据网页搜索研究结果生成事实清晰、避免重复的中文总结。
+研究问题:{self.query}
+排序结果:{json.dumps(ranked, ensure_ascii=False)}
+详情:{json.dumps(details, ensure_ascii=False)}
+严格只输出 JSON:
+{{"summary": string, "key_points": [string], "conclusion": string, "notes": string}}"""
+        summary = self._text_json(prompt)
+        self.analyses.append({"type": "final_summary", **summary})
+        return summary
+
+    def _write_report(
+        self,
+        results: list[dict[str, Any]],
+        ranked: list[dict[str, Any]],
+        details: list[dict[str, Any]],
+        summary: dict[str, Any],
+    ) -> str:
+        report_dir = settings_service.resolve_data_path("automation_runtime_path", "automation/runtime") / "web_search"
+        report_dir.mkdir(parents=True, exist_ok=True)
+        path = report_dir / f"search_{int(time.time() * 1000)}.json"
+        payload = {
+            "query": self.query,
+            "results": results,
+            "ranked_results": ranked,
+            "researched_details": details,
+            "final_summary": summary,
+        }
+        path.write_text(json.dumps(payload, ensure_ascii=False, indent=2), encoding="utf-8")
+        return str(path)
+
+
+def web_search_node(node: dict[str, Any], inputs: dict[str, Any], context: WorkflowContext) -> dict[str, Any]:
+    params = {**(node.get("params") or {}), **inputs}
+    return WebSearchRunner(context, params).run()
+
+
+register_node(
+    {
+        "type": "browser.web_search",
+        "category": "browser",
+        "label": "网页搜索研究",
+        "params": {
+            "query": field_def("text", "搜索关键词", required=True),
+            "search_engine": field_def("select", "搜索引擎", "google", options=["google", "bing"]),
+            "browser": field_def("select", "浏览器", "edge", options=["default", "edge"]),
+            "max_search_pages": field_def("number", "最多搜索页屏", 4, minimum=1, maximum=20),
+            "result_count": field_def("number", "研究结果数", 3, minimum=1, maximum=10),
+            "detail_max_pages": field_def("number", "每页最多滚动", 4, minimum=1, maximum=20),
+            "click_attempts": field_def("number", "标题点击重试", 2, minimum=1, maximum=5),
+            "page_load_wait_seconds": field_def("number", "页面加载等待秒数", 8, minimum=0, maximum=120),
+            "action_wait_seconds": field_def("number", "操作等待秒数", 1, minimum=0, maximum=30),
+            "close_browser": field_def("boolean", "完成后关闭浏览器", True),
+            "include_debug_analyses": field_def("boolean", "返回调试分析", False),
+        },
+        "inputs": {
+            "query": field_def("string", "搜索关键词"),
+            "search_engine": field_def("string", "搜索引擎"),
+            "browser": field_def("string", "浏览器"),
+        },
+        "outputs": {
+            "query": {"type": "string", "label": "搜索关键词"},
+            "results": {"type": "array", "label": "搜索结果"},
+            "ranked_results": {"type": "array", "label": "排序结果"},
+            "researched_details": {"type": "array", "label": "详情研究结果"},
+            "summary": {"type": "string", "label": "总结"},
+            "key_points": {"type": "array", "label": "要点"},
+            "conclusion": {"type": "string", "label": "结论"},
+            "report_path": {"type": "string", "label": "结果文件"},
+        },
+        "control_ports": control_ports(["success", "no_results", "failure"]),
+    },
+    web_search_node,
+)

+ 4 - 0
backend/app/automation/registry.py

@@ -14,6 +14,7 @@ NODE_EXECUTORS: dict[str, NodeExecutor] = {}
 
 def register_node(definition: dict[str, Any], executor: NodeExecutor) -> None:
     node_type = str(definition["type"])
+    definition.setdefault("description", f"执行“{definition.get('label') or node_type}”节点。")
     NODE_REGISTRY[node_type] = definition
     NODE_EXECUTORS[node_type] = executor
 
@@ -36,6 +37,7 @@ def field_def(
     options: list[Any] | None = None,
     minimum: float | None = None,
     maximum: float | None = None,
+    description: str | None = None,
 ) -> dict[str, Any]:
     item: dict[str, Any] = {
         "type": field_type,
@@ -50,6 +52,8 @@ def field_def(
         item["min"] = minimum
     if maximum is not None:
         item["max"] = maximum
+    if description:
+        item["description"] = description
     return item
 
 

+ 270 - 7
backend/app/automation_service.py

@@ -1,12 +1,14 @@
 from __future__ import annotations
 
 import base64
+import io
 import json
 import mimetypes
 import re
 import sqlite3
 import time
 import uuid
+import zipfile
 from pathlib import Path
 from typing import Any
 
@@ -677,9 +679,129 @@ def update_workflow(workflow_id: int, payload: AutomationWorkflowSaveRequest) ->
     return get_workflow(workflow_id)
 
 
+def import_workflow(payload: AutomationWorkflowSaveRequest, conflict_strategy: str) -> dict[str, Any]:
+    """导入 workflow/v1;replace 模式按 workflow_key 覆盖目标设备中的同名工作流。"""
+    workflow_key = normalize_workflow_key(payload.workflow_key)
+    if conflict_strategy == "replace" and workflow_key:
+        with get_db() as conn:
+            existing = conn.execute(
+                "SELECT id FROM automation_workflows WHERE workflow_key = ?",
+                (workflow_key,),
+            ).fetchone()
+        if existing:
+            return update_workflow(int(existing["id"]), payload)
+    return save_workflow(payload)
+
+
+def export_workflows_zip() -> bytes:
+    """把全部 workflow 打成 ZIP,便于在多台设备间批量迁移。"""
+    with get_db() as conn:
+        rows = conn.execute(
+            """
+            SELECT *
+            FROM automation_workflows
+            ORDER BY workflow_key ASC, name ASC, id ASC
+            """
+        ).fetchall()
+
+    buffer = io.BytesIO()
+    exported_items: list[dict[str, Any]] = []
+    with zipfile.ZipFile(buffer, "w", compression=zipfile.ZIP_DEFLATED) as archive:
+        for row in rows:
+            workflow = workflow_export_payload(workflow_to_public(row))
+            key = normalize_workflow_key(str(workflow.get("workflow_key") or "")) or f"workflow-{row['id']}"
+            filename = f"workflows/{safe_zip_name(key)}.workflow.json"
+            archive.writestr(filename, json.dumps(workflow, ensure_ascii=False, indent=2))
+            exported_items.append({"workflow_key": workflow.get("workflow_key"), "name": workflow.get("name"), "path": filename})
+        archive.writestr(
+            "manifest.json",
+            json.dumps(
+                {
+                    "schema_version": "workflow-zip/v1",
+                    "exported_at": now_iso(),
+                    "count": len(exported_items),
+                    "items": exported_items,
+                },
+                ensure_ascii=False,
+                indent=2,
+            ),
+        )
+    return buffer.getvalue()
+
+
+def import_workflows_zip(content: bytes) -> dict[str, Any]:
+    """从 ZIP 批量导入 workflow;遇到重复 workflow_key 时跳过,不覆盖本机已有配置。"""
+    if not content:
+        raise HTTPException(status_code=400, detail="zip content is required")
+
+    created: list[dict[str, Any]] = []
+    skipped: list[dict[str, Any]] = []
+    failed: list[dict[str, Any]] = []
+    try:
+        archive = zipfile.ZipFile(io.BytesIO(content))
+    except zipfile.BadZipFile as exc:
+        raise HTTPException(status_code=400, detail="Invalid workflow zip file") from exc
+
+    with archive:
+        names = [name for name in archive.namelist() if name.lower().endswith(".json") and Path(name).name != "manifest.json"]
+        for name in names:
+            try:
+                with archive.open(name) as file:
+                    raw = json.loads(file.read().decode("utf-8"))
+                payload = AutomationWorkflowSaveRequest.model_validate(raw)
+                workflow_key = normalize_workflow_key(payload.workflow_key)
+                if workflow_key and workflow_key_exists(workflow_key):
+                    skipped.append({"path": name, "workflow_key": workflow_key, "reason": "workflow_key already exists"})
+                    continue
+                saved = save_workflow(payload)
+                created.append({"path": name, "id": saved["id"], "workflow_key": saved.get("workflow_key"), "name": saved.get("name")})
+            except Exception as exc:
+                failed.append({"path": name, "error": str(exc)})
+
+    return {
+        "created_count": len(created),
+        "skipped_count": len(skipped),
+        "failed_count": len(failed),
+        "created": created,
+        "skipped": skipped,
+        "failed": failed,
+    }
+
+
+def workflow_key_exists(workflow_key: str) -> bool:
+    """检查目标设备中是否已存在同名 workflow key。"""
+    with get_db() as conn:
+        row = conn.execute("SELECT id FROM automation_workflows WHERE workflow_key = ?", (workflow_key,)).fetchone()
+    return bool(row)
+
+
+def safe_zip_name(value: str) -> str:
+    """生成安全的 ZIP 文件名片段,避免不同系统路径规则不一致。"""
+    name = re.sub(r"[^A-Za-z0-9_.-]+", "-", value).strip(".-")
+    return name or "workflow"
+
+
+def workflow_export_payload(workflow: dict[str, Any]) -> dict[str, Any]:
+    """提取可迁移的 workflow/v1 字段,不包含本机数据库 ID 和时间戳。"""
+    return {
+        key: workflow.get(key)
+        for key in [
+            "schema_version",
+            "workflow_key",
+            "name",
+            "description",
+            "variables",
+            "settings",
+            "nodes",
+            "edges",
+        ]
+    }
+
+
 def normalize_workflow_payload(payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
     """把请求模型转换为持久化的 workflow/v1 JSON。"""
-    workflow_json = payload.model_dump()
+    # 排除 Pydantic 为可选引用字段补出的 null,保证导入后再导出仍保持简洁稳定。
+    workflow_json = payload.model_dump(exclude_none=True)
     workflow_json["schema_version"] = "workflow/v1"
     workflow_json["workflow_key"] = normalize_workflow_key(payload.workflow_key)
     workflow_json["name"] = payload.name.strip()
@@ -715,6 +837,126 @@ def list_workflows(page: int, page_size: int) -> dict[str, Any]:
     return {"items": [workflow_summary(row) for row in rows], "total": total, "page": page, "page_size": page_size}
 
 
+def web_search_workflow_template() -> dict[str, Any]:
+    """返回可供外部程序调用的 AI 多轮网页研究 workflow。"""
+    return {
+        "schema_version": "workflow/v1",
+        "workflow_key": "ai-web-research",
+        "name": "AI 多轮网页搜索研究",
+        "description": "AI 制定搜索计划,使用视觉浏览器多轮研究,并按调用方提供的 JSON Schema 返回结果。",
+        "variables": {
+            "objective": {
+                "type": "string",
+                "default": "",
+                "description": "要搜索和研究的目标",
+            },
+            "output_schema": {
+                "type": "object",
+                "default": {
+                    "type": "object",
+                    "required": ["summary", "facts"],
+                    "properties": {
+                        "summary": {"type": "string"},
+                        "facts": {"type": "array", "items": {"type": "string"}},
+                    },
+                    "additionalProperties": False,
+                },
+                "description": "最终 data 字段必须满足的 JSON Schema",
+            },
+            "constraints": {
+                "type": "object",
+                "default": {"language": "zh-CN", "min_sources": 1},
+                "description": "来源、语言、时间范围等研究约束",
+            },
+            "max_attempts": {
+                "type": "number",
+                "default": 3,
+                "description": "AI 评估未达标时最多执行的搜索轮数",
+            },
+        },
+        "settings": {
+            "max_steps": 10,
+            "default_timeout_ms": 1800000,
+            "on_unhandled_error": "pause_for_user",
+            "return": {"node_id": "research"},
+        },
+        "nodes": [
+            {
+                "id": "start",
+                "type": "flow.start",
+                "title": "开始",
+                "position": {"x": 80, "y": 180},
+                "params": {},
+                "inputs": {},
+            },
+            {
+                "id": "research",
+                "type": "research.ai_web_research",
+                "title": "AI 规划并循环研究",
+                "position": {"x": 360, "y": 180},
+                "params": {
+                    "search_engine": "bing",
+                    "browser": "edge",
+                    "max_search_pages": 2,
+                    "result_count": 2,
+                    "detail_max_pages": 2,
+                },
+                "inputs": {
+                    "objective": {"source": "variable", "name": "objective"},
+                    "output_schema": {"source": "variable", "name": "output_schema"},
+                    "constraints": {"source": "variable", "name": "constraints"},
+                    "max_attempts": {"source": "variable", "name": "max_attempts"},
+                },
+            },
+            {
+                "id": "end",
+                "type": "flow.end",
+                "title": "结束",
+                "position": {"x": 680, "y": 180},
+                "params": {},
+                "inputs": {},
+            },
+        ],
+        "edges": [
+            {
+                "id": "start_to_research",
+                "kind": "control",
+                "source": "start",
+                "source_port": "next",
+                "target": "research",
+                "target_port": "run",
+            },
+            {
+                "id": "research_to_end",
+                "kind": "control",
+                "source": "research",
+                "source_port": "success",
+                "target": "end",
+                "target_port": "run",
+            },
+            {
+                "id": "partial_to_end",
+                "kind": "control",
+                "source": "research",
+                "source_port": "partial",
+                "target": "end",
+                "target_port": "run",
+            },
+        ],
+    }
+
+
+def create_web_search_workflow() -> dict[str, Any]:
+    """创建完善版研究 workflow;已存在时直接返回,避免重复覆盖用户调整。"""
+    template = AutomationWorkflowSaveRequest.model_validate(web_search_workflow_template())
+    try:
+        return get_workflow_by_key(str(template.workflow_key))
+    except HTTPException as exc:
+        if exc.status_code != 404:
+            raise
+    return save_workflow(template)
+
+
 def get_workflow(workflow_id: int) -> dict[str, Any]:
     """读取 workflow/v1 工作流详情。"""
     with get_db() as conn:
@@ -737,6 +979,12 @@ def get_workflow_by_key(workflow_key: str) -> dict[str, Any]:
     return workflow_to_public(workflow)
 
 
+def export_workflow_by_key(workflow_key: str) -> dict[str, Any]:
+    """导出不含数据库 ID 和时间字段的可迁移 workflow/v1 JSON。"""
+    workflow = get_workflow_by_key(workflow_key)
+    return workflow_export_payload(workflow)
+
+
 def delete_workflow(workflow_id: int) -> dict[str, Any]:
     """删除工作流及其节点。"""
     with get_db() as conn:
@@ -746,9 +994,9 @@ def delete_workflow(workflow_id: int) -> dict[str, Any]:
     return {"deleted": cursor.rowcount}
 
 
-def run_workflow(workflow_id: int, payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
-    """执行 workflow/v1 工作流图。"""
-    workflow = get_workflow(workflow_id)
+def execute_workflow(workflow: dict[str, Any], payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
+    """同步执行 workflow 快照,仅供后台任务 worker 调用。"""
+    workflow_id = workflow.get("id")
     defaults = settings_service.default_ai_params()
     provider_id = payload.provider_id or defaults.get("provider_id")
     model_id = payload.model_id or defaults.get("model_id")
@@ -815,9 +1063,24 @@ def run_workflow(workflow_id: int, payload: AutomationWorkflowRunRequest) -> dic
     return {"workflow_id": workflow_id, "status": "SUCCESS", "results": results, "outputs": context.outputs}
 
 
-def run_workflow_by_key(workflow_key: str, payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
-    workflow = get_workflow_by_key(workflow_key)
-    return run_workflow(int(workflow["id"]), payload)
+def run_workflow(workflow_id: int, payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
+    """兼容内部调用;HTTP 接口不再按数据库 ID 同步执行。"""
+    return execute_workflow(get_workflow(workflow_id), payload)
+
+
+def workflow_return_data(workflow: dict[str, Any], result: dict[str, Any]) -> Any:
+    """按 workflow.settings.return 提取给外部调用方的最终数据。"""
+    return_config = (workflow.get("settings") or {}).get("return") or {}
+    node_id = str(return_config.get("node_id") or "").strip()
+    output_name = str(return_config.get("output") or "").strip()
+    if not node_id:
+        return result
+    node_outputs = (result.get("outputs") or {}).get(node_id)
+    if not output_name:
+        return node_outputs
+    if isinstance(node_outputs, dict):
+        return node_outputs.get(output_name)
+    return None
 
 
 def execute_workflow_node(

+ 33 - 1
backend/app/database.py

@@ -221,6 +221,38 @@ def init_db() -> None:
             CREATE INDEX IF NOT EXISTS idx_automation_workflow_nodes_workflow
                 ON automation_workflow_nodes(workflow_id, node_index);
 
+            CREATE TABLE IF NOT EXISTS automation_workflow_tasks (
+                id TEXT PRIMARY KEY,
+                workflow_id INTEGER,
+                workflow_key TEXT NOT NULL,
+                workflow_name TEXT NOT NULL,
+                status TEXT NOT NULL,
+                request_json TEXT NOT NULL,
+                workflow_snapshot_json TEXT NOT NULL,
+                result_json TEXT,
+                return_data_json TEXT,
+                error_message TEXT,
+                created_at TEXT NOT NULL,
+                started_at TEXT,
+                finished_at TEXT,
+                FOREIGN KEY(workflow_id) REFERENCES automation_workflows(id) ON DELETE SET NULL
+            );
+
+            CREATE INDEX IF NOT EXISTS idx_automation_workflow_tasks_status_created
+                ON automation_workflow_tasks(status, created_at);
+            CREATE INDEX IF NOT EXISTS idx_automation_workflow_tasks_created
+                ON automation_workflow_tasks(created_at DESC);
+
+            CREATE TABLE IF NOT EXISTS automation_workflow_runtime (
+                id INTEGER PRIMARY KEY CHECK(id = 1),
+                active_task_id TEXT,
+                updated_at TEXT,
+                FOREIGN KEY(active_task_id) REFERENCES automation_workflow_tasks(id) ON DELETE SET NULL
+            );
+
+            INSERT OR IGNORE INTO automation_workflow_runtime (id, active_task_id, updated_at)
+            VALUES (1, NULL, NULL);
+
             CREATE TABLE IF NOT EXISTS automation_errors (
                 id INTEGER PRIMARY KEY AUTOINCREMENT,
                 workflow_id INTEGER,
@@ -307,7 +339,7 @@ def seed_default_settings(conn: sqlite3.Connection) -> None:
         ("automation_runtime_path", "automation/runtime", "自动化临时截图保存路径"),
         ("automation_auto_screenshot_enabled", "0", "自动化操作页面是否默认自动截屏"),
         ("automation_auto_screenshot_interval", "30", "自动化操作页面默认自动截屏间隔秒数"),
-        ("automation_remote_token", "", "远程执行工作流接口 Token,设置后按 key 执行接口必须携带 X-Automation-Token"),
+        ("automation_remote_token", "", "远程执行工作流和查询任务状态的 Token,设置后需通过 X-Automation-Token 或 Bearer Token 传入"),
     ]
     now = __import__("datetime").datetime.now().astimezone().isoformat(timespec="seconds")
     for key, value, description in settings:

+ 95 - 15
backend/app/main.py

@@ -5,10 +5,10 @@ import secrets
 import sqlite3
 from typing import Any
 
-from fastapi import FastAPI, Header, HTTPException, Query
+from fastapi import Body, FastAPI, Header, HTTPException, Query, Response
 from fastapi.middleware.cors import CORSMiddleware
 
-from . import ai_service, automation_service, settings_service, windows_automation
+from . import ai_service, automation_service, settings_service, windows_automation, workflow_task_service
 from .control import (
     CONFIRMED_CONTROL_STATUSES,
     restart_service,
@@ -44,6 +44,7 @@ from .schemas import (
     AutomationVisionAnalyzeRequest,
     AutomationWorkflowRunRequest,
     AutomationWorkflowSaveRequest,
+    AutomationWorkflowImportRequest,
     AutomationWorkflowPlanRequest,
     AutomationWorkflowPlanContinueRequest,
     BatchStatusUpdate,
@@ -115,17 +116,44 @@ app.add_middleware(
 )
 
 
-def verify_automation_token(x_automation_token: str | None = Header(default=None)) -> None:
+def bearer_token(authorization: str | None) -> str | None:
+    """从 Authorization 请求头中提取 Bearer Token,兼容 iOS 快捷指令的常见写法。"""
+    value = str(authorization or "").strip()
+    if not value:
+        return None
+    prefix = "bearer "
+    if value.lower().startswith(prefix):
+        return value[len(prefix):].strip()
+    return None
+
+
+def verify_automation_token(
+    x_automation_token: str | None = None,
+    authorization: str | None = None,
+    automation_token: str | None = None,
+) -> None:
+    """校验远程自动化 Token;未配置 Token 时保持开发期兼容。"""
     configured = settings_service.automation_remote_token()
     if not configured:
-        raise HTTPException(status_code=403, detail="Automation remote token is not configured")
-    if not x_automation_token or not secrets.compare_digest(x_automation_token, configured):
+        return
+    candidates = [
+        str(x_automation_token or "").strip(),
+        str(bearer_token(authorization) or "").strip(),
+        str(automation_token or "").strip(),
+    ]
+    if not any(candidate and secrets.compare_digest(candidate, configured) for candidate in candidates):
         raise HTTPException(status_code=401, detail="Invalid automation token")
 
 
 @app.on_event("startup")
 def startup() -> None:
     init_db()
+    workflow_task_service.start_worker()
+
+
+@app.on_event("shutdown")
+def shutdown() -> None:
+    workflow_task_service.stop_worker()
 
 
 def build_where(
@@ -819,14 +847,34 @@ def automation_workflow_plan_continue(payload: AutomationWorkflowPlanContinueReq
     return automation_service.continue_workflow_plan(payload)
 
 
+@app.post("/api/automation/workflows/templates/web-search")
+def automation_web_search_workflow_create() -> dict[str, Any]:
+    return automation_service.create_web_search_workflow()
+
+
 @app.post("/api/automation/workflows")
 def automation_workflow_create(payload: AutomationWorkflowSaveRequest) -> dict[str, Any]:
     return automation_service.save_workflow(payload)
 
 
-@app.get("/api/automation/workflows/{workflow_id}")
-def automation_workflow_detail(workflow_id: int) -> dict[str, Any]:
-    return automation_service.get_workflow(workflow_id)
+@app.post("/api/automation/workflows/import")
+def automation_workflow_import(payload: AutomationWorkflowImportRequest) -> dict[str, Any]:
+    return automation_service.import_workflow(payload.workflow, payload.conflict_strategy)
+
+
+@app.get("/api/automation/workflows/export.zip")
+def automation_workflows_export_zip() -> Response:
+    content = automation_service.export_workflows_zip()
+    return Response(
+        content=content,
+        media_type="application/zip",
+        headers={"Content-Disposition": 'attachment; filename="workflows.zip"'},
+    )
+
+
+@app.post("/api/automation/workflows/import.zip")
+def automation_workflows_import_zip(content: bytes = Body(..., media_type="application/zip")) -> dict[str, Any]:
+    return automation_service.import_workflows_zip(content)
 
 
 @app.get("/api/automation/workflows/by-key/{workflow_key}")
@@ -834,19 +882,51 @@ def automation_workflow_detail_by_key(workflow_key: str) -> dict[str, Any]:
     return automation_service.get_workflow_by_key(workflow_key)
 
 
-@app.post("/api/automation/workflows/{workflow_id}/run")
-def automation_workflow_run(workflow_id: int, payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
-    return automation_service.run_workflow(workflow_id, payload)
+@app.get("/api/automation/workflows/by-key/{workflow_key}/export")
+def automation_workflow_export(workflow_key: str) -> dict[str, Any]:
+    return automation_service.export_workflow_by_key(workflow_key)
 
 
-@app.post("/api/automation/workflows/by-key/{workflow_key}/run")
+@app.get("/api/automation/workflows/{workflow_id}")
+def automation_workflow_detail(workflow_id: int) -> dict[str, Any]:
+    return automation_service.get_workflow(workflow_id)
+
+
+@app.post("/api/automation/workflows/by-key/{workflow_key}/run", status_code=202)
 def automation_workflow_run_by_key(
     workflow_key: str,
     payload: AutomationWorkflowRunRequest,
-    x_automation_token: str | None = Header(default=None),
+    x_automation_token: str | None = Header(default=None, alias="X-Automation-Token"),
+    authorization: str | None = Header(default=None, alias="Authorization"),
+    automation_token: str | None = Query(default=None),
+) -> dict[str, Any]:
+    verify_automation_token(x_automation_token, authorization, automation_token)
+    workflow = automation_service.get_workflow_by_key(workflow_key)
+    return workflow_task_service.enqueue_workflow_task(workflow, payload)
+
+
+@app.get("/api/automation/workflow-tasks")
+def automation_workflow_tasks(
+    page: int = Query(default=1, ge=1),
+    page_size: int = Query(default=20, ge=1, le=200),
+    status: str | None = None,
+    x_automation_token: str | None = Header(default=None, alias="X-Automation-Token"),
+    authorization: str | None = Header(default=None, alias="Authorization"),
+    automation_token: str | None = Query(default=None),
+) -> dict[str, Any]:
+    verify_automation_token(x_automation_token, authorization, automation_token)
+    return workflow_task_service.list_workflow_tasks(page, page_size, status)
+
+
+@app.get("/api/automation/workflow-tasks/{task_id}")
+def automation_workflow_task_detail(
+    task_id: str,
+    x_automation_token: str | None = Header(default=None, alias="X-Automation-Token"),
+    authorization: str | None = Header(default=None, alias="Authorization"),
+    automation_token: str | None = Query(default=None),
 ) -> dict[str, Any]:
-    verify_automation_token(x_automation_token)
-    return automation_service.run_workflow_by_key(workflow_key, payload)
+    verify_automation_token(x_automation_token, authorization, automation_token)
+    return workflow_task_service.get_workflow_task(task_id)
 
 
 @app.put("/api/automation/workflows/{workflow_id}")

+ 5 - 0
backend/app/schemas.py

@@ -213,6 +213,11 @@ class AutomationWorkflowRunRequest(BaseModel):
     variables: dict[str, Any] = Field(default_factory=dict)
 
 
+class AutomationWorkflowImportRequest(BaseModel):
+    workflow: AutomationWorkflowSaveRequest
+    conflict_strategy: Literal["error", "replace"] = "error"
+
+
 class AutomationWorkflowPlanRequest(BaseModel):
     requirement: str = Field(min_length=1, max_length=4000)
     provider_id: int | None = None

+ 268 - 0
backend/app/workflow_task_service.py

@@ -0,0 +1,268 @@
+from __future__ import annotations
+
+import json
+import threading
+import time
+import uuid
+from typing import Any
+
+from fastapi import HTTPException
+
+from .database import get_db
+from .scanner import now_iso
+from .schemas import AutomationWorkflowRunRequest
+
+
+TERMINAL_STATUSES = {"SUCCESS", "FAILED", "PAUSED"}
+_worker_thread: threading.Thread | None = None
+_worker_guard = threading.Lock()
+_stop_event = threading.Event()
+_wake_event = threading.Event()
+
+
+def start_worker() -> None:
+    """启动唯一的后台消费线程,并恢复服务异常退出时遗留的任务。"""
+    global _worker_thread
+    with _worker_guard:
+        if _worker_thread and _worker_thread.is_alive():
+            return
+        with get_db() as conn:
+            conn.execute(
+                """
+                UPDATE automation_workflow_tasks
+                SET status = 'QUEUED', started_at = NULL,
+                    error_message = CASE
+                        WHEN error_message IS NULL OR error_message = '' THEN '服务重启后重新排队'
+                        ELSE error_message
+                    END
+                WHERE status = 'RUNNING'
+                """
+            )
+            conn.execute(
+                "UPDATE automation_workflow_runtime SET active_task_id = NULL, updated_at = ? WHERE id = 1",
+                (now_iso(),),
+            )
+        _stop_event.clear()
+        _worker_thread = threading.Thread(target=_worker_loop, name="workflow-task-worker", daemon=True)
+        _worker_thread.start()
+
+
+def stop_worker() -> None:
+    """通知后台线程停止;正在执行的节点会在当前进程退出时结束。"""
+    _stop_event.set()
+    _wake_event.set()
+
+
+def enqueue_workflow_task(workflow: dict[str, Any], payload: AutomationWorkflowRunRequest) -> dict[str, Any]:
+    """保存任务快照并加入全局串行队列。"""
+    workflow_key = str(workflow.get("workflow_key") or "").strip()
+    if not workflow_key:
+        raise HTTPException(status_code=400, detail="Workflow key is required before execution")
+    task_id = str(uuid.uuid4())
+    created_at = now_iso()
+    request_json = json.dumps(payload.model_dump(), ensure_ascii=False)
+    snapshot_json = json.dumps(workflow, ensure_ascii=False)
+    with get_db() as conn:
+        conn.execute(
+            """
+            INSERT INTO automation_workflow_tasks (
+                id, workflow_id, workflow_key, workflow_name, status,
+                request_json, workflow_snapshot_json, created_at
+            ) VALUES (?, ?, ?, ?, 'QUEUED', ?, ?, ?)
+            """,
+            (
+                task_id,
+                workflow.get("id"),
+                workflow_key,
+                str(workflow.get("name") or workflow_key),
+                request_json,
+                snapshot_json,
+                created_at,
+            ),
+        )
+    _wake_event.set()
+    return get_workflow_task(task_id)
+
+
+def list_workflow_tasks(page: int, page_size: int, status: str | None = None) -> dict[str, Any]:
+    """分页读取 workflow 异步任务历史。"""
+    clauses: list[str] = []
+    params: list[Any] = []
+    if status:
+        clauses.append("status = ?")
+        params.append(status.upper())
+    where_sql = "WHERE " + " AND ".join(clauses) if clauses else ""
+    offset = (page - 1) * page_size
+    with get_db() as conn:
+        total = conn.execute(
+            f"SELECT COUNT(*) AS total FROM automation_workflow_tasks {where_sql}",
+            params,
+        ).fetchone()["total"]
+        rows = conn.execute(
+            f"""
+            SELECT * FROM automation_workflow_tasks
+            {where_sql}
+            ORDER BY created_at DESC
+            LIMIT ? OFFSET ?
+            """,
+            [*params, page_size, offset],
+        ).fetchall()
+    return {
+        "items": [task_to_public(row, include_payload=False) for row in rows],
+        "total": total,
+        "page": page,
+        "page_size": page_size,
+    }
+
+
+def get_workflow_task(task_id: str) -> dict[str, Any]:
+    with get_db() as conn:
+        row = conn.execute("SELECT * FROM automation_workflow_tasks WHERE id = ?", (task_id,)).fetchone()
+    if not row:
+        raise HTTPException(status_code=404, detail="Workflow task not found")
+    return task_to_public(row, include_payload=True)
+
+
+def task_to_public(row: dict[str, Any], include_payload: bool) -> dict[str, Any]:
+    item = {
+        "id": row["id"],
+        "workflow_id": row.get("workflow_id"),
+        "workflow_key": row["workflow_key"],
+        "workflow_name": row["workflow_name"],
+        "status": row["status"],
+        "error_message": row.get("error_message"),
+        "created_at": row["created_at"],
+        "started_at": row.get("started_at"),
+        "finished_at": row.get("finished_at"),
+    }
+    if row["status"] == "QUEUED":
+        item["queue_position"] = queue_position(row["id"], row["created_at"])
+    else:
+        item["queue_position"] = None
+    if include_payload:
+        item["request"] = parse_json_object(row.get("request_json"))
+        item["result"] = parse_json_value(row.get("result_json"))
+        item["return_data"] = parse_json_value(row.get("return_data_json"))
+    return item
+
+
+def queue_position(task_id: str, created_at: str) -> int:
+    with get_db() as conn:
+        count = conn.execute(
+            """
+            SELECT COUNT(*) AS total
+            FROM automation_workflow_tasks
+            WHERE status = 'QUEUED'
+              AND (created_at < ? OR (created_at = ? AND id <= ?))
+            """,
+            (created_at, created_at, task_id),
+        ).fetchone()["total"]
+    return int(count)
+
+
+def parse_json_object(value: str | None) -> dict[str, Any]:
+    parsed = parse_json_value(value)
+    return parsed if isinstance(parsed, dict) else {}
+
+
+def parse_json_value(value: str | None) -> Any:
+    if not value:
+        return None
+    try:
+        return json.loads(value)
+    except json.JSONDecodeError:
+        return None
+
+
+def _worker_loop() -> None:
+    while not _stop_event.is_set():
+        task = _claim_next_task()
+        if not task:
+            _wake_event.wait(1)
+            _wake_event.clear()
+            continue
+        _execute_task(task)
+
+
+def _claim_next_task() -> dict[str, Any] | None:
+    """通过 SQLite 写锁领取任务,确保任何时刻只有一个全局活动任务。"""
+    with get_db() as conn:
+        conn.execute("BEGIN IMMEDIATE")
+        runtime = conn.execute(
+            "SELECT active_task_id FROM automation_workflow_runtime WHERE id = 1"
+        ).fetchone()
+        if runtime and runtime.get("active_task_id"):
+            return None
+        task = conn.execute(
+            """
+            SELECT * FROM automation_workflow_tasks
+            WHERE status = 'QUEUED'
+            ORDER BY created_at ASC, id ASC
+            LIMIT 1
+            """
+        ).fetchone()
+        if not task:
+            return None
+        started_at = now_iso()
+        conn.execute(
+            "UPDATE automation_workflow_tasks SET status = 'RUNNING', started_at = ?, error_message = NULL WHERE id = ?",
+            (started_at, task["id"]),
+        )
+        conn.execute(
+            "UPDATE automation_workflow_runtime SET active_task_id = ?, updated_at = ? WHERE id = 1",
+            (task["id"], started_at),
+        )
+        task["status"] = "RUNNING"
+        task["started_at"] = started_at
+        return task
+
+
+def _execute_task(task: dict[str, Any]) -> None:
+    from . import automation_service
+
+    result: dict[str, Any] | None = None
+    error_message: str | None = None
+    status = "FAILED"
+    try:
+        workflow = parse_json_object(task.get("workflow_snapshot_json"))
+        payload = AutomationWorkflowRunRequest.model_validate(parse_json_object(task.get("request_json")))
+        result = automation_service.execute_workflow(workflow, payload)
+        raw_status = str(result.get("status") or "FAILED").upper()
+        status = raw_status if raw_status in TERMINAL_STATUSES else "FAILED"
+        if status == "FAILED":
+            error_message = workflow_failure_message(result)
+        return_data = automation_service.workflow_return_data(workflow, result)
+    except Exception as exc:
+        error_message = str(exc)
+        return_data = None
+        result = {"status": "FAILED", "detail": error_message}
+    finished_at = now_iso()
+    with get_db() as conn:
+        conn.execute("BEGIN IMMEDIATE")
+        conn.execute(
+            """
+            UPDATE automation_workflow_tasks
+            SET status = ?, result_json = ?, return_data_json = ?, error_message = ?, finished_at = ?
+            WHERE id = ?
+            """,
+            (
+                status,
+                json.dumps(result, ensure_ascii=False),
+                json.dumps(return_data, ensure_ascii=False) if return_data is not None else None,
+                error_message,
+                finished_at,
+                task["id"],
+            ),
+        )
+        conn.execute(
+            "UPDATE automation_workflow_runtime SET active_task_id = NULL, updated_at = ? WHERE id = 1",
+            (finished_at,),
+        )
+    _wake_event.set()
+
+
+def workflow_failure_message(result: dict[str, Any]) -> str | None:
+    failed = result.get("failed")
+    if isinstance(failed, dict) and failed.get("detail"):
+        return str(failed["detail"])
+    return str(result.get("detail") or "Workflow execution failed")

+ 1 - 0
backend/requirements.txt

@@ -6,3 +6,4 @@ httpx>=0.27.0
 pyautogui>=0.9.54
 pillow>=10.0.0
 pyperclip>=1.9.0
+jsonschema>=4.23.0

+ 34 - 0
backend/tests/test_automation_token.py

@@ -0,0 +1,34 @@
+from __future__ import annotations
+
+import unittest
+from unittest.mock import patch
+
+from fastapi import HTTPException
+
+from app.main import verify_automation_token
+
+
+class AutomationTokenTest(unittest.TestCase):
+    def test_token_is_optional_when_not_configured(self) -> None:
+        with patch("app.main.settings_service.automation_remote_token", return_value=""):
+            verify_automation_token()
+
+    def test_accepts_supported_token_locations(self) -> None:
+        with patch("app.main.settings_service.automation_remote_token", return_value="secret"):
+            verify_automation_token(x_automation_token="secret")
+            verify_automation_token(authorization="Bearer secret")
+            verify_automation_token(automation_token="secret")
+
+    def test_rejects_missing_or_invalid_token_when_configured(self) -> None:
+        with patch("app.main.settings_service.automation_remote_token", return_value="secret"):
+            with self.assertRaises(HTTPException) as missing:
+                verify_automation_token()
+            self.assertEqual(missing.exception.status_code, 401)
+
+            with self.assertRaises(HTTPException) as invalid:
+                verify_automation_token(x_automation_token="wrong")
+            self.assertEqual(invalid.exception.status_code, 401)
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 59 - 0
backend/tests/test_video_workflows.py

@@ -0,0 +1,59 @@
+from __future__ import annotations
+
+import json
+import unittest
+from pathlib import Path
+
+from app.automation import get_node_definitions
+from app.schemas import AutomationWorkflowSaveRequest
+
+
+VIDEO_WORKFLOW_KEYS = {
+    "youtube-home-random-video",
+    "youtube-channel-latest-video",
+    "bilibili-home-random-video",
+    "bilibili-up-latest-video",
+    "douyin-random-video",
+    "douyin-next-video",
+}
+
+
+class VideoWorkflowTemplateTest(unittest.TestCase):
+    def test_video_action_node_is_registered(self) -> None:
+        definitions = {item["type"]: item for item in get_node_definitions()}
+
+        self.assertIn("browser.video_action", definitions)
+        self.assertIn("vision.locate_element", definitions)
+        self.assertEqual(
+            definitions["browser.video_action"]["params"]["action"]["options"],
+            [
+                "youtube_home_random",
+                "youtube_channel_latest",
+                "bilibili_home_random",
+                "bilibili_up_latest",
+                "douyin_random",
+                "douyin_next",
+            ],
+        )
+
+    def test_video_workflow_templates_match_schema(self) -> None:
+        workflow_dir = Path(__file__).resolve().parents[2] / "workflows"
+        checked_keys = set()
+
+        for path in workflow_dir.glob("*.workflow.json"):
+            raw = json.loads(path.read_text(encoding="utf-8"))
+            if raw.get("workflow_key") not in VIDEO_WORKFLOW_KEYS:
+                continue
+            workflow = AutomationWorkflowSaveRequest.model_validate(raw)
+            checked_keys.add(str(workflow.workflow_key))
+
+            # 视频模板使用真实页面截图定位,再把坐标交给鼠标节点点击。
+            self.assertTrue(any(node.type == "vision.locate_element" for node in workflow.nodes))
+            self.assertTrue(any(node.type == "mouse.click" for node in workflow.nodes))
+            self.assertEqual(workflow.settings.get("return"), {"node_id": "locate"})
+
+        self.assertEqual(checked_keys, VIDEO_WORKFLOW_KEYS)
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 52 - 0
backend/tests/test_vision_node.py

@@ -0,0 +1,52 @@
+from __future__ import annotations
+
+import unittest
+from unittest.mock import patch
+
+from app.automation.context import WorkflowContext
+from app.automation.nodes.vision import locate_element_node
+
+
+class VisionLocateNodeTest(unittest.TestCase):
+    def test_locate_element_converts_percent_to_screen_coordinates(self) -> None:
+        context = WorkflowContext(workflow_id=1, provider_id=1, model_id=1, temperature=0.1)
+        node = {
+            "params": {
+                "target_description": "点击第一个推荐视频",
+                "screen_context": "视频首页",
+                "save_screenshot": False,
+            }
+        }
+
+        with (
+            patch(
+                "app.automation.nodes.vision.windows_automation.take_screenshot",
+                return_value={
+                    "image_base64": "abc",
+                    "mime_type": "image/png",
+                    "width": 800,
+                    "height": 600,
+                    "path": None,
+                },
+            ),
+            patch(
+                "app.automation.nodes.vision.ai_service.chat_with_images",
+                return_value={
+                    "content": (
+                        '{"found": true, "x_percent": 25, "y_percent": 50, '
+                        '"confidence": 0.9, "target_label": "推荐视频", "reason": "清晰可见"}'
+                    )
+                },
+            ),
+        ):
+            result = locate_element_node(node, {}, context)
+
+        self.assertTrue(result["located"])
+        self.assertEqual(result["x"], 200)
+        self.assertEqual(result["y"], 300)
+        self.assertEqual(result["x_percent"], 25)
+        self.assertEqual(result["y_percent"], 50)
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 196 - 0
backend/tests/test_web_search.py

@@ -0,0 +1,196 @@
+from __future__ import annotations
+
+import unittest
+import json
+from pathlib import Path
+from unittest.mock import patch
+
+from app.automation.context import WorkflowContext
+from app.automation.nodes.web_search import WebSearchRunner, normalize_search_result, result_identity
+from app.automation.nodes.research import AiWebResearchRunner, compact_evidence, validate_json_data, validate_research_result
+from app.automation_service import web_search_workflow_template
+from app.automation_service import workflow_return_data
+from app.schemas import AutomationWorkflowSaveRequest
+
+
+class WebSearchHelpersTest(unittest.TestCase):
+    def test_normalize_search_result_converts_percent_to_screen_point(self) -> None:
+        result = normalize_search_result(
+            {
+                "title": "示例结果",
+                "url": "https://example.com",
+                "snippet": "摘要",
+                "title_center_x_percent": 25,
+                "title_center_y_percent": 40,
+            },
+            scroll_page=2,
+            width=1920,
+            height=1080,
+        )
+
+        self.assertIsNotNone(result)
+        self.assertEqual(result["title_center_x"], 480)
+        self.assertEqual(result["title_center_y"], 432)
+        self.assertEqual(result["scroll_page"], 2)
+
+    def test_result_identity_prefers_url_and_falls_back_to_title(self) -> None:
+        self.assertEqual(result_identity({"url": "HTTPS://EXAMPLE.COM", "title": "标题"}), "https://example.com")
+        self.assertEqual(result_identity({"url": "", "title": " 标题 "}), "标题")
+
+    def test_ranking_ignores_invalid_and_duplicate_indexes(self) -> None:
+        runner = WebSearchRunner(
+            WorkflowContext(workflow_id=1, provider_id=1, model_id=1),
+            {"query": "测试", "result_count": 2},
+        )
+        runner._text_json = lambda prompt: {
+            "ranked_results": [
+                {"original_index": 1, "relevance_score": 9},
+                {"original_index": 1, "relevance_score": 8},
+                {"original_index": 99, "relevance_score": 7},
+                {"original_index": 0, "relevance_score": 6},
+            ]
+        }
+
+        ranked = runner._rank_results([{"title": "A"}, {"title": "B"}])
+
+        self.assertEqual([item["title"] for item in ranked], ["B", "A"])
+
+    def test_failed_title_location_keeps_search_page_state(self) -> None:
+        runner = WebSearchRunner(
+            WorkflowContext(workflow_id=1, provider_id=1, model_id=1),
+            {"query": "测试", "click_attempts": 1},
+        )
+        runner._go_to_scroll_page = lambda scroll_page: None
+        runner._capture = lambda: {"width": 1920, "height": 1080, "image_base64": "", "mime_type": "image/png"}
+        runner._vision_json = lambda prompt, screenshot: {"found": False, "notes": "未找到标题"}
+
+        result = runner._open_result({"title": "不存在的标题", "scroll_page": 0})
+
+        self.assertFalse(result["opened_detail_page"])
+        self.assertTrue(result["is_search_results_page"])
+
+
+class WebSearchWorkflowTemplateTest(unittest.TestCase):
+    def test_template_matches_workflow_schema(self) -> None:
+        workflow = AutomationWorkflowSaveRequest.model_validate(web_search_workflow_template())
+
+        self.assertEqual(workflow.schema_version, "workflow/v1")
+        self.assertEqual(workflow.workflow_key, "ai-web-research")
+        self.assertEqual(workflow.variables["objective"]["default"], "")
+        self.assertEqual([node.type for node in workflow.nodes], ["flow.start", "research.ai_web_research", "flow.end"])
+
+    def test_checked_in_workflow_matches_template(self) -> None:
+        path = Path(__file__).resolve().parents[2] / "workflows" / "ai-web-research.workflow.json"
+        checked_in = json.loads(path.read_text(encoding="utf-8"))
+
+        self.assertEqual(checked_in, web_search_workflow_template())
+
+
+class AiResearchHelpersTest(unittest.TestCase):
+    def test_json_schema_validation_reports_missing_required_field(self) -> None:
+        schema = {
+            "type": "object",
+            "required": ["summary"],
+            "properties": {"summary": {"type": "string"}},
+        }
+
+        result = validate_json_data({}, schema)
+
+        self.assertFalse(result["schema_valid"])
+        self.assertTrue(result["errors"])
+
+    def test_compact_evidence_keeps_source_and_cleaned_content(self) -> None:
+        evidence = compact_evidence(
+            {
+                "researched_details": [
+                    {
+                        "visited_url": "https://example.com",
+                        "opened_detail_page": True,
+                        "result": {"title": "原始标题"},
+                        "cleaned": {"clean_title": "清理标题", "clean_text": "正文", "key_points": ["要点"]},
+                    }
+                ]
+            }
+        )
+
+        self.assertEqual(evidence[0]["title"], "清理标题")
+        self.assertEqual(evidence[0]["url"], "https://example.com")
+
+    def test_research_validation_enforces_minimum_sources(self) -> None:
+        result = validate_research_result(
+            {"summary": "完成"},
+            {
+                "type": "object",
+                "required": ["summary"],
+                "properties": {"summary": {"type": "string"}},
+            },
+            {"min_sources": 2},
+            [{"title": "A", "url": "https://example.com"}],
+        )
+
+        self.assertTrue(result["schema_valid"])
+        self.assertFalse(result["constraints_valid"])
+        self.assertFalse(result["valid"])
+
+    def test_workflow_return_data_uses_configured_node(self) -> None:
+        workflow = {"settings": {"return": {"node_id": "research"}}}
+        result = {"outputs": {"research": {"data": {"answer": 1}}}}
+
+        self.assertEqual(workflow_return_data(workflow, result), {"data": {"answer": 1}})
+
+    def test_ai_research_retries_until_assessment_is_valid(self) -> None:
+        runner = AiWebResearchRunner(
+            WorkflowContext(workflow_id=1, provider_id=1, model_id=1),
+            {
+                "objective": "测试目标",
+                "output_schema": {
+                    "type": "object",
+                    "required": ["answer"],
+                    "properties": {"answer": {"type": "string"}},
+                },
+                "constraints": {"min_sources": 1},
+                "max_attempts": 2,
+            },
+        )
+        runner._create_plan = lambda: {"queries": ["第一轮", "第二轮"]}
+        assessments = iter(
+            [
+                {
+                    "goal_achieved": False,
+                    "candidate_data": {},
+                    "missing_information": ["答案"],
+                    "next_queries": ["第二轮"],
+                },
+                {
+                    "goal_achieved": True,
+                    "candidate_data": {"answer": "完成"},
+                    "missing_information": [],
+                    "next_queries": [],
+                },
+            ]
+        )
+        runner._assess_progress = lambda plan, queries, evidence: next(assessments)
+        fake_output = {
+            "result_count": 1,
+            "researched_count": 1,
+            "researched_details": [
+                {
+                    "visited_url": "https://example.com",
+                    "opened_detail_page": True,
+                    "result": {"title": "来源"},
+                    "cleaned": {"clean_text": "证据"},
+                }
+            ],
+        }
+
+        with patch("app.automation.nodes.research.WebSearchRunner") as search_runner:
+            search_runner.return_value.run.return_value = fake_output
+            result = runner.run()
+
+        self.assertTrue(result["goal_achieved"])
+        self.assertEqual(result["attempts_used"], 2)
+        self.assertEqual(result["data"], {"answer": "完成"})
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 38 - 0
backend/tests/test_workflow_node_registry.py

@@ -0,0 +1,38 @@
+from __future__ import annotations
+
+import json
+import unittest
+from pathlib import Path
+
+from app.automation import get_node_definitions
+from app.schemas import AutomationWorkflowSaveRequest
+
+
+class WorkflowNodeRegistryTest(unittest.TestCase):
+    def test_registered_nodes_have_chinese_descriptions(self) -> None:
+        for definition in get_node_definitions():
+            self.assertTrue(definition.get("description"), definition["type"])
+            self.assertTrue(definition.get("label"), definition["type"])
+
+    def test_reliability_and_console_nodes_are_registered(self) -> None:
+        definitions = {item["type"]: item for item in get_node_definitions()}
+        for node_type in [
+            "browser.ensure_foreground",
+            "browser.control",
+            "media.control",
+            "vision.verify_page",
+            "vision.click_target",
+            "vision.close_popups",
+        ]:
+            self.assertIn(node_type, definitions)
+
+    def test_all_checked_in_workflows_match_schema(self) -> None:
+        workflow_dir = Path(__file__).resolve().parents[2] / "workflows"
+        paths = sorted(workflow_dir.glob("*.workflow.json"))
+        self.assertGreater(len(paths), 0)
+        for path in paths:
+            AutomationWorkflowSaveRequest.model_validate(json.loads(path.read_text(encoding="utf-8")))
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 65 - 0
backend/tests/test_workflow_zip.py

@@ -0,0 +1,65 @@
+from __future__ import annotations
+
+import io
+import json
+import unittest
+import zipfile
+from unittest.mock import patch
+
+from app.automation_service import import_workflows_zip, safe_zip_name, workflow_export_payload
+
+
+def workflow_payload(workflow_key: str, name: str = "测试工作流") -> dict:
+    return {
+        "schema_version": "workflow/v1",
+        "workflow_key": workflow_key,
+        "name": name,
+        "description": "用于 ZIP 导入测试",
+        "variables": {},
+        "settings": {},
+        "nodes": [],
+        "edges": [],
+    }
+
+
+def zip_bytes(items: dict[str, object]) -> bytes:
+    buffer = io.BytesIO()
+    with zipfile.ZipFile(buffer, "w", compression=zipfile.ZIP_DEFLATED) as archive:
+        for name, payload in items.items():
+            content = payload if isinstance(payload, str) else json.dumps(payload, ensure_ascii=False)
+            archive.writestr(name, content)
+    return buffer.getvalue()
+
+
+class WorkflowZipTest(unittest.TestCase):
+    def test_import_zip_skips_duplicate_key_and_keeps_processing(self) -> None:
+        content = zip_bytes(
+            {
+                "workflows/new.workflow.json": workflow_payload("zip-new"),
+                "workflows/existing.workflow.json": workflow_payload("zip-existing"),
+                "workflows/broken.workflow.json": "{bad json",
+                "manifest.json": {"schema_version": "workflow-zip/v1"},
+            }
+        )
+
+        with (
+            patch("app.automation_service.workflow_key_exists", side_effect=lambda key: key == "zip-existing"),
+            patch("app.automation_service.save_workflow", return_value={"id": 99, "workflow_key": "zip-new", "name": "测试工作流"}),
+        ):
+            result = import_workflows_zip(content)
+
+        self.assertEqual(result["created_count"], 1)
+        self.assertEqual(result["skipped_count"], 1)
+        self.assertEqual(result["failed_count"], 1)
+        self.assertEqual(result["skipped"][0]["workflow_key"], "zip-existing")
+
+    def test_export_payload_removes_database_fields(self) -> None:
+        payload = workflow_payload("zip-export")
+        exported = workflow_export_payload({**payload, "id": 1, "created_at": "x", "updated_at": "y"})
+
+        self.assertEqual(exported, payload)
+        self.assertEqual(safe_zip_name("a/b:c*中文"), "a-b-c")
+
+
+if __name__ == "__main__":
+    unittest.main()

+ 64 - 64
deployment.md

@@ -112,6 +112,50 @@ root\OpenHardwareMonitor
 
 ## 7. 启动后端
 
+### 推荐:使用项目脚本同时启动前后端
+
+项目根目录提供 Windows PowerShell 启动和关闭脚本。脚本会在后台启动服务,将 PID 保存到 `.runtime\processes.json`,并把日志写入 `.runtime\logs`:
+
+```powershell
+.\start.ps1
+```
+
+也可以双击 `start.cmd`,或在 CMD 中执行:
+
+```cmd
+start.cmd
+```
+
+默认启动地址:
+
+- 后端:`http://127.0.0.1:8000`
+- 前端:`http://127.0.0.1:5173`
+
+默认监听 `0.0.0.0`,允许局域网访问。只启动后端或不自动打开浏览器时可使用:
+
+```powershell
+.\start.ps1 -SkipFrontend
+.\start.ps1 -NoBrowser
+```
+
+如需修改端口:
+
+```powershell
+.\start.ps1 -BackendPort 8080 -FrontendPort 5180
+```
+
+本机访问时,自定义后端端口会自动同步给 Vite。如果需要从局域网访问自定义后端端口,请显式指定设备地址:
+
+```powershell
+.\start.ps1 -BackendPort 8080 -ApiBaseUrl "http://192.168.1.10:8080"
+```
+
+启动前请先创建 `.venv`、安装后端依赖,并在 `frontend` 目录执行过 `npm install`。
+
+视觉自动化、鼠标键盘控制和屏幕截图必须运行在当前登录用户的交互式桌面会话中。客厅娱乐中心部署时,建议让 Windows 自动登录一个专用用户,并通过“任务计划程序”的“用户登录时”触发 `start.ps1 -SkipFrontend -NoBrowser`。不要把后端放到非交互式服务会话里,否则可能无法截取投影仪画面或控制鼠标键盘。
+
+### 手动启动后端
+
 开发或本机使用:
 
 ```powershell
@@ -159,40 +203,14 @@ $env:VITE_API_BASE="http://目标设备IP:8000"
 npm run build
 ```
 
-## 9. 作为 Windows 服务运行后端
-
-可使用 NSSM:
-
-```powershell
-nssm install WinMonitorApi
-```
-
-推荐配置:
-
-- Application path:`C:\Apps\win_monitor\.venv\Scripts\python.exe`
-- Startup directory:`C:\Apps\win_monitor\backend`
-- Arguments:`-m uvicorn app.main:app --host 127.0.0.1 --port 8000`
-
-如果需要让局域网设备访问后端,Arguments 改为:
-
-```text
--m uvicorn app.main:app --host 0.0.0.0 --port 8000
-```
-
-保存后启动:
-
-```powershell
-nssm start WinMonitorApi
-```
-
-## 10. 数据和日志
+## 9. 数据和日志
 
 - SQLite 数据库位置:`backend\data\win_monitor.db`
 - 服务、进程和扫描记录会写入数据库。
 - 传感器信息和 SMART 信息是实时读取,不写入数据库。
 - 如果需要备份,只需停止后端后复制 `backend\data\win_monitor.db`。
 
-## 11. 项目自身会增加的进程
+## 10. 项目自身会增加的进程
 
 项目启动后,进程列表中会出现一些属于本项目自身或由本项目临时调用的进程。做服务/进程扫描结果确认时,可以将这些进程按实际部署方式标记为可信。
 
@@ -210,13 +228,6 @@ C:\Apps\win_monitor\.venv\Scripts\python.exe -m uvicorn app.main:app --host 127.
 | --- | --- |
 | `python.exe` | FastAPI 后端主进程,运行 `uvicorn app.main:app`,负责 API、扫描、传感器和 SMART 查询 |
 
-如果使用 NSSM 注册为 Windows 服务,还会看到:
-
-| 进程名 | 说明 |
-| --- | --- |
-| `nssm.exe` | NSSM 服务包装器,用于托管后端 Python 进程 |
-| `python.exe` | 被 NSSM 启动的 FastAPI 后端主进程 |
-
 ### 前端开发模式进程
 
 如果使用开发模式启动前端:
@@ -253,12 +264,27 @@ npm run dev
 | 进程名 | 是否常驻 | 用途 |
 | --- | --- | --- |
 | `python.exe` | 是 | 后端 API 服务 |
-| `nssm.exe` | 是,可选 | 仅当使用 NSSM 托管后端服务时出现 |
 | `nginx.exe` / `w3wp.exe` / 其他静态服务进程 | 是,可选 | 仅当前端由独立静态 Web 服务承载时出现 |
 
 不建议在生产环境长期使用 `npm run dev`,因为它会额外保留 `node.exe` 开发服务器进程,适合调试,不适合作为正式前端服务。
 
-## 12. 结束程序
+## 11. 结束程序
+
+### 推荐:使用项目关闭脚本
+
+如果项目由 `start.ps1` 或 `start.cmd` 启动,在项目根目录执行:
+
+```powershell
+.\stop.ps1
+```
+
+或双击、调用:
+
+```cmd
+stop.cmd
+```
+
+关闭脚本优先读取 `.runtime\processes.json`,按进程树关闭本项目启动的后端和前端,避免按名称结束其他 `python.exe` 或 `node.exe`。状态文件丢失时,仅会对默认端口上命令行明确匹配本项目的进程执行兜底关闭。
 
 结束程序时,先确认你当前采用的是哪种启动方式。推荐优先使用启动方式对应的正常停止命令,避免直接结束无关的 `python.exe`、`node.exe` 或 Web 服务进程。
 
@@ -285,32 +311,6 @@ $pid = (Get-NetTCPConnection -LocalPort 8000 -State Listen).OwningProcess
 Stop-Process -Id $pid
 ```
 
-### NSSM 服务方式启动的后端
-
-如果后端注册成了 NSSM 服务,例如服务名是 `WinMonitorApi`:
-
-```powershell
-nssm stop WinMonitorApi
-```
-
-或使用 Windows 服务命令:
-
-```powershell
-Stop-Service WinMonitorApi
-```
-
-如需禁止开机自启:
-
-```powershell
-Set-Service WinMonitorApi -StartupType Manual
-```
-
-如需彻底删除该服务:
-
-```powershell
-nssm remove WinMonitorApi confirm
-```
-
 ### 前端开发服务器
 
 如果前端是开发模式启动的:
@@ -409,7 +409,7 @@ Get-NetTCPConnection -LocalPort 5173 -State Listen -ErrorAction SilentlyContinue
 
 如果没有输出,表示对应端口没有进程监听。
 
-## 13. 常见问题
+## 12. 常见问题
 
 ### 页面没有温度
 

+ 86 - 6
frontend/src/App.vue

@@ -1,5 +1,20 @@
 <template>
-  <div class="app-shell">
+  <div v-if="!authenticated" class="login-page">
+    <div class="login-panel">
+      <div class="login-brand">Windows 监控</div>
+      <h1>输入访问 Token</h1>
+      <p>用于进入局域网自动化控制台。当前后端未设置 Token 时,可以留空直接进入。</p>
+      <el-input
+        v-model="loginToken"
+        show-password
+        size="large"
+        placeholder="远程执行 Token,可留空"
+        @keyup.enter="login"
+      />
+      <el-button type="primary" size="large" :loading="loggingIn" @click="login">进入控制台</el-button>
+    </div>
+  </div>
+  <div v-else class="app-shell">
     <aside v-if="activeView !== 'automation-workflow-editor'" class="sidebar">
       <div class="brand">Windows 监控</div>
       <el-menu :default-active="activeView" background-color="#1f2937" text-color="#d1d5db" active-text-color="#fff" @select="activeView = $event">
@@ -19,6 +34,7 @@
           <template #title>自动化</template>
           <el-menu-item index="automation-actions">自动化操作</el-menu-item>
           <el-menu-item index="automation-workflows">自动化工作流</el-menu-item>
+          <el-menu-item index="automation-tasks">自动化任务记录</el-menu-item>
           <el-menu-item index="automation-screens">已识别界面</el-menu-item>
           <el-menu-item index="automation-errors">自动化错误记录</el-menu-item>
         </el-sub-menu>
@@ -34,7 +50,10 @@
           <div class="page-title">{{ title }}</div>
           <div class="muted">{{ subtitle }}</div>
         </div>
-        <el-button type="primary" :loading="scanning" @click="runScan">执行扫描</el-button>
+        <div class="filters">
+          <el-button type="primary" :loading="scanning" @click="runScan">执行扫描</el-button>
+          <el-button @click="logout">退出登录</el-button>
+        </div>
       </div>
 
       <section v-if="activeView === 'dashboard'">
@@ -99,7 +118,12 @@
       </section>
 
       <section v-if="activeView === 'automation-workflows'">
-        <AutomationWorkflowView ref="automationWorkflowView" @create="openWorkflowEditor(null)" @edit="openWorkflowEditor" />
+        <AutomationWorkflowView
+          ref="automationWorkflowView"
+          @create="openWorkflowEditor(null)"
+          @edit="openWorkflowEditor"
+          @task-created="openTaskHistory"
+        />
       </section>
 
       <section v-if="activeView === 'automation-workflow-editor'">
@@ -108,9 +132,14 @@
           :workflow-id="editingWorkflowId"
           @back="closeWorkflowEditor"
           @saved="editingWorkflowId = $event"
+          @task-created="openTaskHistory"
         />
       </section>
 
+      <section v-if="activeView === 'automation-tasks'">
+        <AutomationWorkflowTasksView ref="automationWorkflowTasksView" />
+      </section>
+
       <section v-if="activeView === 'automation-screens'">
         <AutomationScreensView ref="automationScreensView" />
       </section>
@@ -145,9 +174,9 @@
 </template>
 
 <script setup>
-import { computed, nextTick, onMounted, ref, watch } from 'vue'
+import { computed, nextTick, onMounted, onUnmounted, ref, watch } from 'vue'
 import { ElMessage } from 'element-plus'
-import { api } from './api'
+import { api, clearAutomationAuth, hasAutomationAuth, setAutomationToken } from './api'
 import AiModelManager from './components/AiModelManager.vue'
 import AiProviderManager from './components/AiProviderManager.vue'
 import AiTestView from './components/AiTestView.vue'
@@ -156,6 +185,7 @@ import AutomationErrorsView from './components/AutomationErrorsView.vue'
 import AutomationWorkflowEditorPage from './components/AutomationWorkflowEditorPage.vue'
 import AutomationScreensView from './components/AutomationScreensView.vue'
 import AutomationWorkflowView from './components/AutomationWorkflowView.vue'
+import AutomationWorkflowTasksView from './components/AutomationWorkflowTasksView.vue'
 import ItemTable from './components/ItemTable.vue'
 import SensorView from './components/SensorView.vue'
 import SmartView from './components/SmartView.vue'
@@ -177,10 +207,14 @@ const aiTestView = ref(null)
 const systemSettingsView = ref(null)
 const automationActionView = ref(null)
 const automationWorkflowView = ref(null)
+const automationWorkflowTasksView = ref(null)
 const automationScreensView = ref(null)
 const automationErrorsView = ref(null)
 const editingWorkflowId = ref(null)
 const workflowEditorKey = ref(0)
+const authenticated = ref(hasAutomationAuth())
+const loginToken = ref('')
+const loggingIn = ref(false)
 
 const title = computed(() => ({
   dashboard: '仪表盘',
@@ -194,6 +228,7 @@ const title = computed(() => ({
   'ai-test': 'AI 服务测试',
   'automation-actions': '自动化操作',
   'automation-workflows': '自动化工作流',
+  'automation-tasks': '自动化任务记录',
   'automation-workflow-editor': '工作流编辑器',
   'automation-screens': '已识别界面',
   'automation-errors': '自动化错误记录',
@@ -220,6 +255,7 @@ async function loadScans() {
 }
 
 async function refreshCurrent() {
+  if (!authenticated.value) return
   await loadDashboard()
   if (activeView.value === 'scans') await loadScans()
   await nextTick()
@@ -233,10 +269,32 @@ async function refreshCurrent() {
   systemSettingsView.value?.load()
   automationActionView.value?.loadOptions()
   automationWorkflowView.value?.load()
+  automationWorkflowTasksView.value?.load()
   automationScreensView.value?.load()
   automationErrorsView.value?.load()
 }
 
+async function login() {
+  loggingIn.value = true
+  try {
+    setAutomationToken(loginToken.value)
+    authenticated.value = true
+    await refreshCurrent()
+  } catch (error) {
+    clearAutomationAuth()
+    authenticated.value = false
+    ElMessage.error(error.response?.data?.detail || 'Token 校验失败')
+  } finally {
+    loggingIn.value = false
+  }
+}
+
+function logout() {
+  clearAutomationAuth()
+  authenticated.value = false
+  loginToken.value = ''
+}
+
 function openWorkflowEditor(id) {
   editingWorkflowId.value = id
   workflowEditorKey.value += 1
@@ -249,6 +307,13 @@ async function closeWorkflowEditor() {
   await automationWorkflowView.value?.load()
 }
 
+async function openTaskHistory(task) {
+  activeView.value = 'automation-tasks'
+  await nextTick()
+  await automationWorkflowTasksView.value?.load()
+  if (task?.id) await automationWorkflowTasksView.value?.showDetail(task)
+}
+
 async function runScan() {
   scanning.value = true
   try {
@@ -263,13 +328,28 @@ async function runScan() {
 }
 
 watch(activeView, async (view) => {
+  if (!authenticated.value) return
   if (view === 'dashboard') await loadDashboard()
   if (view === 'scans') await loadScans()
   if (view === 'settings') await systemSettingsView.value?.load()
   if (view === 'automation-workflows') await automationWorkflowView.value?.load()
+  if (view === 'automation-tasks') await automationWorkflowTasksView.value?.load()
   if (view === 'automation-screens') await automationScreensView.value?.load()
   if (view === 'automation-errors') await automationErrorsView.value?.load()
 })
 
-onMounted(refreshCurrent)
+function handleAuthRequired() {
+  authenticated.value = false
+  loginToken.value = ''
+  ElMessage.warning('访问 Token 无效或已失效,请重新输入')
+}
+
+onMounted(() => {
+  window.addEventListener('automation-auth-required', handleAuthRequired)
+  if (authenticated.value) refreshCurrent().catch(() => {})
+})
+
+onUnmounted(() => {
+  window.removeEventListener('automation-auth-required', handleAuthRequired)
+})
 </script>

+ 45 - 0
frontend/src/api.js

@@ -1,12 +1,57 @@
 import axios from 'axios'
 
 const defaultBaseUrl = `${window.location.protocol}//${window.location.hostname}:8000`
+const tokenKey = 'automation_remote_token'
+const unlockedKey = 'automation_auth_unlocked'
 
 export const api = axios.create({
   baseURL: import.meta.env.VITE_API_BASE || defaultBaseUrl,
   timeout: 120000,
 })
 
+api.interceptors.request.use((config) => {
+  const token = automationToken()
+  if (token) {
+    config.headers = config.headers || {}
+    config.headers['X-Automation-Token'] = token
+  }
+  return config
+})
+
+api.interceptors.response.use(
+  (response) => response,
+  (error) => {
+    if ([401, 403].includes(error.response?.status)) {
+      clearAutomationAuth()
+      window.dispatchEvent(new CustomEvent('automation-auth-required'))
+    }
+    return Promise.reject(error)
+  },
+)
+
+export function automationToken() {
+  return window.localStorage.getItem(tokenKey) || ''
+}
+
+export function setAutomationToken(token) {
+  const value = String(token || '').trim()
+  if (value) {
+    window.localStorage.setItem(tokenKey, value)
+  } else {
+    window.localStorage.removeItem(tokenKey)
+  }
+  window.localStorage.setItem(unlockedKey, '1')
+}
+
+export function clearAutomationAuth() {
+  window.localStorage.removeItem(tokenKey)
+  window.localStorage.removeItem(unlockedKey)
+}
+
+export function hasAutomationAuth() {
+  return window.localStorage.getItem(unlockedKey) === '1' || Boolean(automationToken())
+}
+
 export const statusOptions = [
   { label: '待确认', value: 'PENDING', type: 'warning' },
   { label: '可信', value: 'TRUSTED', type: 'success' },

+ 220 - 14
frontend/src/components/AutomationWorkflowEditorPage.vue

@@ -22,6 +22,7 @@
             <button v-for="definition in group.items" :key="definition.type" class="palette-node" type="button" @click="addNode(definition)">
               <span>{{ definition.label }}</span>
               <small>{{ definition.type }}</small>
+              <em>{{ definition.description }}</em>
             </button>
           </el-collapse-item>
         </el-collapse>
@@ -42,6 +43,46 @@
         >
           <Background pattern-color="#cbd5e1" :gap="24" />
           <Controls />
+          <template #node-default="{ id, data, selected }">
+            <div class="workflow-card-node" :class="{ selected, collapsed: data.collapsed }">
+              <div class="workflow-card-head">
+                <div class="workflow-card-title-wrap">
+                  <div class="workflow-card-type">{{ nodeDefinition(data.nodeType)?.label || data.nodeType }}</div>
+                  <div class="workflow-card-title">{{ data.title }}</div>
+                </div>
+                <button class="workflow-card-toggle" type="button" @click.stop="toggleNodeCollapsed(id)">
+                  {{ data.collapsed ? '展开' : '折叠' }}
+                </button>
+              </div>
+              <div class="workflow-card-desc">{{ nodeDefinition(data.nodeType)?.description || data.nodeType }}</div>
+              <div v-if="!data.collapsed" class="workflow-card-body">
+                <div class="workflow-card-section">
+                  <strong>参数</strong>
+                  <span v-if="!nodeParamEntries(data).length">无</span>
+                  <span v-for="item in nodeParamEntries(data)" :key="item.key" class="workflow-card-chip">
+                    {{ item.label }}={{ shortValue(item.value) }}
+                  </span>
+                </div>
+                <div class="workflow-card-section">
+                  <strong>输入</strong>
+                  <span v-if="!nodeInputEntries(data).length">无</span>
+                  <span v-for="item in nodeInputEntries(data)" :key="item.key" class="workflow-card-chip input">
+                    {{ item.label }}←{{ bindingSummary(item.binding) }}
+                  </span>
+                </div>
+                <div class="workflow-card-section">
+                  <strong>输出</strong>
+                  <span v-if="!nodeOutputEntries(data.nodeType).length">无</span>
+                  <span v-for="item in nodeOutputEntries(data.nodeType).slice(0, 4)" :key="item.key" class="workflow-card-chip output">
+                    {{ item.key }}:{{ item.type }}
+                  </span>
+                  <span v-if="nodeOutputEntries(data.nodeType).length > 4" class="workflow-card-more">
+                    +{{ nodeOutputEntries(data.nodeType).length - 4 }}
+                  </span>
+                </div>
+              </div>
+            </div>
+          </template>
         </VueFlow>
       </main>
 
@@ -58,8 +99,19 @@
                 <el-option label="文本" value="string" />
                 <el-option label="数字" value="number" />
                 <el-option label="布尔" value="boolean" />
+                <el-option label="对象" value="object" />
+                <el-option label="数组" value="array" />
               </el-select>
+              <el-input
+                v-if="['object', 'array'].includes(ensureVariableObject(name).type)"
+                :model-value="variableJsonText(ensureVariableObject(name).default)"
+                type="textarea"
+                :rows="5"
+                placeholder="JSON"
+                @change="updateVariableJson(name, $event)"
+              />
               <component
+                v-else
                 :is="variableDefaultComponent(ensureVariableObject(name))"
                 v-model="ensureVariableObject(name).default"
                 v-bind="variableDefaultProps(ensureVariableObject(name))"
@@ -80,6 +132,9 @@
             <el-form-item label="类型">
               <el-tag>{{ selectedNode.data.nodeType }}</el-tag>
             </el-form-item>
+            <el-form-item label="说明">
+              <div class="node-description">{{ selectedDefinition?.description || '暂无说明' }}</div>
+            </el-form-item>
           </el-form>
 
           <div class="section-title">参数</div>
@@ -112,15 +167,35 @@
               <el-select v-model="inputBinding(key).node_id" placeholder="节点">
                 <el-option v-for="node in flowNodes.filter((item) => item.id !== selectedNode.id)" :key="node.id" :label="node.data.title" :value="node.id" />
               </el-select>
-              <el-input v-model="inputBinding(key).output" placeholder="输出名" />
+              <el-select v-model="inputBinding(key).output" placeholder="输出">
+                <el-option v-for="output in outputEntriesForNode(inputBinding(key).node_id)" :key="output.key" :label="`${output.label} (${output.key})`" :value="output.key" />
+              </el-select>
             </template>
             <el-input v-if="inputBinding(key).source === 'runtime'" v-model="inputBinding(key).name" placeholder="运行时键名" />
+            <div v-if="field.description" class="field-help">{{ field.description }}</div>
           </div>
           <div v-if="!Object.keys(selectedDefinition?.inputs || {}).length" class="muted">此节点不需要输入。</div>
 
-          <div class="section-title">连线</div>
+          <div class="section-title">输出数据结构</div>
+          <el-table :data="nodeOutputEntries(selectedNode.data.nodeType)" size="small" border empty-text="此节点没有声明输出">
+            <el-table-column prop="key" label="字段" width="110" />
+            <el-table-column prop="label" label="含义" min-width="130" />
+            <el-table-column prop="type" label="类型" width="90" />
+            <el-table-column prop="description" label="说明" min-width="150" show-overflow-tooltip />
+          </el-table>
+
+          <div class="section-title">相关连线</div>
           <el-table :data="selectedEdges" size="small" border>
-            <el-table-column prop="label" label="连线" min-width="130" />
+            <el-table-column label="来源" min-width="150">
+              <template #default="{ row }">
+                {{ edgeEndpointLabel(row.source, row.sourceHandle || edgeSourcePort(row)) }}
+              </template>
+            </el-table-column>
+            <el-table-column label="目标" min-width="150">
+              <template #default="{ row }">
+                {{ edgeEndpointLabel(row.target, row.targetHandle || edgeTargetPort(row)) }}
+              </template>
+            </el-table-column>
             <el-table-column label="类型" width="100">
               <template #default="{ row }">
                 <el-select v-model="row.data.kind" size="small" @change="syncEdgeLabel(row)">
@@ -142,6 +217,22 @@
         </template>
         <div v-else class="muted">从左侧添加节点,或点击画布中的节点编辑参数和输入绑定。</div>
 
+        <div class="section-title">全部连线</div>
+        <el-table :data="flowEdges" size="small" border empty-text="暂无连线">
+          <el-table-column label="连线" min-width="230">
+            <template #default="{ row }">
+              {{ readableEdgeLabel(row) }}
+            </template>
+          </el-table-column>
+          <el-table-column label="类型" width="82">
+            <template #default="{ row }">
+              <el-tag size="small" :type="row.data?.kind === 'data' ? 'success' : 'primary'">
+                {{ edgeKindLabel(row.data?.kind) }}
+              </el-tag>
+            </template>
+          </el-table-column>
+        </el-table>
+
         <div class="section-title">执行结果</div>
         <pre class="workflow-run-output">{{ runOutput || '暂无执行结果' }}</pre>
       </aside>
@@ -164,7 +255,7 @@ const props = defineProps({
   workflowId: { type: Number, default: null },
 })
 
-const emit = defineEmits(['back', 'saved'])
+const emit = defineEmits(['back', 'saved', 'task-created'])
 const { fitView } = useVueFlow()
 
 const nodeDefinitions = ref([])
@@ -177,7 +268,7 @@ const workflowVariables = ref({})
 const workflowSettings = ref({ max_steps: 100, default_timeout_ms: 30000, on_unhandled_error: 'pause_for_user' })
 const workflowId = ref(props.workflowId)
 const selectedNodeId = ref(null)
-const openedCategories = ref(['flow', 'mouse', 'keyboard'])
+const openedCategories = ref(['flow', 'browser', 'vision', 'media', 'mouse', 'keyboard'])
 const saving = ref(false)
 const running = ref(false)
 const runOutput = ref('')
@@ -196,17 +287,25 @@ const groupedNodeDefinitions = computed(() => {
 
 function categoryLabel(category) {
   return {
+    browser: '浏览器',
     flow: '流程',
-    mouse: '鼠标',
+    human: '人工交互',
     keyboard: '键盘',
-    text: '文本',
+    media: '媒体',
+    mouse: '鼠标',
     program: '程序',
+    research: '研究',
     screen: '屏幕',
+    text: '文本',
+    vision: '视觉 AI',
     wait: '等待',
-    human: '人工交互',
   }[category] || category
 }
 
+function nodeDefinition(nodeType) {
+  return nodeDefinitions.value.find((item) => item.type === nodeType)
+}
+
 async function loadDefinitions() {
   const { data } = await api.get('/api/automation/workflow-nodes')
   nodeDefinitions.value = data.items || []
@@ -231,6 +330,7 @@ async function loadWorkflow() {
   workflowSettings.value = data.settings || {}
   flowNodes.value = (data.nodes || []).map(workflowNodeToFlow)
   flowEdges.value = (data.edges || []).map(workflowEdgeToFlow)
+  refreshEdgeLabels()
 }
 
 function workflowNodeToFlow(node) {
@@ -244,6 +344,7 @@ function workflowNodeToFlow(node) {
       nodeType: node.type,
       params: structuredClone(node.params || {}),
       inputs: structuredClone(node.inputs || {}),
+      collapsed: false,
     },
   }
 }
@@ -256,7 +357,7 @@ function workflowEdgeToFlow(edge) {
     target: edge.target,
     sourceHandle: edge.source_port || null,
     targetHandle: edge.target_port || null,
-    label: edgeLabel(kind),
+    label: '',
     data: { kind },
     animated: kind === 'control',
     style: { stroke: kind === 'data' ? '#10b981' : '#2563eb' },
@@ -275,6 +376,7 @@ function addNode(definition) {
       nodeType: definition.type,
       params: defaultValues(definition.params || {}),
       inputs: {},
+      collapsed: false,
     },
   }
   flowNodes.value.push(node)
@@ -302,11 +404,12 @@ function onConnect(connection) {
   flowEdges.value.push({
     id: `edge_${Date.now()}`,
     ...connection,
-    label: edgeLabel(kind),
+    label: '',
     data: { kind },
     animated: true,
     style: { stroke: '#2563eb' },
   })
+  refreshEdgeLabels()
 }
 
 function onNodeClick(event) {
@@ -316,10 +419,11 @@ function onNodeClick(event) {
 function syncNodeLabel() {
   if (!selectedNode.value) return
   selectedNode.value.label = selectedNode.value.data.title
+  refreshEdgeLabels()
 }
 
 function syncEdgeLabel(edge) {
-  edge.label = edgeLabel(edge.data.kind)
+  edge.label = readableEdgeLabel(edge)
   edge.animated = edge.data.kind === 'control'
   edge.style = { stroke: edge.data.kind === 'data' ? '#10b981' : '#2563eb' }
 }
@@ -328,6 +432,92 @@ function edgeLabel(kind) {
   return kind === 'data' ? '数据' : '控制'
 }
 
+function edgeKindLabel(kind) {
+  return kind === 'data' ? '数据' : '控制'
+}
+
+function edgeSourcePort(edge) {
+  return edge.sourceHandle || (edge.data?.kind === 'data' ? 'value' : 'success')
+}
+
+function edgeTargetPort(edge) {
+  return edge.targetHandle || (edge.data?.kind === 'data' ? 'value' : 'run')
+}
+
+function edgeEndpointLabel(nodeId, port) {
+  const node = flowNodes.value.find((item) => item.id === nodeId)
+  const title = node?.data?.title || nodeId || '未知节点'
+  return port ? `${title}.${port}` : title
+}
+
+function readableEdgeLabel(edge) {
+  return `${edgeEndpointLabel(edge.source, edgeSourcePort(edge))} → ${edgeEndpointLabel(edge.target, edgeTargetPort(edge))}`
+}
+
+function refreshEdgeLabels() {
+  flowEdges.value.forEach((edge) => {
+    edge.label = readableEdgeLabel(edge)
+  })
+}
+
+function toggleNodeCollapsed(nodeId) {
+  const node = flowNodes.value.find((item) => item.id === nodeId)
+  if (node) node.data.collapsed = !node.data.collapsed
+}
+
+function fieldEntries(fields = {}) {
+  return Object.entries(fields).map(([key, field]) => ({
+    key,
+    label: field.label || key,
+    type: field.type || 'any',
+    description: field.description || '',
+  }))
+}
+
+function nodeOutputEntries(nodeType) {
+  return fieldEntries(nodeDefinition(nodeType)?.outputs || {})
+}
+
+function outputEntriesForNode(nodeId) {
+  const node = flowNodes.value.find((item) => item.id === nodeId)
+  return node ? nodeOutputEntries(node.data.nodeType) : []
+}
+
+function nodeParamEntries(data) {
+  const definition = nodeDefinition(data.nodeType)
+  const params = data.params || {}
+  return fieldEntries(definition?.params || {})
+    .filter((item) => params[item.key] !== undefined && params[item.key] !== '')
+    .map((item) => ({ ...item, value: params[item.key] }))
+}
+
+function nodeInputEntries(data) {
+  const definition = nodeDefinition(data.nodeType)
+  const inputs = data.inputs || {}
+  return fieldEntries(definition?.inputs || {})
+    .filter((item) => inputs[item.key])
+    .map((item) => ({ ...item, binding: inputs[item.key] }))
+}
+
+function shortValue(value) {
+  if (value === null || value === undefined || value === '') return '空'
+  if (typeof value === 'boolean') return value ? '是' : '否'
+  if (typeof value === 'object') {
+    const text = JSON.stringify(value)
+    return text.length > 28 ? `${text.slice(0, 28)}...` : text
+  }
+  const text = String(value)
+  return text.length > 28 ? `${text.slice(0, 28)}...` : text
+}
+
+function bindingSummary(binding) {
+  if (!binding) return '未设置'
+  if (binding.source === 'variable') return `变量:${binding.name || '-'}`
+  if (binding.source === 'node_output') return `${edgeEndpointLabel(binding.node_id, binding.output || '-')}`
+  if (binding.source === 'runtime') return `运行时:${binding.name || '-'}`
+  return shortValue(binding.value)
+}
+
 function fieldComponent(field) {
   if (field.type === 'boolean') return 'el-switch'
   if (field.type === 'select') return 'el-select'
@@ -373,6 +563,18 @@ function variableDefaultProps(variable) {
   return {}
 }
 
+function variableJsonText(value) {
+  return JSON.stringify(value ?? {}, null, 2)
+}
+
+function updateVariableJson(name, value) {
+  try {
+    ensureVariableObject(name).default = JSON.parse(value)
+  } catch {
+    ElMessage.error(`变量 ${name} 的默认值不是有效 JSON`)
+  }
+}
+
 async function addVariable() {
   const { value } = await ElMessageBox.prompt('请输入变量名', '新增变量', {
     inputPattern: /^[A-Za-z_][A-Za-z0-9_]*$/,
@@ -447,12 +649,16 @@ async function save() {
 }
 
 async function runWorkflow() {
-  if (!workflowId.value) return
+  if (!workflowId.value || !workflowKey.value.trim()) {
+    ElMessage.warning('请先保存并设置 workflow key')
+    return
+  }
   running.value = true
   try {
-    const { data } = await api.post(`/api/automation/workflows/${workflowId.value}/run`, {})
+    const { data } = await api.post(`/api/automation/workflows/by-key/${encodeURIComponent(workflowKey.value.trim())}/run`, {})
     runOutput.value = JSON.stringify(data, null, 2)
-    ElMessage[data.status === 'SUCCESS' ? 'success' : 'warning'](data.status === 'SUCCESS' ? '工作流执行完成' : '工作流执行中止')
+    ElMessage.success('任务已加入执行队列')
+    emit('task-created', data)
   } catch (error) {
     ElMessage.error(error.response?.data?.detail || '执行工作流失败')
   } finally {

+ 123 - 0
frontend/src/components/AutomationWorkflowTasksView.vue

@@ -0,0 +1,123 @@
+<template>
+  <div class="panel">
+    <div class="toolbar">
+      <div class="filters">
+        <el-select v-model="statusFilter" clearable placeholder="全部状态" style="width: 160px" @change="load">
+          <el-option v-for="item in statusOptions" :key="item" :label="statusLabel(item)" :value="item" />
+        </el-select>
+        <el-button @click="load">刷新</el-button>
+      </div>
+    </div>
+
+    <el-table :data="tasks.items" border stripe height="680" @row-dblclick="showDetail">
+      <el-table-column prop="id" label="任务 ID" min-width="270" show-overflow-tooltip />
+      <el-table-column prop="workflow_name" label="工作流" min-width="190" show-overflow-tooltip />
+      <el-table-column prop="workflow_key" label="Key" min-width="160" show-overflow-tooltip />
+      <el-table-column label="状态" width="110">
+        <template #default="{ row }">
+          <el-tag :type="statusType(row.status)">{{ statusLabel(row.status) }}</el-tag>
+        </template>
+      </el-table-column>
+      <el-table-column label="队列位置" width="100">
+        <template #default="{ row }">{{ row.queue_position || '-' }}</template>
+      </el-table-column>
+      <el-table-column prop="created_at" label="创建时间" min-width="180" />
+      <el-table-column prop="started_at" label="开始时间" min-width="180" />
+      <el-table-column prop="finished_at" label="完成时间" min-width="180" />
+      <el-table-column prop="error_message" label="错误" min-width="220" show-overflow-tooltip />
+      <el-table-column label="操作" width="90" fixed="right">
+        <template #default="{ row }">
+          <el-button size="small" type="primary" @click="showDetail(row)">详情</el-button>
+        </template>
+      </el-table-column>
+    </el-table>
+
+    <el-dialog v-model="detailVisible" title="自动化任务详情" width="900px">
+      <el-descriptions v-if="detail" :column="2" border>
+        <el-descriptions-item label="任务 ID">{{ detail.id }}</el-descriptions-item>
+        <el-descriptions-item label="状态">{{ statusLabel(detail.status) }}</el-descriptions-item>
+        <el-descriptions-item label="工作流">{{ detail.workflow_name }}</el-descriptions-item>
+        <el-descriptions-item label="Key">{{ detail.workflow_key }}</el-descriptions-item>
+        <el-descriptions-item label="创建时间">{{ detail.created_at }}</el-descriptions-item>
+        <el-descriptions-item label="完成时间">{{ detail.finished_at || '-' }}</el-descriptions-item>
+      </el-descriptions>
+      <div class="section-title">调用参数</div>
+      <pre class="workflow-run-output">{{ pretty(detail?.request) }}</pre>
+      <div class="section-title">返回数据</div>
+      <pre class="workflow-run-output">{{ pretty(detail?.return_data) }}</pre>
+      <div class="section-title">完整执行结果</div>
+      <pre class="workflow-run-output">{{ pretty(detail?.result) }}</pre>
+    </el-dialog>
+  </div>
+</template>
+
+<script setup>
+import { onBeforeUnmount, onMounted, ref } from 'vue'
+import { ElMessage } from 'element-plus'
+import { api } from '../api'
+
+const tasks = ref({ items: [] })
+const statusFilter = ref('')
+const detailVisible = ref(false)
+const detail = ref(null)
+const statusOptions = ['QUEUED', 'RUNNING', 'SUCCESS', 'FAILED', 'PAUSED']
+let timer = null
+
+function statusLabel(status) {
+  return {
+    QUEUED: '排队中',
+    RUNNING: '执行中',
+    SUCCESS: '成功',
+    FAILED: '失败',
+    PAUSED: '已暂停',
+  }[status] || status
+}
+
+function statusType(status) {
+  return {
+    QUEUED: 'info',
+    RUNNING: 'warning',
+    SUCCESS: 'success',
+    FAILED: 'danger',
+    PAUSED: 'primary',
+  }[status] || 'info'
+}
+
+function pretty(value) {
+  return value == null ? '暂无数据' : JSON.stringify(value, null, 2)
+}
+
+async function load() {
+  const params = statusFilter.value ? { status: statusFilter.value } : {}
+  const { data } = await api.get('/api/automation/workflow-tasks', { params })
+  tasks.value = data
+  if (detail.value?.id && ['QUEUED', 'RUNNING'].includes(detail.value.status)) {
+    await loadDetail(detail.value.id, false)
+  }
+}
+
+async function loadDetail(taskId, showError = true) {
+  try {
+    const { data } = await api.get(`/api/automation/workflow-tasks/${taskId}`)
+    detail.value = data
+  } catch (error) {
+    if (showError) ElMessage.error(error.response?.data?.detail || '读取任务详情失败')
+  }
+}
+
+async function showDetail(row) {
+  await loadDetail(row.id)
+  detailVisible.value = true
+}
+
+function startPolling() {
+  timer = window.setInterval(() => load().catch(() => {}), 3000)
+}
+
+defineExpose({ load, showDetail })
+onMounted(async () => {
+  await load()
+  startPolling()
+})
+onBeforeUnmount(() => window.clearInterval(timer))
+</script>

+ 114 - 6
frontend/src/components/AutomationWorkflowView.vue

@@ -3,6 +3,12 @@
     <div class="toolbar">
       <div class="filters">
         <el-button type="primary" @click="$emit('create')">新建工作流</el-button>
+        <el-button type="warning" :loading="creatingWebSearch" @click="createWebSearch">AI 研究模板</el-button>
+        <el-button @click="selectImportFile">导入 JSON</el-button>
+        <input ref="importFileInput" type="file" accept="application/json,.json" hidden @change="importWorkflow" />
+        <el-button @click="selectZipImportFile">导入 ZIP</el-button>
+        <input ref="zipImportFileInput" type="file" accept="application/zip,.zip" hidden @change="importWorkflowZip" />
+        <el-button :loading="exportingZip" @click="exportWorkflowZip">导出 ZIP</el-button>
         <el-button type="success" @click="planDialogVisible = true">AI 生成</el-button>
         <el-button @click="load">刷新</el-button>
       </div>
@@ -16,11 +22,12 @@
       <el-table-column prop="node_count" label="节点" width="80" />
       <el-table-column prop="edge_count" label="连线" width="80" />
       <el-table-column prop="updated_at" label="更新时间" min-width="180" />
-      <el-table-column label="操作" width="230" fixed="right">
+      <el-table-column label="操作" width="300" fixed="right">
         <template #default="{ row }">
           <div class="row-actions">
             <el-button size="small" type="primary" @click="$emit('edit', row.id)">编辑</el-button>
             <el-button size="small" type="success" :loading="runningId === row.id" @click="run(row)">执行</el-button>
+            <el-button size="small" @click="exportWorkflow(row)">导出</el-button>
             <el-button size="small" type="danger" @click="remove(row)">删除</el-button>
           </div>
         </template>
@@ -46,7 +53,7 @@ import { onMounted, ref } from 'vue'
 import { ElMessage, ElMessageBox } from 'element-plus'
 import { api } from '../api'
 
-const emit = defineEmits(['create', 'edit'])
+const emit = defineEmits(['create', 'edit', 'task-created'])
 
 const workflows = ref({ items: [] })
 const runningId = ref(null)
@@ -55,6 +62,10 @@ const runDialogVisible = ref(false)
 const planDialogVisible = ref(false)
 const planRequirement = ref('')
 const planning = ref(false)
+const creatingWebSearch = ref(false)
+const exportingZip = ref(false)
+const importFileInput = ref(null)
+const zipImportFileInput = ref(null)
 
 async function load() {
   const { data } = await api.get('/api/automation/workflows')
@@ -62,12 +73,15 @@ async function load() {
 }
 
 async function run(row) {
+  if (!row.workflow_key) {
+    ElMessage.warning('请先为工作流设置 workflow key')
+    return
+  }
   runningId.value = row.id
   try {
-    const { data } = await api.post(`/api/automation/workflows/${row.id}/run`, {})
-    runOutput.value = JSON.stringify(data, null, 2)
-    runDialogVisible.value = true
-    ElMessage[data.status === 'SUCCESS' ? 'success' : 'warning'](data.status === 'SUCCESS' ? '工作流执行完成' : '工作流执行中止')
+    const { data } = await api.post(`/api/automation/workflows/by-key/${encodeURIComponent(row.workflow_key)}/run`, {})
+    ElMessage.success(`任务已加入队列,位置 ${data.queue_position || 1}`)
+    emit('task-created', data)
   } catch (error) {
     ElMessage.error(error.response?.data?.detail || '执行工作流失败')
   } finally {
@@ -75,6 +89,100 @@ async function run(row) {
   }
 }
 
+function selectImportFile() {
+  importFileInput.value?.click()
+}
+
+function selectZipImportFile() {
+  zipImportFileInput.value?.click()
+}
+
+async function importWorkflow(event) {
+  const file = event.target.files?.[0]
+  event.target.value = ''
+  if (!file) return
+  try {
+    const workflow = JSON.parse(await file.text())
+    try {
+      await api.post('/api/automation/workflows/import', { workflow, conflict_strategy: 'error' })
+    } catch (error) {
+      if (error.response?.status !== 409) throw error
+      await ElMessageBox.confirm('相同 workflow key 已存在,是否覆盖?', '导入工作流', { type: 'warning' })
+      await api.post('/api/automation/workflows/import', { workflow, conflict_strategy: 'replace' })
+    }
+    ElMessage.success('工作流导入成功')
+    await load()
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || error.message || '工作流导入失败')
+  }
+}
+
+async function importWorkflowZip(event) {
+  const file = event.target.files?.[0]
+  event.target.value = ''
+  if (!file) return
+  try {
+    const { data } = await api.post('/api/automation/workflows/import.zip', await file.arrayBuffer(), {
+      headers: { 'Content-Type': 'application/zip' },
+    })
+    ElMessage.success(`ZIP 导入完成:新增 ${data.created_count},跳过 ${data.skipped_count},失败 ${data.failed_count}`)
+    if (data.failed_count) {
+      runOutput.value = JSON.stringify(data, null, 2)
+      runDialogVisible.value = true
+    }
+    await load()
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || error.message || 'ZIP 导入失败')
+  }
+}
+
+async function exportWorkflow(row) {
+  if (!row.workflow_key) {
+    ElMessage.warning('没有 workflow key 的工作流不能导出')
+    return
+  }
+  const { data } = await api.get(`/api/automation/workflows/by-key/${encodeURIComponent(row.workflow_key)}/export`)
+  const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json;charset=utf-8' })
+  const url = URL.createObjectURL(blob)
+  const link = document.createElement('a')
+  link.href = url
+  link.download = `${row.workflow_key}.workflow.json`
+  link.click()
+  URL.revokeObjectURL(url)
+}
+
+async function exportWorkflowZip() {
+  exportingZip.value = true
+  try {
+    const { data } = await api.get('/api/automation/workflows/export.zip', { responseType: 'blob' })
+    const blob = new Blob([data], { type: 'application/zip' })
+    const url = URL.createObjectURL(blob)
+    const link = document.createElement('a')
+    link.href = url
+    link.download = `workflows-${new Date().toISOString().slice(0, 10)}.zip`
+    link.click()
+    URL.revokeObjectURL(url)
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || '导出 ZIP 失败')
+  } finally {
+    exportingZip.value = false
+  }
+}
+
+async function createWebSearch() {
+  creatingWebSearch.value = true
+  try {
+    const { data } = await api.post('/api/automation/workflows/templates/web-search')
+    ElMessage.success('已创建网页搜索工作流')
+    await load()
+    emit('edit', data.id)
+  } catch (error) {
+    ElMessage.error(error.response?.data?.detail || '创建网页搜索工作流失败')
+  } finally {
+    creatingWebSearch.value = false
+  }
+}
+
 async function remove(row) {
   await ElMessageBox.confirm(`确认删除工作流“${row.name}”?`, '删除工作流', { type: 'warning' })
   await api.delete(`/api/automation/workflows/${row.id}`)

+ 9 - 3
frontend/src/components/SystemSettingsView.vue

@@ -47,9 +47,9 @@
       </el-form-item>
 
       <div class="section-title">远程执行</div>
-      <el-alert type="warning" show-icon :closable="false" title="按 key 远程执行工作流时必须携带此 Token。请只在可信局域网或 VPN 内开放后端端口。" />
+      <el-alert type="warning" show-icon :closable="false" title="远程执行和任务查询会校验此 Token。支持 X-Automation-Token、Bearer Token 或 automation_token 查询参数;请只在可信局域网或 VPN 内开放后端端口。" />
       <el-form-item label="远程执行 Token">
-        <el-input v-model="form.automation_remote_token" show-password placeholder="用于 X-Automation-Token 请求头" />
+        <el-input v-model="form.automation_remote_token" show-password placeholder="用于 iOS 快捷指令和前端自动化请求" />
       </el-form-item>
     </el-form>
   </div>
@@ -58,7 +58,7 @@
 <script setup>
 import { computed, onMounted, reactive, ref } from 'vue'
 import { ElMessage } from 'element-plus'
-import { api } from '../api'
+import { api, setAutomationToken } from '../api'
 
 const providers = ref([])
 const models = ref([])
@@ -88,6 +88,7 @@ async function load() {
   providers.value = providerResult.data.items
   models.value = modelResult.data.items
   Object.assign(form, settingsResult.data.settings)
+  syncAutomationToken()
 }
 
 function selectDefaultModel() {
@@ -101,6 +102,7 @@ async function save() {
   saving.value = true
   try {
     await api.put('/api/settings', form)
+    syncAutomationToken()
     ElMessage.success('系统设置已保存')
     await load()
   } catch (error) {
@@ -110,6 +112,10 @@ async function save() {
   }
 }
 
+function syncAutomationToken() {
+  setAutomationToken(form.automation_remote_token)
+}
+
 defineExpose({ load })
 onMounted(load)
 </script>

+ 178 - 1
frontend/src/styles.css

@@ -14,6 +14,45 @@ body {
   display: flex;
 }
 
+.login-page {
+  min-height: 100vh;
+  display: grid;
+  place-items: center;
+  padding: 24px;
+  background:
+    radial-gradient(circle at 20% 20%, rgb(37 99 235 / 10%), transparent 28%),
+    linear-gradient(135deg, #f8fafc, #eef2f7);
+}
+
+.login-panel {
+  width: min(440px, 100%);
+  display: grid;
+  gap: 14px;
+  padding: 28px;
+  border: 1px solid #d8dee8;
+  border-radius: 8px;
+  background: #fff;
+  box-shadow: 0 18px 45px rgb(15 23 42 / 12%);
+}
+
+.login-brand {
+  color: #2563eb;
+  font-size: 14px;
+  font-weight: 700;
+}
+
+.login-panel h1 {
+  margin: 0;
+  color: #111827;
+  font-size: 26px;
+}
+
+.login-panel p {
+  margin: 0;
+  color: #64748b;
+  line-height: 1.6;
+}
+
 .sidebar {
   width: 220px;
   flex: 0 0 220px;
@@ -498,6 +537,123 @@ body {
   background-size: 24px 24px;
 }
 
+.workflow-flow .vue-flow__node-default {
+  width: 300px;
+  padding: 0;
+  border: 0;
+  border-radius: 8px;
+  background: transparent;
+  box-shadow: none;
+}
+
+.workflow-card-node {
+  width: 300px;
+  overflow: hidden;
+  border: 1px solid #cfd8e3;
+  border-radius: 8px;
+  background: #fff;
+  box-shadow: 0 8px 22px rgb(15 23 42 / 10%);
+}
+
+.workflow-card-node.selected {
+  border-color: #2563eb;
+  box-shadow: 0 0 0 3px rgb(37 99 235 / 18%), 0 8px 22px rgb(15 23 42 / 12%);
+}
+
+.workflow-card-head {
+  display: flex;
+  align-items: flex-start;
+  justify-content: space-between;
+  gap: 8px;
+  padding: 10px 10px 8px;
+  border-bottom: 1px solid #eef2f7;
+}
+
+.workflow-card-title-wrap {
+  min-width: 0;
+}
+
+.workflow-card-type {
+  color: #2563eb;
+  font-size: 12px;
+  font-weight: 700;
+}
+
+.workflow-card-title {
+  margin-top: 3px;
+  color: #111827;
+  font-weight: 700;
+  line-height: 1.3;
+  word-break: break-word;
+}
+
+.workflow-card-toggle {
+  flex: 0 0 auto;
+  padding: 3px 7px;
+  border: 1px solid #d8dee8;
+  border-radius: 6px;
+  background: #f8fafc;
+  color: #475569;
+  font-size: 12px;
+  cursor: pointer;
+}
+
+.workflow-card-desc {
+  padding: 8px 10px;
+  color: #64748b;
+  font-size: 12px;
+  line-height: 1.45;
+  word-break: break-word;
+}
+
+.workflow-card-node.collapsed .workflow-card-desc {
+  border-bottom: 0;
+}
+
+.workflow-card-body {
+  display: grid;
+  gap: 8px;
+  padding: 0 10px 10px;
+}
+
+.workflow-card-section {
+  display: flex;
+  align-items: center;
+  gap: 5px;
+  flex-wrap: wrap;
+  color: #6b7280;
+  font-size: 12px;
+}
+
+.workflow-card-section strong {
+  color: #374151;
+}
+
+.workflow-card-chip {
+  max-width: 260px;
+  padding: 2px 6px;
+  overflow: hidden;
+  border-radius: 999px;
+  background: #eef2ff;
+  color: #3730a3;
+  text-overflow: ellipsis;
+  white-space: nowrap;
+}
+
+.workflow-card-chip.input {
+  background: #ecfdf5;
+  color: #047857;
+}
+
+.workflow-card-chip.output {
+  background: #fef3c7;
+  color: #92400e;
+}
+
+.workflow-card-more {
+  color: #64748b;
+}
+
 .palette-node {
   display: block;
   width: 100%;
@@ -517,7 +673,8 @@ body {
 }
 
 .palette-node span,
-.palette-node small {
+.palette-node small,
+.palette-node em {
   display: block;
 }
 
@@ -526,6 +683,26 @@ body {
   color: #64748b;
 }
 
+.palette-node em {
+  margin-top: 5px;
+  color: #6b7280;
+  font-size: 12px;
+  font-style: normal;
+  line-height: 1.4;
+}
+
+.node-description,
+.field-help {
+  color: #64748b;
+  font-size: 13px;
+  line-height: 1.5;
+}
+
+.field-help {
+  grid-column: 3 / 4;
+  margin-top: -4px;
+}
+
 .workflow-inspector-form {
   margin-bottom: 12px;
 }

+ 5 - 0
start.cmd

@@ -0,0 +1,5 @@
+@echo off
+setlocal
+powershell.exe -NoProfile -ExecutionPolicy Bypass -File "%~dp0start.ps1" %*
+if errorlevel 1 pause
+endlocal

+ 205 - 0
start.ps1

@@ -0,0 +1,205 @@
+[CmdletBinding()]
+param(
+    [string]$BackendBindAddress = "0.0.0.0",
+    [int]$BackendPort = 8000,
+    [string]$FrontendBindAddress = "0.0.0.0",
+    [int]$FrontendPort = 5173,
+    [string]$ApiBaseUrl = "",
+    [switch]$SkipFrontend,
+    [switch]$NoBrowser
+)
+
+$ErrorActionPreference = "Stop"
+
+$ProjectRoot = Split-Path -Parent $MyInvocation.MyCommand.Path
+$RuntimeDir = Join-Path $ProjectRoot ".runtime"
+$LogDir = Join-Path $RuntimeDir "logs"
+$StatePath = Join-Path $RuntimeDir "processes.json"
+$PythonPath = Join-Path $ProjectRoot ".venv\Scripts\python.exe"
+$BackendDir = Join-Path $ProjectRoot "backend"
+$FrontendDir = Join-Path $ProjectRoot "frontend"
+$NodeModulesDir = Join-Path $FrontendDir "node_modules"
+
+function Get-ListeningProcessId {
+    param([int]$Port)
+
+    $connection = Get-NetTCPConnection -LocalPort $Port -State Listen -ErrorAction SilentlyContinue |
+        Select-Object -First 1
+    if ($null -eq $connection) {
+        return $null
+    }
+    return [int]$connection.OwningProcess
+}
+
+function Wait-ForPort {
+    param(
+        [int]$Port,
+        [System.Diagnostics.Process]$Launcher,
+        [int]$TimeoutSeconds = 30
+    )
+
+    $deadline = (Get-Date).AddSeconds($TimeoutSeconds)
+    while ((Get-Date) -lt $deadline) {
+        $listenerPid = Get-ListeningProcessId -Port $Port
+        if ($null -ne $listenerPid) {
+            return $listenerPid
+        }
+        if ($Launcher.HasExited) {
+            throw "启动进程已退出,退出码:$($Launcher.ExitCode)"
+        }
+        Start-Sleep -Milliseconds 300
+        $Launcher.Refresh()
+    }
+    throw "等待端口 $Port 监听超时(${TimeoutSeconds} 秒)"
+}
+
+function Stop-ProcessTree {
+    param([int]$RootProcessId)
+
+    if ($RootProcessId -le 0) {
+        return
+    }
+
+    # 先结束子进程,避免 npm/cmd 退出后留下独立的 Vite 进程。
+    $pending = [System.Collections.Generic.List[int]]::new()
+    $pending.Add($RootProcessId)
+    $allProcessIds = [System.Collections.Generic.List[int]]::new()
+    while ($pending.Count -gt 0) {
+        $currentProcessId = $pending[0]
+        $pending.RemoveAt(0)
+        $allProcessIds.Add($currentProcessId)
+        Get-CimInstance Win32_Process -Filter "ParentProcessId = $currentProcessId" -ErrorAction SilentlyContinue |
+            ForEach-Object { $pending.Add([int]$_.ProcessId) }
+    }
+
+    $orderedProcessIds = @($allProcessIds)
+    [array]::Reverse($orderedProcessIds)
+    foreach ($processId in $orderedProcessIds) {
+        Stop-Process -Id $processId -Force -ErrorAction SilentlyContinue
+    }
+}
+
+if (-not (Test-Path -LiteralPath $PythonPath -PathType Leaf)) {
+    throw "未找到 Python 虚拟环境:$PythonPath。请先执行 python -m venv .venv 并安装后端依赖。"
+}
+if (-not (Test-Path -LiteralPath $BackendDir -PathType Container)) {
+    throw "未找到后端目录:$BackendDir"
+}
+if (-not $SkipFrontend) {
+    if (-not (Get-Command npm.cmd -ErrorAction SilentlyContinue)) {
+        throw "未找到 npm.cmd,请先安装 Node.js 并确认 npm 已加入 PATH。"
+    }
+    if (-not (Test-Path -LiteralPath $NodeModulesDir -PathType Container)) {
+        throw "未找到前端依赖目录:$NodeModulesDir。请先在 frontend 目录执行 npm install。"
+    }
+}
+
+$portsToCheck = @($BackendPort)
+if (-not $SkipFrontend) {
+    $portsToCheck += $FrontendPort
+}
+foreach ($port in $portsToCheck) {
+    if ($null -ne (Get-ListeningProcessId -Port $port)) {
+        throw "端口 $port 已被占用。请先关闭占用进程,或通过参数指定其他端口。"
+    }
+}
+
+New-Item -ItemType Directory -Path $LogDir -Force | Out-Null
+if (Test-Path -LiteralPath $StatePath) {
+    Remove-Item -LiteralPath $StatePath -Force
+}
+
+$backendProcess = $null
+$frontendProcess = $null
+try {
+    Write-Host "正在启动后端:http://127.0.0.1:$BackendPort" -ForegroundColor Cyan
+    $backendProcess = Start-Process `
+        -FilePath $PythonPath `
+        -ArgumentList @("-m", "uvicorn", "app.main:app", "--host", $BackendBindAddress, "--port", $BackendPort) `
+        -WorkingDirectory $BackendDir `
+        -WindowStyle Hidden `
+        -RedirectStandardOutput (Join-Path $LogDir "backend.out.log") `
+        -RedirectStandardError (Join-Path $LogDir "backend.err.log") `
+        -PassThru
+    $backendListenerPid = Wait-ForPort -Port $BackendPort -Launcher $backendProcess
+
+    $frontendListenerPid = $null
+    if (-not $SkipFrontend) {
+        Write-Host "正在启动前端:http://127.0.0.1:$FrontendPort" -ForegroundColor Cyan
+        $originalApiBaseUrl = $env:VITE_API_BASE
+        try {
+            if (-not [string]::IsNullOrWhiteSpace($ApiBaseUrl)) {
+                $env:VITE_API_BASE = $ApiBaseUrl.TrimEnd("/")
+            } elseif ($BackendPort -ne 8000) {
+                # 自定义后端端口时,为本机访问自动同步前端 API 地址。
+                $env:VITE_API_BASE = "http://127.0.0.1:$BackendPort"
+            }
+            $frontendProcess = Start-Process `
+                -FilePath "npm.cmd" `
+                -ArgumentList @("run", "dev", "--", "--host", $FrontendBindAddress, "--port", $FrontendPort) `
+                -WorkingDirectory $FrontendDir `
+                -WindowStyle Hidden `
+                -RedirectStandardOutput (Join-Path $LogDir "frontend.out.log") `
+                -RedirectStandardError (Join-Path $LogDir "frontend.err.log") `
+                -PassThru
+        } finally {
+            if ($null -eq $originalApiBaseUrl) {
+                Remove-Item Env:VITE_API_BASE -ErrorAction SilentlyContinue
+            } else {
+                $env:VITE_API_BASE = $originalApiBaseUrl
+            }
+        }
+        $frontendListenerPid = Wait-ForPort -Port $FrontendPort -Launcher $frontendProcess
+    }
+
+    $state = [ordered]@{
+        started_at = (Get-Date).ToString("o")
+        project_root = $ProjectRoot
+        backend = [ordered]@{
+            launcher_pid = $backendProcess.Id
+            listener_pid = $backendListenerPid
+            bind_address = $BackendBindAddress
+            port = $BackendPort
+        }
+        frontend = if ($SkipFrontend) {
+            $null
+        } else {
+            [ordered]@{
+                launcher_pid = $frontendProcess.Id
+                listener_pid = $frontendListenerPid
+                bind_address = $FrontendBindAddress
+                port = $FrontendPort
+                api_base_url = if (-not [string]::IsNullOrWhiteSpace($ApiBaseUrl)) {
+                    $ApiBaseUrl.TrimEnd("/")
+                } elseif ($BackendPort -ne 8000) {
+                    "http://127.0.0.1:$BackendPort"
+                } else {
+                    $null
+                }
+            }
+        }
+    }
+    $state | ConvertTo-Json -Depth 5 | Set-Content -LiteralPath $StatePath -Encoding UTF8
+
+    Write-Host "项目启动成功。" -ForegroundColor Green
+    Write-Host "后端:http://127.0.0.1:$BackendPort"
+    if (-not $SkipFrontend) {
+        $frontendUrl = "http://127.0.0.1:$FrontendPort"
+        Write-Host "前端:$frontendUrl"
+        if (-not $NoBrowser) {
+            Start-Process $frontendUrl
+        }
+    }
+    Write-Host "日志目录:$LogDir"
+    Write-Host "关闭项目:.\stop.ps1"
+} catch {
+    Write-Host "启动失败:$($_.Exception.Message)" -ForegroundColor Red
+    if ($null -ne $frontendProcess) {
+        Stop-ProcessTree -RootProcessId $frontendProcess.Id
+    }
+    if ($null -ne $backendProcess) {
+        Stop-ProcessTree -RootProcessId $backendProcess.Id
+    }
+    Remove-Item -LiteralPath $StatePath -Force -ErrorAction SilentlyContinue
+    throw
+}

+ 5 - 0
stop.cmd

@@ -0,0 +1,5 @@
+@echo off
+setlocal
+powershell.exe -NoProfile -ExecutionPolicy Bypass -File "%~dp0stop.ps1" %*
+if errorlevel 1 pause
+endlocal

+ 97 - 0
stop.ps1

@@ -0,0 +1,97 @@
+[CmdletBinding()]
+param()
+
+$ErrorActionPreference = "Stop"
+
+$ProjectRoot = Split-Path -Parent $MyInvocation.MyCommand.Path
+$StatePath = Join-Path $ProjectRoot ".runtime\processes.json"
+
+function Get-ListeningProcessId {
+    param([int]$Port)
+
+    $connection = Get-NetTCPConnection -LocalPort $Port -State Listen -ErrorAction SilentlyContinue |
+        Select-Object -First 1
+    if ($null -eq $connection) {
+        return $null
+    }
+    return [int]$connection.OwningProcess
+}
+
+function Get-ProcessCommandLine {
+    param([int]$ProcessId)
+
+    $process = Get-CimInstance Win32_Process -Filter "ProcessId = $ProcessId" -ErrorAction SilentlyContinue
+    return [string]$process.CommandLine
+}
+
+function Stop-ProcessTree {
+    param([int]$RootProcessId)
+
+    if ($RootProcessId -le 0 -or -not (Get-Process -Id $RootProcessId -ErrorAction SilentlyContinue)) {
+        return
+    }
+
+    # 递归收集并从最深层开始停止,确保 node/python 的子进程不会残留。
+    $pending = [System.Collections.Generic.List[int]]::new()
+    $pending.Add($RootProcessId)
+    $allProcessIds = [System.Collections.Generic.List[int]]::new()
+    while ($pending.Count -gt 0) {
+        $currentProcessId = $pending[0]
+        $pending.RemoveAt(0)
+        $allProcessIds.Add($currentProcessId)
+        Get-CimInstance Win32_Process -Filter "ParentProcessId = $currentProcessId" -ErrorAction SilentlyContinue |
+            ForEach-Object { $pending.Add([int]$_.ProcessId) }
+    }
+
+    $orderedProcessIds = @($allProcessIds)
+    [array]::Reverse($orderedProcessIds)
+    foreach ($processId in $orderedProcessIds) {
+        Stop-Process -Id $processId -Force -ErrorAction SilentlyContinue
+    }
+}
+
+$processIds = [System.Collections.Generic.HashSet[int]]::new()
+if (Test-Path -LiteralPath $StatePath -PathType Leaf) {
+    try {
+        $state = Get-Content -LiteralPath $StatePath -Raw | ConvertFrom-Json
+        foreach ($entry in @($state.backend, $state.frontend)) {
+            if ($null -eq $entry) {
+                continue
+            }
+            foreach ($value in @($entry.listener_pid, $entry.launcher_pid)) {
+                if ($null -ne $value -and [int]$value -gt 0) {
+                    [void]$processIds.Add([int]$value)
+                }
+            }
+        }
+    } catch {
+        Write-Warning "运行状态文件无法读取,将使用端口和命令行进行兜底检查。"
+    }
+}
+
+# 状态文件丢失时,只接管命令行明确属于本项目的默认端口进程。
+foreach ($port in @(8000, 5173)) {
+    $listenerPid = Get-ListeningProcessId -Port $port
+    if ($null -eq $listenerPid) {
+        continue
+    }
+    $commandLine = Get-ProcessCommandLine -ProcessId $listenerPid
+    $isBackend = $port -eq 8000 -and $commandLine -like "*uvicorn*app.main:app*"
+    $isFrontend = $port -eq 5173 -and $commandLine -like "*vite*"
+    if ($isBackend -or $isFrontend) {
+        [void]$processIds.Add($listenerPid)
+    }
+}
+
+if ($processIds.Count -eq 0) {
+    Remove-Item -LiteralPath $StatePath -Force -ErrorAction SilentlyContinue
+    Write-Host "项目当前未运行。" -ForegroundColor Yellow
+    exit 0
+}
+
+foreach ($processId in $processIds) {
+    Stop-ProcessTree -RootProcessId $processId
+}
+
+Remove-Item -LiteralPath $StatePath -Force -ErrorAction SilentlyContinue
+Write-Host "项目已关闭。" -ForegroundColor Green

+ 3 - 0
task.md

@@ -68,6 +68,9 @@
 - [x] 编写新的 workflow 数据格式与节点说明文档,并同步更新 API 文档
 - [x] 增加 workflow_key,支持按稳定 key 远程执行工作流
 - [x] 增加自动化远程执行 Token 设置,并为按 key 执行接口校验 X-Automation-Token
+- [x] 增加基于真实浏览器截图和多模态模型的网页搜索研究节点及工作流模板
+- [x] 将 workflow 执行改为按 key 创建异步任务,增加全局单任务队列、任务历史和状态查询
+- [x] 增加 workflow JSON 导入导出及可迁移的 AI 多轮网页研究 workflow
 
 ## 进度日志
 

+ 48 - 15
workflow-format.md

@@ -20,7 +20,8 @@
   "settings": {
     "max_steps": 100,
     "default_timeout_ms": 30000,
-    "on_unhandled_error": "pause_for_user"
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "result_node", "output": "data"}
   },
   "nodes": [],
   "edges": []
@@ -32,6 +33,7 @@
 - `workflow_key`:可选的稳定调用 key,只能使用字母、数字、下划线和连字符;适合手机快捷指令等远程入口按 key 执行。
 - `variables`:工作流变量。运行时可以通过接口传入同名变量覆盖默认值。
 - `settings.max_steps`:防止流程循环或异常跳转导致无限执行。
+- `settings.return`:异步任务完成后,从指定节点输出中提取 `return_data`。省略 `output` 时返回该节点的全部输出。
 - `nodes`:节点实例列表。
 - `edges`:节点之间的连线,包括控制流和数据流。
 
@@ -107,6 +109,16 @@
 - `keyboard.press`、`keyboard.hotkey`、`keyboard.key_down`、`keyboard.key_up`
 - `text.input`
 - `browser.open_url`
+- `browser.ensure_foreground`:打开或唤起浏览器,可选打开网址并最大化窗口。
+- `browser.control`:执行浏览器全屏、返回、刷新、关闭标签页等快捷控制。
+- `browser.video_action`:打开 YouTube/Bilibili/抖音视频,或在抖音视频流切换下一条。
+- `browser.web_search`:使用真实浏览器截图和多模态模型,完成搜索结果提取、去重排序、详情页研究和总结。
+- `research.ai_web_research`:让 AI 规划查询、执行多轮视觉搜索、评估目标并按 JSON Schema 生成最终数据。
+- `vision.locate_element`:截取当前屏幕,调用多模态 AI 定位目标元素的相对百分比位置,并输出换算后的屏幕坐标。
+- `vision.click_target`:定位目标并立即点击,减少常见“定位节点 + 鼠标节点”的重复搭线。
+- `vision.verify_page`:用多模态 AI 判断当前屏幕是否符合预期状态,并输出匹配分支。
+- `vision.close_popups`:尝试识别并点击关闭、跳过、稍后再说等弹窗按钮。
+- `media.control`:面向客厅遥控场景的播放暂停、全屏、静音、音量和上下条控制。
 - `program.start`、`program.stop`、`program.close_opened`
 - `screen.screenshot`
 - `wait.seconds`
@@ -114,26 +126,47 @@
 
 每个节点类型的参数、输入、输出和控制端口以 `GET /api/automation/workflow-nodes` 为准。前端节点库和属性面板应使用该接口动态生成。
 
-## 执行结果
+## 异步执行
 
-工作流运行接口按节点返回结果:
+工作流只通过稳定 key 执行:
+
+```text
+POST /api/automation/workflows/by-key/{workflow_key}/run
+```
+
+接口立即返回任务 ID。所有 workflow 共用一个串行队列,任意时刻只有一个任务运行。使用以下接口读取状态和结果:
+
+```text
+GET /api/automation/workflow-tasks/{task_id}
+GET /api/automation/workflow-tasks
+```
+
+任务完成后同时提供完整 `result` 和按 `settings.return` 提取的 `return_data`:
 
 ```json
 {
-  "workflow_id": 1,
+  "id": "task-uuid",
   "status": "SUCCESS",
-  "results": [
-    {
-      "node_id": "program_1",
-      "status": "SUCCESS",
-      "inputs": {},
-      "outputs": { "pid": 1234, "command": "notepad" }
-    }
-  ],
-  "outputs": {
-    "program_1": { "pid": 1234, "command": "notepad" }
-  }
+  "return_data": {"answer": "..."},
+  "result": {"status": "SUCCESS", "results": [], "outputs": {}}
 }
 ```
 
 如果节点需要用户判断,执行结果会返回 `status: "PAUSED"`,并包含暂停节点、问题和可选截图路径。节点失败时,后端会尽量保存当前屏幕截图到自动化错误目录,并在失败项的 `artifacts.screenshot_path` 返回路径,供前端展示给用户继续分析。
+
+## AI 多轮网页研究工作流
+
+仓库提供 `workflows/ai-web-research.workflow.json`。该 workflow 的 key 为 `ai-web-research`,变量包括:
+
+- `objective`:搜索和研究目标。
+- `output_schema`:最终 `data` 必须满足的 JSON Schema。
+- `constraints`:语言、最少来源数、必需域名等约束。
+- `max_attempts`:未达成目标时的最大搜索轮数。
+
+节点会先让本地 AI 制定查询计划,再循环调用视觉网页搜索。每轮结束后同时执行 JSON Schema 校验、来源约束校验和 AI 语义目标判断,直到目标达成或达到次数上限。
+
+## 导入导出
+
+- `POST /api/automation/workflows/import`:导入 workflow JSON,可选择遇到相同 key 时覆盖。
+- `GET /api/automation/workflows/by-key/{workflow_key}/export`:导出可迁移 JSON。
+- 前端自动化工作流页面提供“导入 JSON”和“导出”按钮。

+ 105 - 0
workflows/ai-web-research.workflow.json

@@ -0,0 +1,105 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "ai-web-research",
+  "name": "AI 多轮网页搜索研究",
+  "description": "AI 制定搜索计划,使用视觉浏览器多轮研究,并按调用方提供的 JSON Schema 返回结果。",
+  "variables": {
+    "objective": {
+      "type": "string",
+      "default": "",
+      "description": "要搜索和研究的目标"
+    },
+    "output_schema": {
+      "type": "object",
+      "default": {
+        "type": "object",
+        "required": ["summary", "facts"],
+        "properties": {
+          "summary": {"type": "string"},
+          "facts": {"type": "array", "items": {"type": "string"} }
+        },
+        "additionalProperties": false
+      },
+      "description": "最终 data 字段必须满足的 JSON Schema"
+    },
+    "constraints": {
+      "type": "object",
+      "default": {"language": "zh-CN", "min_sources": 1},
+      "description": "来源、语言、时间范围等研究约束"
+    },
+    "max_attempts": {
+      "type": "number",
+      "default": 3,
+      "description": "AI 评估未达标时最多执行的搜索轮数"
+    }
+  },
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 1800000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "research"}
+  },
+  "nodes": [
+    {
+      "id": "start",
+      "type": "flow.start",
+      "title": "开始",
+      "position": {"x": 80, "y": 180},
+      "params": {},
+      "inputs": {}
+    },
+    {
+      "id": "research",
+      "type": "research.ai_web_research",
+      "title": "AI 规划并循环研究",
+      "position": {"x": 360, "y": 180},
+      "params": {
+        "search_engine": "bing",
+        "browser": "edge",
+        "max_search_pages": 2,
+        "result_count": 2,
+        "detail_max_pages": 2
+      },
+      "inputs": {
+        "objective": {"source": "variable", "name": "objective"},
+        "output_schema": {"source": "variable", "name": "output_schema"},
+        "constraints": {"source": "variable", "name": "constraints"},
+        "max_attempts": {"source": "variable", "name": "max_attempts"}
+      }
+    },
+    {
+      "id": "end",
+      "type": "flow.end",
+      "title": "结束",
+      "position": {"x": 680, "y": 180},
+      "params": {},
+      "inputs": {}
+    }
+  ],
+  "edges": [
+    {
+      "id": "start_to_research",
+      "kind": "control",
+      "source": "start",
+      "source_port": "next",
+      "target": "research",
+      "target_port": "run"
+    },
+    {
+      "id": "research_to_end",
+      "kind": "control",
+      "source": "research",
+      "source_port": "success",
+      "target": "end",
+      "target_port": "run"
+    },
+    {
+      "id": "partial_to_end",
+      "kind": "control",
+      "source": "research",
+      "source_port": "partial",
+      "target": "end",
+      "target_port": "run"
+    }
+  ]
+}

+ 52 - 0
workflows/bilibili-home-random-video.workflow.json

@@ -0,0 +1,52 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "bilibili-home-random-video",
+  "name": "视觉点击:随机 Bilibili 首页推荐视频",
+  "description": "打开真实浏览器中的 Bilibili 首页,用多模态 AI 从当前首页推荐里随机定位一个视频并点击播放。",
+  "variables": {},
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 120000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {"id": "open", "type": "browser.open_url", "title": "打开 B 站首页", "position": {"x": 300, "y": 180}, "params": {"url": "https://www.bilibili.com/", "browser": "edge", "new_window": true}, "inputs": {}},
+    {"id": "wait", "type": "wait.seconds", "title": "等待页面加载", "position": {"x": 540, "y": 180}, "params": {"seconds": 6}, "inputs": {}},
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位随机首页视频",
+      "position": {"x": 780, "y": 180},
+      "params": {
+        "target_description": "从当前 Bilibili 首页可见的推荐视频卡片中随机选择一个可点击的视频封面或标题中心。不要选择顶部导航、搜索框、直播入口、广告横幅、番剧导航、用户头像或轮播空白区域。",
+        "screen_context": "Bilibili 首页推荐流,当前浏览器可能已登录用户账号。",
+        "randomize": true,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.2
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击播放",
+      "position": {"x": 1040, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 1280, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_open", "kind": "control", "source": "start", "source_port": "next", "target": "open", "target_port": "run"},
+    {"id": "open_to_wait", "kind": "control", "source": "open", "source_port": "success", "target": "wait", "target_port": "run"},
+    {"id": "wait_to_locate", "kind": "control", "source": "wait", "source_port": "success", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 58 - 0
workflows/bilibili-up-latest-video.workflow.json

@@ -0,0 +1,58 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "bilibili-up-latest-video",
+  "name": "视觉点击:指定 Bilibili UP 主最新视频",
+  "description": "打开指定 Bilibili UP 主的视频页,用多模态 AI 定位最新投稿并点击播放。",
+  "variables": {
+    "up_url": {
+      "type": "string",
+      "default": "https://space.bilibili.com/2/video",
+      "description": "Bilibili UP 主视频页地址,建议使用 /video 结尾"
+    }
+  },
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 120000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {"id": "open", "type": "browser.open_url", "title": "打开 UP 主视频页", "position": {"x": 300, "y": 180}, "params": {"browser": "edge", "new_window": true}, "inputs": {"url": {"source": "variable", "name": "up_url"}}},
+    {"id": "wait", "type": "wait.seconds", "title": "等待页面加载", "position": {"x": 540, "y": 180}, "params": {"seconds": 6}, "inputs": {}},
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位最新投稿",
+      "position": {"x": 780, "y": 180},
+      "params": {
+        "target_description": "定位 Bilibili UP 主视频列表中最新发布的公开视频,通常是投稿列表或视频网格中最靠前的第一个视频封面或标题中心。不要选择动态、合集、播放列表、头像、关注按钮、搜索框或导航标签。",
+        "screen_context": "Bilibili UP 主空间 /video 页面。",
+        "randomize": false,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.1
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击播放",
+      "position": {"x": 1040, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 1280, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_open", "kind": "control", "source": "start", "source_port": "next", "target": "open", "target_port": "run"},
+    {"id": "open_to_wait", "kind": "control", "source": "open", "source_port": "success", "target": "wait", "target_port": "run"},
+    {"id": "wait_to_locate", "kind": "control", "source": "wait", "source_port": "success", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 48 - 0
workflows/douyin-next-video.workflow.json

@@ -0,0 +1,48 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "douyin-next-video",
+  "name": "视觉点击:抖音视频下一条",
+  "description": "在当前抖音网页视频流中,用多模态 AI 定位下一条/向下切换按钮并点击。",
+  "variables": {},
+  "settings": {
+    "max_steps": 8,
+    "default_timeout_ms": 60000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位下一条按钮",
+      "position": {"x": 360, "y": 180},
+      "params": {
+        "target_description": "定位抖音视频流页面中用于切换到下一条视频的向下箭头、下一条按钮或页面右侧的下一个视频切换控件中心。不要选择点赞、评论、收藏、分享、头像、关注或搜索按钮。",
+        "screen_context": "当前已经停留在抖音网页视频播放页。",
+        "randomize": false,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.1
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击下一条",
+      "position": {"x": 620, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 860, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_locate", "kind": "control", "source": "start", "source_port": "next", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 58 - 0
workflows/douyin-random-video.workflow.json

@@ -0,0 +1,58 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "douyin-random-video",
+  "name": "视觉点击:打开随机抖音视频",
+  "description": "打开真实浏览器中的抖音推荐页,用多模态 AI 定位当前推荐视频区域并点击播放/聚焦。",
+  "variables": {
+    "douyin_url": {
+      "type": "string",
+      "default": "https://www.douyin.com/",
+      "description": "抖音推荐入口地址,通常保持默认即可"
+    }
+  },
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 120000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {"id": "open", "type": "browser.open_url", "title": "打开抖音推荐页", "position": {"x": 300, "y": 180}, "params": {"browser": "edge", "new_window": true}, "inputs": {"url": {"source": "variable", "name": "douyin_url"}}},
+    {"id": "wait", "type": "wait.seconds", "title": "等待页面加载", "position": {"x": 540, "y": 180}, "params": {"seconds": 8}, "inputs": {}},
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位当前视频",
+      "position": {"x": 780, "y": 180},
+      "params": {
+        "target_description": "定位当前抖音推荐页中正在展示的主视频画面中心,优先点击视频画面可播放区域。不要选择搜索框、头像、关注按钮、评论按钮、点赞按钮、分享按钮或浏览器控件。",
+        "screen_context": "抖音网页推荐视频流。",
+        "randomize": false,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.1
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击播放/聚焦",
+      "position": {"x": 1040, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 1280, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_open", "kind": "control", "source": "start", "source_port": "next", "target": "open", "target_port": "run"},
+    {"id": "open_to_wait", "kind": "control", "source": "open", "source_port": "success", "target": "wait", "target_port": "run"},
+    {"id": "wait_to_locate", "kind": "control", "source": "wait", "source_port": "success", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-browser-back.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-browser-back",
+  "name": "客厅遥控:浏览器返回",
+  "description": "对当前浏览器窗口执行 Alt+Left 返回上一页。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "browser"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "browser", "type": "browser.control", "title": "浏览器返回", "position": {"x": 340, "y": 160}, "params": {"action": "back"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_browser", "kind": "control", "source": "start", "source_port": "next", "target": "browser", "target_port": "run"},
+    {"id": "browser_to_end", "kind": "control", "source": "browser", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-close-browser.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-close-browser",
+  "name": "客厅遥控:关闭浏览器窗口",
+  "description": "对当前浏览器窗口执行 Alt+F4。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "browser"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "browser", "type": "browser.control", "title": "关闭窗口", "position": {"x": 340, "y": 160}, "params": {"action": "close_window"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_browser", "kind": "control", "source": "start", "source_port": "next", "target": "browser", "target_port": "run"},
+    {"id": "browser_to_end", "kind": "control", "source": "browser", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-escape.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-escape",
+  "name": "客厅遥控:退出/关闭浮层",
+  "description": "对当前窗口按 Escape,适合退出全屏或关闭浮层。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "browser"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "browser", "type": "browser.control", "title": "按 Escape", "position": {"x": 340, "y": 160}, "params": {"action": "escape"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_browser", "kind": "control", "source": "start", "source_port": "next", "target": "browser", "target_port": "run"},
+    {"id": "browser_to_end", "kind": "control", "source": "browser", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-fullscreen.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-fullscreen",
+  "name": "客厅遥控:网页全屏",
+  "description": "对当前视频页面按 F,切换网页播放器全屏。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "网页全屏", "position": {"x": 340, "y": 160}, "params": {"action": "site_fullscreen"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-mute.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-mute",
+  "name": "客厅遥控:系统静音",
+  "description": "切换 Windows 系统静音。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "系统静音", "position": {"x": 340, "y": 160}, "params": {"action": "system_mute"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-next.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-next",
+  "name": "客厅遥控:下一条",
+  "description": "对短视频/推荐流页面按方向键下,切换下一条。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "下一条", "position": {"x": 340, "y": 160}, "params": {"action": "next"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-play-pause.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-play-pause",
+  "name": "客厅遥控:播放/暂停",
+  "description": "对当前播放页面按 Space,切换播放或暂停。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "播放/暂停", "position": {"x": 340, "y": 160}, "params": {"action": "play_pause"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-previous.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-previous",
+  "name": "客厅遥控:上一条",
+  "description": "对短视频/推荐流页面按方向键上,切换上一条。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "上一条", "position": {"x": 340, "y": 160}, "params": {"action": "previous"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-volume-down.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-volume-down",
+  "name": "客厅遥控:音量减",
+  "description": "发送系统音量减按键。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "音量减", "position": {"x": 340, "y": 160}, "params": {"action": "volume_down"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 17 - 0
workflows/entertainment-volume-up.workflow.json

@@ -0,0 +1,17 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "entertainment-volume-up",
+  "name": "客厅遥控:音量加",
+  "description": "发送系统音量加按键。",
+  "variables": {},
+  "settings": {"max_steps": 5, "default_timeout_ms": 15000, "on_unhandled_error": "pause_for_user", "return": {"node_id": "media"}},
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 160}, "params": {}, "inputs": {}},
+    {"id": "media", "type": "media.control", "title": "音量加", "position": {"x": 340, "y": 160}, "params": {"action": "volume_up"}, "inputs": {}},
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 600, "y": 160}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_media", "kind": "control", "source": "start", "source_port": "next", "target": "media", "target_port": "run"},
+    {"id": "media_to_end", "kind": "control", "source": "media", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 65 - 0
workflows/youtube-channel-latest-video.workflow.json

@@ -0,0 +1,65 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "youtube-channel-latest-video",
+  "name": "视觉点击:指定 YouTube 主播最新视频",
+  "description": "打开指定 YouTube 主播的视频页,用多模态 AI 定位最靠前的公开视频封面或标题并点击播放。",
+  "variables": {
+    "channel_url": {
+      "type": "string",
+      "default": "https://www.youtube.com/@Google/videos",
+      "description": "YouTube 主播视频页地址,建议使用 /videos 结尾"
+    }
+  },
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 120000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {
+      "id": "open",
+      "type": "browser.open_url",
+      "title": "打开主播视频页",
+      "position": {"x": 300, "y": 180},
+      "params": {"browser": "edge", "new_window": true},
+      "inputs": {"url": {"source": "variable", "name": "channel_url"}}
+    },
+    {"id": "wait", "type": "wait.seconds", "title": "等待页面加载", "position": {"x": 540, "y": 180}, "params": {"seconds": 6}, "inputs": {}},
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位最新视频",
+      "position": {"x": 780, "y": 180},
+      "params": {
+        "target_description": "定位 YouTube 主播视频列表中最新发布的普通公开视频,通常是视频网格中最靠左、最靠上、排在第一位的视频封面或标题中心。不要选择 Shorts、播放列表、频道头像、订阅按钮或导航标签。",
+        "screen_context": "YouTube 主播 /videos 页面。",
+        "randomize": false,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.1
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击播放",
+      "position": {"x": 1040, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 1280, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_open", "kind": "control", "source": "start", "source_port": "next", "target": "open", "target_port": "run"},
+    {"id": "open_to_wait", "kind": "control", "source": "open", "source_port": "success", "target": "wait", "target_port": "run"},
+    {"id": "wait_to_locate", "kind": "control", "source": "wait", "source_port": "success", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}

+ 66 - 0
workflows/youtube-home-random-video.workflow.json

@@ -0,0 +1,66 @@
+{
+  "schema_version": "workflow/v1",
+  "workflow_key": "youtube-home-random-video",
+  "name": "视觉点击:随机 YouTube 首页推荐视频",
+  "description": "打开真实浏览器中的 YouTube 首页,用多模态 AI 从当前账号首页推荐里随机定位一个可点击视频,再用鼠标点击播放。",
+  "variables": {},
+  "settings": {
+    "max_steps": 10,
+    "default_timeout_ms": 120000,
+    "on_unhandled_error": "pause_for_user",
+    "return": {"node_id": "locate"}
+  },
+  "nodes": [
+    {"id": "start", "type": "flow.start", "title": "开始", "position": {"x": 80, "y": 180}, "params": {}, "inputs": {}},
+    {
+      "id": "open",
+      "type": "browser.open_url",
+      "title": "打开 YouTube 首页",
+      "position": {"x": 300, "y": 180},
+      "params": {"url": "https://www.youtube.com/", "browser": "edge", "new_window": true},
+      "inputs": {}
+    },
+    {
+      "id": "wait",
+      "type": "wait.seconds",
+      "title": "等待页面加载",
+      "position": {"x": 540, "y": 180},
+      "params": {"seconds": 6},
+      "inputs": {}
+    },
+    {
+      "id": "locate",
+      "type": "vision.locate_element",
+      "title": "AI 定位随机推荐视频",
+      "position": {"x": 780, "y": 180},
+      "params": {
+        "target_description": "从当前 YouTube 首页可见的普通推荐视频中随机选择一个可点击的视频封面或标题中心。不要选择 Shorts、广告、频道头像、导航栏、搜索框、登录按钮或通知按钮。",
+        "screen_context": "YouTube 首页推荐流,当前浏览器可能已登录用户账号。",
+        "randomize": true,
+        "save_screenshot": true,
+        "fail_if_not_found": true,
+        "temperature": 0.2
+      },
+      "inputs": {}
+    },
+    {
+      "id": "click",
+      "type": "mouse.click",
+      "title": "点击播放",
+      "position": {"x": 1040, "y": 180},
+      "params": {"button": "left", "clicks": 1, "duration": 0},
+      "inputs": {
+        "x": {"source": "node_output", "node_id": "locate", "output": "x"},
+        "y": {"source": "node_output", "node_id": "locate", "output": "y"}
+      }
+    },
+    {"id": "end", "type": "flow.end", "title": "结束", "position": {"x": 1280, "y": 180}, "params": {}, "inputs": {}}
+  ],
+  "edges": [
+    {"id": "start_to_open", "kind": "control", "source": "start", "source_port": "next", "target": "open", "target_port": "run"},
+    {"id": "open_to_wait", "kind": "control", "source": "open", "source_port": "success", "target": "wait", "target_port": "run"},
+    {"id": "wait_to_locate", "kind": "control", "source": "wait", "source_port": "success", "target": "locate", "target_port": "run"},
+    {"id": "locate_to_click", "kind": "control", "source": "locate", "source_port": "success", "target": "click", "target_port": "run"},
+    {"id": "click_to_end", "kind": "control", "source": "click", "source_port": "success", "target": "end", "target_port": "run"}
+  ]
+}