返回顶部
s

safari-control

Use Safari directly on macOS when work must happen in the user's real Safari session instead of a separate automation browser. Best for reading the current tab, inspecting the live session layout, operating Safari menu bar or native toolbar controls, reading page text and structure, running page JavaScript, waiting for page conditions, exporting page artifacts, and performing lightweight DOM interactions in the active Safari tab. If Safari JavaScript from Apple Events is disabled, guide the user

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.1
安全检测
已通过
78
下载量
1
收藏
概述
安装方式
版本历史

safari-control

# Safari Control Use this skill when the task must happen in **Safari itself**: - The user is already logged in there - Safari extensions, cookies, or profiles matter - You need the current tab, tab list, page URL, or page text - A headless browser would not reflect the real session Do **not** force Safari to behave like Playwright. Prefer Safari's session access and DOM helpers first, Safari native UI automation second, and `desktop-control` only for pointer-heavy UI. ## Core Tool Run commands from the `safari-control` skill directory so the relative `scripts/` path resolves correctly. ```bash swift scripts/safari_control.swift <command> ``` ## Recommended Flow Use this exact flow unless the task clearly starts in one specific layer: 1. Check environment and active page: ```bash swift scripts/safari_control.swift doctor swift scripts/safari_control.swift current --json ``` `doctor` is the branch point: - `safari_js=false`: do not use DOM commands until the user enables JavaScript from Apple Events. - `accessibility=false`: do not use native Safari chrome commands. - `screenshot_background=false`: use `screenshot --mode foreground` when you need an image. - If Safari is missing required permissions or features, tell the user before proceeding. In particular, check whether Safari allows Automation, JavaScript in the Smart Search field, and JavaScript from Apple Events. Enabling these unlocks more functionality, but also increases risk because automation and scripted interactions gain more power over the user's Safari session. If the user refuses, do not keep pushing the recommendation and continue with the reduced feature set. 2. If Safari session state matters, inspect it before mutating anything: ```bash swift scripts/safari_control.swift list-windows --json swift scripts/safari_control.swift list-tabs --json ``` 3. If you need page content or web-page actions, verify DOM access: ```bash swift scripts/safari_control.swift check-js swift scripts/safari_control.swift snapshot --interactive-only --limit 30 ``` 4. Choose the lowest layer that fits: - Prefer **DOM commands** for page content, forms, links, and standard web controls. - Use **native Safari controls or menus** for the smart search field, toolbar buttons, tab groups, and menu bar commands. - Use **desktop-control** only for extension popups, native pickers, pointer-heavy drag/drop, or UI Accessibility cannot expose. - When you want to draw the user's attention to important content on the page, you may use JavaScript to visually highlight it, such as adding a red outline, background tint, or temporary annotation. ## Command Layers ### 1. Environment and Session Use for machine-readable state and tab/window management. ```bash swift scripts/safari_control.swift doctor swift scripts/safari_control.swift current --json swift scripts/safari_control.swift list-windows --json swift scripts/safari_control.swift list-tabs --json swift scripts/safari_control.swift save-session ./session.json --front-only swift scripts/safari_control.swift restore-session ./session.json swift scripts/safari_control.swift new-window https://example.com swift scripts/safari_control.swift switch-window 2 swift scripts/safari_control.swift switch-tab 2 --window 1 swift scripts/safari_control.swift switch-tab-title 'Dashboard' swift scripts/safari_control.swift switch-tab-url '/dashboard' swift scripts/safari_control.swift duplicate-tab swift scripts/safari_control.swift close-tab --tab 2 --window 1 swift scripts/safari_control.swift close-window --window 2 ``` Rules: - Prefer `--json` for agent-readable output. - Most read-only page commands support `--window N --tab M`, so you can inspect background tabs without switching focus. - `save-session` before invasive work when preserving the user's layout matters. - `restore-session` is additive: it recreates windows and tabs, it does not merge with or close current ones. ### 2. Page Read and Inspect Use when JavaScript from Apple Events is enabled and the target is inside the web page. ```bash swift scripts/safari_control.swift snapshot --limit 20 --heading-limit 10 --form-limit 5 swift scripts/safari_control.swift snapshot --interactive-only --limit 30 swift scripts/safari_control.swift interactive --json swift scripts/safari_control.swift query 'form input' --json swift scripts/safari_control.swift element-info 'button[type=\"submit\"]' swift scripts/safari_control.swift find-text 'Buy now' --json swift scripts/safari_control.swift exists 'button[data-testid=\"submit\"]' swift scripts/safari_control.swift count 'button' swift scripts/safari_control.swift get-text --mode article swift scripts/safari_control.swift extract-links --json swift scripts/safari_control.swift extract-tables --json swift scripts/safari_control.swift run-js 'document.title' swift scripts/safari_control.swift eval-js 'document.querySelectorAll(\"a\").length' ``` Rules: - Prefer `snapshot` before manual probing on unfamiliar pages. - Prefer `snapshot --interactive-only` when you only need actionable controls and want smaller output. - Prefer `get-text --mode article` before `body`; it is usually less noisy. - Prefer `element-info` when a selector exists but behaves unexpectedly. ### 3. Page Actions and Waits Use for standard DOM interactions after confirming the target is inside the page. ```bash swift scripts/safari_control.swift focus 'input[name=\"email\"]' swift scripts/safari_control.swift focus-text 'Email' --selector 'input, textarea, select' swift scripts/safari_control.swift fill 'input[name=\"email\"]' 'user@example.com' swift scripts/safari_control.swift click 'button[type=\"submit\"]' swift scripts/safari_control.swift click-text 'Continue' --exact swift scripts/safari_control.swift select-option 'select[name=\"country\"]' CN --by value swift scripts/safari_control.swift check '#agree' swift scripts/safari_control.swift uncheck '#agree' swift scripts/safari_control.swift upload 'input[type=\"file\"]' ./avatar.png swift scripts/safari_control.swift submit 'form' swift scripts/safari_control.swift wait-selector 'form' --visible swift scripts/safari_control.swift wait-text 'Success' swift scripts/safari_control.swift wait-count '.result-item' 10 --op ge swift scripts/safari_control.swift wait-title 'Success' swift scripts/safari_control.swift wait-url '/success' swift scripts/safari_control.swift wait-download 'report-*.csv' swift scripts/safari_control.swift reload swift scripts/safari_control.swift back swift scripts/safari_control.swift forward ``` Rules: - Prefer `exists`, `count`, `wait-*`, and `snapshot` to branch explicitly before acting. - After an action, wait on a concrete condition instead of sleeping. - Use `press-key` and `press-shortcut` only for page-level DOM handlers, not Safari chrome. - `back`, `forward`, and `reload` return structured before/after state for change detection. ### 4. Native Safari UI Use when the target is Safari chrome, not the page. ```bash swift scripts/safari_control.swift list-menu-bar --json swift scripts/safari_control.swift list-menu-items View --json swift scripts/safari_control.swift click-menu View 'Reload Page' swift scripts/safari_control.swift list-native-controls --json swift scripts/safari_control.swift focus-native-control WEB_BROWSER_ADDRESS_AND_SEARCH_FIELD --field identifier --exact swift scripts/safari_control.swift set-native-value WEB_BROWSER_ADDRESS_AND_SEARCH_FIELD 'https://example.com' --field identifier --exact swift scripts/safari_control.swift press-native-control ReloadButton --field identifier --exact swift scripts/safari_control.swift perform-native-action WEB_BROWSER_ADDRESS_AND_SEARCH_FIELD AXConfirm --field identifier --exact swift scripts/safari_control.swift list-native-menu-items NewTabGroupButton --field identifier --exact --json swift scripts/safari_control.swift click-native-menu-item NewTabGroupButton 'New Empty Tab Group' --field identifier --exact swift scripts/safari_control.swift native-open-url https://example.com swift scripts/safari_control.swift native-search 'open claw' swift scripts/safari_control.swift native-search 'open claw' --confirm-mode both swift scripts/safari_control.swift press-system-shortcut Cmd+L swift scripts/safari_control.swift press-system-key Enter ``` Rules: - The menu examples in this section assume Safari is running with an English UI. - Start with `list-native-controls --json` before guessing identifiers. - The native control JSON now includes `pressable`, `focusable`, `settable`, and `menuable` fields. Use those before choosing `press-native-control`, `focus-native-control`, `set-native-value`, or menu actions. - `native-open-url` and `native-search` default to `--confirm-mode ax`. Use `enter` or `both` only if Safari's native confirm path is insufficient on that machine. - Use `list-menu-items` or `list-native-menu-items` before clicking when Safari is localized. ## Common Task Templates ### Read the current page ```bash swift scripts/safari_control.swift doctor swift scripts/safari_control.swift check-js swift scripts/safari_control.swift snapshot --interactive-only --limit 30 swift scripts/safari_control.swift get-text --mode article ``` ### Fill a form in the real Safari session ```bash swift scripts/safari_control.swift current --json swift scripts/safari_control.swift snapshot --interactive-only --limit 30 swift scripts/safari_control.swift fill 'input[name=\"email\"]' 'user@example.com' swift scripts/safari_control.swift click 'button[type=\"submit\"]' swift scripts/safari_control.swift wait-url '/success' ``` ### Operate Safari's own address bar or toolbar ```bash swift scripts/safari_control.swift doctor swift scripts/safari_control.swift list-native-controls --json swift scripts/safari_control.swift native-open-url https://example.com swift scripts/safari_control.swift press-native-control ReloadButton --field identifier --exact ``` ## Export and Evidence Use when you need durable files instead of transient terminal output. ```bash swift scripts/safari_control.swift screenshot ./safari.png swift scripts/safari_control.swift screenshot ./safari-foreground.png --mode foreground swift scripts/safari_control.swift snapshot-with-screenshot ./safari.png --path ./snapshot.json swift scripts/safari_control.swift save-html ./page.html swift scripts/safari_control.swift save-text ./page.txt --mode article swift scripts/safari_control.swift save-links ./links.json swift scripts/safari_control.swift save-tables ./tables.json swift scripts/safari_control.swift save-snapshot ./snapshot.json --interactive-only swift scripts/safari_control.swift save-page-bundle ./page-bundle --interactive-only --zip ``` Rules: - `auto` screenshot mode tries background capture first and falls back to foreground. - `snapshot-with-screenshot` and `save-page-bundle` can read a background tab via `--window` / `--tab`, but the screenshot still comes from the visible Safari window. ## JavaScript Access Requirement If `check-js` reports disabled, tell the user to enable: 1. Safari Settings 2. Advanced 3. Developer features if needed 4. Developer 5. `Allow JavaScript from Apple Events` Then retry `check-js`. If native Safari automation or script-driven workflows are blocked, also tell the user to verify Safari settings such as: 1. Allow Automation 2. Allow JavaScript in the Smart Search field 3. Allow JavaScript from Apple Events Explain that these settings enable more Safari-control features, but they also grant more power to automation and scripted interactions. If the user declines, continue with only the features that remain available. ## Release and Distribution These are maintenance commands, not the normal interaction path: ```bash swift scripts/safari_control.swift build ./dist swift scripts/safari_control.swift build ./dist --zip swift scripts/safari_control.swift release swift scripts/safari_control.swift release --name safari-control-demo --notes 'Internal preview build' swift scripts/safari_control.swift version ``` Use these when you want a compiled binary, a release directory, zip artifacts, or build metadata with git and environment information. ## Limits - Safari scripting is strong for session state, structured page inspection, waits, and standard DOM interactions. - Safari scripting is not a robust replacement for full browser automation. - Complex form workflows, native pickers, extension UIs, canvas apps, and pointer-heavy interactions should move to `desktop-control`. - `upload` is for small files injected into `input[type=file]`, not large native picker workflows. - `wait-download` is filesystem polling, not a Safari download API. - `export-cookies` exposes `document.cookie`, not HttpOnly cookies or the full Safari cookie jar. - `export-storage` only sees the current origin.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 safari-control-1776050702 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 safari-control-1776050702 技能

通过命令行安装

skillhub install safari-control-1776050702

下载 Zip 包

⬇ 下载 safari-control v1.0.1

文件大小: 39.12 KB | 发布时间: 2026-4-14 11:18

v1.0.1 最新 2026-4-14 11:18
## Changelog / 更新日志

### Initial release / 首次发布

`safari-control` is now available on Clawhub.
`safari-control` 现已发布到 Clawhub。

This first release brings full real-session Safari automation to macOS, built for workflows where the user's actual Safari state matters: logged-in sessions, cookies, open tabs, native Safari chrome, and live page content inside the real browser.
这是首个版本,提供面向 macOS 的真实 Safari 会话自动化能力,适用于用户真实 Safari 状态很重要的场景,例如登录态、Cookie、已打开标签页、Safari 原生界面元素,以及浏览器中的实时页面内容。

### What’s included / 包含功能

- Real Safari session control
真实 Safari 会话控制

- Read the current tab, inspect windows and tabs, switch tabs, duplicate tabs, open new tabs or windows, close tabs or windows, and save or restore Safari session layouts
读取当前标签页、查看窗口和标签页、切换标签页、复制标签页、新建标签页或窗口、关闭标签页或窗口,以及保存和恢复 Safari 会话布局

- Native Safari UI automation
Safari 原生界面自动化

- List Safari menu bar items and nested menu items
列出 Safari 菜单栏项目及其嵌套菜单项

- Click Safari menu commands
点击 Safari 菜单命令

- Inspect native Safari controls such as the smart search field, toolbar buttons, tab controls, and popup controls
检查 Safari 原生控件,例如智能搜索栏、工具栏按钮、标签页控件和弹出式控件

- Focus controls, press them, set values, open native menus, and invoke specific AX actions such as `AXConfirm`
聚焦控件、点击控件、设置值、打开原生菜单,并调用特定的辅助功能动作,例如 `AXConfirm`

- Live page inspection and DOM automation
实时页面检查与 DOM 自动化

- Run JavaScript in Safari tabs
在 Safari 标签页中运行 JavaScript

- Evaluate expressions as JSON-friendly output
将表达式求值为适合 JSON 的输出

- Read page text, HTML, links, tables, interactive elements, and form state
读取页面文本、HTML、链接、表格、可交互元素以及表单状态

- Query selectors, inspect elements, find visible text, and check element existence or counts
查询选择器、检查元素、查找可见文本,并检查元素是否存在及其数量

- Perform DOM actions including click, fill, focus, hover, scroll, drag, select, check, uncheck, upload, submit, keypress, shortcut dispatch, and custom event dispatch
执行 DOM 操作,包括点击、填写、聚焦、悬停、滚动、拖拽、选择、勾选、取消勾选、上传、提交、按键、快捷键派发以及自定义事件派发

- Real Safari wait and export workflows
真实 Safari 等待与导出工作流

- Wait for selectors, text, titles, URLs, JavaScript conditions, DOM mutations, element counts, and downloads
等待选择器、文本、标题、URL、JavaScript 条件、DOM 变化、元素数量以及下载完成

- Save HTML, extracted text, links, tables, snapshots, screenshots, and complete page bundles with manifests
保存 HTML、提取文本、链接、表格、快照、截图,以及带 manifest 的完整页面打包结果

- Packaging and release support
打包与发布支持

- Build standalone binaries
构建独立二进制

- Create release folders and zip archives
创建发布目录和 zip 压缩包

- Generate build and release manifests with checksums, git metadata, and environment metadata
生成带校验和、git 元数据和环境元数据的构建与发布 manifest

### v1 improvements included in this release / 本次 v1 发布包含的改进

- Structured JSON-based Safari session inspection for current tab, windows, and tabs
基于结构化 JSON 的 Safari 会话检查,覆盖当前标签页、窗口和标签页列表

- Rich native control metadata including `pressable`, `focusable`, `settable`, and `menuable`
更丰富的原生控件元数据,包括 `pressable`、`focusable`、`settable` 和 `menuable`

- Configurable smart search confirmation modes with `ax`, `enter`, and `both`
可配置的智能搜索栏确认模式,支持 `ax`、`enter` 和 `both`

- Post-action tab-state waiting for native URL open and native search flows
在原生打开 URL 和原生搜索流程中,支持动作完成后的标签页状态等待

- Agent-friendly workflow documentation organized by session, DOM, native UI, and export layers
面向 Agent 的工作流文档,按会话、DOM、原生 UI 和导出层进行组织

- Public-facing docs updated to use relative paths and English UI examples
对外文档已更新为相对路径,并使用英文界面示例

### Important note about Safari settings / 关于 Safari 设置的重要说明

To unlock more advanced functionality, users may need to enable Safari capabilities such as:
要启用更高级的功能,用户可能需要在 Safari 中开启以下能力:

- Allow Automation
允许自动化

- Allow JavaScript in the Smart Search field
允许在智能搜索栏中使用 JavaScript

- Allow JavaScript from Apple Events
允许 Apple 事件中的 JavaScript

These settings make more of `safari-control` available, especially for native Safari automation and script-driven workflows. They also increase risk, because automation and scripting gain more power over the user’s real Safari session. Users should enable them intentionally and only when they understand that tradeoff.
这些设置会让 `safari-control` 的更多功能可用,尤其是 Safari 原生自动化和脚本驱动工作流;但与此同时也会带来更高风险,因为自动化和脚本会获得对用户真实 Safari 会话更强的控制能力。用户应在理解这一权衡的前提下,主动决定是否开启。

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部