browser-automation

Use when controlling web pages with the OpenClaw browser tool, especially multi-step flows, login checks, tab management, or recovery from stale refs/timeouts.

browser-automation 是什么？

browser-automation is a Claude Code agent skill that use when controlling web pages with the OpenClaw browser tool, especially multi-step flows, login checks, tab management, or recovery from stale refs/timeouts.

兼容平台~Claude Code~Codex CLI~Cursor

npx skills add https://github.com/clawdbot/clawdbot/tree/main/extensions/browser/skills/browser-automation

文档

Browser Automation

Use this skill when you need the browser tool for anything beyond a single page check.

Operating Loop

Check browser state before acting:
- openclaw browser doctor or action="status" when the browser/plugin setup itself may be broken.
- action="status" for availability.
- action="profiles" if login state or profile choice matters.
- action="tabs" before opening a new tab if retries/timeouts may have left windows behind.
Prefer stable tab handles:
- Open important tabs with label, for example label="meet".
- After action="tabs" or action="open", store suggestedTargetId and pass it as targetId in later calls.
- suggestedTargetId is the label when one exists, otherwise the stable tabId handle like t1.
- Avoid relying on raw DevTools targetId except for immediate diagnostics; it can change under Chromium target replacement.
Read before you click:
- Use action="snapshot" on the intended targetId.
- Use the same targetId for follow-up actions so refs stay on the same tab.
- For durable Playwright refs, request refs="aria" when supported. If you receive axN refs from snapshotFormat="aria", use them only after that same snapshot call; stale or unbound axN refs fail fast and need a fresh snapshot.
- Use urls=true when link text is ambiguous or a direct navigation target would avoid brittle clicks.
- Use labels=true on snapshot or screenshot when visual position matters. On Playwright-backed profiles, the response includes an annotations array ({ref, number, role, name?, box}) with each ref's bounding box in the captured image's coordinate space, so you can reason about position without re-snapshotting; screenshot labels can also combine with fullPage=true (CLI: --full-page) to label the whole document, or ref / element to clip to one element. profile="user" and other existing-session (chrome-mcp) profiles render an overlay into page screenshots but do not attach annotations or use the Playwright full-page/ref/element projection helper, so read positions from the labeled image itself on those profiles. The raw-CDP fallback (no Playwright) does not support labeled screenshots at all and returns a 501, so only request labels when Playwright is available.
Act narrowly:
- Prefer action="act" with a ref from the latest snapshot.
- After navigation, modal changes, or form submission, snapshot again before the next action.
- Avoid blind waits. Wait for visible UI state when possible.
Report real blockers:
- If the page needs login, permission, captcha, 2FA, camera/microphone approval, or another manual step, stop and tell the user exactly what is needed.
- Do not claim the browser is not logged in just because the current page shows a permission or onboarding dialog. Inspect the visible UI first.

Tab Hygiene

Before creating a tab for a named task, list tabs and reuse an existing matching label or URL when it is still usable.

Example:

{ "action": "tabs" }

If no suitable tab exists:

{ "action": "open", "url": "https://example.com", "label": "task" }

Then target it by label:

{ "action": "snapshot", "targetId": "task", "refs": "aria" }

If a retry creates duplicates, close the extras by tabId:

{ "action": "close", "targetId": "t3" }

Do not pass bare numbers like "2" as targetId. Numeric tab positions are only for the CLI openclaw browser tab select 2 helper; browser tool calls need a suggestedTargetId, label, tabId, or raw target id.

Stale Ref Recovery

If an action fails with a missing or stale ref:

Snapshot the same targetId again.
Find the current visible control.
Retry once with the new ref.
If the UI moved to a blocker state, report the blocker instead of looping.

Existing User Browser

Use profile="user" only when existing cookies/login matter. This attaches to the user's running Chromium-based browser.

On macOS, action="importprofile" is the alternative when the agent should use an isolated managed browser with cookies copied from a real Chrome-family profile. First use action="profiles" and inspect systemProfiles, then import into a fresh managed profile name. Import asks for one Keychain/Touch ID consent prompt. It copies cookies, not local storage or IndexedDB; device-bound session credentials (DBSC) mean some Google sessions may still require re-authentication.

For profile="user" and other existing-session profiles, omit timeoutMs on act:type, hover, scrollIntoView, drag, select, and fill; that driver rejects per-call timeout overrides for those actions. act:evaluate accepts timeoutMs.

Google Meet Notes

When creating or joining a Meet:

Treat camera/microphone permission screens as progress, not login failure.
If asked whether people can hear you, click the microphone option when voice is required.
If Google asks for sign-in, 2FA, account chooser confirmation, or permission that needs user approval, report the exact manual action.
Use one labeled tab per meeting flow, for example label="meet", and reuse it during retries.

browser-automation

browser-automation 是什么？

在你喜欢的 AI 中提问

文档

Browser Automation

Operating Loop

Tab Hygiene

Stale Ref Recovery

Existing User Browser

Google Meet Notes

Individual skills in this repo

1password

acp-router

agent-transcript

apple-notes

apple-reminders

autoreview

bear-notes

blacksmith-testbox

blogwatcher

blucli

bluebubbles

camsnap

canvas

channel-message-flows

clawdtributor

clawhub

claw-score

clawsweeper

clownfish-cloud-pr

codex-review

相关技能

steipete/bluebubbles

steipete/eightctl

steipete/blucli

steipete/bear-notes

steipete/camsnap

steipete/gifgrep