Browser Automation
Use this skill when you need the browser tool for anything beyond a single page check.
Operating Loop
- Check browser state before acting:
openclaw browser doctororaction="status"when the browser/plugin setup itself may be broken.action="status"for availability.action="profiles"if login state or profile choice matters.action="tabs"before opening a new tab if retries/timeouts may have left windows behind.
- Prefer stable tab handles:
- Open important tabs with
label, for examplelabel="meet". - After
action="tabs"oraction="open", storesuggestedTargetIdand pass it astargetIdin later calls. suggestedTargetIdis the label when one exists, otherwise the stabletabIdhandle liket1.- Avoid relying on raw DevTools
targetIdexcept for immediate diagnostics; it can change under Chromium target replacement.
- Open important tabs with
- Read before you click:
- Use
action="snapshot"on the intendedtargetId. - Use the same
targetIdfor follow-up actions so refs stay on the same tab. - For durable Playwright refs, request
refs="aria"when supported. If you receiveaxNrefs fromsnapshotFormat="aria", use them only after that same snapshot call; stale or unboundaxNrefs fail fast and need a fresh snapshot. - Use
urls=truewhen link text is ambiguous or a direct navigation target would avoid brittle clicks. - Use
labels=trueon snapshot or screenshot when visual position matters. On Playwright-backed profiles, the response includes anannotationsarray ({ref, number, role, name?, box}) with each ref's bounding box in the captured image's coordinate space, so you can reason about position without re-snapshotting; screenshot labels can also combine withfullPage=true(CLI:--full-page) to label the whole document, orref/elementto clip to one element.profile="user"and other existing-session (chrome-mcp) profiles render an overlay into page screenshots but do not attachannotationsor use the Playwright full-page/ref/element projection helper, so read positions from the labeled image itself on those profiles. The raw-CDP fallback (no Playwright) does not support labeled screenshots at all and returns a 501, so only requestlabelswhen Playwright is available.
- Use
- Act narrowly:
- Prefer
action="act"with a ref from the latest snapshot. - After navigation, modal changes, or form submission, snapshot again before the next action.
- Avoid blind waits. Wait for visible UI state when possible.
- Prefer
- Report real blockers:
- If the page needs login, permission, captcha, 2FA, camera/microphone approval, or another manual step, stop and tell the user exactly what is needed.
- Do not claim the browser is not logged in just because the current page shows a permission or onboarding dialog. Inspect the visible UI first.
Tab Hygiene
Before creating a tab for a named task, list tabs and reuse an existing matching label or URL when it is still usable.
Example:
{ "action": "tabs" }
If no suitable tab exists:
{ "action": "open", "url": "https://example.com", "label": "task" }
Then target it by label:
{ "action": "snapshot", "targetId": "task", "refs": "aria" }
If a retry creates duplicates, close the extras by tabId:
{ "action": "close", "targetId": "t3" }
Do not pass bare numbers like "2" as targetId. Numeric tab positions are only for the CLI openclaw browser tab select 2 helper; browser tool calls need a suggestedTargetId, label, tabId, or raw target id.
Stale Ref Recovery
If an action fails with a missing or stale ref:
- Snapshot the same
targetIdagain. - Find the current visible control.
- Retry once with the new ref.
- If the UI moved to a blocker state, report the blocker instead of looping.
Existing User Browser
Use profile="user" only when existing cookies/login matter. This attaches to the user's running Chromium-based browser.
For profile="user" and other existing-session profiles, omit timeoutMs on act:type, evaluate, hover, scrollIntoView, drag, select, and fill; that driver rejects per-call timeout overrides for those actions.
Google Meet Notes
When creating or joining a Meet:
- Treat camera/microphone permission screens as progress, not login failure.
- If asked whether people can hear you, click the microphone option when voice is required.
- If Google asks for sign-in, 2FA, account chooser confirmation, or permission that needs user approval, report the exact manual action.
- Use one labeled tab per meeting flow, for example
label="meet", and reuse it during retries.