OpenBrowse

Tools Reference

Complete reference for all browser agent tools.

The agent comes equipped with a comprehensive suite of tools. When you ask it to perform a task, it autonomously selects and chains these tools to get the job done.

Security & Approvals

Sensitive actions (like executeOnPage or modifying memory) prompt for your explicit approval. Choose Allow once, or Always allow on this site to streamline repeat workflows.

Tab handles

Every tab-interacting tool takes an explicit tab argument — a stable handle (t1, t2, ...) that identifies which tab the call should act on. Handles are minted by navigate (when it opens a new tab) and selectTab (when binding an external tab into the conversation). Once minted, a handle keeps referring to the same tab on subsequent turns and across service-worker restarts.

The agent sees the live set of available handles in a ## Tabs in this conversation block injected into its system prompt every turn. Calling listTabs returns the same shape and is the way to discover tabs the user has open elsewhere.

To bootstrap a fresh conversation, the agent calls navigate({ url }) with no tab arg — that opens a new background tab and returns the new handle in the response.

Browsing & Interaction

The agent navigates the web and interacts with elements just like a human user. Each tool below requires tab (except navigate, where omitting tab opens a new tab).

ToolDescription
navigate{ url, tab? }. With tab: navigate that tab. Without tab: open a new background tab and return its handle.
snapshot{ tab, mode?, selector?, diff? }. Capture a structural accessibility-tree snapshot of a tab.
readPage{ tab }. Extract text, links, and metadata from a tab.
screenshot{ tab, annotate?, fullPage? }. Capture a visual screenshot of a tab.
scrollPage{ tab, direction, amount? }. Scroll up or down.
clickElement{ tab, target }. Click a specific element by its @ref (from a snapshot of THAT tab) or CSS selector.
typeInElement{ tab, target, text, clearFirst?, submit? }. Type into an input. submit: true presses Enter and waits for navigation.

Tab Management

ToolDescription
listTabsList all open tabs across windows and spaces; emits handles.
selectTab{ tab }. Bind an external tab into this conversation so it appears in the legend and can be passed as tab to other tools.

Extraction

ToolDescription
extract{ tab, instruction, selector?, schema? }. Extract structured data from a tab using its accessibility tree. Preferred over raw DOM-scraping via executeOnPage for text-based data like search results, product lists, table rows, or articles. URL fields ({ type: "string", format: "uri" }) are substituted with numeric IDs and rehydrated to prevent hallucination.

Code Execution

The agent can execute code in three sandboxed environments depending on what it's doing.

ToolDescriptionSecurity
executeCodeRun JavaScript in an isolated worker (no DOM access). Best for data processing.Safe (Isolated)
executeOnPage{ tab, code, args? }. Run JavaScript directly in a tab's page context. Use when you need DOM access.Requires Approval
executePythonRun CPython 3 in Pyodide, bundled in a sandboxed iframe. The conversation's workspace is mounted at /workspace; /skills is mounted read-only. Optional network access via pyfetch.Safe (Sandboxed)

For Python specifically, the agent loads the bundled python-env skill on demand to pick up Pyodide-specific gotchas (no subprocess, mismatched import names like import fitz for pymupdf, top-level await, latin-1 PDF unicode pitfalls, etc.).

Filesystem

These tools operate on the conversation's per-conversation virtual Workspace, backed by OPFS.

ToolDescription
ReadRead a file. Returns lines prefixed with line numbers; supports offset + limit. Can also read skill bundle files via a /skills/<name>/… path.
WriteCreate or overwrite a file. Auto-creates parent directories.
EditExact string replacement in an existing file.
GlobFind files matching a glob pattern (e.g. src/**/*.ts).
GrepRegex search across file contents. Returns path:line:content.
LSList files and directories in a folder.

Skills

Skills let the agent load curated, on-demand instructions instead of carrying everything in its system prompt. See Skills for the full picture.

ToolDescription
skillLoad a specific skill's SKILL.md into the conversation.
install_skillInstall a skill from a GitHub repository or URL.
create_skillAuthor and persist a new skill into the local registry.

To read a supporting file bundled with a skill (reference docs, scripts), the agent uses the Read tool with the skill's path, e.g. Read({ file_path: "/skills/<name>/references/<file>" }).

Task Planning

For complex, multi-step operations, the agent maintains a structured plan visible in the side panel's Cowork → Progress card.

ToolDescription
todoWriteCreate or update the conversation's persistent task list. Status changes (pending → in_progress → completed) drive the Progress UI.

Memory

Persistent memory across conversations. See Memory for details.

ToolDescription
saveMemoryStore a new piece of information.
recallMemorySearch previously saved memories.
updateMemoryUpdate an existing memory entry.
deleteMemoryRemove a memory entry.

Connectors (MCP)

Any Connector you enable adds its own MCP tools to the agent's toolset dynamically. Names are namespaced by connector (e.g. github_search_issues, linear_create_issue).

On this page