Browserbase

--- name: browserbase description: 'Browser automation, Fetch API, Search API, serverless Functions, and platform management for AI agents.' compatibility: 'Node.js 18+. API key fr

company route


name: browserbase description: 'Browser automation, Fetch API, Search API, serverless Functions, and platform management for AI agents.' compatibility: 'Node.js 18+. API key from https://browserbase.com/settings.' license: MIT allowed-tools: Bash

Browserbase

The complete guide to using Browserbase with AI agents. This covers all Browserbase capabilities:

Quick Setup

Before running any commands, present the user with a preliminary setup checklist:

Here's what I'll do to get you set up:

- [ ] Install/update prerequisites (Node.js, Browserbase CLI)
- [ ] Configure Browserbase credentials
- [ ] Set up your project
- [ ] Verify everything works

Shall I proceed?

Wait for the user to confirm before continuing.

Step 1 — Install the CLI:

npm install -g @browserbasehq/cli

Step 2 — Install agent skills:

bb skills --install

Step 3 — Set credentials:

Sign up at browserbase.com if you don't have an account. Then get your API key from browserbase.com/settings.

export BROWSERBASE_API_KEY="your_api_key"

Step 4 — Verify setup:

bb projects list

If this returns your project, you're ready. If this fails or BROWSERBASE_API_KEY is not set, direct the user to browserbase.com/settings to copy their API key and project ID, then:

export BROWSERBASE_API_KEY="their_key"

Do not proceed until bb projects list returns successfully.

Step 5 (optional) — Install the browse CLI for browser automation:

npm install -g @browserbasehq/browse-cli

Choosing the Right Tool

Task Tool Why
Browse a website, click, type, scrape JS pages browse CLI Full browser with interaction
Get HTML/JSON from a static page Fetch API Fast, no browser needed
Find URLs for a topic Search API Structured results, no browsing
Run automation on a schedule or webhook Functions Serverless cloud execution
Manage sessions, projects, contexts bb CLI Platform administration

Browser Automation

Automate browser interactions using the browse CLI.

Setup

which browse || npm install -g @browserbasehq/browse-cli

Environment Selection (Local vs Remote)

The CLI supports explicit per-session environment overrides. If you do nothing, the next session defaults to Browserbase when BROWSERBASE_API_KEY is set and to local otherwise.

Local mode

Remote mode (Browserbase)

Core Commands

Navigation

browse open <url>                        # Go to URL
browse open <url> --context-id <id>      # Load Browserbase context (remote only)
browse open <url> --context-id <id> --persist  # Load context + save changes back
browse reload                            # Reload current page
browse back                              # Go back in history
browse forward                           # Go forward in history

Page State

browse snapshot                          # Accessibility tree with element refs (preferred)
browse screenshot [path]                 # Visual screenshot (slower, uses vision tokens)
browse get url                           # Current URL
browse get title                         # Page title
browse get text <selector>              # Text content ("body" for all text)
browse get html <selector>              # HTML content of element
browse get value <selector>             # Form field value

Use browse snapshot as your default for understanding page state. Only use browse screenshot when you need visual context.

Interaction

browse click <ref>                       # Click element by ref from snapshot (e.g., @0-5)
browse type <text>                       # Type into focused element
browse fill <selector> <value>           # Fill input and press Enter
browse select <selector> <values...>     # Select dropdown option(s)
browse press <key>                       # Press key (Enter, Tab, Escape, Cmd+A, etc.)
browse drag <fromX> <fromY> <toX> <toY>  # Drag between points
browse scroll <x> <y> <deltaX> <deltaY> # Scroll at coordinates
browse wait <type> [arg]                 # Wait for: load, selector, timeout
browse is visible <selector>             # Check if element is visible
browse is checked <selector>             # Check if element is checked

Session Management

browse stop                              # Stop the browser daemon (also clears env override)
browse status                            # Check daemon status (includes env)
browse env                               # Show current environment (local or remote)
browse env local                         # Use clean isolated local browser
browse env local --auto-connect          # Reuse existing Chrome, fallback to isolated
browse env local <port|url>              # Attach to a specific CDP target
browse env remote                        # Switch to Browserbase (requires API keys)
browse pages                             # List all open tabs
browse tab_switch <index>                # Switch to tab by index
browse tab_close [index]                 # Close tab

Advanced

browse eval <expression>                 # Evaluate JavaScript in page
browse viewport <width> <height>         # Set viewport size
browse network on                        # Start capturing network requests
browse network off                       # Stop capturing
browse highlight <selector>              # Highlight element for debugging
browse --json <command>                  # Output as JSON
browse --session <name> <command>        # Named sessions for multiple browsers

Typical Workflow

  1. browse open <url> — navigate to the page
  2. browse snapshot — read the accessibility tree and get element refs
  3. browse click <ref> / browse t<text> / browse fill <selector> <value> — interact
  4. browse snapshot — confirm the action worked
  5. Repeat 3-4 as needed
  6. browse stop — close the browser when done

Mode Comparison

Feature Local Browserbase
Speed Faster Slightly slower
Setup Chrome required API key required
Reuse existing local cookies With browse env local --auto-connect N/A
Stealth mode No Yes (custom Chromium, anti-bot fingerprinting)
CAPTCHA solving No Yes (automatic reCAPTCHA/hCaptcha)
Residential proxies No Yes (201 countries, geo-targeting)
Session persistence No Yes (cookies/auth persist via contexts)
Best for Development/simple pages Protected sites, bot detection, production scraping

Troubleshooting


Fetch API

Fetch a page and return its content, headers, and metadata — no browser session required.

When to Use

Use Fetch for simple HTTP requests where you don't need JavaScript execution. Use the Browser skill when you need to interact with or render the page.

Using with cURL

curl -X POST "https://api.browserbase.com/v1/fetch" \
  -H "Content-Type: application/json" \
  -H "X-BB-API-Key: $BROWSERBASE_API_KEY" \
  -d '{"url": "https://www.browserbase.com"}'

Using with the bb CLI

bb fetch https://www.browserbase.com
bb fetch https://www.browserbase.com --allow-redirects
bb fetch https://www.browserbase.com --proxies --output page.html

Request Options

Field Type Default Description
url string required The URL to fetch
allowRedirects boolean false Follow HTTP redirects
allowInsecureSsl boolean false Bypass TLS verification
proxies boolean false Enable proxy support

Using with SDKs

Node.js / TypeScript:

npm install @browserbasehq/sdk
import {Browserbase} from '@browserbasehq/sdk'

const bb = new Browserbase({apiKey: process.env.BROWSERBASE_API_KEY})

const response = await bb.fetchAPI.create({
  url: 'https://www.browserbase.com',
  allowRedirects: true,
})

console.log(response.statusCode) // 200
console.log(response.content) // page HTML

Python:

pip install browserbase
from browserbase import Browserbase
import os

bb = Browserbase(api_key=os.environ["BROWSERBASE_API_KEY"])

response = bb.fetch_api.create(
    url="https://www.browserbase.com",
    allow_redirects=True,
)

print(response.status_code)  # 200
print(response.content)      # page HTML

Response

Field Type Description
id string Request identifier
statusCode integer HTTP status code
headers object Response headers
content string Response body
contentType string MIME type
encoding string Character encoding

Error Handling

Status Meaning
400 Invalid request body
429 Concurrent request limit exceeded
502 Response too large or TLS verification failed
504 Request timed out (60s default)

Search API

Search the web and return structured results — no browser session required.

When to Use

Use Search to find URLs and metadata. Use Fetch to retrieve content from those URLs. Use Browser when you need to interact with the pages.

Using with cURL

curl -X POST "https://api.browserbase.com/v1/search" \
  -H "Content-Type: application/json" \
  -H "X-BB-API-Key: $BROWSERBASE_API_KEY" \
  -d '{"query": "browser automation", "numResults": 5}'

Using with the bb CLI

bb search "browser automation"
bb search "web scraping" --num-results 5
bb search "AI agents" --output results.json

Request Options

Field Type Default Description
query string required The search query
numResults integer 10 Number of results (1-25)

Response

Returns a JSON object containing:

Field Type Description
requestId string Unique identifier for the search request
query string The search query that was executed
results array List of search result objects

Each result object contains:

Field Type Description
id string Result identifier
url string URL of the result
title string Title of the result
author string? Author (if available)
publishedDate string? Publication date (if available)
image string? Image URL (if available)
favicon string? Favicon URL (if available)

Error Handling

Status Meaning
400 Invalid query or parameters
403 Invalid or missing API key
429 Rate limit exceeded
500 Internal server error

Functions

Deploy serverless browser automation as cloud functions.

Prerequisites

export BROWSERBASE_API_KEY="your_api_key"

Create a Function

pnpm dlx @browserbasehq/sdk-functions init my-function
cd my-function
pnpm install

Add credentials to .env:

echo "BROWSERBASE_API_KEY=$BROWSERBASE_API_KEY" >> .env

Function Structure

import {defineFn} from '@browserbasehq/sdk-functions'
import {chromium} from 'playwright-core'

defineFn('my-function', async (context) => {
  const {session, params} = context

  const browser = await chromium.connectOverCDP(session.connectUrl)
  const page = browser.contexts()[0]!.pages()[0]!

  await page.goto(params.url || 'https://example.com')
  const title = await page.title()

  return {success: true, title}
})

Development

pnpm bb dev index.ts                    # Start dev server at http://127.0.0.1:14113

Test locally:

curl -X POST http://127.0.0.1:14113/v1/functions/my-function/invoke \
  -H "Content-Type: application/json" \
  -d '{"params": {"url": "https://news.ycombinator.com"}}'

Deploy

pnpm bb publish index.ts                # Deploy to Browserbase

Save the Function ID from the output — you need it to invoke remotely.

Invoke Deployed Functions

# Via bb CLI
bb functions invoke <function_id> --params '{"url":"https://example.com"}'

# Via cURL
curl -X POST "https://api.browserbase.com/v1/functions/<function_id>/invoke" \
  -H "Content-Type: application/json" \
  -H "X-BB-API-Key: $BROWSERBASE_API_KEY" \
  -d '{"params": {"url": "https://example.com"}}'

Quick Reference| Command | Description |

|---------|-------------| | pnpm dlx @browserbasehq/sdk-functions init <name> | Create new project | | pnpm bb dev <file> | Start local dev server | | pnpm bb publish <file> | Deploy to Browserbase | | bb functions invoke <id> --params '{...}' | Invoke deployed function | | bb functions invoke --check-status <invocation_id> | Poll invocation status |


Browserbase CLI

The bb CLI for platform management, Functions workflows, and API operations.

Setup

which bb || npm install -g @browserbasehq/cli
bb --help

Platform APIs

Sessions

bb sessions list
bb sessions list --q "user_metadata['userId']:'123'"
bb sessions create --proxies --advanced-stealth --region us-east-1
bb sessions create --solve-captchas --context-id ctx_abc --persist
bb sessions get <session_id>
bb sessions update <session_id> --status REQUEST_RELEASE
bb sessions debug <session_id>
bb sessions logs <session_id>
bb sessions recording <session_id>
bb sessions downloads get <session_id> --output session-artifacts.zip
bb sessions uploads create <session_id> ./file.txt

Projects

bb projects list
bb projects get <project_id>
bb projects usage <project_id>

Contexts

bb contexts create --body '{"region":"us-west-2"}'
bb contexts get <context_id>
bb contexts update <context_id>
bb contexts delete <context_id>

Extensions

bb extensions upload ./my-extension.zip
bb extensions get <extension_id>
bb extensions delete <extension_id>

Templates

bb templates list
bb templates list --language python
bb templates clone form-filling --language typescript
bb templates clone amazon-product-scraping --language python ./my-scraper

Common Flags

Platform API commands (sessions, projects, contexts, extensions, fetch, search):

Functions commands (bb functions ...):

Best Practices

  1. Use bb --help and subcommand --help before guessing flags.
  2. Use --output on fetch and search to save results to a file.
  3. Use environment variables for auth unless you need one-off overrides.
  4. Use --api-url for bb functions, --base-url for other API commands.

Troubleshooting


Safety Notes

Best Practices

  1. Start simple: Use Search to find URLs, Fetch to get content, Browser only when needed.
  2. Use browse snapshot over browse screenshot — it's faster and gives element refs.
  3. Use remote mode for protected sites — local mode for developme4. Set credentials via env vars rather than inline flags.
  4. Clean up: Always browse stop when done with browser sessions.