browser-use-local
# browser-use (local) playbook
## Default constraints in this environment
- Prefer **browser-use** (CLI/Python) over OpenClaw `browser` tool here; OpenClaw `browser` may fail if no supported system browser is present.
- Use **persistent sessions** to do multi-step flows: `--session <name>`.
## Quick CLI workflow (non-agent)
1) Open
```bash
browser-use --session demo open https://example.com
```
2) Inspect (sometimes `state` returns 0 elements on heavy/JS sites)
```bash
browser-use --session demo --json state | jq '.data | {url,title,elements:(.elements|length)}'
```
3) Screenshot (always works; best debugging primitive)
```bash
browser-use --session demo screenshot /home/node/.openclaw/workspace/page.png
```
4) HTML for link discovery (works even when `state` is empty)
```bash
browser-use --session demo --json get html > /tmp/page_html.json
python3 - <<'PY'
import json,re
html=json.load(open('/tmp/page_html.json')).get('data',{}).get('html','')
urls=set(re.findall(r"https?://[^\s\"'<>]+", html))
for u in sorted([u for u in urls if any(k in u for k in ['demo','login','console','qr','qrcode'])])[:200]:
print(u)
PY
```
5) Lightweight DOM queries via JS (useful when `state` is empty)
```bash
browser-use --session demo --json eval "location.href"
browser-use --session demo --json eval "document.title"
```
## Agent workflow with OpenAI-compatible LLM (Moonshot/Kimi)
Use Python for Agent runs when the CLI `run` path requires Browser-Use cloud keys or when you need strict control over LLM parameters.
### Minimal working Kimi example
Create `.env` (or export env vars) with:
- `OPENAI_API_KEY=...`
- `OPENAI_BASE_URL=https://api.moonshot.cn/v1`
Then run the bundled script:
```bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python /home/node/.openclaw/workspace/skills/browser-use-local/scripts/run_agent_kimi.py
```
**Kimi/Moonshot quirks observed in practice** (fixes):
- `temperature` must be `1` for `kimi-k2.5`.
- `frequency_penalty` must be `0` for `kimi-k2.5`.
- Moonshot can reject strict JSON Schema used for structured output. Enable:
- `remove_defaults_from_schema=True`
- `remove_min_items_from_schema=True`
If you get a 400 error mentioning `response_format.json_schema ... keyword 'default' is not allowed` or `min_items unsupported`, those two flags are the first thing to set.
## QR code extraction (login/demo pages)
### Preferred order
1) **Screenshot the page** and crop candidate regions (fast, robust).
2) If HTML contains `data:image/png;base64,...`, extract and decode it.
### Crop candidates
Use `scripts/crop_candidates.py` to generate multiple likely QR crops from a screenshot.
```bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
python skills/browser-use-local/scripts/crop_candidates.py \
--in /home/node/.openclaw/workspace/login.png \
--outdir /home/node/.openclaw/workspace/qr_crops
```
### Extract base64-embedded images from HTML
```bash
source /home/node/.openclaw/workspace/.venv-browser-use/bin/activate
browser-use --session demo --json get html > /tmp/page_html.json
python skills/browser-use-local/scripts/extract_data_images.py \
--in /tmp/page_html.json \
--outdir /home/node/.openclaw/workspace/data_imgs
```
## Troubleshooting
- **`state` shows `elements: 0`**: use `get html` + regex discovery, plus screenshots; use `eval` to query DOM.
- **Page readiness timeout warnings**: usually harmless; rely on screenshot + HTML.
- **CLI flags order**: global flags go *before* the subcommand:
- ✅ `browser-use --browser chromium --json open https://...`
- ❌ `browser-use open https://... --browser chromium`
标签
skill
ai