openclaw-security

# OpenClaw Security - PII Audit Skill Multi-region async PII detection engine for OpenClaw sessions. Detects 8 categories of sensitive personal data across 10 country/region jurisdictions and logs audit events locally as NDJSON. ## 中文速览（PII 审计） ### 基本信息 - 技能名称：`openclaw-security` - 能力：多区域异步 PII 检测，支持后台审计与本地合规留痕 ### 检测范围 - 8 类标签：`PHONE` / `EMAIL` / `PERSON_NAME` / `ADDRESS` / `PASSPORT` / `BANK_CARD` / `NATIONAL_ID` / `SOCIAL_ACCOUNT` - 10 区域：CN / US / AU / SG / MY / TH / ID / DE / UK / FR（支持 `+CC` 国际手机号） - 来源类型：`input` / `prompt` / `context` / `knowledge_base` ### 关键规则 - 风险分级：`high`（证件/银行卡或组合信息），`low`（单一弱标识） - 智能采样：`input` 100%（5m），`prompt` 20%（24h），`context` 20%（1h），`knowledge_base` 100%（24h） - 调用方无需自行判断是否跳过扫描；如需强制扫描，使用 `--no-cache` - 后台扫描禁止 `--text`，请使用 `--file` + `--delete-after-read` - 输入上限 32,768 字符，超限截断并记录 `truncated: true` - 审计结果本地 NDJSON 落盘，默认保留 7 天，可 `cleanup.py --dry-run` 先演练 ## Quick Start Scan via file (recommended for background / automated scans): ```powershell python scripts/audit_worker.py --session-id SESSION_001 --source-type input --file content.txt ``` Scan via file + auto-delete (secure temp-file workflow): ```powershell python scripts/audit_worker.py --session-id SESSION_001 --source-type input --file tmp_scan.txt --delete-after-read ``` Scan via stdin: ```powershell echo "张三的手机号是13812345678" | python scripts/audit_worker.py --session-id SESSION_001 --source-type input ``` Quick manual test (WARNING: content visible in process list): ```powershell python scripts/audit_worker.py --session-id S001 --source-type input --text "short test" --json ``` ## Source Types - `input` — User input text - `prompt` — System or user prompts - `context` — Conversation context - `knowledge_base` — Knowledge base content ## Detection Labels PHONE, EMAIL, PERSON_NAME, ADDRESS, PASSPORT, BANK_CARD, NATIONAL_ID, SOCIAL_ACCOUNT ## Supported Regions CN, US, AU, SG, MY, TH, ID, DE, UK, FR (+ INTL via +CC phone prefix) ## Risk Levels - **high**: NATIONAL_ID / PASSPORT / BANK_CARD detected, or combination of PERSON_NAME + contact info + ADDRESS - **low**: Single weak identifier (EMAIL, SOCIAL_ACCOUNT, PHONE alone) ## Smart Sampling The audit worker includes built-in smart sampling to efficiently handle large contexts: - **User input** (`input`): 100% scan rate, 5-min cache TTL — every user message is scanned, but identical repeats within 5 minutes are skipped. - **System prompts** (`prompt`): 20% scan rate, 24-hour cache TTL — prompts rarely change; first scan is cached for 24 hours. - **Conversation context** (`context`): 20% scan rate, 1-hour cache TTL — context overlaps heavily; only sample 1 in 5 submissions. - **Knowledge base** (`knowledge_base`): 100% first-scan rate, 24-hour cache TTL — static content is fully scanned once, then deduped for 24 hours. Bypass sampling for manual / forced scans: ```powershell python scripts/audit_worker.py --session-id S001 --source-type context --text "text" --no-cache ``` ## Async Audit Workflow When auditing session content as a background task: 1. **Respond to user first** — never block the main response for audit. 2. **Feed all content types** — the script internally decides whether to actually scan based on sampling config and cache. The Agent does not need to decide when to skip. 3. **Use temp-file + `--delete-after-read`** — NEVER pass content via `--text` in background scans. Write content to a temp file, pass `--file`, and let the script auto-delete it. 4. Run audit in background: ```powershell # Step 1: Write content to temp file (no PII in command-line args) $tmpFile = [System.IO.Path]::GetTempFileName() [System.IO.File]::WriteAllText($tmpFile, $userInput, [System.Text.Encoding]::UTF8) # Step 2: Background scan — script reads and deletes the temp file Start-Process -NoNewWindow -FilePath python -ArgumentList "scripts/audit_worker.py --session-id $sid --source-type input --file $tmpFile --delete-after-read" # Same pattern for other source types: $tmpPrompt = [System.IO.Path]::GetTempFileName() [System.IO.File]::WriteAllText($tmpPrompt, $systemPrompt, [System.Text.Encoding]::UTF8) Start-Process -NoNewWindow -FilePath python -ArgumentList "scripts/audit_worker.py --session-id $sid --source-type prompt --file $tmpPrompt --delete-after-read" ``` 5. Review results: `openclaw-security-audit/YYYY-MM-DD/events.ndjson` 6. All outcomes (detected, clean, skipped) are logged for complete audit trail. ## Retention Default: 7 days. Cleanup: ```powershell python scripts/cleanup.py --days 7 ``` Dry run first: ```powershell python scripts/cleanup.py --days 7 --dry-run ``` ## Input Size Limit Maximum input: **32,768 characters** (32K). Content exceeding this limit is truncated to the first 32K characters. The audit record carries `truncated: true` and original `input_chars` count. ## Audit Record Schema Every scan invocation writes an NDJSON record — including `clean` and `skipped` outcomes. Each NDJSON line contains: - `event_id` — UUID - `session_id` — Caller-provided session ID (required) - `source_type` — One of: input, prompt, context, knowledge_base - `status` — `detected`, `clean`, or `skipped` - `labels` — Array of detected PII types (detected only) - `regions` — Array of matched regions/country codes (detected only) - `risk_level` — high or low (detected only) - `matched_count` — Number of PII matches - `matches` — Array of {label, confidence, masked_preview, region} (detected only) - `content_hash` — SHA256 prefix for dedup (no raw content stored) - `input_chars` — Original input size in characters - `truncated` — Whether input was truncated to 32K - `created_at` — ISO 8601 UTC timestamp ## Safety Rules - NEVER store raw sensitive values — only masked previews + content hash - NEVER pass content via `--text` in background scans — use `--file` + `--delete-after-read` - Audit logs are local-only, never transmitted externally - All file I/O uses UTF-8 encoding explicitly, with file locking for concurrent safety - No external dependencies — stdlib only - Input capped at 32K characters to prevent resource exhaustion ## Configuration Environment variable override for audit output directory: ```powershell $env:OPENCLAW_AUDIT_DIR = "C:\path\to\custom\audit\dir" ``` See `references/patterns.md` for detection pattern details.

openclaw-security

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

openclaw-security