evalpal

# EvalPal Skill Run AI agent evaluations inline. Trigger eval runs, poll for results, and list available evaluation definitions — all from chat. ## Prerequisites Set the following environment variables in your OpenClaw skill configuration: | Variable | Required | Description | | ----------------- | -------- | -------------------------------------------- | | `EVALPAL_API_KEY` | Yes | Your EvalPal API key (starts with `sk_`) | | `EVALPAL_API_URL` | No | Base URL (defaults to `https://evalpal.dev`) | Get your API key from **Settings → API Keys** at [evalpal.dev](https://evalpal.dev). ## Commands ### `/evalpal run --eval-id <ID>` Trigger an evaluation run and wait for results. **Usage:** ```bash bash scripts/run-eval.sh --eval-id <EVAL_DEFINITION_ID> ``` **What it does:** 1. Triggers a new eval run via the EvalPal API 2. Polls for completion with exponential backoff (up to 5 minutes) 3. Fetches and formats results as readable markdown **Example output:** ``` ✅ Episode Quality — PASSED (15/16) ├── Test Case tc_001: ✓ PASS ├── Test Case tc_002: ✓ PASS ├── Test Case tc_003: ✗ FAIL └── 12 more passed... Run ID: run_abc123 · 16 test cases · 47s ``` **Exit codes:** 0 = all passed, 1 = failures or error. ### `/evalpal status --run-id <ID>` Check the current status of a running evaluation. **Usage:** ```bash bash scripts/check-status.sh --run-id <RUN_ID> ``` **Example output:** ``` 📊 Run Status: run_abc123 Status: running Started: 2026-03-26T20:00:00Z ``` ### `/evalpal list` List available evaluation definitions across your projects. **Usage:** ```bash bash scripts/list-evals.sh [--project-id <PROJECT_ID>] ``` If `--project-id` is omitted, lists evals for all projects. **Example output:** ``` 📋 Evaluation Definitions Project: AI Workforce Lab abc123 Episode Quality Check def456 Factual Accuracy Eval Project: Customer Support Bot ghi789 Response Quality ``` ## Error Handling All scripts handle common error cases: | Scenario | Output | Exit Code | | --------------- | -------------------------------------------- | --------- | | No API key set | `Error: EVALPAL_API_KEY is not set` | 1 | | Invalid API key | `Error: Authentication failed (401)` | 1 | | Eval not found | `Error: Eval definition not found (404)` | 1 | | Rate limited | `Error: Rate limited — retry after Xs (429)` | 1 | | Timeout (5 min) | `Error: Evaluation timed out after 300s` | 1 | | Network error | `Error: Could not reach EvalPal API` | 1 | ## Security - The API key is read from `EVALPAL_API_KEY` environment variable only - Scripts never echo or log the API key - All API calls use HTTPS

evalpal

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

evalpal

evalpal

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement