返回顶部
e

evalpal

Run AI agent evaluations via EvalPal — trigger eval runs, check results, and

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.1
安全检测
已通过
95
下载量
0
收藏
概述
安装方式
版本历史

evalpal

# EvalPal Skill Run AI agent evaluations inline. Trigger eval runs, poll for results, and list available evaluation definitions — all from chat. ## Prerequisites Set the following environment variables in your OpenClaw skill configuration: | Variable | Required | Description | | ----------------- | -------- | -------------------------------------------- | | `EVALPAL_API_KEY` | Yes | Your EvalPal API key (starts with `sk_`) | | `EVALPAL_API_URL` | No | Base URL (defaults to `https://evalpal.dev`) | Get your API key from **Settings → API Keys** at [evalpal.dev](https://evalpal.dev). ## Commands ### `/evalpal run --eval-id <ID>` Trigger an evaluation run and wait for results. **Usage:** ```bash bash scripts/run-eval.sh --eval-id <EVAL_DEFINITION_ID> ``` **What it does:** 1. Triggers a new eval run via the EvalPal API 2. Polls for completion with exponential backoff (up to 5 minutes) 3. Fetches and formats results as readable markdown **Example output:** ``` ✅ Episode Quality — PASSED (15/16) ├── Test Case tc_001: ✓ PASS ├── Test Case tc_002: ✓ PASS ├── Test Case tc_003: ✗ FAIL └── 12 more passed... Run ID: run_abc123 · 16 test cases · 47s ``` **Exit codes:** 0 = all passed, 1 = failures or error. ### `/evalpal status --run-id <ID>` Check the current status of a running evaluation. **Usage:** ```bash bash scripts/check-status.sh --run-id <RUN_ID> ``` **Example output:** ``` 📊 Run Status: run_abc123 Status: running Started: 2026-03-26T20:00:00Z ``` ### `/evalpal list` List available evaluation definitions across your projects. **Usage:** ```bash bash scripts/list-evals.sh [--project-id <PROJECT_ID>] ``` If `--project-id` is omitted, lists evals for all projects. **Example output:** ``` 📋 Evaluation Definitions Project: AI Workforce Lab abc123 Episode Quality Check def456 Factual Accuracy Eval Project: Customer Support Bot ghi789 Response Quality ``` ## Error Handling All scripts handle common error cases: | Scenario | Output | Exit Code | | --------------- | -------------------------------------------- | --------- | | No API key set | `Error: EVALPAL_API_KEY is not set` | 1 | | Invalid API key | `Error: Authentication failed (401)` | 1 | | Eval not found | `Error: Eval definition not found (404)` | 1 | | Rate limited | `Error: Rate limited — retry after Xs (429)` | 1 | | Timeout (5 min) | `Error: Evaluation timed out after 300s` | 1 | | Network error | `Error: Could not reach EvalPal API` | 1 | ## Security - The API key is read from `EVALPAL_API_KEY` environment variable only - Scripts never echo or log the API key - All API calls use HTTPS

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 evalpal-1775996230 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 evalpal-1775996230 技能

通过命令行安装

skillhub install evalpal-1775996230

下载 Zip 包

⬇ 下载 evalpal v1.0.1

文件大小: 7.59 KB | 发布时间: 2026-4-13 10:11

v1.0.1 最新 2026-4-13 10:11
Declare EVALPAL_API_KEY env var and curl/jq binary requirements in registry metadata

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部