返回顶部
a

agent-learner

Benchmark and compare agent prompts and evaluation results. Use when tuning strategies, evaluating outputs, or comparing configurations.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 2.0.2
安全检测
已通过
374
下载量
0
收藏
概述
安装方式
版本历史

agent-learner

# Agent Learner An AI toolkit for configuring, benchmarking, comparing, and optimizing agent prompts and evaluation results. Agent Learner provides persistent, file-based logging for each command category with timestamped entries, summary statistics, multi-format export, and full-text search across all records. ## Commands | Command | Description | |---------|-------------| | `configure` | Configure agent settings — log configuration entries or view recent ones | | `benchmark` | Benchmark agent performance — log benchmark results or view history | | `compare` | Compare agent outputs — log comparison data or view recent comparisons | | `prompt` | Prompt management — log prompt variations or view recent prompts | | `evaluate` | Evaluate agent outputs — log evaluation results or view history | | `fine-tune` | Fine-tune parameters — log fine-tuning sessions or view recent ones | | `analyze` | Analyze agent behavior — log analysis entries or view recent analyses | | `cost` | Cost tracking — log cost data or view recent cost entries | | `usage` | Usage monitoring — log usage metrics or view recent usage data | | `optimize` | Optimize configurations — log optimization runs or view history | | `test` | Test agent behavior — log test results or view recent tests | | `report` | Report generation — log report entries or view recent reports | | `stats` | Show summary statistics across all log categories (entry counts, data size, first entry date) | | `export <fmt>` | Export all data in json, csv, or txt format to the data directory | | `search <term>` | Full-text search across all log files (case-insensitive) | | `recent` | Show the 20 most recent entries from the activity history log | | `status` | Health check — show version, data directory, total entries, disk usage, and last activity | | `help` | Show the full help message with all available commands | | `version` | Print the current version string | Each data command (configure, benchmark, compare, etc.) works in two modes: - **Without arguments**: displays the 20 most recent entries from that category - **With arguments**: saves the input as a new timestamped entry and reports the total count ## Data Storage All data is stored in plain text files under the data directory: - **Category logs**: `$DATA_DIR/<command>.log` — one file per command (e.g., `configure.log`, `benchmark.log`, `prompt.log`), each entry is `timestamp|value` - **History log**: `$DATA_DIR/history.log` — audit trail of every command executed with timestamps - **Export files**: `$DATA_DIR/export.<fmt>` — generated by the `export` command in json, csv, or txt format Default data directory: `~/.local/share/agent-learner/` ## Requirements - Bash (with `set -euo pipefail` support) - Standard Unix utilities: `grep`, `cat`, `date`, `echo`, `wc`, `du`, `head`, `tail`, `basename` - No external dependencies or API keys required ## When to Use 1. **Benchmarking agent performance** — When you need to track and compare benchmark results across different agent configurations, models, or prompt strategies 2. **Prompt engineering iteration** — When you're testing multiple prompt variations and want to log each version with results for later comparison 3. **Cost and usage tracking** — When you need to monitor API costs and usage metrics over time to optimize spending 4. **Fine-tuning experiments** — When running fine-tuning sessions and you want to log parameters, results, and observations for reproducibility 5. **Cross-category analysis** — When you need to search across all logged data (benchmarks, prompts, evaluations, costs) to find patterns or specific entries ## Examples ```bash # Initialize and check status agent-learner status # Log a benchmark result agent-learner benchmark "GPT-4o on MMLU: 88.7% accuracy, 1.2s avg latency" # Log a prompt variation agent-learner prompt "System: You are a helpful coding assistant. Always explain your reasoning step by step." # Compare two configurations agent-learner compare "GPT-4o vs Claude-3.5: GPT-4o 12% faster, Claude 5% more accurate on code tasks" # Track costs agent-learner cost "March batch: 12,450 tokens input, 3,200 tokens output, $0.47 total" # View all recent benchmarks agent-learner benchmark # Search across all logs for a specific term agent-learner search "accuracy" # Export all data as JSON agent-learner export json # View summary statistics agent-learner stats # Show recent activity agent-learner recent ``` ## Output All commands return output to stdout. Export files are written to the data directory: ```bash agent-learner export json # → ~/.local/share/agent-learner/export.json agent-learner export csv # → ~/.local/share/agent-learner/export.csv agent-learner export txt # → ~/.local/share/agent-learner/export.txt ``` Every command execution is logged to `$DATA_DIR/history.log` for auditing purposes. --- Powered by BytesAgain | bytesagain.com | hello@bytesagain.com

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 agent-learner-1776159915 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 agent-learner-1776159915 技能

通过命令行安装

skillhub install agent-learner-1776159915

下载 Zip 包

⬇ 下载 agent-learner v2.0.2

文件大小: 4.86 KB | 发布时间: 2026-4-15 10:34

v2.0.2 最新 2026-4-15 10:34
update

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部