返回顶部
m

multi-model-critique

Run complex prompts through a multi-model deliberation pipeline with structured self-improvement. Use when the user sets a complex flag (e.g., complex=true/complex) or asks for high-stakes, ambiguous, or long-form reasoning where one model is not enough. Produces outputs by: (1) parallel model runs, (2) cross-critique, (3) critique-driven revision, and (4) final synthesized answer with uncertainties and evidence notes.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.1
安全检测
已通过
338
下载量
0
收藏
概述
安装方式
版本历史

multi-model-critique

# Multi-Model Critique ## Overview Use this skill only for complex tasks. Route multiple models through the same 4-step loop (`Plan -> Execute -> Review -> Improve`), then run cross-critique and synthesis to produce a higher-quality final answer than any single-model draft. ## Trigger rule Enable this skill only when the request explicitly sets `complex` to true (or equivalent wording such as “this is complex/deep”). If `complex` is false, skip this skill and respond with normal single-model behavior. ## Inputs Collect or confirm these inputs before execution: - `complex`: boolean flag (must be true) - `question`: user request - `models`: list of ACP `agentId` values (typically 3) - `constraints`: output format, language, length, deadlines, forbidden assumptions - `ops`: optional runtime controls (`timeoutSec`, `maxRetries`, `maxRounds`, `budgetUsd`) ## File map (what each file does) - `SKILL.md` (this file): orchestration policy, trigger conditions, and execution sequence. - `references/prompt-templates.md`: reusable prompts for draft, critique, revision, and final synthesis (includes scoring rubric usage). - `references/orchestration-template.md`: practical OpenClaw orchestration flow using `sessions_spawn`, `sessions_send`, and `sessions_history`. - `references/output-schema.md`: machine-parseable JSON output schema for final result and per-model scoring. - `scripts/build_round_prompts.py`: utility to generate per-model prompt files for repeated runs. - `scripts/run_orchestration.py`: local helper that builds a run plan JSON (model mapping, round prompts, runtime settings). ## Workflow ### Step 1) Parallel draft round Spawn one ACP session per model with the same task and constraints. Per-model requirements: - Follow the exact internal sequence: `Plan -> Execute -> Review -> Improve` - Print all four sections explicitly - End with `Draft Answer` Use `sessions_spawn` with `runtime:"acp"` and explicit `agentId`. ### Step 2) Cross-critique round Share peer `Draft Answer` outputs with each model and require structured critique: - Strengths - Weaknesses - Missing assumptions/data - Hallucination and confidence risks - Concrete fix suggestions Also require ranking of peer drafts with rationale. ### Step 3) Revision round Send critique feedback back to each original model and request revision: - Keep `Plan -> Execute -> Review -> Improve` - Include `Changes from Critique` - End with `Revised Answer` ### Step 4) Final synthesis round Integrate revised answers into one user-facing output: - Best final answer - Why the synthesis is stronger than individual drafts - Remaining uncertainties - Optional next actions ## Scoring rubric (required in critique + synthesis) Score each draft on a 1-5 scale: - `accuracy`: factual correctness and internal consistency - `coverage`: completeness against user request and constraints - `evidence`: quality of assumptions and support - `actionability`: usefulness for concrete decision/action Default weighted score: `0.40 * accuracy + 0.25 * coverage + 0.20 * evidence + 0.15 * actionability` Use this score to justify rankings and the final selected direction. ## Prompting resources - Use `references/prompt-templates.md` for canonical prompts. - Use `scripts/build_round_prompts.py` when you need file-based prompt generation for repeated or batched runs. - Use `scripts/run_orchestration.py` to generate a deterministic run-plan artifact for reproducible execution. - Use `references/orchestration-template.md` for concrete OpenClaw tool-call flow. ## Required user-facing output shape 1. `Final Answer` 2. `Key Improvements from Critique` 3. `Uncertainties` 4. `Next Steps` (optional) When machine consumption is needed, return JSON matching `references/output-schema.md`. Do not expose private chain-of-thought. Provide concise reasoning summaries only. ## Failure handling - One model fails: continue with remaining models and note reduced diversity. - Two or more models fail: ask whether to retry or switch to single-model mode. - Strong disagreement remains: present competing hypotheses and state what evidence would resolve them. ## Runtime defaults (recommended) - `timeoutSec`: 180 per round per model - `maxRetries`: 1 per failed model turn - `maxRounds`: fixed at 4 (draft, critique, revision, synthesis) - `budgetUsd`: optional hard stop when cost-sensitive

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 multi-model-critique-1776295297 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 multi-model-critique-1776295297 技能

通过命令行安装

skillhub install multi-model-critique-1776295297

下载 Zip 包

⬇ 下载 multi-model-critique v1.0.1

文件大小: 11.3 KB | 发布时间: 2026-4-16 18:09

v1.0.1 最新 2026-4-16 18:09
Security patch: validate and sanitize untrusted question/constraints inputs, block prompt-injection control phrases, validate model/agent mapping formats, and add runtime guardrails for orchestration plan generation.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部