返回顶部
m

ml-ops

Deep MLOps workflow—reproducible training, experiment tracking, packaging, deployment, monitoring (drift, performance), governance, and rollback for ML. Use when shipping models to production or hardening ML pipelines.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
97
下载量
0
收藏
概述
安装方式
版本历史

ml-ops

# MLOps (Deep Workflow) MLOps connects **research velocity** to **production reliability**: version **data**, **code**, and **artifacts** together; monitor **behavior** after deploy. ## When to Offer This Workflow **Trigger conditions:** - First production model; batch or online serving - Drift, bias, or latency SLO misses - Compliance needs for lineage and explainability **Initial offer:** Use **six stages**: (1) problem & risk class, (2) data & reproducibility, (3) training & evaluation, (4) packaging & deployment, (5) monitoring & feedback, (6) governance & rollback). Confirm batch vs real-time and regulatory tier. --- ## Stage 1: Problem & Risk Class **Goal:** Align ML to decision risk (credit, health vs recommendation). **Exit condition:** Offline and online success metrics defined. --- ## Stage 2: Data & Reproducibility **Goal:** Snapshot training data; deterministic pipelines; PII handling. ### Practices - Feature stores optional but valuable for consistency - Secrets not in notebooks; orchestrated jobs **Exit condition:** Run id reproduces artifact hash within agreed bounds. --- ## Stage 3: Training & Evaluation **Goal:** Train/val/test without leakage; time-series splits careful. ### Practices - Model card with limits and metrics - Fairness slices where policy requires --- ## Stage 4: Packaging & Deployment **Goal:** Immutable artifacts; canary or shadow before full cutover. ### Practices - Model + preprocessing code version pinned together **Exit condition:** Rollback to previous artifact id documented. --- ## Stage 5: Monitoring & Feedback **Goal:** Data drift, concept drift, latency; business KPIs tied to model decisions. ### Practices - Human review queue for low-confidence predictions when needed --- ## Stage 6: Governance & Rollback **Goal:** Approvals for retrain/deploy; audit trail; A/B for big changes. --- ## Final Review Checklist - [ ] Offline metrics aligned with business risk - [ ] Data and code reproducibility - [ ] Packaged artifacts with versioning and rollback - [ ] Online monitoring and drift strategy - [ ] Governance and approval path ## Tips for Effective Guidance - Training-serving skew is a top bug—feature parity tests help. - Offline accuracy ≠ online business outcome. - Fairness needs explicit slices—not one headline number. ## Handling Deviations - LLM-heavy products: lean on eval harnesses and prompt versioning (see **llm-evaluation**). - Tiny teams: start with artifact registry + dashboards before a full feature store.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 ml-ops-1776028739 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 ml-ops-1776028739 技能

通过命令行安装

skillhub install ml-ops-1776028739

下载 Zip 包

⬇ 下载 ml-ops v1.0.0

文件大小: 1.89 KB | 发布时间: 2026-4-13 11:04

v1.0.0 最新 2026-4-13 11:04
- Initial release of the "ml-ops" skill featuring a comprehensive MLOps workflow.
- Covers reproducible training, experiment tracking, packaging, deployment, monitoring (drift, performance), governance, and rollback.
- Introduces six workflow stages: problem & risk class, data & reproducibility, training & evaluation, packaging & deployment, monitoring & feedback, governance & rollback.
- Provides practical triggers, stage exit conditions, and a final review checklist.
- Includes tips for preventing common pitfalls and adapting practices for LLM products or small teams.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部