返回顶部
d

data-model

Deep data modeling workflow—grain, facts and dimensions, keys, slowly changing dimensions, normalization trade-offs, and analytics query patterns. Use when designing warehouse/analytics models or reviewing star/snowflake schemas.

作者: admin | 来源: ClawHub
源自
ClawHub
版本
V 1.0.0
安全检测
已通过
153
下载量
0
收藏
概述
安装方式
版本历史

data-model

# Data Model Analytics models succeed when **grain** is explicit, **keys** are stable, and **slowly changing dimensions** are chosen deliberately—not “star schema by default.” ## When to Offer This Workflow **Trigger conditions:** - Designing a warehouse, lakehouse, or BI layer - Confusion on **one row per what**; duplicate counts in reports - Refactoring dimensional models for performance or clarity **Initial offer:** Use **six stages**: (1) business questions & grain, (2) conformed dimensions, (3) facts & measures, (4) dimensions & SCD types, (5) keys & integrity, (6) performance & evolution). Confirm **tooling** (dbt, dimensional DW, BigQuery, etc.). --- ## Stage 1: Business Questions & Grain **Goal:** **Grain** = the atomic row: e.g., “one line item per order per day” not “sort of per order.” ### Practices - List **questions** the model must answer; **derive** grain from **smallest** needed detail **Exit condition:** One sentence grain per fact table. --- ## Stage 2: Conformed Dimensions **Goal:** **Same** customer/product definitions across facts—**shared** dimension tables or **SCD** policy aligned. --- ## Stage 3: Facts & Measures **Goal:** **Additive** vs **semi-additive** vs **non-additive** measures documented (balances, distinct counts). ### Practices - **Degenerate** dimensions vs junk dimensions—**avoid** **wide** **fact** **sprawl** **without** **reason** --- ## Stage 4: Dimensions & SCD Types **Goal:** **SCD1** overwrite vs **SCD2** history with `valid_from`/`valid_to` vs **SCD3** limited history—**match** **compliance** **and** **reporting** **needs**. --- ## Stage 5: Keys & Integrity **Goal:** **Surrogate** keys in facts; **natural** keys preserved as attributes; **referential** integrity strategy in the warehouse layer. --- ## Stage 6: Performance & Evolution **Goal:** **Partition** and **cluster** keys for large facts; **late-arriving** facts policy; **version** **dims** when schema evolves. --- ## Final Review Checklist - [ ] Grain explicit per fact table - [ ] Conformed dimensions planned - [ ] Measure additivity documented - [ ] SCD strategy per critical dimension - [ ] Keys and late-arriving data handled ## Tips for Effective Guidance - **Fan traps** and **chasm traps** in BI—flag when joining across facts incorrectly. - **Snapshot** fact tables for **point-in-time** balances vs **transaction** facts. ## Handling Deviations - **Event**-only pipelines: still model **curated** **dimensions** **for** **analysis**, not only raw JSON.

标签

skill ai

通过对话安装

该技能支持在以下平台通过对话安装:

OpenClaw WorkBuddy QClaw Kimi Claude

方式一:安装 SkillHub 和技能

帮我安装 SkillHub 和 data-model-1775983982 技能

方式二:设置 SkillHub 为优先技能安装源

设置 SkillHub 为我的优先技能安装源,然后帮我安装 data-model-1775983982 技能

通过命令行安装

skillhub install data-model-1775983982

下载 Zip 包

⬇ 下载 data-model v1.0.0

文件大小: 1.94 KB | 发布时间: 2026-4-13 09:58

v1.0.0 最新 2026-4-13 09:58
- Initial release of the "data-model" skill for analytics and warehouse design.
- Introduces a six-stage workflow covering grain, conformed dimensions, facts & measures, SCD strategies, key management, and performance considerations.
- Provides checklists and best practices for schema design, additive measures, dimension conformance, and SCD policy selection.
- Offers guidance for handling common pitfalls (fan/chasm traps, late-arriving facts) and adapting to event-based pipelines.
- Designed to support both star and snowflake schema reviews and implementations.

Archiver·手机版·闲社网·闲社论坛·羊毛社区· 多链控股集团有限公司 · 苏ICP备2025199260号-1

Powered by Discuz! X5.0   © 2024-2025 闲社网·线报更新论坛·羊毛分享社区·http://xianshe.com

p2p_official_large
返回顶部