sentryskills

# SentrySkills - Always-On Security Guard **SentrySkills is designed to run AUTOMATICALLY on every task.** It provides three-stage protection (preflight → runtime → output) with predictive risk analysis. ## ⚠️ Installation ≠ Automatic Activation After installing SentrySkills, you must configure your AGENTS.md to make it run automatically. See [install/first_time_setup.md](install/first_time_setup.md) for details. ## How It Works On every task, SentrySkills: 1. **Preflight** - Analyzes user intent and planned actions 2. **Runtime** - Monitors behavior during execution 3. **Output** - Redacts sensitive data before response 4. **Predictive** - Warns about potential risks ## Quick Enable (One Command) Add this to your `~/.codex/AGENTS.md`: ```bash cat >> ~/.codex/AGENTS.md << 'EOF' # SentrySkills - Security First Before EVERY response, run: python ~/.codex/sentryskills/shared/scripts/self_guard_runtime_hook_template.py \ input.json \ --policy-profile balanced \ --out result.json Then check result.json for final_action and follow it. EOF ``` Restart Codex and you're protected! ## When to Use SentrySkills Use SentrySkills when you need AI agents to operate safely with: - **Sensitive data access** - Agents reading credentials, secrets, or private information - **System modifications** - Agents executing commands, writing files, or changing configurations - **External communications** - Agents making network requests or calling external APIs - **Code generation** - Agents producing code that might contain vulnerabilities - **Production environments** - Any scenario where security cannot be compromised - **Multi-turn conversations** - Detect subtle manipulation across multiple interactions **Examples:** ``` ✅ Use: When an agent needs to read environment variables or config files ✅ Use: When an agent is asked to execute shell commands ✅ Use: When an agent generates database queries or API calls ✅ Use: When an agent modifies system files or configurations ❌ Skip: Simple read-only queries on public documentation ❌ Skip: Basic explanations without system access ``` ## Skill Package Structure This is a **skill package** that orchestrates multiple sub-skills: 1. **using-sentryskills** - User-facing entry point 2. **sentryskills-orchestrator** - Central coordination 3. **sentryskills-preflight** - Pre-execution checks 4. **sentryskills-runtime** - Runtime monitoring 5. **sentryskills-output** - Output validation & redaction Each sub-skill has its own `SKILL.md` with specific requirements. ## Execution Requirements 1. Run guard checks **before each external output** 2. Process sequence: `preflight → runtime → output guard → final decision` 3. **Block**: Prohibit original response, must refuse or redact 4. **Downgrade**: Must downgrade expression and declare uncertainty 5. **Explanatory responses** must also go through output guard ## Recommended Usage ### Default (turn_dir layout) ```bash python shared/scripts/self_guard_runtime_hook_template.py \ shared/references/input_schema.json \ --policy shared/references/runtime_policy.balanced.json \ --policy-profile balanced ``` ### With summary output ```bash python shared/scripts/self_guard_runtime_hook_template.py \ shared/references/input_schema.json \ --out ./sentry_skill_log/sentryskills_summary.json ``` ### Legacy event stream ```bash python shared/scripts/self_guard_runtime_hook_template.py \ shared/references/input_schema.json \ --log-layout legacy \ --events-log ./sentry_skill_log/sentryskills_events.jsonl ``` ## Mandatory Logging Protocol 1. **Text-only judgment is prohibited** - Runtime hook must execute each round 2. Input JSON **must include** `project_path` (absolute path to avoid drift) 3. Final response **must provide**: - `self_guard_final_action` - `self_guard_trace_id` - `self_guard_events_log` (path to index or legacy events) 4. If script execution **fails**, declare "security self-check not completed" and adopt conservative output strategy ## Default Log Layout Log root: `./sentry_skill_log/` Per-turn directories: - `./sentry_skill_log/turns/YYYYMMDD_HHMMSS_<turn_id>/input.json` - `./sentry_skill_log/turns/YYYYMMDD_HHMMSS_<turn_id>/result.json` Global index: - `./sentry_skill_log/index.jsonl` Session state: - `./sentry_skill_log/.self_guard_state/` ## Policy Profiles - **balanced**: Standard security (default) - **strict**: Maximum security - **permissive**: Minimal interference ## Detection Coverage ### Preflight Stage - Prompt injection patterns - Malicious intent detection - Sensitive topic inference - Action classification ### Runtime Stage - Event monitoring - Source tracking - Anomaly detection - Behavioral analysis ### Output Stage - Sensitive data redaction - Source disclosure handling - Confidence assessment - Safe response generation ### Predictive Analysis - Resource exhaustion prediction - Scope creep detection - Privilege escalation warning - Data exfiltration path analysis - Multi-turn grooming detection ## Integration ### As Codex Skill Copy to `skills/sentryskills/` and reference in agent configuration. ## Configuration Files - `shared/references/runtime_policy.*.json` - Security policy profiles - `shared/references/detection_rules.json` - Detection rule definitions - `shared/references/input_schema.json` - Input validation schema ## Testing ```bash # Test predictive analysis python test_predictive_analysis.py # Test integration python test_integration.py ``` ## Event Types The system emits structured events for: - `preflight_result` - Pre-execution check outcome - `runtime_result` - Runtime monitoring outcome - `output_guard_result` - Output validation outcome - `predictive_analysis_result` - Risk prediction (if enabled) - `final_decision` - Overall decision with rationale - `hook_end` - Completion with duration Each event includes: - Trace ID for correlation - Decision (block/downgrade/allow/continue) - Reason codes - Matched rules - Metadata ## Performance - Typical latency: 50-100ms per check - Memory: <50MB baseline - Zero external dependencies (Python stdlib only) ## Security Properties - **No data exfiltration**: All processing is local - **No LLM calls**: Pure rule-based and heuristic - **Audit trail**: Complete event log for compliance - **Transparent**: All decisions include reason codes

sentryskills

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

sentryskills

sentryskills

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

相关推荐

self-improvement

self-improvement

self-improvement

self-improvement