system-restoration

# System Restoration Comprehensive guide for restoring Advantage HPE's operational intelligence systems when they fail or go down. ## Investigation Workflow ### 1. System Status Assessment Before fixing anything, map out what's broken: **Core Intelligence Systems:** 1. **Zero Revenue Alerts** → #margin-alerts (Every 30 min) 2. **Morning Pulse** → #manager-nudges (Daily 6:35 AM) 3. **Live Nudges** → #manager-nudges (Every 15 min) 4. **Material Truth Report** → #material-intel-systems (Daily 7:00 AM) 5. **Friend-Zone Reformatter** → #live-ops (ServiceTitan email alerts) **Investigation Commands:** ```bash # Check LaunchD services launchctl list | grep ranger # Check cron jobs cron list # Check running processes ps aux | grep -E "(keel|pulse|margin|nudge)" | grep -v grep # Find system code find /Users/stephendobbins/.config/ranger -name "*.py" | grep -E "(pulse|margin|nudge)" find /Users/stephendobbins/.openclaw/workspace -name "*.py" | grep -E "(zero|revenue)" ``` ### 2. Locate Code & Determine Failure Cause **Common Locations:** - `/Users/stephendobbins/.config/ranger/scripts/` - Main operational scripts - `/Users/stephendobbins/.config/ranger/materials/` - Material intelligence - `/Users/stephendobbins/.openclaw/workspace/` - Recent scripts & fixes - `/Users/stephendobbins/Library/LaunchAgents/` - LaunchD service definitions **Common Failure Patterns:** - **LaunchD services unloaded** - Emergency shutdown or system restart - **Data source broken** - ServiceTitan API returning wrong data - **Scheduling missing** - Functions exist but no cron/LaunchD trigger - **Script errors** - Import failures, credential issues ## System-Specific Restoration ### Zero Revenue Alerts **Script:** `/Users/stephendobbins/.config/ranger/scripts/margin_alerts.py` **Channel:** #margin-alerts (C0A5L7MG60P) **Schedule:** Every 30 minutes **Restoration Steps:** 1. Verify script exists and posts to Slack 2. Load LaunchD service: `launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.margin-alerts.plist` 3. Test manually: `cd /Users/stephendobbins/.config/ranger/scripts && python3 margin_alerts.py` 4. Check logs: `tail /tmp/margin_alerts.log` ### Morning Pulse **Script:** `/Users/stephendobbins/.config/ranger/scripts/pulse_os_full.py` **Channel:** #manager-nudges (C0A5V9JL2KV) **Schedule:** Daily 6:35 AM CT **Restoration Steps:** 1. **If broken API data:** Check for `.bak` backup with working data sources 2. **Restore backup:** `cp pulse_os_full.py.bak pulse_os_full.py` 3. **Fix data sources:** Replace API calls with browser automation (see references/browser-data-sources.md) 4. Load LaunchD service: `launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.morning-pulse.plist` 5. Test: `python3 pulse_os_full.py pulse` ### Live Nudges **Script:** `/Users/stephendobbins/.config/ranger/scripts/pulse_os_full.py nudges` **Channel:** #manager-nudges **Schedule:** Every 15 minutes **Function:** `run_nudges()` on line 548-617 **Features:** 🚗 dispatched / 📍 arrived / ✅ completed alerts **Restoration Steps:** 1. Verify function exists: `grep -n "def run_nudges" pulse_os_full.py` 2. Create LaunchD service (see scripts/create-live-nudges-service.py) 3. Load service: `launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.live-nudges.plist` 4. Test: `python3 pulse_os_full.py nudges` ### Material Truth Report **Script:** `/Users/stephendobbins/.config/ranger/materials/reconciliation_report.py` **Channel:** #material-intel-systems (C0A5L7RB5EK) **Schedule:** Daily 7:00 AM CT **Restoration Steps:** 1. Test script: `cd /Users/stephendobbins/.config/ranger/materials && python3 reconciliation_report.py --no-email` 2. Create cron job with 7:00 AM schedule 3. Verify channel posting ## Data Source Repair ### ServiceTitan API vs UI Data **Problem:** ServiceTitan API often returns test/historical data instead of real operational data. **Solution:** Replace API calls with browser automation: 1. **Create browser data source module** (see scripts/browser_data_sources.py) 2. **Import in main script:** Replace parse functions with browser equivalents 3. **Preserve output format** - Same sections, different data source **Browser Data Functions:** - `get_browser_low_margin_jobs()` - `get_browser_stale_estimates()` - `get_browser_revenue_leaks()` - `get_browser_driver_incidents()` ### KEEL System Issues **Script:** `/Users/stephendobbins/.config/ranger/keel/keel_slack_bot.py` **Safe restart for field tech DM only:** 1. **Disable operational intelligence:** Set `OPERATIONAL_INTELLIGENCE_ENABLED = False` 2. **Restart process:** `cd /Users/stephendobbins/.config/ranger/keel && python3 keel_slack_bot.py &` 3. **Verify running:** `ps aux | grep keel_slack_bot` ## Service Management Commands ### LaunchD Services ```bash # List services launchctl list | grep ranger # Load service launchctl load /Users/stephendobbins/Library/LaunchAgents/com.ranger.<service>.plist # Unload service launchctl unload /Users/stephendobbins/Library/LaunchAgents/com.ranger.<service>.plist # Start service immediately launchctl start com.ranger.<service> # Check service logs tail /tmp/<service>.log tail /tmp/<service>.err ``` ### Cron Jobs (OpenClaw) ```bash # List jobs cron list # Add job cron add <job-definition> # Remove job cron remove <job-id> ``` ## Emergency Shutdown Recovery When systems are emergency-stopped due to bad data: 1. **Investigate root cause** - Usually ServiceTitan API data issues 2. **Fix data sources** - Switch to browser automation or correct API endpoints 3. **Test manually** - Verify data accuracy before re-enabling 4. **Restore services** - Load LaunchD services and cron jobs 5. **Monitor initially** - Check logs and channel posts for accuracy ## Resources ### scripts/ - `create-live-nudges-service.py` - Generate LaunchD plist for live nudges - `browser_data_sources.py` - Browser automation replacement for broken APIs ### references/ - `launchd-service-templates.md` - LaunchD plist templates for different schedules - `channel-ids.md` - Slack channel IDs for all operational intelligence channels - `troubleshooting-checklist.md` - Step-by-step debugging guide

system-restoration

标签

通过对话安装

方式一：安装 SkillHub 和技能

方式二：设置 SkillHub 为优先技能安装源

通过命令行安装

下载 Zip 包

system-restoration