Show HN: A 4-tier self-healing system for local AI agents (was silently broken)

  • Posted 4 hours ago by ramsbaby-dev
  • 2 points
I run a local AI assistant (OpenClaw) on a Mac mini. When it crashes, I wanted it to fix itself automatically. So I built a 4-tier recovery system:

Level 1 → launchd auto-restart (seconds) Level 2 → watchdog health check every 60s Level 3 → Claude Code AI diagnosis + repair (fires after 30+ min failure) Level 4 → Discord/Telegram human escalation

The system looked great on paper. But v3.0.0 had a critical bug: Level 2 logged "Escalating to Level 3..." but never actually called the script. The chain was broken at the most important handoff — silently, for weeks.

v3.1.0 fixes this plus adds: - Chain verification in the installer (each level is tested end-to-end) - Graceful .env loading so Level 3 doesn't die silently without API keys - Watchdog Catch-up mode for missed intervals

GitHub: https://github.com/Ramsbaby/openclaw-self-healing

The meta-lesson: "I have monitoring" ≠ "my monitoring chain is actually connected." Testing each component starts is not the same as testing the system works end-to-end.

1 comments

    Loading..