Show HN: Incident management for Slack with AI-generated postmortems

  • Posted 5 hours ago by soyzamudio
  • 1 points
https://www.incidentops.io/
I've been an on-call engineer at startups where "incident management" meant someone panic-creating a Slack channel called #fire-fire-fire, or all incidents handled in the #incidents channel at the same time... and the postmortem was whatever someone remembered to write in Notion three days later.

So I built a Slack bot to fix my own workflow. Figured I'd share it.

What it does:

"/incident start sev2 API latency spike" creates a dedicated channel, invites whoever's on-call, pins the details, and starts recording a timeline. When you run "/incident resolve", it uses GPT-4 to analyze the entire channel conversation and generate a postmortem draft: summary, root cause, event timeline, action items.

The key insight: the actual diagnosis usually happens in casual messages ("wait, I think the connection pool is exhausted") not in formal status updates. So the AI reads everything, not just what was labeled important.

Stack: - TypeScript + Slack Bolt - Prisma + Postgres - OpenAI API for postmortem generation - PagerDuty integration for escalations

Other stuff it handles: - Update severity of an incident with "/incident severity <sev1|sev2|sev3|sev4>" - On-call scheduling with automatic weekly/daily rotation - Paging with escalation chains (Slack DMs → PagerDuty if configured) - Jira ticket creation for incidents within slack with "/incident ticket <title>" - Basic analytics (incidents per on-call, MTTR)

What I learned building this: 1. Slack's API is actually pretty good now. The Bolt framework handles most of the OAuth/event subscription pain. 2. Getting AI to write useful postmortems required being very explicit about event types. Without context about what's a "status update" vs a "debug message," it would hallucinate causes. 3. On-call scheduling is surprisingly complex. Timezone handling, rotation boundaries, handoff notifications, each is a rabbit hole.

Honest limitations: - Only works for teams already living in Slack - AI postmortems need human review, it can miss context from calls/video chats - Only a couple of integrations (the ones I use, but can add more, like Linear, github issues, etc...)

Code isn't open source (yet?), but happy to answer architecture questions. Been running this with my own team for approx. 2 months.

Landing page: https://incidentops.io

Would appreciate feedback, especially from SREs who've built similar internal tools. What am I missing?

0 comments