1. Job calls external API (Stripe, SendGrid, AWS) 2. API call succeeds 3. Job crashes before recording success 4. Job retries → calls API again → duplicate
Example: process refund, send email notification, crash. Retry does both again. Customer gets duplicate refund email (or worse, duplicate refund).
I see a few approaches:
Option A: Store processed IDs in database Problem: Race between "check DB" and "call API" can still duplicate
Option B: Use API idempotency keys (Stripe supports this) Problem: Not all APIs support it (legacy systems, third-party)
Option C: Build deduplication layer that checks external system first Problem: Extra latency, extra complexity
What do you do in production? Accept some duplicates? Only use APIs with idempotency? Something else?
(I built something for Option C, but trying to understand if this is actually a common-enough problem or if I'm over-engineering.)