Document poisoning in RAG systems: How attackers corrupt AI's sources

  • Posted 1 day ago by aminerj
  • 146 points
https://aminrj.com/posts/rag-document-poisoning/
I'm the author. Repo is here: https://github.com/aminrj-labs/mcp-attack-labs/tree/main/lab...

The lab runs entirely on LM Studio + Qwen2.5-7B-Instruct (Q4_K_M) + ChromaDB — no cloud APIs, no GPU required, no API keys.

From zero to seeing the poisoning succeed: git clone, make setup, make attack1. About 10 minutes.

Two things worth flagging upfront:

- The 95% success rate is against a 5-document corpus (best case for the attacker). In a mature collection you need proportionally more poisoned docs to dominate retrieval — but the mechanism is the same.

- Embedding anomaly detection at ingestion was the biggest surprise: 95% → 20% as a standalone control, outperforming all three generation-phase defenses combined. It runs on embeddings your pipeline already produces — no additional model.

All five layers combined: 10% residual.

Happy to discuss methodology, the PoisonedRAG comparison, or anything that looks off.

32 comments

    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..
    Loading..