Show HN: I wrote a GPU-less billion-vector DB for molecule search (live demo)

  • Posted 17 hours ago by mireklzicar
  • 9 points
https://cheese-new.deepmedchem.com/
Input a SMILES string (or pick one molecule from the examples) and it returns up to 100k molecules closest in 3-D shape or electrostatic similarity – from 10+ billion scale databases — typically in under 5-10 s.

*Why it might interest HN*

* Entire index lives on disk — no GPU at query-time, less than ~10 GB RAM total. * Built from scratch (no FAISS index / Milvus / Pinecone). * Index-build cost: one Nvidia T4 (~ 300USD) for one 5.5B database. * Open to anyone, predict ADMET, export results as CSV/SDF.

Full write-up & benchmarks (DUD-E, LIT-PCBA, SVS) in the pre-print: https://chemrxiv.org/engage/chemrxiv/article-details/6725091...

1 comments

    Loading..