Show HN: DataBridge - An open-source, modular, multi-modal RAG solution

  • Posted 5 days ago by Adityav369
  • 5 points
https://github.com/databridge-org/databridge-core
For the past few weeks, I've been working on DataBridge, an open-source solution for easy data ingestion and querying. We support text, PDFs, images, and as of recently, we've added a video parser that can analyze and work well over frames and audio. We are also adding object tracking to improve video ingestion and context, and plan to do this for various data types.

To get started, you can find the installation section in our docs at https://databridge.gitbook.io/databridge-docs/getting-starte.... There are a bunch of other useful functions and examples available there. Our docs aren't 100% caught up with all these new features, so if you're curious about the latest and greatest, the git repo is the source of truth.

We're still shaping DataBridge (we have a skeleton and want to add the meaty parts) to best serve the LLM and RAG developer community, so I'd love your feedback about what features you're currently missing in RAG pipelines, whether specialized parsing (e.g., for medical docs, legal texts, or multimedia) is something you'd want, what your ideal RAG workflow looks like, and what some must-haves are.

Thanks for checking out DataBridge, and feel free to open issues or PRs on GitHub if you have ideas, requests, or want to help shape the next set of features. If this is helpful, I'd really appreciate it if you could give it a star on GitHub! Looking forward to hearing your thoughts!

Happy building!

0 comments