The intersting features are : 1> I used json rag with real time embeddings so that for a few specs and info we don't need to set a whole pipeline..
I have already built " Hierarchical Agentic Rag with Hybrid Search ( knowledge graph + vector search) u can view that on my profile ...
I am actively trying to share as much as possible related to it but that project is actually linked with a huge set of files it's 693k points of data with pgvector+ postgress .. give a visit u will get more idea from that
2> I had tried every sort of whisper models.. faster whisper .. turbo or anything u can u think of ..even with a self c++ engine .. but that model itself was hallucintion prone architecture..
Then I moved to parakeet tdt with silero vad and not parakeet rnn for better speed and optimisations .. repo has further details ..
3> fine tuned a dataset from anthropic rlhf through space and glinner and convert that to a perfect training dataset of the Lama 3.2 3b ..
I will attach the dataset of u need or will upload that to hugging face if u want to use it for yourself..
4> attached phonetic correctors for both output from parakeet and llama for better tts working .
5> I used setfit to route the queries and confidence based semantic search for faster and accurate as much as possible
6> I am using sherpa onxx and qued the tts and stt and everything but as a experimentation I have also achieved llama generating respond and kokora processing as a batch with a full nyc working as well and everything on my laptop...
7> along with these my frontend also relies on heavy three.js and 3d view files but I had applied optimisations there which works perfectly with everything together on the laptop..
8> I also applied glued interaction to the llm model .. implemented FIFO with 5 interactions and storing them for future fine tuning and phonetic words additions.
Pls give a visit it and let me know if I should learn something new ..
One kind note : as a enthusiast spending so much energy on these things things .. I have taken help from ai for the md files and expansion or explanations in the codes for better help of every single person...