https://github.com/bggb7781-collab/lrnnsmdds
Architectures I've experimented with and my personal notes as pros and cons:
1. RNNS: Like Mamba, RWKV and my attempt above: likely the best alternative to transformers but hard to parallelize and I've personally encountered weird logical "bugs": a. Severe bias over repeated text and the ending of the text corpus (go figure...). b. Speed similar to transformers.
Pros: a. Very limited RAM utilization and very good ability to generalize and learn, perplexity reaches to extremely low levels (~1.05 for 2+ for GPT in comparison). b. Relatively easy to understand as it uses backprop, feed-forward, matrices, very similar to transformers.
2. HDC: hyperdimensional computing: for the time being mostly sci-fi...
3. SNNs: spiking neural networks - i ended up having several vibecoded implementations in C#, F#, C. Ultimately despite novel ideas, not very succesful. May be could be succesful but at the moment mostly failure...still potential.
4. TCN: temporal convolution networks...best case it seems.
pros: after than rnn/transformers, good generalization, could be the next great reduction of resources in AI-gen!
screenshot of my last tcn attempts:
https://postimg.cc/R3r71PL2 - ultimately, there is a potential!