🐍Mamba is a new neural net architecture better than transformers at language modelling.
💡Mamba uses less compute and allows for much greater context sizes compared to transformers.
⚡Mamba is an extension of the state-space model and can also be understood as an extension of recurrent neural networks.
🧠Mamba incorporates long-range information, solving the problem of slow computation and difficult training.
🚀The model weights are initialized to ensure stability and it performs well on tasks evaluating long-range reasoning.