💡Mixr 8x7B is an open-source model by Mistl AI with a sparse mixture of experts architecture.
🔎The paper does not disclose the source of the training data, possibly to avoid legal issues.
🚀The model's design allows for faster inference speed at low batch sizes and higher throughput at large batch sizes.
📚Mixr 8x7B is a decoder-only model with a 32,000 token window context size, similar to other large language models.
🎯The model has fewer parameters than other models but still outperforms them on various benchmarks.