DBRX
Large language model
From Wikipedia, the free encyclopedia
DBRX is a large language model (LLM) developed by Mosaic under its parent company Databricks, released on March 27, 2024 under the Databricks Open Model License.[3][4][5] It is a mixture-of-experts transformer model, with 132 billion parameters in total. 36 billion parameters (4 out of 16 experts) are active for each token.[6] The released model comes in either a base foundation model version or an instruction-tuned variant.[7]
| DBRX | |
|---|---|
Screenshot of DBRX describing Wikipedia | |
| Developers | Mosaic ML and Databricks team |
| Initial release | March 27, 2024 |
| License | Databricks Open Model License[1][2] |
| Website | https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm |
| Repository | https://github.com/databricks/dbrx |
At the time of its release, DBRX outperformed other prominent open-source models such as Meta's LLaMA 2, Mistral AI's Mixtral, and xAI's Grok, in several benchmarks ranging from language understanding, programming ability and mathematics.[6][8][9]
It was trained for 2.5 months[9] and reported using on 3,072 Nvidia H100s connected by 3.2 terabytes per second bandwidth (InfiniBand), for a training cost of US$10M.[3][non-primary source needed]