Minerva (model)

Italian artificial intelligence project From Wikipedia, the free encyclopedia

Minerva is a large language model developed by an Italian research group, Sapienza NLP, at Sapienza University of Rome, led by Roberto Navigli. It is trained from scratch with a primary focus on the Italian language.[1][2][3]

DeveloperSapienza NLP research group at Sapienza University of Rome
Initial releaseApril 2024; 1 year ago (2024-04)
Available inItalian, English
Quick facts Developer, Initial release ...
Minerva
DeveloperSapienza NLP research group at Sapienza University of Rome
Initial releaseApril 2024; 1 year ago (2024-04)
Operating systemWeb app
Available inItalian, English
TypeChatbot
Large language model
Websiteminerva-llm.org
Close

It is a model for Natural Language Processing tasks, capable of understanding and generating human-like text. This model utilizes deep learning techniques, specifically a Transformer architecture, to process and generate text. It has been trained on a large corpus of text data and has been fine-tuned to perform various language tasks, such as language translation, text summarization, and question answering.[4]

Models

Minerva 7B

With 7 billion parameters, Minerva 7B has been trained on approximately 2.5 trillion tokens, evenly split between Italian and English texts, and includes an additional 200 billion tokens of code. This extensive training enables the model to proficiently handle both Italian and English languages, making it a valuable tool for various natural language processing tasks.[5][6]

The development of Minerva 7B was carried out within the Future Artificial Intelligence Research (FAIR) project, in collaboration with CINECA, which provided the Leonardo (supercomputer) for training. Additional contributions came from Babelscape and the CREATIVE PRIN Project. Notably, Minerva models are truly open, with both data and models accessible to the public.[7][8]

References

See also

Related Articles

Wikiwand AI