LLMs for all official EU languages on horizon for Finnish startup

A Finnish startup today launched a multilingual AI model that's a “significant milestone” on the path to LLMs for every EU language, the company says.

Helsinki-based Silo AI calls the new large language model Viking 7B. It covers  Danish, Finnish, Icelandic, Norwegian, and Swedish, as well as English and programming languages. Evaluations indicate best-in-class performance in all the Nordic languages — without compromising the English outputs.

Peter Sarlin, Silo AI's CEO, told TNW that his company is now “on the right track” towards its ultimate goal.

“This release marks an important step in our ongoing efforts to develop performant language models for all official EU languages,” he said.

“With the Viking model family, we reaffirm our commitment to Europe's digital sovereignty.”

The 💜 of EU tech

The latest rumblings from the EU tech scene, a story from our wise ol' founder Boris, and some questionable AI art. It's free, every week, in your inbox. Sign up now!

Silo AI's LLM family

Silo specialises in low-resource languages, which lack the linguistic data that's typically needed to train AI models.

Without LLMs in these languages, entire communities will miss out on countless services, from machine translation to personalised healthcare.

To fill the data gap, Silo applies a variety of techniques. One is optimising model architectures for pre-training. Another incorporates translated pairs of high- and low-resource languages.

Several of the techniques use a cross-lingual signal, which enhances the connections between languages.

“It allows the model to generalise and apply learned patterns across different languages — even those with limited training data,” Sarlin said.

New parameters

The 7 billion-parameter Viking is the first release from a model family announced last month. Silo also plans to launch 13B and 33B versions. Checkpoints for both these LLMs were released today.

As the parameters expand, the models will improve their understanding of prompts and their capacity for nuanced outputs. But they will also need greater computational resources, which lead to higher costs and energy consumption.

To conserve these resources, Silo trained Viking on LUMI —  Europe's most powerful supercomputer and the world's third greenest on the Top500 list.

With resources under control and performance proven, Silo now plans to integrate every EU language.

“We consider multilingual LLMs to constitute a part of Europe's digital infrastructure,” Sarlin said.

Also tagged with