Rakuten releases AI 2.0 Mixture of Experts LLM and first Small Language Model SLM topping Japanese benchmarks
- Rakuten’s fine-tuned Mixture of Experts LLM and first SLM aim to accelerate Japan’s AI development with best-in-class scores
Tokyo – Rakuten Group, Inc. has
announced the release of both Rakuten AI 2.0, the company’s first
Japanese large language model (LLM) based on a Mixture of Experts (MoE)*1 architecture,
and Rakuten AI 2.0 mini, the company’s first small language model
(SLM). Both models were unveiled in December 2024, and after further
fine-tuning, Rakuten has released Rakuten AI 2.0 foundation*2 and instruct models*3
along with Rakuten AI 2.0 mini foundation and instruct models to
empower companies and professionals developing AI applications.
Rakuten
AI 2.0 is an 8x7B MoE model based on the Rakuten AI 7B model released
in March 2024. This MoE model is comprised of eight 7 billion parameter
models, each as a separate expert. Each individual token is sent to the
two most relevant experts, as decided by the router. The experts and
router are continually trained together with vast amounts of
high-quality Japanese and English language data.
Rakuten AI 2.0
mini is a 1.5 billion parameter dense model, trained from scratch on a
mix of English and Japanese data, and developed to enable cost-effective
deployments on edge devices for focused use cases. The instruct
versions of the model are developed after instruction fine-tuning and
preference optimization of the respective foundation models.
All the models are released under the Apache 2.0 license*4 and are available from the official Rakuten Group Hugging Face repository*5.
All models can be used commercially for various text generation tasks
such as summarizing content, answering questions, general text
understanding and building dialogue systems. In addition, the models can
be used as a base for building other models.
"I am incredibly proud of how our team has combined data, engineering and science to deliver Rakuten AI 2.0," commented Ting Cai, Chief AI & Data Officer of Rakuten Group.
"Our new AI models deliver powerful, cost-effective solutions that
empower businesses to make intelligent tradeoffs that accelerate time to
value and unlock new possibilities. By sharing open models, we aim to
accelerate AI development in Japan. We’re encouraging every business in
Japan to build, experiment and grow with AI, and hope to foster a
collaborative community that drives progress for all."
Innovative technique for LLM human preferences optimization
During the Rakuten AI 2.0 foundation model fine-tuning process, the development team leveraged the innovative SimPO*6
(Simple Preference Optimization with a Reference-Free Reward) technique
for preference optimization. Compared with traditional RLHF
(Reinforcement Learning from Human Feedback) or the simplified DPO
(Direct Preference Optimization), SimPO combines the benefits of
simplicity, stability and efficiency, making it a cost-efficient and
practical alternative for fine-tuning AI models to align with human
preferences.
Best-in-class Japanese performance
After
further fine-tuning of the foundation models for conversational and
instruction following abilities, Rakuten conducted model evaluations
with the average scores from Japanese MT-Bench*7. Currently,
Japanese MT-Bench is the standard for measuring fine-tuned instruct
models and specifically measures conversational and instruction
following ability.
Comparative scores for Rakuten AI 2.0 instruct
model and Rakuten AI 2.0 mini instruct model and other top performing
models made by other Japanese companies and academic institutions are
given in the following table*8:
Rakuten AI 2.0-instruct is top performing compared with other top
open models made by other Japanese companies and academic institutions
with a similar number of active parameters. Rakuten AI 2.0-mini-instruct
is top performing among similar-sized open models.
Rakuten is
continuously pushing the boundaries of innovation to develop
best-in-class LLMs for R&D and deliver best-in-class AI services to
its customers. By developing models in-house, Rakuten can build up its
knowledge and expertise, and create models optimized to support the
Rakuten Ecosystem. By making the models open, Rakuten aims to contribute
to the open-source community and accelerate the development of local AI
applications and Japanese language LLMs.
As new breakthroughs in
AI trigger transformations across industries, Rakuten’s AI-nization
initiative aims to implement AI in every aspect of its business to drive
further growth. Rakuten is committed to making AI a force for good that
augments humanity, drives productivity and fosters prosperity.
*1 The Mixture of Experts model architecture is an AI model
architecture where the model is divided into multiple sub models, known
as experts. During inference and training, only a subset of the experts
is activated and used to process the input.
*2 Foundation
models are models that have been pre-trained on vast amounts of data and
can then be fine-tuned for specific tasks or applications.
*3
An instruct model is a version of a foundation model fine-tuned on
instruction-style data. This results in the model replying to prompted
instructions.
*4 About the Apache 2.0 License: https://www.apache.org/licenses/LICENSE-2.0
*5 Rakuten Group Official Hugging Face repository: https://huggingface.co/Rakuten
*6 SimPO
uses the average probability of model output as the implicit reward
instead of relying on a reference model. This method reduces
computational overhead and enables preference optimization of larger
models: https://arxiv.org/abs/2405.14734
*7 Results
of evaluation tests are carried out on Japanese MT-Bench. Japanese
MT-Bench is a set of 80 challenging open-ended questions for evaluating
chat assistants on eight dimensions: writing, roleplay, reasoning, math,
coding, extraction, STEM, humanities: https://github.com/Stability-AI/FastChat/tree/jp-stable/fastchat/llm_judge
Evaluation of responses is conducted with GPT4(gpt-4o-2024-05-13) as a judge, in line with a public leaderboard.
*8 Scores for other models are taken from a public leaderboard maintained by Weights and Biases on January 27, 2025: https://wandb.ai/wandb-japan/llm-leaderboard3/reports/Nejumi-LLM-3--Vmlldzo3OTg2NjM2
*9 The size of 8x7B models are less than 56B due to parameters, except for experts, being shared.
Comments
Post a Comment