Gemma 2: Improving Open Language Models at a Practical Size
Authors: Gemma Team
Published: 2024
Publication: ArXiv
Gemma 2 introduces scalable Transformer-based language models (2B-27B parameters) enhanced with techniques like local-global and group-query attention, achieving state-of-the-art performance for their size and competing with larger models.
Methods: The study applied modifications to the Transformer architecture, such as local-global attentions and group-query attention, as well as knowledge distillation training for select model sizes.
Key Findings: Performance of lightweight language models in terms of efficiency and competitiveness with larger models.
Institution: Google DeepMind, Google
Research Area: LLM, Model Efficiency,Architecture
Discipline: Artificial Intelligence
Citations: 1649