Skip to main content

All Articles

Articles

Academic · 1 min

LuxMT Technical Report

arXiv:2602.15506v1 Announce Type: new Abstract: We introduce LuxMT, a machine translation system based on Gemma 3 27B and fine-tuned for translation from Luxembourgish (LB) into …

Nils Rehlinger
3 views
Academic · 1 min

ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns

arXiv:2602.15521v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch …

Ziyu Zhao, Tong Zhu, Zhi Zhang, Tiantian Fan, Jinluan Yang, Kun Kuang, Zhongyu Wei, Fei Wu, Yu Cheng
7 views
Academic · 1 min

jina-embeddings-v5-text: Task-Targeted Embedding Distillation

arXiv:2602.15547v1 Announce Type: new Abstract: Text embedding models are widely used for semantic similarity tasks, including information retrieval, clustering, and classification. General-purpose models are typically …

Mohammad Kalim Akram, Saba Sturua, Nastia Havriushenko, Quentin Herreros, Michael G\"unther, Maximilian Werk, Han Xiao
3 views