Unveiling Language Routing Isolation in Multilingual MoE Models for Interpretable Subnetwork Adaptation
arXiv:2604.03592v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models exhibit striking performance disparities across languages, yet the internal mechanisms driving these gaps remain poorly understood. In this work, we conduct a systematic analysis of expert routing patterns in MoE models, revealing...
Beyond Retrieval: Modeling Confidence Decay and Deterministic Agentic Platforms in Generative Engine Optimization
arXiv:2604.03656v1 Announce Type: new Abstract: Generative Engine Optimization (GEO) is rapidly reshaping digital marketing paradigms in the era of Large Language Models (LLMs). However, current GEO strategies predominantly rely on Retrieval-Augmented Generation (RAG), which inherently suffers from probabilistic hallucinations and...
VERT: Reliable LLM Judges for Radiology Report Evaluation
arXiv:2604.03376v1 Announce Type: new Abstract: Current literature on radiology report evaluation has focused primarily on designing LLM-based metrics and fine-tuning small models for chest X-rays. However, it remains unclear whether these approaches are robust when applied to reports from other...
Why Attend to Everything? Focus is the Key
arXiv:2604.03260v1 Announce Type: new Abstract: We introduce Focus, a method that learns which token pairs matter rather than approximating all of them. Learnable centroids assign tokens to groups; distant attention is restricted to same-group pairs while local attention operates at...
Diagonal-Tiled Mixed-Precision Attention for Efficient Low-Bit MXFP Inference
arXiv:2604.03950v1 Announce Type: new Abstract: Transformer-based large language models (LLMs) have demonstrated remarkable performance across a wide range of real-world tasks, but their inference cost remains prohibitively high due to the quadratic complexity of attention and the memory bandwidth limitations...
Improving Model Performance by Adapting the KGE Metric to Account for System Non-Stationarity
arXiv:2604.03906v1 Announce Type: new Abstract: Geoscientific systems tend to be characterized by pronounced temporal non-stationarity, arising from seasonal and climatic variability in hydrometeorological drivers, and from natural and anthropogenic changes to land use and cover. As has been pointed out,...
Shorter, but Still Trustworthy? An Empirical Study of Chain-of-Thought Compression
arXiv:2604.04120v1 Announce Type: new Abstract: Long chain-of-thought (Long-CoT) reasoning models have motivated a growing body of work on compressing reasoning traces to reduce inference cost, yet existing evaluations focus almost exclusively on task accuracy and token savings. Trustworthiness properties, whether...
Toward Full Autonomous Laboratory Instrumentation Control with Large Language Models
arXiv:2604.03286v1 Announce Type: new Abstract: The control of complex laboratory instrumentation often requires significant programming expertise, creating a barrier for researchers lacking computational skills. This work explores the potential of large language models (LLMs), such as ChatGPT, and LLM-based artificial...
CAWN: Continuous Acoustic Wave Networks for Autoregressive Language Modeling
arXiv:2604.04250v1 Announce Type: new Abstract: Modern Large Language Models (LLMs) rely on Transformer self-attention, which scales quadratically with sequence length. Recent linear-time alternatives, like State Space Models (SSMs), often suffer from signal degradation over extended contexts. We introduce the Continuous...
CoALFake: Collaborative Active Learning with Human-LLM Co-Annotation for Cross-Domain Fake News Detection
arXiv:2604.04174v1 Announce Type: new Abstract: The proliferation of fake news across diverse domains highlights critical limitations in current detection systems, which often exhibit narrow domain specificity and poor generalization. Existing cross-domain approaches face two key challenges: (1) reliance on labelled...
Towards the AI Historian: Agentic Information Extraction from Primary Sources
arXiv:2604.03553v1 Announce Type: new Abstract: AI is supporting, accelerating, and automating scientific discovery across a diverse set of fields. However, AI adoption in historical research remains limited due to the lack of solutions designed for historians. In this technical progress...
Scaling DPPs for RAG: Density Meets Diversity
arXiv:2604.03240v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by grounding generation in external knowledge, yielding relevance responses that are aligned with factual evidence and evolving corpora. Standard RAG pipelines construct context through relevance ranking, performing...
Adaptive Threshold-Driven Continuous Greedy Method for Scalable Submodular Optimization
arXiv:2604.03419v1 Announce Type: new Abstract: Submodular maximization under matroid constraints is a fundamental problem in combinatorial optimization with applications in sensing, data summarization, active learning, and resource allocation. While the Sequential Greedy (SG) algorithm achieves only a $\frac{1}{2}$-approximation due to...
Researchers waste 80% of LLM annotation costs by classifying one text at a time
arXiv:2604.03684v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly being used for text classification across the social sciences, yet researchers overwhelmingly classify one text per variable per prompt. Coding 100,000 texts on four variables requires 400,000 API calls....
BWTA: Accurate and Efficient Binarized Transformer by Algorithm-Hardware Co-design
arXiv:2604.03957v1 Announce Type: new Abstract: Ultra low-bit quantization brings substantial efficiency for Transformer-based models, but the accuracy degradation and limited GPU support hinder its wide usage. In this paper, we analyze zero-point distortion in binarization and propose a Binary Weights...
Automated Conjecture Resolution with Formal Verification
arXiv:2604.03789v1 Announce Type: new Abstract: Recent advances in large language models have significantly improved their ability to perform mathematical reasoning, extending from elementary problem solving to increasingly capable performance on research-level problems. However, reliably solving and verifying such problems remains...
Querying Structured Data Through Natural Language Using Language Models
arXiv:2604.03057v1 Announce Type: new Abstract: This paper presents an open source methodology for allowing users to query structured non textual datasets through natural language Unlike Retrieval Augmented Generation RAG which struggles with numerical and highly structured information our approach trains...
AdaHOP: Fast and Accurate Low-Precision Training via Outlier-Pattern-Aware Rotation
arXiv:2604.02525v1 Announce Type: new Abstract: Low-precision training (LPT) commonly employs Hadamard transforms to suppress outliers and mitigate quantization error in large language models (LLMs). However, prior methods apply a fixed transform uniformly, despite substantial variation in outlier structures across tensors....
Automatic Textbook Formalization
arXiv:2604.03071v1 Announce Type: new Abstract: We present a case study where an automatic AI system formalizes a textbook with more than 500 pages of graduate-level algebraic combinatorics to Lean. The resulting formalization represents a new milestone in textbook formalization scale...
Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains
arXiv:2604.02343v1 Announce Type: cross Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve...
SIEVE: Sample-Efficient Parametric Learning from Natural Language
arXiv:2604.02339v1 Announce Type: new Abstract: Natural language context-such as instructions, knowledge, or feedback-contains rich signal for adapting language models. While in-context learning provides adaptation via the prompt, parametric learning persists into model weights and can improve performance further, though is...
SocioEval: A Template-Based Framework for Evaluating Socioeconomic Status Bias in Foundation Models
arXiv:2604.02660v1 Announce Type: new Abstract: As Large Language Models (LLMs) increasingly power decision-making systems across critical domains, understanding and mitigating their biases becomes essential for responsible AI deployment. Although bias assessment frameworks have proliferated for attributes such as race and...
Not All Denoising Steps Are Equal: Model Scheduling for Faster Masked Diffusion Language Models
arXiv:2604.02340v1 Announce Type: new Abstract: Recent advances in masked diffusion language models (MDLMs) narrow the quality gap to autoregressive LMs, but their sampling remains expensive because generation requires many full-sequence denoising passes with a large Transformer and, unlike autoregressive decoding,...
EMS: Multi-Agent Voting via Efficient Majority-then-Stopping
arXiv:2604.02863v1 Announce Type: new Abstract: Majority voting is the standard for aggregating multi-agent responses into a final decision. However, traditional methods typically require all agents to complete their reasoning before aggregation begins, leading to significant computational overhead, as many responses...
Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts
arXiv:2604.03127v1 Announce Type: new Abstract: Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient domain grounding. We present a domain-adapted RAG pipeline for tutoring move annotation. Rather than fine-tuning the generative model, we adapt...
An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages
arXiv:2604.02596v1 Announce Type: new Abstract: In-context learning (ICL) allows large language models (LLMs) to adapt to new tasks from a few examples, making it promising for languages underrepresented in pre-training. Recent work on many-shot ICL suggests that modern LLMs can...
Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems
arXiv:2604.02668v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit sycophancy: agreement with user stance even when it conflicts with the model's opinion. While prior work has mostly studied this in single-agent settings, it remains underexplored in collaborative multi-agent...
Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation
arXiv:2604.02557v1 Announce Type: new Abstract: Language models are known to exhibit various forms of cultural bias in decision-making tasks, yet much less is known about their degree of cultural familiarity in open-ended text generation tasks. In this paper, we introduce...
Improving Role Consistency in Multi-Agent Collaboration via Quantitative Role Clarity
arXiv:2604.02770v1 Announce Type: new Abstract: In large language model (LLM)-driven multi-agent systems, disobey role specification (failure to adhere to the defined responsibilities and constraints of an assigned role, potentially leading to an agent behaving like another) is a major failure...
DIGITAL DIPLOMACY AND ARTIFICIAL INTELLIGENCE: REGULATION ASPECTS IN INTERNATIONAL LAW
The article examines the legal aspects of regulating artificial intelligence in the context of digital diplomacy. The author examines the process of transformation of traditional diplomatic institutions under the influence of digitalization and the introduction of artificial intelligence technologies, analyzes...