One Supervisor, Many Modalities: Adaptive Tool Orchestration for Autonomous Queries
arXiv:2603.11545v1 Announce Type: new Abstract: We present an agentic AI framework for autonomous multimodal query processing that coordinates specialized tools across text, image, audio, video, and document modalities. A central Supervisor dynamically decomposes user queries, delegates subtasks to modality-appropriate tools...
A technology-oriented mapping of the language and translation industry: Analysing stakeholder values and their potential implication for translation pedagogy
arXiv:2603.11667v1 Announce Type: new Abstract: This paper examines how value is constructed and negotiated in today's increasingly automated language and translation industry. Drawing on interview data from twenty-nine industry stakeholders collected within the LT-LiDER project, the study analyses how human...
SemBench: A Universal Semantic Framework for LLM Evaluation
arXiv:2603.11687v1 Announce Type: new Abstract: Recent progress in Natural Language Processing (NLP) has been driven by the emergence of Large Language Models (LLMs), which exhibit remarkable generative and reasoning capabilities. However, despite their success, evaluating the true semantic understanding of...
Semi-Synthetic Parallel Data for Translation Quality Estimation: A Case Study of Dataset Building for an Under-Resourced Language Pair
arXiv:2603.11743v1 Announce Type: new Abstract: Quality estimation (QE) plays a crucial role in machine translation (MT) workflows, as it serves to evaluate generated outputs that have no reference translations and to determine whether human post-editing or full retranslation is necessary....
Legal-DC: Benchmarking Retrieval-Augmented Generation for Legal Documents
arXiv:2603.11772v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has emerged as a promising technology for legal document consultation, yet its application in Chinese legal scenarios faces two key limitations: existing benchmarks lack specialized support for joint retriever-generator evaluation, and mainstream...
CHiL(L)Grader: Calibrated Human-in-the-Loop Short-Answer Grading
arXiv:2603.11957v1 Announce Type: new Abstract: Scaling educational assessment with large language models requires not just accuracy, but the ability to recognize when predictions are trustworthy. Instruction-tuned models tend to be overconfident, and their reliability deteriorates as curricula evolve, making fully...
BTZSC: A Benchmark for Zero-Shot Text Classification Across Cross-Encoders, Embedding Models, Rerankers and LLMs
arXiv:2603.11991v1 Announce Type: new Abstract: Zero-shot text classification (ZSC) offers the promise of eliminating costly task-specific annotation by matching texts directly to human-readable label descriptions. While early approaches have predominantly relied on cross-encoder models fine-tuned for natural language inference (NLI),...
Long-Context Encoder Models for Polish Language Understanding
arXiv:2603.12191v1 Announce Type: new Abstract: While decoder-only Large Language Models (LLMs) have recently dominated the NLP landscape, encoder-only architectures remain a cost-effective and parameter-efficient standard for discriminative tasks. However, classic encoders like BERT are limited by a short context window,...
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
arXiv:2603.12201v1 Announce Type: new Abstract: Long-context agentic workflows have emerged as a defining use case for large language models, making attention efficiency critical for both inference speed and serving cost. Sparse attention addresses this challenge effectively, and DeepSeek Sparse Attention...
Sparking Scientific Creativity via LLM-Driven Interdisciplinary Inspiration
arXiv:2603.12226v1 Announce Type: new Abstract: Despite interdisciplinary research leading to larger and longer-term impact, most work remains confined to single-domain academic silos. Recent AI-based approaches to scientific discovery show promise for interdisciplinary research, but many prioritize rapidly designing experiments and...
Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information
arXiv:2603.11094v1 Announce Type: new Abstract: Streaming sources of data are becoming more common as the ability to collect data in real-time grows. A major concern in dealing with data streams is concept drift, a change in the distribution of data...
High-resolution weather-guided surrogate modeling for data-efficient cross-location building energy prediction
arXiv:2603.11121v1 Announce Type: new Abstract: Building design optimization often depends on physics-based simulation tools such as EnergyPlus, which, although accurate, are computationally expensive and slow. Surrogate models provide a faster alternative, yet most are location-specific, and even weather-informed variants require...
H2LooP Spark Preview: Continual Pretraining of Large Language Models for Low-Level Embedded Systems Code
arXiv:2603.11139v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate strong code generation abilities in general-purpose programming languages but remain limited in specialized domains such as low-level embedded systems programming. This domain involves hardware register manipulation, vendor-specific SDKs, real-time operating...
Attention Gathers, MLPs Compose: A Causal Analysis of an Action-Outcome Circuit in VideoViT
arXiv:2603.11142v1 Announce Type: new Abstract: The paper explores how video models trained for classification tasks represent nuanced, hidden semantic information that may not affect the final outcome, a key challenge for Trustworthy AI models. Through Explainable and Interpretable AI methods,...
Algorithmic Capture, Computational Complexity, and Inductive Bias of Infinite Transformers
arXiv:2603.11161v1 Announce Type: new Abstract: We formally define Algorithmic Capture (i.e., ``grokking'' an algorithm) as the ability of a neural network to generalize to arbitrary problem sizes ($T$) with controllable error and minimal sample adaptation, distinguishing true algorithmic learning from...
Huntington Disease Automatic Speech Recognition with Biomarker Supervision
arXiv:2603.11168v1 Announce Type: new Abstract: Automatic speech recognition (ASR) for pathological speech remains underexplored, especially for Huntington's disease (HD), where irregular timing, unstable phonation, and articulatory distortion challenge current models. We present a systematic HD-ASR study using a high-fidelity clinical...
Representation Finetuning for Continual Learning
arXiv:2603.11201v1 Announce Type: new Abstract: The world is inherently dynamic, and continual learning aims to enable models to adapt to ever-evolving data streams. While pre-trained models have shown powerful performance in continual learning, they still require finetuning to adapt effectively...
Reference-Guided Machine Unlearning
arXiv:2603.11210v1 Announce Type: new Abstract: Machine unlearning aims to remove the influence of specific data from trained models while preserving general utility. Existing approximate unlearning methods often rely on performance-degradation heuristics, such as loss maximization or random labeling. However, these...
Duration Aware Scheduling for ASR Serving Under Workload Drift
arXiv:2603.11273v1 Announce Type: new Abstract: Scheduling policies in large-scale Automatic Speech Recognition (ASR) serving pipelines play a key role in determining end-to-end (E2E) latency. Yet, widely used serving engines rely on first-come-first-served (FCFS) scheduling, which ignores variability in request duration...
Meta-Reinforcement Learning with Self-Reflection for Agentic Search
arXiv:2603.11327v1 Announce Type: new Abstract: This paper introduces MR-Search, an in-context meta reinforcement learning (RL) formulation for agentic search with self-reflection. Instead of optimizing a policy within a single independent episode with sparse rewards, MR-Search trains a policy that conditions...
Teleodynamic Learning a new Paradigm For Interpretable AI
arXiv:2603.11355v1 Announce Type: new Abstract: We introduce Teleodynamic Learning, a new paradigm for machine learning in which learning is not the minimization of a fixed objective, but the emergence and stabilization of functional organization under constraint. Inspired by living systems,...
Ensuring Safety in Automated Mechanical Ventilation through Offline Reinforcement Learning and Digital Twin Verification
arXiv:2603.11372v1 Announce Type: new Abstract: Mechanical ventilation (MV) is a life-saving intervention for patients with acute respiratory failure (ARF) in the ICU. However, inappropriate ventilator settings could cause ventilator-induced lung injury (VILI). Also, clinicians workload is shown to be directly...
UniHetCO: A Unified Heterogeneous Representation for Multi-Problem Learning in Unsupervised Neural Combinatorial Optimization
arXiv:2603.11456v1 Announce Type: new Abstract: Unsupervised neural combinatorial optimization (NCO) offers an appealing alternative to supervised approaches by training learning-based solvers without ground-truth solutions, directly minimizing instance objectives and constraint violations. Yet for graph node subset-selection problems (e.g., Maximum Clique...
PoultryLeX-Net: Domain-Adaptive Dual-Stream Transformer Architecture for Large-Scale Poultry Stakeholder Modeling
arXiv:2603.09991v1 Announce Type: cross Abstract: The rapid growth of the global poultry industry, driven by rising demand for affordable animal protein, has intensified public discourse surrounding production practices, housing, management, animal welfare, and supply-chain transparency. Social media platforms such as...
TAMUSA-Chat: A Domain-Adapted Large Language Model Conversational System for Research and Responsible Deployment
arXiv:2603.09992v1 Announce Type: cross Abstract: This paper presents TAMUSA-Chat, a research-oriented framework for building domain-adapted large language model conversational systems. The work addresses critical challenges in adapting general-purpose foundation models to institutional contexts through supervised fine-tuning, retrieval-augmented generation, and systematic...
Resource-constrained Amazons chess decision framework integrating large language models and graph attention
arXiv:2603.10512v1 Announce Type: new Abstract: Artificial intelligence has advanced significantly through the development of intelligent game-playing systems, providing rigorous testbeds for decision-making, strategic planning, and adaptive learning. However, resource-constrained environments pose critical challenges, as conventional deep learning methods heavily rely...
AraModernBERT: Transtokenized Initialization and Long-Context Encoder Modeling for Arabic
arXiv:2603.09982v1 Announce Type: cross Abstract: Encoder-only transformer models remain widely used for discriminative NLP tasks, yet recent architectural advances have largely focused on English. In this work, we present AraModernBERT, an adaptation of the ModernBERT encoder architecture to Arabic, and...
Explainable LLM Unlearning Through Reasoning
arXiv:2603.09980v1 Announce Type: cross Abstract: LLM unlearning is essential for mitigating safety, copyright, and privacy concerns in pre-trained large language models (LLMs). Compared to preference alignment, it offers a more explicit way by removing undesirable knowledge characterized by specific unlearning...
Adaptive RAN Slicing Control via Reward-Free Self-Finetuning Agents
arXiv:2603.10564v1 Announce Type: new Abstract: The integration of Generative AI models into AI-native network systems offers a transformative path toward achieving autonomous and adaptive control. However, the application of such models to continuous control tasks is impeded by intrinsic architectural...
CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents
arXiv:2603.10577v1 Announce Type: new Abstract: Computer-Use Agents (CUAs) are emerging as a new paradigm in human-computer interaction, enabling autonomous execution of tasks in desktop environment by perceiving high-level natural-language instructions. As such agents become increasingly capable and are deployed across...