The acquisition of English irregular inflections by Yemeni L1 Arabic learners: A Universal Grammar approach
arXiv:2602.13816v1 Announce Type: new Abstract: This study examines the acquisition of English irregular inflections by Yemeni learners of English as a second language (L2), utilizing a Universal Grammar (UG) approach. Within the UG approach, the study considers Feature Reassembly Hypothesis...
Beyond Words: Evaluating and Bridging Epistemic Divergence in User-Agent Interaction via Theory of Mind
arXiv:2602.13832v1 Announce Type: new Abstract: Large Language Models (LLMs) have developed rapidly and are widely applied to both general-purpose and professional tasks to assist human users. However, they still struggle to comprehend and respond to the true user needs when...
PrivAct: Internalizing Contextual Privacy Preservation via Multi-Agent Preference Training
arXiv:2602.13840v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly deployed in personalized tasks involving sensitive, context-dependent information, where privacy violations may arise in agents' action due to the implicitness of contextual privacy. Existing approaches rely on external,...
Tutoring Large Language Models to be Domain-adaptive, Precise, and Safe
arXiv:2602.13860v1 Announce Type: new Abstract: The overarching research direction of this work is the development of a ''Responsible Intelligence'' framework designed to reconcile the immense generative power of Large Language Models (LLMs) with the stringent requirements of real-world deployment. As...
Bridging the Multilingual Safety Divide: Efficient, Culturally-Aware Alignment for Global South Languages
arXiv:2602.13867v1 Announce Type: new Abstract: Large language models (LLMs) are being deployed across the Global South, where everyday use involves low-resource languages, code-mixing, and culturally specific norms. Yet safety pipelines, benchmarks, and alignment still largely target English and a handful...
ADAB: Arabic Dataset for Automated Politeness Benchmarking -- A Large-Scale Resource for Computational Sociopragmatics
arXiv:2602.13870v1 Announce Type: new Abstract: The growing importance of culturally-aware natural language processing systems has led to an increasing demand for resources that capture sociopragmatic phenomena across diverse languages. Nevertheless, Arabic-language resources for politeness detection remain under-explored, despite the rich...
Evaluating Prompt Engineering Techniques for RAG in Small Language Models: A Multi-Hop QA Approach
arXiv:2602.13890v1 Announce Type: new Abstract: Retrieval Augmented Generation (RAG) is a powerful approach for enhancing the factual grounding of language models by integrating external knowledge. While widely studied for large language models, the optimization of RAG for Small Language Models...
Chain-of-Thought Reasoning with Large Language Models for Clinical Alzheimer's Disease Assessment and Diagnosis
arXiv:2602.13979v1 Announce Type: new Abstract: Alzheimer's disease (AD) has become a prevalent neurodegenerative disease worldwide. Traditional diagnosis still relies heavily on medical imaging and clinical assessment by physicians, which is often time-consuming and resource-intensive in terms of both human expertise...
Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models
arXiv:2602.14039v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) embedding models combine expert outputs using weighted linear summation, implicitly assuming a linear subspace structure in the embedding space. This assumption is shown to be inconsistent with the geometry of expert representations. Geometric...
Context Shapes LLMs Retrieval-Augmented Fact-Checking Effectiveness
arXiv:2602.14044v1 Announce Type: new Abstract: Large language models (LLMs) show strong reasoning abilities across diverse tasks, yet their performance on extended contexts remains inconsistent. While prior research has emphasized mid-context degradation in question answering, this study examines the impact of...
LogitsCoder: Towards Efficient Chain-of-Thought Path Search via Logits Preference Decoding for Code Generation
arXiv:2602.14054v1 Announce Type: new Abstract: Code generation remains a challenging task that requires precise and structured reasoning. Existing Test Time Scaling (TTS) methods, including structured tree search, have made progress in exploring reasoning paths but still face two major challenges:...
LM-Lexicon: Improving Definition Modeling via Harmonizing Semantic Experts
arXiv:2602.14060v1 Announce Type: new Abstract: We introduce LM-Lexicon, an innovative definition modeling approach that incorporates data clustering, semantic expert learning, and model merging using a sparse mixture-of-experts architecture. By decomposing the definition modeling task into specialized semantic domains, where small...
Attention-gated U-Net model for semantic segmentation of brain tumors and feature extraction for survival prognosis
arXiv:2602.15067v1 Announce Type: new Abstract: Gliomas, among the most common primary brain tumors, vary widely in aggressiveness, prognosis, and histology, making treatment challenging due to complex and time-intensive surgical interventions. This study presents an Attention-Gated Recurrent Residual U-Net (R2U-Net) based...
ResearchGym: Evaluating Language Model Agents on Real-World AI Research
arXiv:2602.15112v1 Announce Type: new Abstract: We introduce ResearchGym, a benchmark and execution environment for evaluating AI agents on end-to-end research. To instantiate this, we repurpose five oral and spotlight papers from ICML, ICLR, and ACL. From each paper's repository, we...
Panini: Continual Learning in Token Space via Structured Memory
arXiv:2602.15156v1 Announce Type: new Abstract: Language models are increasingly used to reason over content they were not trained on, such as new documents, evolving knowledge, and user-specific data. A common approach is retrieval-augmented generation (RAG), which stores verbatim documents externally...
da Costa and Tarski meet Goguen and Carnap: a novel approach for ontological heterogeneity based on consequence systems
arXiv:2602.15158v1 Announce Type: new Abstract: This paper presents a novel approach for ontological heterogeneity that draws heavily from Carnapian-Goguenism, as presented by Kutz, Mossakowski and L\"ucke (2010). The approach is provisionally designated da Costian-Tarskianism, named after da Costa's Principle of...
Predicting Invoice Dilution in Supply Chain Finance with Leakage Free Two Stage XGBoost, KAN (Kolmogorov Arnold Networks), and Ensemble Models
arXiv:2602.15248v1 Announce Type: new Abstract: Invoice or payment dilution is the gap between the approved invoice amount and the actual collection is a significant source of non credit risk and margin loss in supply chain finance. Traditionally, this risk is...
Epistemic Traps: Rational Misalignment Driven by Model Misspecification
arXiv:2602.17676v1 Announce Type: new Abstract: The rapid deployment of Large Language Models and AI agents across critical societal and technical domains is hindered by persistent behavioral pathologies including sycophancy, hallucination, and strategic deception that resist mitigation via reinforcement learning. Current...
Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets
arXiv:2602.18025v1 Announce Type: new Abstract: Scalable robot policy pre-training has been hindered by the high cost of collecting high-quality demonstrations for each platform. In this study, we address this issue by uniting offline reinforcement learning (offline RL) with cross-embodiment learning....
SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps
arXiv:2602.18201v1 Announce Type: new Abstract: Unsupervised representations are widely assumed to be neutral with respect to sensitive attributes when those attributes are withheld from training. We show that this assumption is false. Using SOMtime, a topology-preserving representation method based on...
Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies
arXiv:2602.18291v1 Announce Type: new Abstract: Online Multi-Agent Reinforcement Learning (MARL) is a prominent framework for efficient agent coordination. Crucially, enhancing policy expressiveness is pivotal for achieving superior performance. Diffusion-based generative models are well-positioned to meet this demand, having demonstrated remarkable...
AI Hallucination from Students' Perspective: A Thematic Analysis
arXiv:2602.17671v1 Announce Type: cross Abstract: As students increasingly rely on large language models, hallucinations pose a growing threat to learning. To mitigate this, AI literacy must expand beyond prompt engineering to address how students should detect and respond to LLM...
Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts
arXiv:2602.17675v1 Announce Type: cross Abstract: Enterprise conversational UIs increasingly need to orchestrate heterogeneous backend agents and tools across project and account boundaries in a secure and reproducible way. Starting from Gemini Enterprise Agent-to-Agent (A2A) invocation, we implement an A2A Hub...
CodeScaler: Scaling Code LLM Training and Test-Time Inference via Execution-Free Reward Models
arXiv:2602.17684v1 Announce Type: cross Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) has driven recent progress in code large language models by leveraging execution-based feedback from unit tests, but its scalability is fundamentally constrained by the availability and reliability of high-quality...
Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO
arXiv:2602.17686v1 Announce Type: cross Abstract: Distilling Chain-of-Thought (CoT) reasoning from large language models into compact student models presents a fundamental challenge: teacher rationales are often too verbose for smaller models to faithfully reproduce. Existing approaches either compress reasoning into single-step,...
IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering
arXiv:2602.17687v1 Announce Type: cross Abstract: AI systems have achieved remarkable success in processing text and relational data, yet visual document processing remains relatively underexplored. Whereas traditional systems require OCR transcriptions to convert these visual documents into text and metadata, recent...
Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction
arXiv:2602.17689v1 Announce Type: cross Abstract: Medical vision-language models show strong potential for joint reasoning over medical images and clinical text, but their performance often degrades under domain shift caused by variations in imaging devices, acquisition protocols, and reporting styles. Existing...
Agentic Unlearning: When LLM Agent Meets Machine Unlearning
arXiv:2602.17692v1 Announce Type: cross Abstract: In this paper, we introduce \textbf{agentic unlearning} which removes specified information from both model parameters and persistent memory in agents with closed-loop interaction. Existing unlearning methods target parameters alone, leaving two critical gaps: (i) parameter-memory...
AsynDBT: Asynchronous Distributed Bilevel Tuning for efficient In-Context Learning with Large Language Models
arXiv:2602.17694v1 Announce Type: cross Abstract: With the rapid development of large language models (LLMs), an increasing number of applications leverage cloud-based LLM APIs to reduce usage costs. However, since cloud-based models' parameters and gradients are agnostic, users have to manually...
ScaleBITS: Scalable Bitwidth Search for Hardware-Aligned Mixed-Precision LLMs
arXiv:2602.17698v1 Announce Type: cross Abstract: Post-training weight quantization is crucial for reducing the memory and inference cost of large language models (LLMs), yet pushing the average precision below 4 bits remains challenging due to highly non-uniform weight sensitivity and the...