Proceedings of the 2nd Workshop on Advancing Artificial Intelligence through Theory of Mind
arXiv:2603.18786v1 Announce Type: new Abstract: This volume includes a selection of papers presented at the 2nd Workshop on Advancing Artificial Intelligence through Theory of Mind held at AAAI 2026 in Singapore on 26th January 2026. The purpose of this volume...
Retrieval-Augmented LLM Agents: Learning to Learn from Experience
arXiv:2603.18272v1 Announce Type: new Abstract: While large language models (LLMs) have advanced the development of general-purpose agents, achieving robust generalization to unseen tasks remains a significant challenge. Current approaches typically rely on either fine-tuning or training-free memory-augmented generation using retrieved...
A Computationally Efficient Learning of Artificial Intelligence System Reliability Considering Error Propagation
arXiv:2603.18201v1 Announce Type: new Abstract: Artificial Intelligence (AI) systems are increasingly prominent in emerging smart cities, yet their reliability remains a critical concern. These systems typically operate through a sequence of interconnected functional stages, where upstream errors may propagate to...
Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures
arXiv:2603.18729v1 Announce Type: new Abstract: Many works in the literature show that LLM outputs exhibit discriminatory behaviour, triggering stereotype-based inferences based on the dialect in which the inputs are written. This bias has been shown to be particularly pronounced when...
How LLMs Distort Our Written Language
arXiv:2603.18161v1 Announce Type: new Abstract: Large language models (LLMs) are used by over a billion people globally, most often to assist with writing. In this work, we demonstrate that LLMs not only alter the voice and tone of human writing,...
CWoMP: Morpheme Representation Learning for Interlinear Glossing
arXiv:2603.18184v1 Announce Type: new Abstract: Interlinear glossed text (IGT) is a standard notation for language documentation which is linguistically rich but laborious to produce manually. Recent automated IGT methods treat glosses as character sequences, neglecting their compositional structure. We propose...
How Psychological Learning Paradigms Shaped and Constrained Artificial Intelligence
arXiv:2603.18203v1 Announce Type: new Abstract: The dominant paradigms of artificial intelligence were shaped by learning theories from psychology: behaviorism inspired reinforcement learning, cognitivism gave rise to deep learning and memory-augmented architectures, and constructivism influenced curriculum learning and compositional approaches. This...
From Noise to Signal: When Outliers Seed New Topics
arXiv:2603.18358v1 Announce Type: new Abstract: Outliers in dynamic topic modeling are typically treated as noise, yet we show that some can serve as early signals of emerging topics. We introduce a temporal taxonomy of news-document trajectories that defines how documents...
Synthetic Data Generation for Training Diversified Commonsense Reasoning Models
arXiv:2603.18361v1 Announce Type: new Abstract: Conversational agents are required to respond to their users not only with high quality (i.e. commonsense bearing) responses, but also considering multiple plausible alternative scenarios, reflecting the diversity in their responses. Despite the growing need...
PowerFlow: Unlocking the Dual Nature of LLMs via Principled Distribution Matching
arXiv:2603.18363v1 Announce Type: new Abstract: Unsupervised Reinforcement Learning from Internal Feedback (RLIF) has emerged as a promising paradigm for eliciting the latent capabilities of Large Language Models (LLMs) without external supervision. However, current methods rely on heuristic intrinsic rewards, which...
AutoScreen-FW: An LLM-based Framework for Resume Screening
arXiv:2603.18390v1 Announce Type: new Abstract: Corporate recruiters often need to screen many resumes within a limited time, which increases their burden and may cause suitable candidates to be overlooked. To address these challenges, prior work has explored LLM-based automated resume...
TopoChunker: Topology-Aware Agentic Document Chunking Framework
arXiv:2603.18409v1 Announce Type: new Abstract: Current document chunking methods for Retrieval-Augmented Generation (RAG) typically linearize text. This forced linearization strips away intrinsic topological hierarchies, creating ``semantic fragmentation'' that degrades downstream retrieval quality. In this paper, we propose TopoChunker, an agentic...
TARo: Token-level Adaptive Routing for LLM Test-time Alignment
arXiv:2603.18411v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong reasoning capabilities but typically require expensive post-training to reach high performance. Recent test-time alignment methods offer a lightweight alternative, but have been explored mainly for preference alignment rather than...
Adaptive Decoding via Test-Time Policy Learning for Self-Improving Generation
arXiv:2603.18428v1 Announce Type: new Abstract: Decoding strategies largely determine the quality of Large Language Model (LLM) outputs, yet widely used heuristics such as greedy or fixed temperature/top-p decoding are static and often task-agnostic, leading to suboptimal or inconsistent generation quality...
UT-ACA: Uncertainty-Triggered Adaptive Context Allocation for Long-Context Inference
arXiv:2603.18446v1 Announce Type: new Abstract: Long-context inference remains challenging for large language models due to attention dilution and out-of-distribution degradation. Context selection mitigates this limitation by attending to a subset of key-value cache entries, yet most methods allocate a fixed...
WASD: Locating Critical Neurons as Sufficient Conditions for Explaining and Controlling LLM Behavior
arXiv:2603.18474v1 Announce Type: new Abstract: Precise behavioral control of large language models (LLMs) is critical for complex applications. However, existing methods often incur high training costs, lack natural language controllability, or compromise semantic coherence. To bridge this gap, we propose...
The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices
arXiv:2603.18482v1 Announce Type: new Abstract: Standard decoding strategies for text generation, including top-k, nucleus sampling, and contrastive search, select tokens based on likelihood, restricting selection to high-probability regions. Human language production operates differently: tokens are chosen for communicative appropriateness rather...
Learning to Self-Evolve
arXiv:2603.18620v1 Announce Type: new Abstract: We introduce Learning to Self-Evolve (LSE), a reinforcement learning framework that trains large language models (LLMs) to improve their own contexts at test time. We situate LSE in the setting of test-time self-evolution, where a...
A Comparative Empirical Study of Catastrophic Forgetting Mitigation in Sequential Task Adaptation for Continual Natural Language Processing Systems
arXiv:2603.18641v1 Announce Type: new Abstract: Neural language models deployed in real-world applications must continually adapt to new tasks and domains without forgetting previously acquired knowledge. This work presents a comparative empirical study of catastrophic forgetting mitigation in continual intent classification....
Automatic detection of Gen-AI texts: A comparative framework of neural models
arXiv:2603.18750v1 Announce Type: new Abstract: The rapid proliferation of Large Language Models has significantly increased the difficulty of distinguishing between human-written and AI generated texts, raising critical issues across academic, editorial, and social domains. This paper investigates the problem of...
Mi:dm K 2.5 Pro
arXiv:2603.18788v1 Announce Type: new Abstract: The evolving LLM landscape requires capabilities beyond simple text generation, prioritizing multi-step reasoning, long-context understanding, and agentic workflows. This shift challenges existing models in enterprise environments, especially in Korean-language and domain-specific scenarios where scaling is...
Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework
arXiv:2603.18822v1 Announce Type: new Abstract: This study presents a multi-stage classification framework for detecting human values in noisy Russian language social media, validated on a random sample of 7.5 million public text posts. Drawing on Schwartz's theory of basic human...
Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo
arXiv:2603.18873v1 Announce Type: new Abstract: Popular language learning applications such as Duolingo use large language models (LLMs) to generate lessons for its users. Most lessons focus on general real-world scenarios such as greetings, ordering food, or asking directions, with limited...
Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs
arXiv:2603.18911v1 Announce Type: new Abstract: Knowledge-grounded dialogue systems aim to generate informative, contextually relevant responses by conditioning on external knowledge sources. However, most existing approaches focus exclusively on English, lack explicit citation mechanisms for verifying factual claims, and offer limited...
Frayed RoPE and Long Inputs: A Geometric Perspective
arXiv:2603.18017v1 Announce Type: new Abstract: Rotary Positional Embedding (RoPE) is a widely adopted technique for encoding position in language models, which, while effective, causes performance breakdown when input length exceeds training length. Prior analyses assert (rightly) that long inputs cause...
Engineering Verifiable Modularity in Transformers via Per-Layer Supervision
arXiv:2603.18029v1 Announce Type: new Abstract: Transformers resist surgical control. Ablating an attention head identified as critical for capitalization produces minimal behavioral change because distributed redundancy compensates for damage. This Hydra effect renders interpretability illusory: we may identify components through correlation,...
InfoMamba: An Attention-Free Hybrid Mamba-Transformer Model
arXiv:2603.18031v1 Announce Type: new Abstract: Balancing fine-grained local modeling with long-range dependency capture under computational constraints remains a central challenge in sequence modeling. While Transformers provide strong token mixing, they suffer from quadratic complexity, whereas Mamba-style selective state-space models (SSMs)...
Taming Epilepsy: Mean Field Control of Whole-Brain Dynamics
arXiv:2603.18035v1 Announce Type: new Abstract: Controlling the high-dimensional neural dynamics during epileptic seizures remains a significant challenge due to the nonlinear characteristics and complex connectivity of the brain. In this paper, we propose a novel framework, namely Graph-Regularized Koopman Mean-Field...
MST-Direct: Matching via Sinkhorn Transport for Multivariate Geostatistical Simulation with Complex Non-Linear Dependencies
arXiv:2603.18036v1 Announce Type: new Abstract: Multivariate geostatistical simulation requires the faithful reproduction of complex non-linear dependencies among geological variables, including bimodal distributions, step functions, and heteroscedastic relationships. Traditional methods such as the Gaussian Copula and LU Decomposition assume linear correlation...
Adapting Methods for Domain-Specific Japanese Small LMs: Scale, Architecture, and Quantization
arXiv:2603.18037v1 Announce Type: new Abstract: This paper presents a systematic methodology for building domain-specific Japanese small language models using QLoRA fine-tuning. We address three core questions: optimal training scale, base-model selection, and architecture-aware quantization. Stage 1 (Training scale): Scale-learning experiments...