TheraAgent: Multi-Agent Framework with Self-Evolving Memory and Evidence-Calibrated Reasoning for PET Theranostics
arXiv:2603.13676v1 Announce Type: new Abstract: PET theranostics is transforming precision oncology, yet treatment response varies substantially; many patients receiving 177Lu-PSMA radioligand therapy (RLT) for metastatic castration-resistant prostate cancer (mCRPC) fail to respond, demanding reliable pre-therapy prediction. While LLM-based agents have...
StatePlane: A Cognitive State Plane for Long-Horizon AI Systems Under Bounded Context
arXiv:2603.13644v1 Announce Type: new Abstract: Large language models (LLMs) and small language models (SLMs) operate under strict context window and key-value (KV) cache constraints, fundamentally limiting their ability to reason coherently over long interaction horizons. Existing approaches -- extended context...
Automating Document Intelligence in Statutory City Planning
arXiv:2603.13245v1 Announce Type: new Abstract: UK planning authorities face a legislative conflict between the Planning Act, which mandates public access to application documents, and the Data Protection Act, which requires protection of personal information. This situation creates a manually intensive...
The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning
arXiv:2603.13372v1 Announce Type: new Abstract: The Abstraction and Reasoning Corpus (ARC-AGI) has become a key benchmark for fluid intelligence in AI. This survey presents the first cross-generation analysis of 82 approaches across three benchmark versions and the ARC Prize 2024-2025...
Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities
arXiv:2603.13651v1 Announce Type: new Abstract: Bibliographic reference extraction and parsing are foundational for citation indexing, linking, and downstream scholarly knowledge-graph construction. However, most established evaluations focus on clean, English, end-of-document bibliographies, and therefore underrepresent the Social Sciences and Humanities (SSH),...
Projection-Free Evolution Strategies for Continuous Prompt Search
arXiv:2603.13786v1 Announce Type: new Abstract: Continuous prompt search offers a computationally efficient alternative to conventional parameter tuning in natural language processing tasks. Nevertheless, its practical effectiveness can be significantly hindered by the black-box nature and the inherent high-dimensionality of the...
PA-Net: Precipitation-Adaptive Mixture-of-Experts for Long-Tail Rainfall Nowcasting
arXiv:2603.13818v1 Announce Type: new Abstract: Precipitation nowcasting is vital for flood warning, agricultural management, and emergency response, yet two bottlenecks persist: the prohibitive cost of modeling million-scale spatiotemporal tokens from multi-variate atmospheric fields, and the extreme long-tailed rainfall distribution where...
Multimodal Emotion Regression with Multi-Objective Optimization and VAD-Aware Audio Modeling for the 10th ABAW EMI Track
arXiv:2603.13760v1 Announce Type: new Abstract: We participated in the 10th ABAW Challenge, focusing on the Emotional Mimicry Intensity (EMI) Estimation track on the Hume-Vidmimic2 dataset. This task aims to predict six continuous emotion dimensions: Admiration, Amusement, Determination, Empathic Pain, Excitement,...
Think First, Diffuse Fast: Improving Diffusion Language Model Reasoning via Autoregressive Plan Conditioning
arXiv:2603.13243v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) generate text via iterative denoising but consistently underperform on multi-step reasoning. We hypothesize this gap stems from a coordination problem: AR models build coherence token-by-token, while diffusion models must coordinate...
Why Grokking Takes So Long: A First-Principles Theory of Representational Phase Transitions
arXiv:2603.13331v1 Announce Type: new Abstract: Grokking is the sudden generalization that appears long after a model has perfectly memorized its training data. Although this phenomenon has been widely observed, there is still no quantitative theory explaining the length of the...
GhanaNLP Parallel Corpora: Comprehensive Multilingual Resources for Low-Resource Ghanaian Languages
arXiv:2603.13793v1 Announce Type: new Abstract: Low resource languages present unique challenges for natural language processing due to the limited availability of digitized and well structured linguistic data. To address this gap, the GhanaNLP initiative has developed and curated 41,513 parallel...
Preconditioned Test-Time Adaptation for Out-of-Distribution Debiasing in Narrative Generation
arXiv:2603.13683v1 Announce Type: new Abstract: Although debiased LLMs perform well on known bias patterns, they often fail to generalize to unfamiliar bias prompts, producing toxic outputs. We first validate that such high-bias prompts constitute a \emph{distribution shift} via OOD detection,...
DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents
arXiv:2603.13791v1 Announce Type: new Abstract: Reliable detection of deceptive behavior in Large Language Model (LLM) agents is an essential prerequisite for safe deployment in high-stakes agentic contexts. Prior work on scheming detection has focused exclusively on black-box monitors that observe...
Artificial intelligence-driven improvement of hospital logistics management resilience: a practical exploration based on H Hospital
arXiv:2603.13816v1 Announce Type: new Abstract: Hospital logistics management faces growing pressure from internal operations and external emergencies, with artificial intelligence (AI) holding untapped potential to boost its resilience. This study explores AI's role in enhancing logistics resilience via a mixed-methods...
APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution
arXiv:2603.13853v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG), based on large language models (LLMs), serves as a vital approach to retrieving and leveraging external knowledge in various domain applications. When confronted with complex multi-hop questions, single-round retrieval is often insufficient...
ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering
arXiv:2603.13950v1 Announce Type: new Abstract: Large Language Model (LLM) agents increasingly use external tools for complex tasks and rely on embedding-based retrieval to select a small top-k subset for reasoning. As these systems scale, the robustness of this retrieval stage...
Selective Fine-Tuning of GPT Architectures for Parameter-Efficient Clinical Text Classification
arXiv:2603.14183v1 Announce Type: new Abstract: The rapid expansion of electronic health record (EHR) systems has generated large volumes of unstructured clinical narratives that contain valuable information for disease identification, patient cohort discovery, and clinical decision support. Extracting structured knowledge from...
Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring
arXiv:2603.14251v1 Announce Type: new Abstract: Large Reasoning Language Models (LRLMs) demonstrate impressive capabilities on complex tasks by utilizing long Chain-of-Thought reasoning. However, they are prone to overthinking, which generates redundant reasoning steps that degrade both performance and efficiency. Recently, early-exit...
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
arXiv:2603.14303v1 Announce Type: new Abstract: Existing KV cache compression methods generally operate on discrete tokens or non-semantic chunks. However, such approaches often lead to semantic fragmentation, where linguistically coherent units are disrupted, causing irreversible information loss and degradation in model...
Translational Gaps in Graph Transformers for Longitudinal EHR Prediction: A Critical Appraisal of GT-BEHRT
arXiv:2603.13231v1 Announce Type: new Abstract: Transformer-based models have improved predictive modeling on longitudinal electronic health records through large-scale self-supervised pretraining. However, most EHR transformer architectures treat each clinical encounter as an unordered collection of codes, which limits their ability to...
Continual Fine-Tuning with Provably Accurate and Parameter-Free Task Retrieval
arXiv:2603.13235v1 Announce Type: new Abstract: Continual fine-tuning aims to adapt a pre-trained backbone to new tasks sequentially while preserving performance on earlier tasks whose data are no longer available. Existing approaches fall into two categories which include input- and parameter-adaptation....
Your Code Agent Can Grow Alongside You with Structured Memory
arXiv:2603.13258v1 Announce Type: new Abstract: While "Intent-oriented programming" (or "Vibe Coding") redefines software engineering, existing code agents remain tethered to static code snapshots. Consequently, they struggle to model the critical information embedded in the temporal evolution of projects, failing to...
Beyond Attention: True Adaptive World Models via Spherical Kernel Operator
arXiv:2603.13263v1 Announce Type: new Abstract: The pursuit of world model based artificial intelligence has predominantly relied on projecting high-dimensional observations into parameterized latent spaces, wherein transition dynamics are subsequently learned. However, this conventional paradigm is mathematically flawed: it merely displaces...
Federated Personal Knowledge Graph Completion with Lightweight Large Language Models for Personalized Recommendations
arXiv:2603.13264v1 Announce Type: new Abstract: Personalized recommendation increasingly relies on private user data, motivating approaches that can adapt to individuals without centralizing their information. We present Federated Targeted Recommendations with Evolving Knowledge graphs and Language Models (FedTREK-LM), a framework that...
FastODT: A tree-based framework for efficient continual learning
arXiv:2603.13276v1 Announce Type: new Abstract: Machine learning models deployed in real-world settings must operate under evolving data distributions and constrained computational resources. This challenge is particularly acute in non-stationary domains such as energy time series, weather monitoring, and environmental sensing....
ICaRus: Identical Cache Reuse for Efficient Multi Model Inference
arXiv:2603.13281v1 Announce Type: new Abstract: Multi model inference has recently emerged as a prominent paradigm, particularly in the development of agentic AI systems. However, in such scenarios, each model must maintain its own Key-Value (KV) cache for the identical prompt,...
FedTreeLoRA: Reconciling Statistical and Functional Heterogeneity in Federated LoRA Fine-Tuning
arXiv:2603.13282v1 Announce Type: new Abstract: Federated Learning (FL) with Low-Rank Adaptation (LoRA) has become a standard for privacy-preserving LLM fine-tuning. However, existing personalized methods predominantly operated under a restrictive Flat-Model Assumption: they addressed client-side \textit{statistical heterogeneity} but treated the model...
From Stochastic Answers to Verifiable Reasoning: Interpretable Decision-Making with LLM-Generated Code
arXiv:2603.13287v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for high-stakes decision-making, yet existing approaches struggle to reconcile scalability, interpretability, and reproducibility. Black-box models obscure their reasoning, while recent LLM-based rule systems rely on per-sample evaluation, causing...
ICPRL: Acquiring Physical Intuition from Interactive Control
arXiv:2603.13295v1 Announce Type: new Abstract: VLMs excel at static perception but falter in interactive reasoning in dynamic physical environments, which demands planning and adaptation to dynamic outcomes. Existing physical reasoning methods often depend on abstract symbolic inputs or lack the...
FusionCast: Enhancing Precipitation Nowcasting with Asymmetric Cross-Modal Fusion and Future Radar Priors
arXiv:2603.13298v1 Announce Type: new Abstract: Deep learning has significantly improved the accuracy of precipitation nowcasting. However, most existing multimodal models typically use simple channel concatenation or interpolation methods for data fusion, which often overlook the feature differences between different modalities....