Cross-Lingual Transfer and Parameter-Efficient Adaptation in the Turkic Language Family: A Theoretical Framework for Low-Resource Language Models
arXiv:2604.06202v1 Announce Type: new Abstract: Large language models (LLMs) have transformed natural language processing, yet their capabilities remain uneven across languages. Most multilingual models are trained primarily on high-resource languages, leaving many languages with large speaker populations underrepresented in both...
ART: Attention Replacement Technique to Improve Factuality in LLMs
arXiv:2604.06393v1 Announce Type: new Abstract: Hallucination in large language models (LLMs) continues to be a significant issue, particularly in tasks like question answering, where models often generate plausible yet incorrect or irrelevant information. Although various methods have been proposed to...
DualDiffusion: A Speculative Decoding Strategy for Masked Diffusion Models
arXiv:2604.05250v1 Announce Type: new Abstract: Masked Diffusion Models (MDMs) offer a promising alternative to autoregressive language models by enabling parallel token generation and bidirectional context modeling. However, their inference speed is significantly limited by the inability to cache key-value pairs...
ETR: Entropy Trend Reward for Efficient Chain-of-Thought Reasoning
arXiv:2604.05355v1 Announce Type: new Abstract: Chain-of-thought (CoT) reasoning improves large language model performance on complex tasks, but often produces excessively long and inefficient reasoning traces. Existing methods shorten CoTs using length penalties or global entropy reduction, implicitly assuming that low...
Dynamic Agentic AI Expert Profiler System Architecture for Multidomain Intelligence Modeling
arXiv:2604.05345v1 Announce Type: new Abstract: In today's artificial intelligence driven world, modern systems communicate with people from diverse backgrounds and skill levels. For human-machine interaction to be meaningful, systems must be aware of context and user expertise. This study proposes...
Optimal-Transport-Guided Functional Flow Matching for Turbulent Field Generation in Hilbert Space
arXiv:2604.05700v1 Announce Type: new Abstract: High-fidelity modeling of turbulent flows requires capturing complex spatiotemporal dynamics and multi-scale intermittency, posing a fundamental challenge for traditional knowledge-based systems. While deep generative models, such as diffusion models and Flow Matching, have shown promising...
TDA-RC: Task-Driven Alignment for Knowledge-Based Reasoning Chains in Large Language Models
arXiv:2604.04942v1 Announce Type: new Abstract: Enhancing the reasoning capability of large language models (LLMs) remains a core challenge in natural language processing. The Chain-of-Thought (CoT) paradigm dominates practical applications for its single-round efficiency, yet its reasoning chains often exhibit logical...
Extending Tabular Denoising Diffusion Probabilistic Models for Time-Series Data Generation
arXiv:2604.05257v1 Announce Type: new Abstract: Diffusion models are increasingly being utilised to create synthetic tabular and time series data for privacy-preserving augmentation. Tabular Denoising Diffusion Probabilistic Models (TabDDPM) generate high-quality synthetic data from heterogeneous tabular datasets but assume independence between...
Dynamic Linear Coregionalization for Realistic Synthetic Multivariate Time Series
arXiv:2604.05064v1 Announce Type: new Abstract: Synthetic data is essential for training foundation models for time series (FMTS), but most generators assume static correlations, and are typically missing realistic inter-channel dependencies. We introduce DynLMC, a Dynamic Linear Model of Coregionalization, that...
Cross-Machine Anomaly Detection Leveraging Pre-trained Time-series Model
arXiv:2604.05335v1 Announce Type: new Abstract: Achieving resilient and high-quality manufacturing requires reliable data-driven anomaly detection methods that are capable of addressing differences in behaviors among different individual machines which are nominally the same and are executing the same processes. To...
Pramana: Fine-Tuning Large Language Models for Epistemic Reasoning through Navya-Nyaya
arXiv:2604.04937v1 Announce Type: new Abstract: Large language models produce fluent text but struggle with systematic reasoning, often hallucinating confident but unfounded claims. When Apple researchers added irrelevant context to mathematical problems, LLM performance degraded by 65% Apple Machine Learning Research,...
Multi-Agent Pathfinding with Non-Unit Integer Edge Costs via Enhanced Conflict-Based Search and Graph Discretization
arXiv:2604.05416v1 Announce Type: new Abstract: Multi-Agent Pathfinding (MAPF) plays a critical role in various domains. Traditional MAPF methods typically assume unit edge costs and single-timestep actions, which limit their applicability to real-world scenarios. MAPFR extends MAPF to handle non-unit costs...
UniCreative: Unifying Long-form Logic and Short-form Sparkle via Reference-Free Reinforcement Learning
arXiv:2604.05517v1 Announce Type: new Abstract: A fundamental challenge in creative writing lies in reconciling the inherent tension between maintaining global coherence in long-form narratives and preserving local expressiveness in short-form texts. While long-context generation necessitates explicit macroscopic planning, short-form creativity...
SenseAI: A Human-in-the-Loop Dataset for RLHF-Aligned Financial Sentiment Reasoning
arXiv:2604.05135v1 Announce Type: new Abstract: We introduce SenseAI, a human-in-the-loop (HITL) validated financial sentiment dataset designed to capture not only model outputs but the full reasoning process behind them. Unlike existing resources, SenseAI incorporates reasoning chains, confidence scores, human correction...
Learning-Based Multi-Criteria Decision Making Model for Sawmill Location Problems
arXiv:2604.04996v1 Announce Type: new Abstract: Strategically locating a sawmill is vital for enhancing the efficiency, profitability, and sustainability of timber supply chains. Our study proposes a Learning-Based Multi-Criteria Decision-Making (LB-MCDM) framework that integrates machine learning (ML) with GIS-based spatial location...
A Theory-guided Weighted $L^2$ Loss for solving the BGK model via Physics-informed neural networks
arXiv:2604.04971v1 Announce Type: new Abstract: While Physics-Informed Neural Networks offer a promising framework for solving partial differential equations, the standard $L^2$ loss formulation is fundamentally insufficient when applied to the Bhatnagar-Gross-Krook (BGK) model. Specifically, simply minimizing the standard loss does...
Enhancing sample efficiency in reinforcement-learning-based flow control: replacing the critic with an adaptive reduced-order model
arXiv:2604.04986v1 Announce Type: new Abstract: Model-free deep reinforcement learning (DRL) methods suffer from poor sample efficiency. To overcome this limitation, this work introduces an adaptive reduced-order-model (ROM)-based reinforcement learning framework for active flow control. In contrast to conventional actor--critic architectures,...
Learning Stable Predictors from Weak Supervision under Distribution Shift
arXiv:2604.05002v1 Announce Type: new Abstract: Learning from weak or proxy supervision is common when ground-truth labels are unavailable, yet robustness under distribution shift remains poorly understood, especially when the supervision mechanism itself changes. We formalize this as supervision drift, defined...
EvolveRouter: Co-Evolving Routing and Prompt for Multi-Agent Question Answering
arXiv:2604.05149v1 Announce Type: new Abstract: Large language model agents often exhibit complementary strengths, making routing a promising approach for multi-agent question answering. However, existing routing methods remain limited in two important ways: they typically optimize over a fixed pool of...
Channel-wise Retrieval for Multivariate Time Series Forecasting
arXiv:2604.05543v1 Announce Type: new Abstract: Multivariate time series forecasting often struggles to capture long-range dependencies due to fixed lookback windows. Retrieval-augmented forecasting addresses this by retrieving historical segments from memory, but existing approaches rely on a channel-agnostic strategy that applies...
Hidden in the Multiplicative Interaction: Uncovering Fragility in Multimodal Contrastive Learning
arXiv:2604.05834v1 Announce Type: new Abstract: Multimodal contrastive learning is increasingly enriched by going beyond image-text pairs. Among recent contrastive methods, Symile is a strong approach for this challenge because its multiplicative interaction objective captures higher-order cross-modal dependence. Yet, we find...
ALTO: Adaptive LoRA Tuning and Orchestration for Heterogeneous LoRA Training Workloads
arXiv:2604.05426v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) is now the dominant method for parameter-efficient fine-tuning of large language models, but achieving a high-quality adapter often requires systematic hyperparameter tuning because LoRA performance is highly sensitive to configuration choices. In...
Neural Assistive Impulses: Synthesizing Exaggerated Motions for Physics-based Characters
arXiv:2604.05394v1 Announce Type: new Abstract: Physics-based character animation has become a fundamental approach for synthesizing realistic, physically plausible motions. While current data-driven deep reinforcement learning (DRL) methods can synthesize complex skills, they struggle to reproduce exaggerated, stylized motions, such as...
ActivityEditor: Learning to Synthesize Physically Valid Human Mobility
arXiv:2604.05529v1 Announce Type: new Abstract: Human mobility modeling is indispensable for diverse urban applications. However, existing data-driven methods often suffer from data scarcity, limiting their applicability in regions where historical trajectories are unavailable or restricted. To bridge this gap, we...
Automated Auditing of Hospital Discharge Summaries for Care Transitions
arXiv:2604.05435v1 Announce Type: new Abstract: Incomplete or inconsistent discharge documentation is a primary driver of care fragmentation and avoidable readmissions. Despite its critical role in patient safety, auditing discharge summaries relies heavily on manual review and is difficult to scale....
PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection
arXiv:2604.05424v1 Announce Type: new Abstract: PRISM-MCTS: Learning from Reasoning Trajectories with Metacognitive Reflection Siyuan Cheng, Bozhong Tian, Yanchao Hao, Zheng Wei Published: 06 Apr 2026, Last Modified: 06 Apr 2026 ACL 2026 Findings Conference, Area Chairs, Reviewers, Publication Chairs, Authors...
Instruction-Tuned LLMs for Parsing and Mining Unstructured Logs on Leadership HPC Systems
arXiv:2604.05168v1 Announce Type: new Abstract: Leadership-class HPC systems generate massive volumes of heterogeneous, largely unstructured system logs. Because these logs originate from diverse software, hardware, and runtime layers, they exhibit inconsistent formats, making structure extraction and pattern discovery extremely challenging....
AutoSOTA: An End-to-End Automated Research System for State-of-the-Art AI Model Discovery
arXiv:2604.05550v1 Announce Type: new Abstract: Artificial intelligence research increasingly depends on prolonged cycles of reproduction, debugging, and iterative refinement to achieve State-Of-The-Art (SOTA) performance, creating a growing need for systems that can accelerate the full pipeline of empirical model optimization....
DQA: Diagnostic Question Answering for IT Support
arXiv:2604.05350v1 Announce Type: new Abstract: Enterprise IT support interactions are fundamentally diagnostic: effective resolution requires iterative evidence gathering from ambiguous user reports to identify an underlying root cause. While retrieval-augmented generation (RAG) provides grounding through historical cases, standard multi-turn RAG...
Cross-Modal Coreference Alignment: Enabling Reliable Information Transfer in Omni-LLMs
arXiv:2604.05522v1 Announce Type: new Abstract: Omni Large Language Models (Omni-LLMs) have demonstrated impressive capabilities in holistic multi-modal perception, yet they consistently falter in complex scenarios requiring synergistic omni-modal reasoning. Beyond understanding global multimodal context, effective reasoning also hinges on fine-grained...