The Validity of Coreference-based Evaluations of Natural Language Understanding
arXiv:2602.16200v1 Announce Type: new Abstract: In this thesis, I refine our understanding as to what conclusions we can reach from coreference-based evaluations by expanding existing evaluation practices and considering the extent to which evaluation results are either converging or conflicting....
Long-Tail Knowledge in Large Language Models: Taxonomy, Mechanisms, Interventions and Implications
arXiv:2602.16201v1 Announce Type: new Abstract: Large language models (LLMs) are trained on web-scale corpora that exhibit steep power-law distributions, in which the distribution of knowledge is highly long-tailed, with most appearing infrequently. While scaling has improved average-case performance, persistent failures...
Aladdin-FTI @ AMIYA Three Wishes for Arabic NLP: Fidelity, Diglossia, and Multidialectal Generation
arXiv:2602.16290v1 Announce Type: new Abstract: Arabic dialects have long been under-represented in Natural Language Processing (NLP) research due to their non-standardization and high variability, which pose challenges for computational modeling. Recent advances in the field, such as Large Language Models...
MemoryArena: Benchmarking Agent Memory in Interdependent Multi-Session Agentic Tasks
arXiv:2602.16313v1 Announce Type: new Abstract: Existing evaluations of agents with memory typically assess memorization and action in isolation. One class of benchmarks evaluates memorization by testing recall of past conversations or text but fails to capture how memory is used...
A Koopman-Bayesian Framework for High-Fidelity, Perceptually Optimized Haptic Surgical Simulation
arXiv:2602.15834v1 Announce Type: new Abstract: We introduce a unified framework that combines nonlinear dynamics, perceptual psychophysics and high frequency haptic rendering to enhance realism in surgical simulation. The interaction of the surgical device with soft tissue is elevated to an...
Memes-as-Replies: Can Models Select Humorous Manga Panel Responses?
arXiv:2602.15842v1 Announce Type: new Abstract: Memes are a popular element of modern web communication, used not only as static artifacts but also as interactive replies within conversations. While computational research has focused on analyzing the intrinsic properties of memes, the...
BamaER: A Behavior-Aware Memory-Augmented Model for Exercise Recommendation
arXiv:2602.15879v1 Announce Type: new Abstract: Exercise recommendation focuses on personalized exercise selection conditioned on students' learning history, personal interests, and other individualized characteristics. Despite notable progress, most existing methods represent student learning solely as exercise sequences, overlooking rich behavioral interaction...
Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction
arXiv:2602.15883v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large...
Adaptive Semi-Supervised Training of P300 ERP-BCI Speller System with Minimum Calibration Effort
arXiv:2602.15955v1 Announce Type: new Abstract: A P300 ERP-based Brain-Computer Interface (BCI) speller is an assistive communication tool. It searches for the P300 event-related potential (ERP) elicited by target stimuli, distinguishing it from the neural responses to non-target stimuli embedded in...
R$^2$Energy: A Large-Scale Benchmark for Robust Renewable Energy Forecasting under Diverse and Extreme Conditions
arXiv:2602.15961v1 Announce Type: new Abstract: The rapid expansion of renewable energy, particularly wind and solar power, has made reliable forecasting critical for power system operations. While recent deep learning models have achieved strong average accuracy, the increasing frequency and intensity...
B-DENSE: Branching For Dense Ensemble Network Learning
arXiv:2602.15971v1 Announce Type: new Abstract: Inspired by non-equilibrium thermodynamics, diffusion models have achieved state-of-the-art performance in generative modeling. However, their iterative sampling nature results in high inference latency. While recent distillation techniques accelerate sampling, they discard intermediate trajectory steps. This...
Fast Online Learning with Gaussian Prior-Driven Hierarchical Unimodal Thompson Sampling
arXiv:2602.15972v1 Announce Type: new Abstract: We study a type of Multi-Armed Bandit (MAB) problems in which arms with a Gaussian reward feedback are clustered. Such an arm setting finds applications in many real-world problems, for example, mmWave communications and portfolio...
Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks
arXiv:2602.15997v1 Announce Type: new Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across five model scales (405K-85M parameters), 120+ emergence events in eight algorithmic tasks, and three Pythia language models (160M-2.8B). We find:...
MolCrystalFlow: Molecular Crystal Structure Prediction via Flow Matching
arXiv:2602.16020v1 Announce Type: new Abstract: Molecular crystal structure prediction represents a grand challenge in computational chemistry due to large sizes of constituent molecules and complex intra- and intermolecular interactions. While generative modeling has revolutionized structure discovery for molecules, inorganic solids,...
AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
arXiv:2602.16042v1 Announce Type: new Abstract: As machine learning (ML) continues its rapid expansion, the environmental cost of model training and inference has become a critical societal concern. Existing benchmarks overwhelmingly focus on standard performance metrics such as accuracy, BLEU, or...
Extracting and Analyzing Rail Crossing Behavior Signatures from Videos using Tensor Methods
arXiv:2602.16057v1 Announce Type: new Abstract: Railway crossings present complex safety challenges where driver behavior varies by location, time, and conditions. Traditional approaches analyze crossings individually, limiting the ability to identify shared behavioral patterns across locations. We propose a multi-view tensor...
Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training
arXiv:2602.16065v1 Announce Type: new Abstract: Generative Artificial Intelligence (AI), such as large language models (LLMs), has become a transformative force across science, industry, and society. As these systems grow in popularity, web data becomes increasingly interwoven with this AI-generated material...
Omni-iEEG: A Large-Scale, Comprehensive iEEG Dataset and Benchmark for Epilepsy Research
arXiv:2602.16072v1 Announce Type: new Abstract: Epilepsy affects over 50 million people worldwide, and one-third of patients suffer drug-resistant seizures where surgery offers the best chance of seizure freedom. Accurate localization of the epileptogenic zone (EZ) relies on intracranial EEG (iEEG)....
Axle Sensor Fusion for Online Continual Wheel Fault Detection in Wayside Railway Monitoring
arXiv:2602.16101v1 Announce Type: new Abstract: Reliable and cost-effective maintenance is essential for railway safety, particularly at the wheel-rail interface, which is prone to wear and failure. Predictive maintenance frameworks increasingly leverage sensor-generated time-series data, yet traditional methods require manual feature...
On the Power of Source Screening for Learning Shared Feature Extractors
arXiv:2602.16125v1 Announce Type: new Abstract: Learning with shared representation is widely recognized as an effective way to separate commonalities from heterogeneity across various heterogeneous sources. Most existing work includes all related data sources via simultaneously training a common feature extractor...
HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents
arXiv:2602.16165v1 Announce Type: new Abstract: Training LLMs as interactive agents for multi-turn decision-making remains challenging, particularly in long-horizon tasks with sparse and delayed rewards, where agents must execute extended sequences of actions before receiving meaningful feedback. Most existing reinforcement learning...
Muon with Spectral Guidance: Efficient Optimization for Scientific Machine Learning
arXiv:2602.16167v1 Announce Type: new Abstract: Physics-informed neural networks and neural operators often suffer from severe optimization difficulties caused by ill-conditioned gradients, multi-scale spectral behavior, and stiffness induced by physical constraints. Recently, the Muon optimizer has shown promise by performing orthogonalized...
Towards Secure and Scalable Energy Theft Detection: A Federated Learning Approach for Resource-Constrained Smart Meters
arXiv:2602.16181v1 Announce Type: new Abstract: Energy theft poses a significant threat to the stability and efficiency of smart grids, leading to substantial economic losses and operational challenges. Traditional centralized machine learning approaches for theft detection require aggregating user data, raising...
Deep TPC: Temporal-Prior Conditioning for Time Series Forecasting
arXiv:2602.16188v1 Announce Type: new Abstract: LLM-for-time series (TS) methods typically treat time shallowly, injecting positional or prompt-based cues once at the input of a largely frozen decoder, which limits temporal reasoning as this information degrades through the layers. We introduce...
Graphon Mean-Field Subsampling for Cooperative Heterogeneous Multi-Agent Reinforcement Learning
arXiv:2602.16196v1 Announce Type: new Abstract: Coordinating large populations of interacting agents is a central challenge in multi-agent reinforcement learning (MARL), where the size of the joint state-action space scales exponentially with the number of agents. Mean-field methods alleviate this burden...
ModalImmune: Immunity Driven Unlearning via Self Destructive Training
arXiv:2602.16197v1 Announce Type: new Abstract: Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably...
Linked Data Classification using Neurochaos Learning
arXiv:2602.16204v1 Announce Type: new Abstract: Neurochaos Learning (NL) has shown promise in recent times over traditional deep learning due to its two key features: ability to learn from small sized training samples, and low compute requirements. In prior work, NL...
Geometric Neural Operators via Lie Group-Constrained Latent Dynamics
arXiv:2602.16209v1 Announce Type: new Abstract: Neural operators offer an effective framework for learning solutions of partial differential equations for many physical systems in a resolution-invariant and data-driven manner. Existing neural operators, however, often suffer from instability in multi-layer iteration and...
Graph neural network for colliding particles with an application to sea ice floe modeling
arXiv:2602.16213v1 Announce Type: new Abstract: This paper introduces a novel approach to sea ice modeling using Graph Neural Networks (GNNs), utilizing the natural graph structure of sea ice, where nodes represent individual ice pieces, and edges model the physical interactions,...
UCTECG-Net: Uncertainty-aware Convolution Transformer ECG Network for Arrhythmia Detection
arXiv:2602.16216v1 Announce Type: new Abstract: Deep learning has improved automated electrocardiogram (ECG) classification, but limited insight into prediction reliability hinders its use in safety-critical settings. This paper proposes UCTECG-Net, an uncertainty-aware hybrid architecture that combines one-dimensional convolutions and Transformer encoders...