A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions
arXiv:2602.19133v1 Announce Type: new Abstract: This paper introduces FRAME (Fine-grained Recognition of Art-historical Metadata and Entities), a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE). Descriptions were collected from museum catalogs, auction...
PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification
arXiv:2602.19333v1 Announce Type: new Abstract: This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. The dataset comprises 36,000 posts across nine categories (Economic, Artistic,...
Non-Interfering Weight Fields: Treating Model Parameters as a Continuously Extensible Function
arXiv:2602.18628v1 Announce Type: new Abstract: Large language models store all learned knowledge in a single, fixed weight vector. Teaching a model new capabilities requires modifying those same weights, inevitably degrading previously acquired knowledge. This fundamental limitation, known as catastrophic forgetting,...
Information-Guided Noise Allocation for Efficient Diffusion Training
arXiv:2602.18647v1 Announce Type: new Abstract: Training diffusion models typically relies on manually tuned noise schedules, which can waste computation on weakly informative noise regions and limit transfer across datasets, resolutions, and representations. We revisit noise schedule allocation through an information-theoretic...
Communication-Efficient Personalized Adaptation via Federated-Local Model Merging
arXiv:2602.18658v1 Announce Type: new Abstract: Parameter-efficient fine-tuning methods, such as LoRA, offer a practical way to adapt large vision and language models to client tasks. However, this becomes particularly challenging under task-level heterogeneity in federated deployments. In this regime, personalization...
HEHRGNN: A Unified Embedding Model for Knowledge Graphs with Hyperedges and Hyper-Relational Edges
arXiv:2602.18897v1 Announce Type: new Abstract: Knowledge Graph(KG) has gained traction as a machine-readable organization of real-world knowledge for analytics using artificial intelligence systems. Graph Neural Network(GNN), is proven to be an effective KG embedding technique that enables various downstream tasks...
Neural Prior Estimation: Learning Class Priors from Latent Representations
arXiv:2602.17853v1 Announce Type: new Abstract: Class imbalance induces systematic bias in deep neural networks by imposing a skewed effective class prior. This work introduces the Neural Prior Estimator (NPE), a framework that learns feature-conditioned log-prior estimates from latent representations. NPE...
Causal Neighbourhood Learning for Invariant Graph Representations
arXiv:2602.17934v1 Announce Type: new Abstract: Graph data often contain noisy and spurious correlations that mask the true causal relationships, which are essential for enabling graph models to make predictions based on the underlying causal structure of the data. Dependence on...
GRAFNet: Multiscale Retinal Processing via Guided Cortical Attention Feedback for Enhancing Medical Image Polyp Segmentation
arXiv:2602.15072v1 Announce Type: cross Abstract: Accurate polyp segmentation in colonoscopy is essential for cancer prevention but remains challenging due to: (1) high morphological variability (from flat to protruding lesions), (2) strong visual similarity to normal structures such as folds and...
ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
arXiv:2602.15521v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch is prohibitively expensive. A promising alternative is to convert pretrained dense models into sparse MoEs....
Beyond Static Pipelines: Learning Dynamic Workflows for Text-to-SQL
arXiv:2602.15564v1 Announce Type: new Abstract: Text-to-SQL has recently achieved impressive progress, yet remains difficult to apply effectively in real-world scenarios. This gap stems from the reliance on single static workflows, fundamentally limiting scalability to out-of-distribution and long-tail scenarios. Instead of...
NeuroSleep: Neuromorphic Event-Driven Single-Channel EEG Sleep Staging for Edge-Efficient Sensing
arXiv:2602.15888v1 Announce Type: cross Abstract: Reliable, continuous neural sensing on wearable edge platforms is fundamental to long-term health monitoring; however, for electroencephalography (EEG)-based sleep monitoring, dense high-frequency processing is often computationally prohibitive under tight energy budgets. To address this bottleneck,...
Contextuality from Single-State Representations: An Information-Theoretic Principle for Adaptive Intelligence
arXiv:2602.16716v1 Announce Type: new Abstract: Adaptive systems often operate across multiple contexts while reusing a fixed internal state space due to constraints on memory, representation, or physical resources. Such single-state reuse is ubiquitous in natural and artificial intelligence, yet its...
Mind the GAP: Text Safety Does Not Transfer to Tool-Call Safety in LLM Agents
arXiv:2602.16943v1 Announce Type: new Abstract: Large language models deployed as agents increasingly interact with external systems through tool calls--actions with real-world consequences that text outputs alone do not carry. Safety evaluations, however, overwhelmingly measure text-level refusal behavior, leaving a critical...
TopoFlow: Physics-guided Neural Networks for high-resolution air quality prediction
arXiv:2602.16821v1 Announce Type: new Abstract: We propose TopoFlow (Topography-aware pollutant Flow learning), a physics-guided neural network for efficient, high-resolution air quality prediction. To explicitly embed physical processes into the learning framework, we identify two critical factors governing pollutant dynamics: topography...
What is the Value of Censored Data? An Exact Analysis for the Data-driven Newsvendor
arXiv:2602.16842v1 Announce Type: new Abstract: We study the offline data-driven newsvendor problem with censored demand data. In contrast to prior works where demand is fully observed, we consider the setting where demand is censored at the inventory level and only...
AdvSynGNN: Structure-Adaptive Graph Neural Nets via Adversarial Synthesis and Self-Corrective Propagation
arXiv:2602.17071v1 Announce Type: new Abstract: Graph neural networks frequently encounter significant performance degradation when confronted with structural noise or non-homophilous topologies. To address these systemic vulnerabilities, we present AdvSynGNN, a comprehensive architecture designed for resilient node-level representation learning. The proposed...
Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum
arXiv:2602.17080v1 Announce Type: new Abstract: Efficient stochastic optimization typically integrates an update direction that performs well in the deterministic regime with a mechanism adapting to stochastic perturbations. While Adam uses adaptive moment estimates to promote stability, Muon utilizes the weight...
A Koopman-Bayesian Framework for High-Fidelity, Perceptually Optimized Haptic Surgical Simulation
arXiv:2602.15834v1 Announce Type: new Abstract: We introduce a unified framework that combines nonlinear dynamics, perceptual psychophysics and high frequency haptic rendering to enhance realism in surgical simulation. The interaction of the surgical device with soft tissue is elevated to an...
Adaptive Semi-Supervised Training of P300 ERP-BCI Speller System with Minimum Calibration Effort
arXiv:2602.15955v1 Announce Type: new Abstract: A P300 ERP-based Brain-Computer Interface (BCI) speller is an assistive communication tool. It searches for the P300 event-related potential (ERP) elicited by target stimuli, distinguishing it from the neural responses to non-target stimuli embedded in...
Geometry-Aware Uncertainty Quantification via Conformal Prediction on Manifolds
arXiv:2602.16015v1 Announce Type: new Abstract: Conformal prediction provides distribution-free coverage guaranties for regression; yet existing methods assume Euclidean output spaces and produce prediction regions that are poorly calibrated when responses lie on Riemannian manifolds. We propose \emph{adaptive geodesic conformal prediction},...
Muon with Spectral Guidance: Efficient Optimization for Scientific Machine Learning
arXiv:2602.16167v1 Announce Type: new Abstract: Physics-informed neural networks and neural operators often suffer from severe optimization difficulties caused by ill-conditioned gradients, multi-scale spectral behavior, and stiffness induced by physical constraints. Recently, the Muon optimizer has shown promise by performing orthogonalized...
ModalImmune: Immunity Driven Unlearning via Self Destructive Training
arXiv:2602.16197v1 Announce Type: new Abstract: Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably...
Multi-Class Boundary Extraction from Implicit Representations
arXiv:2602.16217v1 Announce Type: new Abstract: Surface extraction from implicit neural representations modelling a single class surface is a well-known task. However, there exist no surface extraction methods from an implicit representation of multiple classes that guarantee topological correctness and no...
Avey-B
arXiv:2602.15814v1 Announce Type: new Abstract: Compact pretrained bidirectional encoders remain the backbone of industrial NLP under tight compute and memory budgets. Their effectiveness stems from self-attention's ability to deliver high-quality bidirectional contextualization with sequence-level parallelism, as popularized by BERT-style architectures....
Fractional-Order Federated Learning
arXiv:2602.15380v1 Announce Type: new Abstract: Federated learning (FL) allows remote clients to train a global model collaboratively while protecting client privacy. Despite its privacy-preserving benefits, FL has significant drawbacks, including slow convergence, high communication cost, and non-independent-and-identically-distributed (non-IID) data. In...
On the Geometric Coherence of Global Aggregation in Federated GNN
arXiv:2602.15510v1 Announce Type: new Abstract: Federated Learning (FL) enables distributed training across multiple clients without centralized data sharing, while Graph Neural Networks (GNNs) model relational data through message passing. In federated GNN settings, client graphs often exhibit heterogeneous structural and...
From Scarcity to Scale: A Release-Level Analysis of the Pashto Common Voice Dataset
arXiv:2602.14062v1 Announce Type: new Abstract: Large, openly licensed speech datasets are essential for building automatic speech recognition (ASR) systems, yet many widely spoken languages remain underrepresented in public resources. Pashto, spoken by more than 60 million people, has historically lacked...
Exploring the Performance of ML/DL Architectures on the MNIST-1D Dataset
arXiv:2602.13348v1 Announce Type: new Abstract: Small datasets like MNIST have historically been instrumental in advancing machine learning research by providing a controlled environment for rapid experimentation and model evaluation. However, their simplicity often limits their utility for distinguishing between advanced...