Scalable Gaussian process modeling of parametrized spatio-temporal fields
arXiv:2603.00290v1 Announce Type: new Abstract: We introduce a scalable Gaussian process (GP) framework with deep product kernels for data-driven learning of parametrized spatio-temporal fields over fixed or parameter-dependent domains. The proposed framework learns a continuous representation, enabling predictions at arbitrary...
Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks
arXiv:2603.00302v1 Announce Type: new Abstract: Differentiable logic gate networks (DLGNs) learn compact, interpretable Boolean circuits via gradient-based training, but all existing variants are restricted to the 16 two-input binary gates. Extending DLGNs to Ternary Kleene $K_3$ logic and training DTLGNs...
Neural Operators Can Discover Functional Clusters
arXiv:2602.23528v1 Announce Type: new Abstract: Operator learning is reshaping scientific computing by amortizing inference across infinite families of problems. While neural operators (NOs) are increasingly well understood for regression, far less is known for classification and its unsupervised analogue: clustering....
Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents
arXiv:2602.23556v1 Announce Type: new Abstract: Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular communication that stalls forward progress. Moreover, fetched data...
Normalisation and Initialisation Strategies for Graph Neural Networks in Blockchain Anomaly Detection
arXiv:2602.23599v1 Announce Type: new Abstract: Graph neural networks (GNNs) offer a principled approach to financial fraud detection by jointly learning from node features and transaction graph topology. However, their effectiveness on real-world anti-money laundering (AML) benchmarks depends critically on training...
Intrinsic Lorentz Neural Network
arXiv:2602.23981v1 Announce Type: new Abstract: Real-world data frequently exhibit latent hierarchical structures, which can be naturally represented by hyperbolic geometry. Although recent hyperbolic neural networks have demonstrated promising results, many existing architectures remain partially intrinsic, mixing Euclidean operations with hyperbolic...
RepSPD: Enhancing SPD Manifold Representation in EEGs via Dynamic Graphs
arXiv:2602.22981v1 Announce Type: new Abstract: Decoding brain activity from electroencephalography (EEG) is crucial for neuroscience and clinical applications. Among recent advances in deep learning for EEG, geometric learning stands out as its theoretical underpinnings on symmetric positive definite (SPD) allows...
Digital Sovereignty: How Nations Are Asserting Control Over Technology Infrastructure
Countries worldwide are implementing digital sovereignty measures to control data flows, technology standards, and digital infrastructure within their borders.
Orthogonal Weight Modification Enhances Learning Scalability and Convergence Efficiency without Gradient Backpropagation
arXiv:2602.22259v1 Announce Type: new Abstract: Recognizing the substantial computational cost of backpropagation (BP), non-BP methods have emerged as attractive alternatives for efficient learning on emerging neuromorphic systems. However, existing non-BP approaches still face critical challenges in efficiency and scalability. Inspired...
X-REFINE: XAI-based RElevance input-Filtering and archItecture fiNe-tuning for channel Estimation
arXiv:2602.22277v1 Announce Type: new Abstract: AI-native architectures are vital for 6G wireless communications. The black-box nature and high complexity of deep learning models employed in critical applications, such as channel estimation, limit their practical deployment. While perturbation-based XAI solutions offer...
Global River Forecasting with a Topology-Informed AI Foundation Model
arXiv:2602.22293v1 Announce Type: new Abstract: River systems operate as inherently interconnected continuous networks, meaning river hydrodynamic simulation ought to be a systemic process. However, widespread hydrology data scarcity often restricts data-driven forecasting to isolated predictions. To achieve systemic simulation and...
Multi-dimensional Assessment and Explainable Feedback for Counselor Responses to Client Resistance in Text-based Counseling with LLMs
arXiv:2602.21638v1 Announce Type: new Abstract: Effectively addressing client resistance is a sophisticated clinical skill in psychological counseling, yet practitioners often lack timely and scalable supervisory feedback to refine their approaches. Although current NLP research has examined overall counseling quality and...
When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training
arXiv:2602.21454v1 Announce Type: new Abstract: Recurrent neural networks (RNNs) can be interpreted as discrete-time state-space models, where the state evolution corresponds to an infinite-impulse-response (IIR) filtering operation governed by both feedforward weights and recurrent poles. While, in principle, all parameters...
From Basis to Basis: Gaussian Particle Representation for Interpretable PDE Operators
arXiv:2602.21551v1 Announce Type: new Abstract: Learning PDE dynamics for fluids increasingly relies on neural operators and Transformer-based models, yet these approaches often lack interpretability and struggle with localized, high-frequency structures while incurring quadratic cost in spatial samples. We propose representing...
Enhancing Hate Speech Detection on Social Media: A Comparative Analysis of Machine Learning Models and Text Transformation Approaches
arXiv:2602.20634v1 Announce Type: new Abstract: The proliferation of hate speech on social media platforms has necessitated the development of effective detection and moderation tools. This study evaluates the efficacy of various machine learning models in identifying hate speech and offensive...
GeoPT: Scaling Physics Simulation via Lifted Geometric Pre-Training
arXiv:2602.20399v1 Announce Type: new Abstract: Neural simulators promise efficient surrogates for physics simulation, but scaling them is bottlenecked by the prohibitive cost of generating high-fidelity training data. Pre-training on abundant off-the-shelf geometries offers a natural alternative, yet faces a fundamental...
CITED: A Decision Boundary-Aware Signature for GNNs Towards Model Extraction Defense
arXiv:2602.20418v1 Announce Type: new Abstract: Graph neural networks (GNNs) have demonstrated superior performance in various applications, such as recommendation systems and financial risk management. However, deploying large-scale GNN models locally is particularly challenging for users, as it requires significant computational...
PerSoMed: A Large-Scale Balanced Dataset for Persian Social Media Text Classification
arXiv:2602.19333v1 Announce Type: new Abstract: This research introduces the first large-scale, well-balanced Persian social media text classification dataset, specifically designed to address the lack of comprehensive resources in this domain. The dataset comprises 36,000 posts across nine categories (Economic, Artistic,...
Information-Guided Noise Allocation for Efficient Diffusion Training
arXiv:2602.18647v1 Announce Type: new Abstract: Training diffusion models typically relies on manually tuned noise schedules, which can waste computation on weakly informative noise regions and limit transfer across datasets, resolutions, and representations. We revisit noise schedule allocation through an information-theoretic...
L2G-Net: Local to Global Spectral Graph Neural Networks via Cauchy Factorizations
arXiv:2602.18837v1 Announce Type: new Abstract: Despite their theoretical advantages, spectral methods based on the graph Fourier transform (GFT) are seldom used in graph neural networks (GNNs) due to the cost of computing the eigenbasis and the lack of vertex-domain locality...
SPQ: An Ensemble Technique for Large Language Model Compression
arXiv:2602.18420v1 Announce Type: new Abstract: This study presents an ensemble technique, SPQ (SVD-Pruning-Quantization), for large language model (LLM) compression that combines variance-retained singular value decomposition (SVD), activation-based pruning, and post-training linear quantization. Each component targets a different source of inefficiency:...
On the "Induction Bias" in Sequence Models
arXiv:2602.18333v1 Announce Type: cross Abstract: Despite the remarkable practical success of transformer-based language models, recent work has raised concerns about their ability to perform state tracking. In particular, a growing body of literature has shown this limitation primarily through failures...
S-PRESSO: Ultra Low Bitrate Sound Effect Compression With Diffusion Autoencoders And Offline Quantization
arXiv:2602.15082v1 Announce Type: cross Abstract: Neural audio compression models have recently achieved extreme compression rates, enabling efficient latent generative modeling. Conversely, latent generative models have been applied to compression, pushing the limits of continuous and discrete approaches. However, existing methods...
ExpertWeaver: Unlocking the Inherent MoE in Dense LLMs with GLU Activation Patterns
arXiv:2602.15521v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) effectively scales model capacity while preserving computational efficiency through sparse expert activation. However, training high-quality MoEs from scratch is prohibitively expensive. A promising alternative is to convert pretrained dense models into sparse MoEs....
NeuroSleep: Neuromorphic Event-Driven Single-Channel EEG Sleep Staging for Edge-Efficient Sensing
arXiv:2602.15888v1 Announce Type: cross Abstract: Reliable, continuous neural sensing on wearable edge platforms is fundamental to long-term health monitoring; however, for electroencephalography (EEG)-based sleep monitoring, dense high-frequency processing is often computationally prohibitive under tight energy budgets. To address this bottleneck,...
Representation Collapse in Machine Translation Through the Lens of Angular Dispersion
arXiv:2602.17287v1 Announce Type: new Abstract: Modern neural translation models based on the Transformer architecture are known for their high performance, particularly when trained on high-resource datasets. A standard next-token prediction training strategy, while widely adopted in practice, may lead to...
Entropy-Based Data Selection for Language Models
arXiv:2602.17465v1 Announce Type: new Abstract: Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data required for fine-tuning LMs. However, their effectiveness is closely related...
Sink-Aware Pruning for Diffusion Language Models
arXiv:2602.17664v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) incur high inference cost due to iterative denoising, motivating efficient pruning. Existing pruning heuristics largely inherited from autoregressive (AR) LLMs, typically preserve attention sink tokens because AR sinks serve as stable...
Machine Learning Argument of Latitude Error Model for LEO Satellite Orbit and Covariance Correction
arXiv:2602.16764v1 Announce Type: new Abstract: Low Earth orbit (LEO) satellites are leveraged to support new position, navigation, and timing (PNT) service alternatives to GNSS. These alternatives require accurate propagation of satellite position and velocity with a realistic quantification of uncertainty....
Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming
arXiv:2602.16944v1 Announce Type: new Abstract: This work introduces a verification framework that provides both sound and complete guarantees for data poisoning attacks during neural network training. We formulate adversarial data manipulation, model training, and test-time evaluation in a single mixed-integer...