Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents
arXiv:2602.23556v1 Announce Type: new Abstract: Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular communication that stalls forward progress. Moreover, fetched data...
Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing
arXiv:2602.23565v1 Announce Type: new Abstract: In many economically relevant contexts where machine learning is deployed, multiple platforms obtain data from the same pool of users, each of whom selects the platform that best serves them. Prior work in this setting...
SDMixer: Sparse Dual-Mixer for Time Series Forecasting
arXiv:2602.23581v1 Announce Type: new Abstract: Multivariate time series forecasting is widely applied in fields such as transportation, energy, and finance. However, the data commonly suffers from issues of multi-scale characteristics, weak correlations, and noise interference, which limit the predictive performance...
Normalisation and Initialisation Strategies for Graph Neural Networks in Blockchain Anomaly Detection
arXiv:2602.23599v1 Announce Type: new Abstract: Graph neural networks (GNNs) offer a principled approach to financial fraud detection by jointly learning from node features and transaction graph topology. However, their effectiveness on real-world anti-money laundering (AML) benchmarks depends critically on training...
When Does Multimodal Learning Help in Healthcare? A Benchmark on EHR and Chest X-Ray Fusion
arXiv:2602.23614v1 Announce Type: new Abstract: Machine learning holds promise for advancing clinical decision support, yet it remains unclear when multimodal learning truly helps in practice, particularly under modality missingness and fairness constraints. In this work, we conduct a systematic benchmark...
FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation
arXiv:2602.23636v1 Announce Type: new Abstract: Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fixed definition of harmfulness. In practice, enforcement strictness -...
Disentangled Mode-Specific Representations for Tensor Time Series via Contrastive Learning
arXiv:2602.23663v1 Announce Type: new Abstract: Multi-mode tensor time series (TTS) can be found in many domains, such as search engines and environmental monitoring systems. Learning representations of a TTS benefits various applications, but it is also challenging since the complexities...
MAGE: Multi-scale Autoregressive Generation for Offline Reinforcement Learning
arXiv:2602.23770v1 Announce Type: new Abstract: Generative models have gained significant traction in offline reinforcement learning (RL) due to their ability to model complex trajectory distributions. However, existing generation-based approaches still struggle with long-horizon tasks characterized by sparse rewards. Some hierarchical...
TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure
arXiv:2602.23784v1 Announce Type: new Abstract: Foundation models have transformed domains from language to genomics by learning general-purpose representations from large-scale, heterogeneous data. We introduce TradeFM, a 524M-parameter generative Transformer that brings this paradigm to market microstructure, learning directly from billions...
GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks
arXiv:2602.23795v1 Announce Type: new Abstract: Structured deep model compression methods are hardware-friendly and substantially reduce memory and inference costs. However, under aggressive compression, the resulting accuracy degradation often necessitates post-compression finetuning, which can be impractical due to missing labeled data...
MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models
arXiv:2602.23798v1 Announce Type: new Abstract: Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU,...
Actor-Critic Pretraining for Proximal Policy Optimization
arXiv:2602.23804v1 Announce Type: new Abstract: Reinforcement learning (RL) actor-critic algorithms enable autonomous learning but often require a large number of environment interactions, which limits their applicability in robotics. Leveraging expert data can reduce the number of required environment interactions. A...
Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies
arXiv:2602.23811v1 Announce Type: new Abstract: We investigate the theoretical aspects of offline reinforcement learning (RL) under general function approximation. While prior works (e.g., Xie et al., 2021) have established the theoretical foundations of learning a good policy from offline data...
Inferring Chronic Treatment Onset from ePrescription Data: A Renewal Process Approach
arXiv:2602.23824v1 Announce Type: new Abstract: Longitudinal electronic health record (EHR) data are often left-censored, making diagnosis records incomplete and unreliable for determining disease onset. In contrast, outpatient prescriptions form renewal-based trajectories that provide a continuous signal of disease management. We...
ULW-SleepNet: An Ultra-Lightweight Network for Multimodal Sleep Stage Scoring
arXiv:2602.23852v1 Announce Type: new Abstract: Automatic sleep stage scoring is crucial for the diagnosis and treatment of sleep disorders. Although deep learning models have advanced the field, many existing models are computationally demanding and designed for single-channel electroencephalography (EEG), limiting...
Hierarchical Concept-based Interpretable Models
arXiv:2602.23947v1 Announce Type: new Abstract: Modern deep neural networks remain challenging to interpret due to the opacity of their latent representations, impeding model understanding, debugging, and debiasing. Concept Embedding Models (CEMs) address this by mapping inputs to human-interpretable concept representations...
Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference
arXiv:2602.23968v1 Announce Type: new Abstract: Masked discrete diffusion models (MDMs) are a promising new approach to generative modelling, offering the ability for parallel token generation and therefore greater efficiency than autoregressive counterparts. However, achieving an optimal balance between parallel generation...
Intrinsic Lorentz Neural Network
arXiv:2602.23981v1 Announce Type: new Abstract: Real-world data frequently exhibit latent hierarchical structures, which can be naturally represented by hyperbolic geometry. Although recent hyperbolic neural networks have demonstrated promising results, many existing architectures remain partially intrinsic, mixing Euclidean operations with hyperbolic...
MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening
arXiv:2602.23994v1 Announce Type: new Abstract: Alzheimer's disease is a progressive neurodegenerative disorder in which mild cognitive impairment (MCI) marks a critical transition between aging and dementia. Neuroimaging modalities, such as structural MRI, provide biomarkers of this transition; however, their high...
Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments
arXiv:2602.23997v1 Announce Type: new Abstract: The next generation of autonomous agents must not only learn efficiently but also act reliably and adapt their behavior in open worlds. Standard approaches typically assume fixed tasks and environments with little or no novelty,...
pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures
arXiv:2602.24066v1 Announce Type: new Abstract: Path signatures provide a rich representation of sequential data, with strong theoretical guarantees and good performance in a variety of machine-learning tasks. While signatures have progressed from fixed feature extractors to trainable components of machine-learning...
Court sides with parents in dispute over California policies on transgender students
The Supreme Court on Monday night granted a request from a group of California parents to reinstate a ruling by a federal district court that prohibits schools in that state […]The postCourt sides with parents in dispute over California policies...
Supreme Court grants Republicans’ request to pause order to redraw New York congressional map
The Supreme Court on Monday night cleared the way for New York to go forward with the 2026 elections using the state’s existing congressional map. Over the objections of the […]The postSupreme Court grants Republicans’ request to pause order to...
Court turns down several cases, including on filing fees for indigent prisoners and ability of felons to possess guns
Over the objections of the court’s three Democratic appointees, the Supreme Court on Monday morning declined to hear a case involving the payment of filing fees by indigent prisoners. The […]The postCourt turns down several cases, including on filing fees...
Birthright citizenship: A note on foundlings and comments on four complementary amicus briefs
Foundlings – babies born of unknown parentage – loomed large in the imagination of mid-19th century Americans, who dutifully read their Bibles and thought about baby Moses in a basket. […]The postBirthright citizenship: A note on foundlings and comments on...
Supreme Court skeptical of law banning drug users from possessing firearms
The Supreme Court on Monday was skeptical that the indictment of a Texas man on charges that he violated a federal law prohibiting the possession of a gun by the […]The postSupreme Court skeptical of law banning drug users from...
Justices to consider breadth of a federal defendant’s waiver of appeal
In Hunter v. United States, to be argued on Tuesday, March 3, the Supreme Court will address how broad federal defendants’ waivers of their right to appeal can be and […]The postJustices to consider breadth of a federal defendant’s waiver...
Trump FCC's equal-time crackdown doesn't apply equally—or at all—to talk radio
FCC Chairman Brendan Carr's unequal enforcement of the equal-time rule.
No one has a good plan for how AI companies should work with the government
As OpenAI transitions from a wildly successful consumer startup into a piece of national security infrastructure, the company seems unequipped to manage its new responsibilities.
Anthropic’s Claude reports widespread outage
Anthropic's AI chatbot Claude experienced widespread service disruptions on Monday morning, with thousands of users reporting issues accessing the bot.