Distributed physics-informed neural networks via domain decomposition for fast flow reconstruction
arXiv:2602.15883v1 Announce Type: new Abstract: Physics-Informed Neural Networks (PINNs) offer a powerful paradigm for flow reconstruction, seamlessly integrating sparse velocity measurements with the governing Navier-Stokes equations to recover complete velocity and latent pressure fields. However, scaling such models to large...
B-DENSE: Branching For Dense Ensemble Network Learning
arXiv:2602.15971v1 Announce Type: new Abstract: Inspired by non-equilibrium thermodynamics, diffusion models have achieved state-of-the-art performance in generative modeling. However, their iterative sampling nature results in high inference latency. While recent distillation techniques accelerate sampling, they discard intermediate trajectory steps. This...
Anatomy of Capability Emergence: Scale-Invariant Representation Collapse and Top-Down Reorganization in Neural Networks
arXiv:2602.15997v1 Announce Type: new Abstract: Capability emergence during neural network training remains mechanistically opaque. We track five geometric measures across five model scales (405K-85M parameters), 120+ emergence events in eight algorithmic tasks, and three Pythia language models (160M-2.8B). We find:...
AI-CARE: Carbon-Aware Reporting Evaluation Metric for AI Models
arXiv:2602.16042v1 Announce Type: new Abstract: As machine learning (ML) continues its rapid expansion, the environmental cost of model training and inference has become a critical societal concern. Existing benchmarks overwhelmingly focus on standard performance metrics such as accuracy, BLEU, or...
MoE-Spec: Expert Budgeting for Efficient Speculative Decoding
arXiv:2602.16052v1 Announce Type: new Abstract: Speculative decoding accelerates Large Language Model (LLM) inference by verifying multiple drafted tokens in parallel. However, for Mixture-of-Experts (MoE) models, this parallelism introduces a severe bottleneck: large draft trees activate many unique experts, significantly increasing...
Can Generative Artificial Intelligence Survive Data Contamination? Theoretical Guarantees under Contaminated Recursive Training
arXiv:2602.16065v1 Announce Type: new Abstract: Generative Artificial Intelligence (AI), such as large language models (LLMs), has become a transformative force across science, industry, and society. As these systems grow in popularity, web data becomes increasingly interwoven with this AI-generated material...
On the Power of Source Screening for Learning Shared Feature Extractors
arXiv:2602.16125v1 Announce Type: new Abstract: Learning with shared representation is widely recognized as an effective way to separate commonalities from heterogeneity across various heterogeneous sources. Most existing work includes all related data sources via simultaneously training a common feature extractor...
HiPER: Hierarchical Reinforcement Learning with Explicit Credit Assignment for Large Language Model Agents
arXiv:2602.16165v1 Announce Type: new Abstract: Training LLMs as interactive agents for multi-turn decision-making remains challenging, particularly in long-horizon tasks with sparse and delayed rewards, where agents must execute extended sequences of actions before receiving meaningful feedback. Most existing reinforcement learning...
Muon with Spectral Guidance: Efficient Optimization for Scientific Machine Learning
arXiv:2602.16167v1 Announce Type: new Abstract: Physics-informed neural networks and neural operators often suffer from severe optimization difficulties caused by ill-conditioned gradients, multi-scale spectral behavior, and stiffness induced by physical constraints. Recently, the Muon optimizer has shown promise by performing orthogonalized...
Towards Secure and Scalable Energy Theft Detection: A Federated Learning Approach for Resource-Constrained Smart Meters
arXiv:2602.16181v1 Announce Type: new Abstract: Energy theft poses a significant threat to the stability and efficiency of smart grids, leading to substantial economic losses and operational challenges. Traditional centralized machine learning approaches for theft detection require aggregating user data, raising...
ModalImmune: Immunity Driven Unlearning via Self Destructive Training
arXiv:2602.16197v1 Announce Type: new Abstract: Multimodal systems are vulnerable to partial or complete loss of input channels at deployment, which undermines reliability in real-world settings. This paper presents ModalImmune, a training framework that enforces modality immunity by intentionally and controllably...
Training-Free Adaptation of Diffusion Models via Doob's $h$-Transform
arXiv:2602.16198v1 Announce Type: new Abstract: Adaptation methods have been a workhorse for unlocking the transformative power of pre-trained diffusion models in diverse applications. Existing approaches often abstract adaptation objectives as a reward function and steer diffusion models to generate high-reward...
Bayesian Quadrature: Gaussian Processes for Integration
arXiv:2602.16218v1 Announce Type: new Abstract: Bayesian quadrature is a probabilistic, model-based approach to numerical integration, the estimation of intractable integrals, or expectations. Although Bayesian quadrature was popularised already in the 1980s, no systematic and comprehensive treatment has been published. The...
SEMixer: Semantics Enhanced MLP-Mixer for Multiscale Mixing and Long-term Time Series Forecasting
arXiv:2602.16220v1 Announce Type: new Abstract: Modeling multiscale patterns is crucial for long-term time series forecasting (TSF). However, redundancy and noise in time series, together with semantic gaps between non-adjacent scales, make the efficient alignment and integration of multi-scale temporal dependencies...
Fast KV Compaction via Attention Matching
arXiv:2602.16284v1 Announce Type: new Abstract: Scaling language models to long contexts is often bottlenecked by the size of the key-value (KV) cache. In deployed settings, long contexts are typically managed through compaction in token space via summarization. However, summarization can...
Can courts excuse late removals to federal court?
As many law students learn in their civil procedure course, when a plaintiff files suit in state court asserting a claim over which a federal district court would have jurisdiction, […]The postCan courts excuse late removals to federal court?appeared first...
SCOTUStoday for Thursday, February 19
Updated on Feb. 19 at 9:50 a.m. President Franklin D. Roosevelt issued Executive Order 9066 on this day in 1942, authorizing the removal of Japanese Americans to internment camps. In […]The postSCOTUStoday for Thursday, February 19appeared first onSCOTUSblog.
Why these startup CEOs don’t think AI will replace human roles
The CEOs of Read AI and Lucidya told TechCrunch at Web Summit Qatar that they see AI tools replacing tasks, rather than workers.
FrameRef: A Framing Dataset and Simulation Testbed for Modeling Bounded Rational Information Health
arXiv:2602.15273v1 Announce Type: cross Abstract: Information ecosystems increasingly shape how people internalize exposure to adverse digital experiences, raising concerns about the long-term consequences for information health. In modern search and recommendation systems, ranking and personalization policies play a central role...
Proactive Conversational Assistant for a Procedural Manual Task based on Audio and IMU
arXiv:2602.15707v1 Announce Type: cross Abstract: Real-time conversational assistants for procedural tasks often depend on video input, which can be computationally expensive and compromise user privacy. For the first time, we propose a real-time conversational assistant that provides comprehensive guidance for...
Hybrid Feature Learning with Time Series Embeddings for Equipment Anomaly Prediction
arXiv:2602.15089v1 Announce Type: new Abstract: In predictive maintenance of equipment, deep learning-based time series anomaly detection has garnered significant attention; however, pure deep learning approaches often fail to achieve sufficient accuracy on real-world data. This study proposes a hybrid approach...
Learning Data-Efficient and Generalizable Neural Operators via Fundamental Physics Knowledge
arXiv:2602.15184v1 Announce Type: new Abstract: Recent advances in scientific machine learning (SciML) have enabled neural operators (NOs) to serve as powerful surrogates for modeling the dynamic evolution of physical systems governed by partial differential equations (PDEs). While existing approaches focus...
COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression
arXiv:2602.15200v1 Announce Type: new Abstract: Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD). However, enforcing a single shared subspace can degrade accuracy even at moderate compression. Sparse dictionary learning provides a more flexible union-of-subspaces representation,...
Automatically Finding Reward Model Biases
arXiv:2602.15222v1 Announce Type: new Abstract: Reward models are central to large language model (LLM) post-training. However, past work has shown that they can reward spurious or undesirable attributes such as length, format, hallucinations, and sycophancy. In this work, we introduce...
Complex-Valued Unitary Representations as Classification Heads for Improved Uncertainty Quantification in Deep Neural Networks
arXiv:2602.15283v1 Announce Type: new Abstract: Modern deep neural networks achieve high predictive accuracy but remain poorly calibrated: their confidence scores do not reliably reflect the true probability of correctness. We propose a quantum-inspired classification head architecture that projects backbone features...
Hybrid Federated and Split Learning for Privacy Preserving Clinical Prediction and Treatment Optimization
arXiv:2602.15304v1 Announce Type: new Abstract: Collaborative clinical decision support is often constrained by governance and privacy rules that prevent pooling patient-level records across institutions. We present a hybrid privacy-preserving framework that combines Federated Learning (FL) and Split Learning (SL) to...
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
arXiv:2602.15322v1 Announce Type: new Abstract: Training large language models (LLMs) relies almost exclusively on dense adaptive optimizers with increasingly sophisticated preconditioners. We challenge this by showing that randomly masking parameter updates can be highly effective, with a masked variant of...
A Scalable Curiosity-Driven Game-Theoretic Framework for Long-Tail Multi-Label Learning in Data Mining
arXiv:2602.15330v1 Announce Type: new Abstract: The long-tail distribution, where a few head labels dominate while rare tail labels abound, poses a persistent challenge for large-scale Multi-Label Classification (MLC) in real-world data mining applications. Existing resampling and reweighting strategies often disrupt...
Doubly Stochastic Mean-Shift Clustering
arXiv:2602.15393v1 Announce Type: new Abstract: Standard Mean-Shift algorithms are notoriously sensitive to the bandwidth hyperparameter, particularly in data-scarce regimes where fixed-scale density estimation leads to fragmentation and spurious modes. In this paper, we propose Doubly Stochastic Mean-Shift (DSMS), a novel...
Logit Distance Bounds Representational Similarity
arXiv:2602.15438v1 Announce Type: new Abstract: For a broad family of discriminative models that includes autoregressive language models, identifiability results imply that if two models induce the same conditional distributions, then their internal representations agree up to an invertible linear transformation....