Transit Network Design with Two-Level Demand Uncertainties: A Machine Learning and Contextual Stochastic Optimization Framework
arXiv:2603.00010v1 Announce Type: new Abstract: Transit Network Design is a well-studied problem in the field of transportation, typically addressed by solving optimization models under fixed demand assumptions. Considering the limitations of these assumptions, this paper proposes a new framework, namely...
StaTS: Spectral Trajectory Schedule Learning for Adaptive Time Series Forecasting with Frequency Guided Denoiser
arXiv:2603.00037v1 Announce Type: new Abstract: Diffusion models have been used for probabilistic time series forecasting and show strong potential. However, fixed noise schedules often produce intermediate states that are hard to invert and a terminal state that deviates from the...
Econometric vs. Causal Structure-Learning for Time-Series Policy Decisions: Evidence from the UK COVID-19 Policies
arXiv:2603.00041v1 Announce Type: new Abstract: Causal machine learning (ML) recovers graphical structures that inform us about potential cause-and-effect relationships. Most progress has focused on cross-sectional data with no explicit time order, whereas recovering causal structures from time series data remains...
Reinforcement Learning for Control with Probabilistic Stability Guarantee: A Finite-Sample Approach
arXiv:2603.00043v1 Announce Type: new Abstract: This paper presents a novel approach to reinforcement learning (RL) for control systems that provides probabilistic stability guarantees using finite data. Leveraging Lyapunov's method, we propose a probabilistic stability theorem that ensures mean square stability...
Property-Driven Evaluation of GNN Expressiveness at Scale: Datasets, Framework, and Study
arXiv:2603.00044v1 Announce Type: new Abstract: Advancing trustworthy AI requires principled software engineering approaches to model evaluation. Graph Neural Networks (GNNs) have achieved remarkable success in processing graph-structured data, however, their expressiveness in capturing fundamental graph properties remains an open challenge....
Breaking the Factorization Barrier in Diffusion Language Models
arXiv:2603.00045v1 Announce Type: new Abstract: Diffusion language models theoretically allow for efficient parallel generation but are practically hindered by the "factorization barrier": the assumption that simultaneously predicted tokens are independent. This limitation forces a trade-off: models must either sacrifice speed...
REMIND: Rethinking Medical High-Modality Learning under Missingness--A Long-Tailed Distribution Perspective
arXiv:2603.00046v1 Announce Type: new Abstract: Medical multi-modal learning is critical for integrating information from a large set of diverse modalities. However, when leveraging a high number of modalities in real clinical applications, it is often impractical to obtain full-modality observations...
BiJEPA: Bi-directional Joint Embedding Predictive Architecture for Symmetric Representation Learning
arXiv:2603.00049v1 Announce Type: new Abstract: Self-Supervised Learning (SSL) has shifted from pixel-level reconstruction to latent space prediction, spearheaded by the Joint Embedding Predictive Architecture (JEPA). While effective, standard JEPA models typically rely on a uni-directional prediction mechanism (e.g. Context $\to$...
Expert Divergence Learning for MoE-based Language Models
arXiv:2603.00054v1 Announce Type: new Abstract: The Mixture-of-Experts (MoE) architecture is a powerful technique for scaling language models, yet it often suffers from expert homogenization, where experts learn redundant functionalities, thereby limiting MoE's full potential. To address this, we introduce Expert...
M3-AD: Reflection-aware Multi-modal, Multi-category, and Multi-dimensional Benchmark and Framework for Industrial Anomaly Detection
arXiv:2603.00055v1 Announce Type: new Abstract: Although multimodal large language models (MLLMs) have advanced industrial anomaly detection toward a zero-shot paradigm, they still tend to produce high-confidence yet unreliable decisions in fine-grained and structurally complex industrial scenarios, and lack effective self-corrective...
Certainty-Validity: A Diagnostic Framework for Discrete Commitment Systems
arXiv:2603.00070v1 Announce Type: new Abstract: Standard evaluation metrics for machine learning -- accuracy, precision, recall, and AUROC -- assume that all errors are equivalent: a confident incorrect prediction is penalized identically to an uncertain one. For discrete commitment systems (architectures...
SEval-NAS: A Search-Agnostic Evaluation for Neural Architecture Search
arXiv:2603.00099v1 Announce Type: new Abstract: Neural architecture search (NAS) automates the discovery of neural networks that meet specified criteria, yet its evaluation procedures are often hardcoded, limiting the ability to introduce new metrics. This issue is especially pronounced in hardware-aware...
Wideband Power Amplifier Behavioral Modeling Using an Amplitude Conditioned LSTM
arXiv:2603.00101v1 Announce Type: new Abstract: Wideband power amplifiers exhibit complex nonlinear and memory effects that challenge traditional behavioral modeling approaches. This paper proposes a novel amplitude conditioned long short-term memory (AC-LSTM) network that introduces explicit amplitude-dependent gating to enhance the...
LIDS: LLM Summary Inference Under the Layered Lens
arXiv:2603.00105v1 Announce Type: new Abstract: Large language models (LLMs) have gained significant attention by many researchers and practitioners in natural language processing (NLP) since the introduction of ChatGPT in 2022. One notable feature of ChatGPT is its ability to generate...
MAML-KT: Addressing Cold Start Problem in Knowledge Tracing for New Students via Few-Shot Model-Agnostic Meta Learning
arXiv:2603.00137v1 Announce Type: new Abstract: Knowledge tracing (KT) models are commonly evaluated by training on early interactions from all students and testing on later responses. While effective for measuring average predictive performance, this evaluation design obscures a cold start scenario...
Bridging Policy and Real-World Dynamics: LLM-Augmented Rebalancing for Shared Micromobility Systems
arXiv:2603.00176v1 Announce Type: new Abstract: Shared micromobility services such as e-scooters and bikes have become an integral part of urban transportation, yet their efficiency critically depends on effective vehicle rebalancing. Existing methods either optimize for average demand patterns or employ...
Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease
arXiv:2603.00181v1 Announce Type: new Abstract: A recent report on "Learning the natural history of human disease with generative transformers" created an opportunity to assess the engineering challenge of delivering user-facing Generative AI applications in privacy-sensitive domains. The application of these...
OSF: On Pre-training and Scaling of Sleep Foundation Models
arXiv:2603.00190v1 Announce Type: new Abstract: Polysomnography (PSG) provides the gold standard for sleep assessment but suffers from substantial heterogeneity across recording devices and cohorts. There have been growing efforts to build general-purpose foundation models (FMs) for sleep physiology, but lack...
Task-Driven Subspace Decomposition for Knowledge Sharing and Isolation in LoRA-based Continual Learning
arXiv:2603.00191v1 Announce Type: new Abstract: Continual Learning (CL) requires models to sequentially adapt to new tasks without forgetting old knowledge. Recently, Low-Rank Adaptation (LoRA), a representative Parameter-Efficient Fine-Tuning (PEFT) method, has gained increasing attention in CL. Several LoRA-based CL methods...
Diagnostics for Individual-Level Prediction Instability in Machine Learning for Healthcare
arXiv:2603.00192v1 Announce Type: new Abstract: In healthcare, predictive models increasingly inform patient-level decisions, yet little attention is paid to the variability in individual risk estimates and its impact on treatment decisions. For overparameterized models, now standard in machine learning, a...
A medical coding language model trained on clinical narratives from a population-wide cohort of 1.8 million patients
arXiv:2603.00221v1 Announce Type: new Abstract: Medical coding translates clinical documentation into standardized codes for billing, research, and public health, but manual coding is time-consuming and error-prone. Existing automation efforts rely on small datasets that poorly represent real-world patient heterogeneity. We...
CoPeP: Benchmarking Continual Pretraining for Protein Language Models
arXiv:2603.00253v1 Announce Type: new Abstract: Protein language models (pLMs) have recently gained significant attention for their ability to uncover relationships between sequence, structure, and function from evolutionary statistics, thereby accelerating therapeutic drug discovery. These models learn from large protein databases...
Scalable Gaussian process modeling of parametrized spatio-temporal fields
arXiv:2603.00290v1 Announce Type: new Abstract: We introduce a scalable Gaussian process (GP) framework with deep product kernels for data-driven learning of parametrized spatio-temporal fields over fixed or parameter-dependent domains. The proposed framework learns a continuous representation, enabling predictions at arbitrary...
Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks
arXiv:2603.00302v1 Announce Type: new Abstract: Differentiable logic gate networks (DLGNs) learn compact, interpretable Boolean circuits via gradient-based training, but all existing variants are restricted to the 16 two-input binary gates. Extending DLGNs to Ternary Kleene $K_3$ logic and training DTLGNs...
Vectorized Adaptive Histograms for Sparse Oblique Forests
arXiv:2603.00326v1 Announce Type: new Abstract: Classification using sparse oblique random forests provides guarantees on uncertainty and confidence while controlling for specific error types. However, they use more data and more compute than other tree ensembles because they create deep trees...
Detecting Transportation Mode Using Dense Smartphone GPS Trajectories and Transformer Models
arXiv:2603.00340v1 Announce Type: new Abstract: Transportation mode detection is an important topic within GeoAI and transportation research. In this study, we introduce SpeedTransformer, a novel Transformer-based model that relies solely on speed inputs to infer transportation modes from dense smartphone...
StethoLM: Audio Language Model for Cardiopulmonary Analysis Across Clinical Tasks
arXiv:2603.00355v1 Announce Type: new Abstract: Listening to heart and lung sounds - auscultation - is one of the first and most fundamental steps in a clinical examination. Despite being fast and non-invasive, it demands years of experience to interpret subtle...
Quantifying Catastrophic Forgetting in IoT Intrusion Detection Systems
arXiv:2603.00363v1 Announce Type: new Abstract: Distribution shifts in attack patterns within RPL-based IoT networks pose a critical threat to the reliability and security of large-scale connected systems. Intrusion Detection Systems (IDS) trained on static datasets often fail to generalize to...
Deep Learning-Based Meat Freshness Detection with Segmentation and OOD-Aware Classification
arXiv:2603.00368v1 Announce Type: new Abstract: In this study, we present a meat freshness classification framework from Red-Green-Blue (RGB) images that supports both packaged and unpackaged meat datasets. The system classifies four in-distribution (ID) meat classes and uses an out-of-distribution (OOD)-aware...
Improving Full Waveform Inversion in Large Model Era
arXiv:2603.00377v1 Announce Type: new Abstract: Full Waveform Inversion (FWI) is a highly nonlinear and ill-posed problem that aims to recover subsurface velocity maps from surface-recorded seismic waveforms data. Existing data-driven FWI typically uses small models, as available datasets have limited...