Logit Distance Bounds Representational Similarity
arXiv:2602.15438v1 Announce Type: new Abstract: For a broad family of discriminative models that includes autoregressive language models, identifiability results imply that if two models induce …
All Articles
arXiv:2602.15438v1 Announce Type: new Abstract: For a broad family of discriminative models that includes autoregressive language models, identifiability results imply that if two models induce …
arXiv:2602.15457v1 Announce Type: new Abstract: Anomaly detection (AD) for safety-critical IoT time series should be judged at the event level: reliability and earliness under realistic …
arXiv:2602.15460v1 Announce Type: new Abstract: Integrating reasoning in large language models and large vision-language models has recently led to significant improvement of their capabilities. However, …
arXiv:2602.15473v1 Announce Type: new Abstract: Optimization refers to the task of finding extrema of an objective function. Classical gradient-based optimizers are highly sensitive to hyperparameter …
arXiv:2602.15478v1 Announce Type: new Abstract: Mood instability is a key behavioral indicator of mental health, yet traditional assessments rely on infrequent and retrospective reports that …
arXiv:2602.15481v1 Announce Type: new Abstract: LLM-as-a-judge has emerged as a cornerstone technique for evaluating large language models by leveraging LLM reasoning to score prompt-response pairs. …
arXiv:2602.15499v1 Announce Type: new Abstract: It has been shown that a neural network's Lipschitz constant can be leveraged to derive robustness guarantees, to improve generalizability …
arXiv:2602.15503v1 Announce Type: new Abstract: Stability and robustness are critical for deploying Transformers in safety-sensitive settings. A principled way to enforce such behavior is to …
arXiv:2602.15510v1 Announce Type: new Abstract: Federated Learning (FL) enables distributed training across multiple clients without centralized data sharing, while Graph Neural Networks (GNNs) model relational …
arXiv:2602.15515v1 Announce Type: new Abstract: Training against white-box deception detectors has been proposed as a way to make AI systems honest. However, such training risks …
arXiv:2602.15546v1 Announce Type: new Abstract: The ability to accurately perform counterfactual inference on time series is crucial for decision-making in fields like finance, healthcare, and …
arXiv:2602.15563v1 Announce Type: new Abstract: Quantization-aware training (QAT) is an effective method to drastically reduce the memory footprint of LLMs while keeping performance degradation at …