Multi-Agent Causal Reasoning for Suicide Ideation Detection Through Online Conversations
arXiv:2602.23577v1 Announce Type: new Abstract: Suicide remains a pressing global public health concern. While social media platforms offer opportunities for early risk detection through online conversation trees, existing approaches face two major limitations: (1) They rely on predefined rules (e.g.,...
BRIDGE the Gap: Mitigating Bias Amplification in Automated Scoring of English Language Learners via Inter-group Data Augmentation
arXiv:2602.23580v1 Announce Type: new Abstract: In the field of educational assessment, automated scoring systems increasingly rely on deep learning and large language models (LLMs). However, these systems face significant risks of bias amplification, where model prediction gaps between student groups...
Divide and Conquer: Accelerating Diffusion-Based Large Language Models via Adaptive Parallel Decoding
arXiv:2602.23792v1 Announce Type: new Abstract: Diffusion-based large language models (dLLMs) have shown promising performance across various reasoning tasks, establishing themselves as an alternative to autoregressive large language models (LLMs). Unlike autoregressive LLMs that generate one token per step based on...
CLFEC: A New Task for Unified Linguistic and Factual Error Correction in paragraph-level Chinese Professional Writing
arXiv:2602.23845v1 Announce Type: new Abstract: Chinese text correction has traditionally focused on spelling and grammar, while factual error correction is usually treated separately. However, in paragraph-level Chinese professional writing, linguistic (word/grammar/punctuation) and factual errors frequently co-occur and interact, making unified...
Task Complexity Matters: An Empirical Study of Reasoning in LLMs for Sentiment Analysis
arXiv:2602.24060v1 Announce Type: new Abstract: Large language models (LLMs) with reasoning capabilities have fueled a compelling narrative that reasoning universally improves performance across language tasks. We test this claim through a comprehensive evaluation of 504 configurations across seven model families--including...
Terminology Rarity Predicts Catastrophic Failure in LLM Translation of Low-Resource Ancient Languages: Evidence from Ancient Greek
arXiv:2602.24119v1 Announce Type: new Abstract: This study presents the first systematic, reference-free human evaluation of large language model (LLM) machine translation (MT) for Ancient Greek (AG) technical prose. We evaluate translations by three commercial LLMs (Claude, Gemini, ChatGPT) of twenty...
ArgLLM-App: An Interactive System for Argumentative Reasoning with Large Language Models
arXiv:2602.24172v1 Announce Type: new Abstract: Argumentative LLMs (ArgLLMs) are an existing approach leveraging Large Language Models (LLMs) and computational argumentation for decision-making, with the aim of making the resulting decisions faithfully explainable to and contestable by humans. Here we propose...
MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games
arXiv:2602.24188v1 Announce Type: new Abstract: We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communication about private information. This enables an interactive scaling analysis, in which a fixed...
Controllable Reasoning Models Are Private Thinkers
arXiv:2602.24210v1 Announce Type: new Abstract: AI agents powered by reasoning models require access to sensitive user data. However, their reasoning traces are difficult to control, which can result in the unintended leakage of private information to external parties. We propose...
NAU-QMUL: Utilizing BERT and CLIP for Multi-modal AI-Generated Image Detection
arXiv:2602.23863v1 Announce Type: cross Abstract: With the aim of detecting AI-generated images and identifying the specific models responsible for their generation, we propose a multi-modal multi-task model. The model leverages pre-trained BERT and CLIP Vision encoders for text and image...
LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding
arXiv:2602.23881v1 Announce Type: cross Abstract: Speculative decoding accelerates autoregressive large language model (LLM) inference by using a lightweight draft model to propose candidate tokens that are then verified in parallel by the target model. The speedup is significantly determined by...
RewardUQ: A Unified Framework for Uncertainty-Aware Reward Models
arXiv:2602.24040v1 Announce Type: cross Abstract: Reward models are central to aligning large language models (LLMs) with human preferences. Yet most approaches rely on pointwise reward estimates that overlook the epistemic uncertainty in reward models arising from limited human feedback. Recent...
U-CAN: Utility-Aware Contrastive Attenuation for Efficient Unlearning in Generative Recommendation
arXiv:2602.23400v1 Announce Type: new Abstract: Generative Recommendation (GenRec) typically leverages Large Language Models (LLMs) to redefine personalization as an instruction-driven sequence generation task. However, fine-tuning on user logs inadvertently encodes sensitive attributes into model parameters, raising critical privacy concerns. Existing...
Sample Size Calculations for Developing Clinical Prediction Models: Overview and pmsims R package
arXiv:2602.23507v1 Announce Type: new Abstract: Background: Clinical prediction models are increasingly used to inform healthcare decisions, but determining the minimum sample size for their development remains a critical and unresolved challenge. Inadequate sample sizes can lead to overfitting, poor generalisability,...
Neural Operators Can Discover Functional Clusters
arXiv:2602.23528v1 Announce Type: new Abstract: Operator learning is reshaping scientific computing by amortizing inference across infinite families of problems. While neural operators (NOs) are increasingly well understood for regression, far less is known for classification and its unsupervised analogue: clustering....
Rudder: Steering Prefetching in Distributed GNN Training using LLM Agents
arXiv:2602.23556v1 Announce Type: new Abstract: Large-scale Graph Neural Networks (GNNs) are typically trained by sampling a vertex's neighbors to a fixed distance. Because large input graphs are distributed, training requires frequent irregular communication that stalls forward progress. Moreover, fetched data...
Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing
arXiv:2602.23565v1 Announce Type: new Abstract: In many economically relevant contexts where machine learning is deployed, multiple platforms obtain data from the same pool of users, each of whom selects the platform that best serves them. Prior work in this setting...
When Does Multimodal Learning Help in Healthcare? A Benchmark on EHR and Chest X-Ray Fusion
arXiv:2602.23614v1 Announce Type: new Abstract: Machine learning holds promise for advancing clinical decision support, yet it remains unclear when multimodal learning truly helps in practice, particularly under modality missingness and fairness constraints. In this work, we conduct a systematic benchmark...
BTTackler: A Diagnosis-based Framework for Efficient Deep Learning Hyperparameter Optimization
arXiv:2602.23630v1 Announce Type: new Abstract: Hyperparameter optimization (HPO) is known to be costly in deep learning, especially when leveraging automated approaches. Most of the existing automated HPO methods are accuracy-based, i.e., accuracy metrics are used to guide the trials of...
Disentangled Mode-Specific Representations for Tensor Time Series via Contrastive Learning
arXiv:2602.23663v1 Announce Type: new Abstract: Multi-mode tensor time series (TTS) can be found in many domains, such as search engines and environmental monitoring systems. Learning representations of a TTS benefits various applications, but it is also challenging since the complexities...
Provable Subspace Identification of Nonlinear Multi-view CCA
arXiv:2602.23785v1 Announce Type: new Abstract: We investigate the identifiability of nonlinear Canonical Correlation Analysis (CCA) in a multi-view setup, where each view is generated by an unknown nonlinear map applied to a linear mixture of shared latents and view-private noise....
MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models
arXiv:2602.23798v1 Announce Type: new Abstract: Machine unlearning for large language models often faces a privacy dilemma in which strict constraints prohibit sharing either the server's parameters or the client's forget set. To address this dual non-disclosure constraint, we propose MPU,...
Beyond State-Wise Mirror Descent: Offline Policy Optimization with Parameteric Policies
arXiv:2602.23811v1 Announce Type: new Abstract: We investigate the theoretical aspects of offline reinforcement learning (RL) under general function approximation. While prior works (e.g., Xie et al., 2021) have established the theoretical foundations of learning a good policy from offline data...
Hierarchical Concept-based Interpretable Models
arXiv:2602.23947v1 Announce Type: new Abstract: Modern deep neural networks remain challenging to interpret due to the opacity of their latent representations, impeding model understanding, debugging, and debiasing. Concept Embedding Models (CEMs) address this by mapping inputs to human-interpretable concept representations...
Intrinsic Lorentz Neural Network
arXiv:2602.23981v1 Announce Type: new Abstract: Real-world data frequently exhibit latent hierarchical structures, which can be naturally represented by hyperbolic geometry. Although recent hyperbolic neural networks have demonstrated promising results, many existing architectures remain partially intrinsic, mixing Euclidean operations with hyperbolic...
MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening
arXiv:2602.23994v1 Announce Type: new Abstract: Alzheimer's disease is a progressive neurodegenerative disorder in which mild cognitive impairment (MCI) marks a critical transition between aging and dementia. Neuroimaging modalities, such as structural MRI, provide biomarkers of this transition; however, their high...
InfoNCE Induces Gaussian Distribution
arXiv:2602.24012v1 Announce Type: new Abstract: Contrastive learning has become a cornerstone of modern representation learning, allowing training with massive unlabeled data for both task-specific and general (foundation) models. A prototypical loss in contrastive training is InfoNCE and its variants. In...
Justices to consider breadth of a federal defendant’s waiver of appeal
In Hunter v. United States, to be argued on Tuesday, March 3, the Supreme Court will address how broad federal defendants’ waivers of their right to appeal can be and […]The postJustices to consider breadth of a federal defendant’s waiver...
Users are ditching ChatGPT for Claude — here’s how to make the switch
Following controversies surrounding ChatGPT, many users are ditching the AI chatbot for Claude instead. Here's how to make the switch.
Tech workers urge DOD, Congress to withdraw Anthropic label as a supply-chain risk
Tech workers have signed an open letter urging the Department of Defense to withdraw its designation of Anthropic as a "supply chain risk" and instead to settle the matter quietly.