ExpLang: Improved Exploration and Exploitation in LLM Reasoning with On-Policy Thinking Language Selection
arXiv:2602.21887v1 Announce Type: new Abstract: Current large reasoning models (LRMs) have shown strong ability on challenging tasks after reinforcement learning (RL) based post-training. However, previous work mainly focuses on English reasoning in expectation of the strongest performance, despite the demonstrated...
MERRY: Semantically Decoupled Evaluation of Multimodal Emotional and Role Consistencies of Role-Playing Agents
arXiv:2602.21941v1 Announce Type: new Abstract: Multimodal Role-Playing Agents (MRPAs) are attracting increasing attention due to their ability to deliver more immersive multimodal emotional interactions. However, existing studies still rely on pure textual benchmarks to evaluate the text responses of MRPAs,...
Large Language Models are Algorithmically Blind
arXiv:2602.21947v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate remarkable breadth of knowledge, yet their ability to reason about computational processes remains poorly understood. Closing this gap matters for practitioners who rely on LLMs to guide algorithm selection and...
Neural network optimization strategies and the topography of the loss landscape
arXiv:2602.21276v1 Announce Type: new Abstract: Neural networks are trained by optimizing multi-dimensional sets of fitting parameters on non-convex loss landscapes. Low-loss regions of the landscapes correspond to the parameter sets that perform well on the training data. A key issue...
Robust AI Evaluation through Maximal Lotteries
arXiv:2602.21297v1 Announce Type: new Abstract: The standard way to evaluate language models on subjective tasks is through pairwise comparisons: an annotator chooses the "better" of two responses to a prompt. Leaderboards aggregate these comparisons into a single Bradley-Terry (BT) ranking,...
SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks
arXiv:2602.21307v1 Announce Type: new Abstract: Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited...
Interleaved Head Attention
arXiv:2602.21371v1 Announce Type: new Abstract: Multi-Head Attention (MHA) is the core computational primitive underlying modern Large Language Models (LLMs). However, MHA suffers from a fundamental linear scaling limitation: $H$ attention heads produce exactly $H$ independent attention matrices, with no communication...
Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators
arXiv:2602.21426v1 Announce Type: new Abstract: We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common...
MINAR: Mechanistic Interpretability for Neural Algorithmic Reasoning
arXiv:2602.21442v1 Announce Type: new Abstract: The recent field of neural algorithmic reasoning (NAR) studies the ability of graph neural networks (GNNs) to emulate classical algorithms like Bellman-Ford, a phenomenon known as algorithmic alignment. At the same time, recent advances in...
When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training
arXiv:2602.21454v1 Announce Type: new Abstract: Recurrent neural networks (RNNs) can be interpreted as discrete-time state-space models, where the state evolution corresponds to an infinite-impulse-response (IIR) filtering operation governed by both feedforward weights and recurrent poles. While, in principle, all parameters...
Effects of Training Data Quality on Classifier Performance
arXiv:2602.21462v1 Announce Type: new Abstract: We describe extensive numerical experiments assessing and quantifying how classifier performance depends on the quality of the training data, a frequently neglected component of the analysis of classifiers. More specifically, in the scientific context of...
Asymptotically Fast Clebsch-Gordan Tensor Products with Vector Spherical Harmonics
arXiv:2602.21466v1 Announce Type: new Abstract: $E(3)$-equivariant neural networks have proven to be effective in a wide range of 3D modeling tasks. A fundamental operation of such networks is the tensor product, which allows interaction between different feature types. Because this...
GradAlign: Gradient-Aligned Data Selection for LLM Reinforcement Learning
arXiv:2602.21492v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a central post-training paradigm for large language models (LLMs), but its performance is highly sensitive to the quality of training problems. This sensitivity stems from the non-stationarity of RL: rollouts...
WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck
arXiv:2602.21508v1 Announce Type: new Abstract: Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which...
Muon+: Towards Better Muon via One Additional Normalization Step
arXiv:2602.21545v1 Announce Type: new Abstract: The Muon optimizer has demonstrated promising performance in pre-training large language models through gradient (or momentum) orthogonalization. In this work, we propose a simple yet effective enhancement to Muon, namely Muon+, which introduces an additional...
Training-free Composition of Pre-trained GFlowNets for Multi-Objective Generation
arXiv:2602.21565v1 Announce Type: new Abstract: Generative Flow Networks (GFlowNets) learn to sample diverse candidates in proportion to a reward function, making them well-suited for scientific discovery, where exploring multiple promising solutions is crucial. Further extending GFlowNets to multi-objective settings has...
ABM-UDE: Developing Surrogates for Epidemic Agent-Based Models via Scientific Machine Learning
arXiv:2602.21588v1 Announce Type: new Abstract: Agent-based epidemic models (ABMs) encode behavioral and policy heterogeneity but are too slow for nightly hospital planning. We develop county-ready surrogates that learn directly from exascale ABM trajectories using Universal Differential Equations (UDEs): mechanistic SEIR-family...
Deep Clustering based Boundary-Decoder Net for Inter and Intra Layer Stress Prediction of Heterogeneous Integrated IC Chip
arXiv:2602.21601v1 Announce Type: new Abstract: High stress occurs when 3D heterogeneous IC packages are subjected to thermal cycling at extreme temperatures. Stress mainly occurs at the interface between different materials. We investigate stress image using latent space representation which is...
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain. In recent years, LegalAI has drawn increasing attention rapidly from both AI researchers and legal professionals, as...
How can the Supreme Court protect electoral integrity?
Justice, Democracy, and Law is a recurring series by Edward B. Foley that focuses on election law and the relationship of law and democracy. The court has already confronted cases […]The postHow can the Supreme Court protect electoral integrity?appeared first...
SCOTUStoday for Thursday, February 26
A new Economist/YouGov poll found that 57% of Americans strongly or somewhat approve of the tariffs ruling and 23% disapprove. For more on the survey, see the Morning Reads section […]The postSCOTUStoday for Thursday, February 26appeared first onSCOTUSblog.
Anthropic CEO stands firm as Pentagon deadline looms
Anthropic CEO Dario Amodei said Thursday that he "cannot in good conscience accede" to the Pentagon's demands to give the military unrestricted access to its AI systems.
Read AI launches an email-based ‘digital twin’ to help you with schedules and answers
Read AI is launching Ada, which can reply with your availability and extract answers from the company knowledge base and the web.
CARE: An Explainable Computational Framework for Assessing Client-Perceived Therapeutic Alliance Using Large Language Models
arXiv:2602.20648v1 Announce Type: new Abstract: Client perceptions of the therapeutic alliance are critical for counseling effectiveness. Accurately capturing these perceptions remains challenging, as traditional post-session questionnaires are burdensome and often delayed, while existing computational approaches produce coarse scores, lack interpretable...
CAMEL: Confidence-Gated Reflection for Reward Modeling
arXiv:2602.20670v1 Announce Type: new Abstract: Reward models play a fundamental role in aligning large language models with human preferences. Existing methods predominantly follow two paradigms: scalar discriminative preference models, which are efficient but lack interpretability, and generative judging models, which...
ID-LoRA: Efficient Low-Rank Adaptation Inspired by Matrix Interpolative Decomposition
arXiv:2602.20727v1 Announce Type: new Abstract: LoRA has become a universal Parameter-Efficient Fine-Tuning (PEFT) technique that equips Large Language Models (LLMs) to adapt quickly to new tasks. However, when these models are scaled up, even the latest LoRA variants still introduce...
The Art of Efficient Reasoning: Data, Reward, and Optimization
arXiv:2602.20945v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning, but also suffer from heavy computational overhead. To address this issue, efficient reasoning aims to incentivize short yet accurate thinking trajectories, typically through reward...
Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving
arXiv:2602.20973v1 Announce Type: new Abstract: To comprehensively evaluate the mathematical reasoning capabilities of Large Language Models (LLMs), researchers have introduced abundant mathematical reasoning datasets. However, most existing datasets primarily focus on linear reasoning, neglecting other parts such as proof by...
Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning
arXiv:2602.21103v1 Announce Type: new Abstract: Advanced reasoning typically requires Chain-of-Thought prompting, which is accurate but incurs prohibitive latency and substantial test-time inference costs. The standard alternative, fine-tuning smaller models, often sacrifices interpretability while introducing significant resource and operational overhead. To...
Protein Language Models Diverge from Natural Language: Comparative Analysis and Improved Inference
arXiv:2602.20449v1 Announce Type: cross Abstract: Modern Protein Language Models (PLMs) apply transformer-based model architectures from natural language processing to biological sequences, predicting a variety of protein functions and properties. However, protein language has key differences from natural language, such as...