Large Language Models are Algorithmically Blind
arXiv:2602.21947v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate remarkable breadth of knowledge, yet their ability to reason about computational processes remains poorly understood. Closing this gap matters for practitioners who rely on LLMs to guide algorithm selection and...
CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models
arXiv:2602.21978v1 Announce Type: new Abstract: Recent work has examined language models from a linguistic perspective to better understand how they acquire language. Most existing benchmarks focus on judging grammatical acceptability, whereas the ability to interpret meanings conveyed by grammatical forms...
A Diversity Diet for a Healthier Model: A Case Study of French ModernBERT
arXiv:2602.22014v1 Announce Type: new Abstract: Diversity has been gaining interest in the NLP community in recent years. At the same time, state-of-the-art transformer models such as ModernBERT use very large pre-training datasets, which are driven by size rather than by...
Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data
arXiv:2602.21320v1 Announce Type: new Abstract: Large language models (LLMs) are becoming the foundation for autonomous agents that can use tools to solve complex tasks. Reinforcement learning (RL) has emerged as a common approach for injecting such agentic capabilities, but typically...
Efficient Opportunistic Approachability
arXiv:2602.21328v1 Announce Type: new Abstract: We study the problem of opportunistic approachability: a generalization of Blackwell approachability where the learner would like to obtain stronger guarantees (i.e., approach a smaller set) when their adversary limits themselves to a subset of...
HiPPO Zoo: Explicit Memory Mechanisms for Interpretable State Space Models
arXiv:2602.21340v1 Announce Type: new Abstract: Representing the past in a compressed, efficient, and informative manner is a central problem for systems trained on sequential data. The HiPPO framework, originally proposed by Gu & Dao et al., provides a principled approach...
Archetypal Graph Generative Models: Explainable and Identifiable Communities via Anchor-Dominant Convex Hulls
arXiv:2602.21342v1 Announce Type: new Abstract: Representation learning has been essential for graph machine learning tasks such as link prediction, community detection, and network visualization. Despite recent advances in achieving high performance on these downstream tasks, little progress has been made...
Defensive Generation
arXiv:2602.21390v1 Announce Type: new Abstract: We study the problem of efficiently producing, in an online fashion, generative models of scalar, multiclass, and vector-valued outcomes that cannot be falsified on the basis of the observed data and a pre-specified collection of...
Generative Bayesian Computation as a Scalable Alternative to Gaussian Process Surrogates
arXiv:2602.21408v1 Announce Type: new Abstract: Gaussian process (GP) surrogates are the default tool for emulating expensive computer experiments, but cubic cost, stationarity assumptions, and Gaussian predictive distributions limit their reach. We propose Generative Bayesian Computation (GBC) via Implicit Quantile Networks...
Proximal-IMH: Proximal Posterior Proposals for Independent Metropolis-Hastings with Approximate Operators
arXiv:2602.21426v1 Announce Type: new Abstract: We consider the problem of sampling from a posterior distribution arising in Bayesian inverse problems in science, engineering, and imaging. Our method belongs to the family of independence Metropolis-Hastings (IMH) sampling algorithms, which are common...
Provably Safe Generative Sampling with Constricting Barrier Functions
arXiv:2602.21429v1 Announce Type: new Abstract: Flow-based generative models, such as diffusion models and flow matching models, have achieved remarkable success in learning complex data distributions. However, a critical gap remains for their deployment in safety-critical domains: the lack of formal...
D-Flow SGLD: Source-Space Posterior Sampling for Scientific Inverse Problems with Flow Matching
arXiv:2602.21469v1 Announce Type: new Abstract: Data assimilation and scientific inverse problems require reconstructing high-dimensional physical states from sparse and noisy observations, ideally with uncertainty-aware posterior samples that remain faithful to learned priors and governing physics. While training-free conditional generation is...
WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck
arXiv:2602.21508v1 Announce Type: new Abstract: Robust watermarking is critical for intellectual property protection, whereas existing methods face a severe vulnerability against regeneration-based AIGC attacks. We identify that existing methods fail because they entangle the watermark with high-frequency cover texture, which...
Training Generalizable Collaborative Agents via Strategic Risk Aversion
arXiv:2602.21515v1 Announce Type: new Abstract: Many emerging agentic paradigms require agents to collaborate with one another (or people) to achieve shared goals. Unfortunately, existing approaches to learning policies for such collaborative problems produce brittle solutions that fail when paired with...
Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction
arXiv:2602.21550v1 Announce Type: new Abstract: Gene expression prediction, which predicts mRNA expression levels from DNA sequences, presents significant challenges. Previous works often focus on extending input sequence length to locate distal enhancers, which may influence target genes from hundreds of...
Duel-Evolve: Reward-Free Test-Time Scaling via LLM Self-Preferences
arXiv:2602.21585v1 Announce Type: new Abstract: Many applications seek to optimize LLM outputs at test time by iteratively proposing, scoring, and refining candidates over a discrete output space. Existing methods use a calibrated scalar evaluator for the target objective to guide...
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence
Legal Artificial Intelligence (LegalAI) focuses on applying the technology of artificial intelligence, especially natural language processing, to benefit tasks in the legal domain. In recent years, LegalAI has drawn increasing attention rapidly from both AI researchers and legal professionals, as...
AI’s Future May Be Quantum
Stephanie Seoyun Hwang, J.D. Class of 2028 While most people recognize AI as a transformative force, fewer are aware of one of the key technologies fueling its progress: quantum computing. In fact, many governments and tech industry actors see it...
Anthropic CEO stands firm as Pentagon deadline looms
Anthropic CEO Dario Amodei said Thursday that he "cannot in good conscience accede" to the Pentagon's demands to give the military unrestricted access to its AI systems.
Semantic Novelty at Scale: Narrative Shape Taxonomy and Readership Prediction in 28,606 Books
arXiv:2602.20647v1 Announce Type: new Abstract: I introduce semantic novelty--cosine distance between each paragraph's sentence embedding and the running centroid of all preceding paragraphs--as an information-theoretic measure of narrative structure at corpus scale. Applying it to 28,606 books in PG19 (pre-1920...
CAMEL: Confidence-Gated Reflection for Reward Modeling
arXiv:2602.20670v1 Announce Type: new Abstract: Reward models play a fundamental role in aligning large language models with human preferences. Existing methods predominantly follow two paradigms: scalar discriminative preference models, which are efficient but lack interpretability, and generative judging models, which...
Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization
arXiv:2602.20743v1 Announce Type: new Abstract: Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing anonymization methods rely on static, manually...
Explicit Grammar Semantic Feature Fusion for Robust Text Classification
arXiv:2602.20749v1 Announce Type: new Abstract: Natural Language Processing enables computers to understand human language by analysing and classifying text efficiently with deep-level grammatical and semantic features. Existing models capture features by learning from large corpora with transformer models, which are...
Don't Ignore the Tail: Decoupling top-K Probabilities for Efficient Language Model Distillation
arXiv:2602.20816v1 Announce Type: new Abstract: The core learning signal used in language model distillation is the standard Kullback-Leibler (KL) divergence between the student and teacher distributions. Traditional KL divergence tends to be dominated by the next tokens with the highest...
FinAnchor: Aligned Multi-Model Representations for Financial Prediction
arXiv:2602.20859v1 Announce Type: new Abstract: Financial prediction from long documents involves significant challenges, as actionable signals are often sparse and obscured by noise, and the optimal LLM for generating embeddings varies across tasks and time periods. In this paper, we...
The Art of Efficient Reasoning: Data, Reward, and Optimization
arXiv:2602.20945v1 Announce Type: new Abstract: Large Language Models (LLMs) consistently benefit from scaled Chain-of-Thought (CoT) reasoning, but also suffer from heavy computational overhead. To address this issue, efficient reasoning aims to incentivize short yet accurate thinking trajectories, typically through reward...
Blackbird Language Matrices: A Framework to Investigate the Linguistic Competence of Language Models
arXiv:2602.20966v1 Announce Type: new Abstract: This article describes a novel language task, the Blackbird Language Matrices (BLM) task, inspired by intelligence tests, and illustrates the BLM datasets, their construction and benchmarking, and targeted experiments on chunking and systematicity. BLMs are...
Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving
arXiv:2602.20973v1 Announce Type: new Abstract: To comprehensively evaluate the mathematical reasoning capabilities of Large Language Models (LLMs), researchers have introduced abundant mathematical reasoning datasets. However, most existing datasets primarily focus on linear reasoning, neglecting other parts such as proof by...
On Data Engineering for Scaling LLM Terminal Capabilities
arXiv:2602.21193v1 Announce Type: new Abstract: Despite rapid recent progress in the terminal capabilities of large language models, the training data strategies behind state-of-the-art terminal agents remain largely undisclosed. We address this gap through a systematic study of data engineering practices...
MedCLIPSeg: Probabilistic Vision-Language Adaptation for Data-Efficient and Generalizable Medical Image Segmentation
arXiv:2602.20423v1 Announce Type: cross Abstract: Medical image segmentation remains challenging due to limited annotations for training, ambiguous anatomical features, and domain shifts. While vision-language models such as CLIP offer strong cross-modal representations, their potential for dense, text-guided medical image segmentation...