Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web
arXiv:2602.17245v1 Announce Type: new Abstract: The Web is evolving from a medium that humans browse to an environment where software agents act on behalf of users. Advances in large language models (LLMs) make natural language a practical interface for goal-directed...
Meenz bleibt Meenz, but Large Language Models Do Not Speak Its Dialect
arXiv:2602.16852v1 Announce Type: new Abstract: Meenzerisch, the dialect spoken in the German city of Mainz, is also the traditional language of the Mainz carnival, a yearly celebration well known throughout Germany. However, Meenzerisch is on the verge of dying out-a...
Persona2Web: Benchmarking Personalized Web Agents for Contextual Reasoning with User History
arXiv:2602.17003v1 Announce Type: new Abstract: Large language models have advanced web agents, yet current agents lack personalization capabilities. Since users rarely specify every detail of their intent, practical web agents must be able to interpret ambiguous queries by inferring user...
ReIn: Conversational Error Recovery with Reasoning Inception
arXiv:2602.17022v1 Announce Type: new Abstract: Conversational agents powered by large language models (LLMs) with tool integration achieve strong performance on fixed task-oriented dialogue datasets but remain vulnerable to unanticipated, user-induced errors. Rather than focusing on error prevention, this work focuses...
Large Language Models Persuade Without Planning Theory of Mind
arXiv:2602.17045v1 Announce Type: new Abstract: A growing body of work attempts to evaluate the theory of mind (ToM) abilities of humans and large language models (LLMs) using static, non-interactive question-and-answer benchmarks. However, theoretical work in the field suggests that first-personal...
ALPS: A Diagnostic Challenge Set for Arabic Linguistic & Pragmatic Reasoning
arXiv:2602.17054v1 Announce Type: new Abstract: While recent Arabic NLP benchmarks focus on scale, they often rely on synthetic or translated data which may benefit from deeper linguistic verification. We introduce ALPS (Arabic Linguistic & Pragmatic Suite), a native, expert-curated diagnostic...
The Emergence of Lab-Driven Alignment Signatures: A Psychometric Framework for Auditing Latent Bias and Compounding Risk in Generative AI
arXiv:2602.17127v1 Announce Type: new Abstract: As Large Language Models (LLMs) transition from standalone chat interfaces to foundational reasoning layers in multi-agent systems and recursive evaluation loops (LLM-as-a-judge), the detection of durable, provider-level behavioral signatures becomes a critical requirement for safety...
Quantifying and Mitigating Socially Desirable Responding in LLMs: A Desirability-Matched Graded Forced-Choice Psychometric Study
arXiv:2602.17262v1 Announce Type: new Abstract: Human self-report questionnaires are increasingly used in NLP to benchmark and audit large language models (LLMs), from persona consistency to safety and bias assessments. Yet these instruments presume honest responding; in evaluative contexts, LLMs can...
OpenAI debated calling police about suspected Canadian shooter’s chats
Jesse Van Rootselaar's descriptions of gun violence were flagged by tools that monitor ChatGPT for misuse.
RPDR: A Round-trip Prediction-Based Data Augmentation Framework for Long-Tail Question Answering
arXiv:2602.17366v1 Announce Type: new Abstract: Long-tail question answering presents significant challenges for large language models (LLMs) due to their limited ability to acquire and accurately recall less common knowledge. Retrieval-augmented generation (RAG) systems have shown great promise in mitigating this...
Diverse Word Choices, Same Reference: Annotating Lexically-Rich Cross-Document Coreference
arXiv:2602.17424v1 Announce Type: new Abstract: Cross-document coreference resolution (CDCR) identifies and links mentions of the same entities and events across related documents, enabling content analysis that aggregates information at the level of discourse participants. However, existing datasets primarily focus on...
Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study
arXiv:2602.17431v1 Announce Type: new Abstract: Uncertainty quantification has emerged as an effective approach to closed-book hallucination detection for LLMs, but existing methods are largely designed for short-form outputs and do not generalize well to long-form generation. We introduce a taxonomy...
PEACE 2.0: Grounded Explanations and Counter-Speech for Combating Hate Expressions
arXiv:2602.17467v1 Announce Type: new Abstract: The increasing volume of hate speech on online platforms poses significant societal challenges. While the Natural Language Processing community has developed effective methods to automatically detect the presence of hate speech, responses to it, called...
Modeling Distinct Human Interaction in Web Agents
arXiv:2602.17588v1 Announce Type: new Abstract: Despite rapid progress in autonomous web agents, human involvement remains essential for shaping preferences and correcting agent behavior as tasks unfold. However, current agentic systems lack a principled understanding of when and why humans intervene,...
The Cascade Equivalence Hypothesis: When Do Speech LLMs Behave Like ASR$\rightarrow$LLM Pipelines?
arXiv:2602.17598v1 Announce Type: new Abstract: Current speech LLMs largely perform implicit ASR: on tasks solvable from a transcript, they are behaviorally and mechanistically equivalent to simple Whisper$\to$LLM cascades. We show this through matched-backbone testing across four speech LLMs and six...
Differences in Typological Alignment in Language Models' Treatment of Differential Argument Marking
arXiv:2602.17653v1 Announce Type: new Abstract: Recent work has shown that language models (LMs) trained on synthetic corpora can exhibit typological preferences that resemble cross-linguistic regularities in human languages, particularly for syntactic phenomena such as word order. In this paper, we...
Better Think Thrice: Learning to Reason Causally with Double Counterfactual Consistency
arXiv:2602.16787v1 Announce Type: cross Abstract: Despite their strong performance on reasoning benchmarks, large language models (LLMs) have proven brittle when presented with counterfactual questions, suggesting weaknesses in their causal reasoning ability. While recent work has demonstrated that labeled counterfactual tasks...
Hybrid-Gym: Training Coding Agents to Generalize Across Tasks
arXiv:2602.16819v1 Announce Type: cross Abstract: When assessing the quality of coding agents, predominant benchmarks focus on solving single issues on GitHub, such as SWE-Bench. In contrast, in real use, these agents solve more various and complex tasks that involve other...
Quantifying LLM Attention-Head Stability: Implications for Circuit Universality
arXiv:2602.16740v1 Announce Type: new Abstract: In mechanistic interpretability, recent work scrutinizes transformer "circuits" - sparse, mono or multi layer sub computations, that may reflect human understandable functions. Yet, these network circuits are rarely acid-tested for their stability across different instances...
PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency
arXiv:2602.16745v1 Announce Type: new Abstract: Test-time scaling can improve model performance by aggregating stochastic reasoning trajectories. However, achieving sample-efficient test-time self-consistency under a limited budget remains an open challenge. We introduce PETS (Principled and Efficient Test-TimeSelf-Consistency), which initiates a principled...
Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking
arXiv:2602.16746v1 Announce Type: new Abstract: Grokking -- the delayed transition from memorization to generalization in small algorithmic tasks -- remains poorly understood. We present a geometric analysis of optimization dynamics in transformers trained on modular arithmetic. PCA of attention weight...
Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models
arXiv:2602.16793v1 Announce Type: new Abstract: In the past year, custom and unreleased math reasoning models reached gold medal performance on the International Mathematical Olympiad (IMO). Similar performance was then reported using large-scale inference on publicly available models but at prohibitive...
HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind
arXiv:2602.16826v1 Announce Type: new Abstract: Theory of mind (ToM) enables AI systems to infer agents' hidden goals and mental states, but existing approaches focus mainly on small human understandable gridworld spaces. We introduce HiVAE, a hierarchical variational architecture that scales...
VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study
arXiv:2602.16833v1 Announce Type: new Abstract: Exploration remains a key bottleneck for reinforcement learning (RL) post-training of large language models (LLMs), where sparse feedback and large action spaces can lead to premature collapse into repetitive behaviors. We propose Verbalized Action Masking...
Multi-Agent Lipschitz Bandits
arXiv:2602.16965v1 Announce Type: new Abstract: We study the decentralized multi-player stochastic bandit problem over a continuous, Lipschitz-structured action space where hard collisions yield zero reward. Our objective is to design a communication-free policy that maximizes collective reward, with coordination costs...
A Unified Framework for Locality in Scalable MARL
arXiv:2602.16966v1 Announce Type: new Abstract: Scalable Multi-Agent Reinforcement Learning (MARL) is fundamentally challenged by the curse of dimensionality. A common solution is to exploit locality, which hinges on an Exponential Decay Property (EDP) of the value function. However, existing conditions...
Discovering Universal Activation Directions for PII Leakage in Language Models
arXiv:2602.16980v1 Announce Type: new Abstract: Modern language models exhibit rich internal structure, yet little is known about how privacy-sensitive behaviors, such as personally identifiable information (PII) leakage, are represented and modulated within their hidden states. We present UniLeak, a mechanistic-interpretability...
Action-Graph Policies: Learning Action Co-dependencies in Multi-Agent Reinforcement Learning
arXiv:2602.17009v1 Announce Type: new Abstract: Coordinating actions is the most fundamental form of cooperation in multi-agent reinforcement learning (MARL). Successful decentralized decision-making often depends not only on good individual actions, but on selecting compatible actions across agents to synchronize behavior,...
WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning
arXiv:2602.17025v1 Announce Type: new Abstract: Group Relative Policy Optimization (GRPO) is effective for training language models on complex reasoning. However, since the objective is defined relative to a group of sampled trajectories, extended deliberation can create more chances to realize...
MeGU: Machine-Guided Unlearning with Target Feature Disentanglement
arXiv:2602.17088v1 Announce Type: new Abstract: The growing concern over training data privacy has elevated the "Right to be Forgotten" into a critical requirement, thereby raising the demand for effective Machine Unlearning. However, existing unlearning approaches commonly suffer from a fundamental...