Analyzing LLM Instruction Optimization for Tabular Fact Verification
arXiv:2602.17937v1 Announce Type: new Abstract: Instruction optimization provides a lightweight, model-agnostic approach to enhancing the reasoning performance of large language models (LLMs). This paper presents the first systematic comparison of instruction optimization, based on the DSPy optimization framework, for tabular...
Decomposing Retrieval Failures in RAG for Long-Document Financial Question Answering
arXiv:2602.17981v1 Announce Type: new Abstract: Retrieval-augmented generation is increasingly used for financial question answering over long regulatory filings, yet reliability depends on retrieving the exact context needed to justify answers in high stakes settings. We study a frequent failure mode...
Perceived Political Bias in LLMs Reduces Persuasive Abilities
arXiv:2602.18092v1 Announce Type: new Abstract: Conversational AI has been proposed as a scalable way to correct public misconceptions and spread misinformation. Yet its effectiveness may depend on perceptions of its political neutrality. As LLMs enter partisan conflict, elites increasingly portray...
Agentic Adversarial QA for Improving Domain-Specific LLMs
arXiv:2602.18137v1 Announce Type: new Abstract: Large Language Models (LLMs), despite extensive pretraining on broad internet corpora, often struggle to adapt effectively to specialized domains. There is growing interest in fine-tuning these models for such domains; however, progress is constrained by...
Information-Theoretic Storage Cost in Sentence Comprehension
arXiv:2602.18217v1 Announce Type: new Abstract: Real-time sentence comprehension imposes a significant load on working memory, as comprehenders must maintain contextual information to anticipate future input. While measures of such load have played an important role in psycholinguistic theories, they have...
Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning
arXiv:2602.18232v1 Announce Type: new Abstract: Recent work on test-time scaling for large language model (LLM) reasoning typically assumes that allocating more inference-time computation uniformly improves correctness. However, prior studies show that reasoning uncertainty is highly localized: a small subset of...
PsihoRo: Depression and Anxiety Romanian Text Corpus
arXiv:2602.18324v1 Announce Type: new Abstract: Psychological corpora in NLP are collections of texts used to analyze human psychology, emotions, and mental health. These texts allow researchers to study psychological constructs, detect mental health issues and analyze emotional language. However, mental...
SPQ: An Ensemble Technique for Large Language Model Compression
arXiv:2602.18420v1 Announce Type: new Abstract: This study presents an ensemble technique, SPQ (SVD-Pruning-Quantization), for large language model (LLM) compression that combines variance-retained singular value decomposition (SVD), activation-based pruning, and post-training linear quantization. Each component targets a different source of inefficiency:...
Lost Before Translation: Social Information Transmission and Survival in AI-AI Communication
arXiv:2602.17674v1 Announce Type: cross Abstract: When AI systems summarize and relay information, they inevitably transform it. But how? We introduce an experimental paradigm based on the telephone game to study what happens when AI talks to AI. Across five studies...
Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering
arXiv:2602.17691v1 Announce Type: cross Abstract: Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output...
Bayesian Optimality of In-Context Learning with Selective State Spaces
arXiv:2602.17744v1 Announce Type: cross Abstract: We propose Bayesian optimal sequential prediction as a new principle for understanding in-context learning (ICL). Unlike interpretations framing Transformers as performing implicit gradient descent, we formalize ICL as meta-learning over latent sequence tasks. For tasks...
TFL: Targeted Bit-Flip Attack on Large Language Model
arXiv:2602.17837v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly deployed in safety and security critical applications, raising concerns about their robustness to model parameter fault injection attacks. Recent studies have shown that bit-flip attacks (BFAs), which exploit computer...
NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs
arXiv:2602.18008v1 Announce Type: cross Abstract: Mechanistic models encode scientific knowledge about dynamical systems and are widely used in downstream scientific and policy applications. Recent work has explored LLM-based agentic frameworks to automatically construct mechanistic models from data; however, existing problem...
Analyzing and Improving Chain-of-Thought Monitorability Through Information Theory
arXiv:2602.18297v1 Announce Type: cross Abstract: Chain-of-thought (CoT) monitors are LLM-based systems that analyze reasoning traces to detect when outputs may exhibit attributes of interest, such as test-hacking behavior during code generation. In this paper, we use information-theoretic analysis to show...
On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction
arXiv:2602.18301v1 Announce Type: cross Abstract: Autoregressive large language models (LLMs) generate text token-by-token, requiring n forward passes to produce a sequence of length n. Recent work, Exploring the Latent Capacity of LLMs for One-Step Text Reconstruction (Mezentsev and Oseledets), shows...
Joint Parameter and State-Space Bayesian Optimization: Using Process Expertise to Accelerate Manufacturing Optimization
arXiv:2602.17679v1 Announce Type: new Abstract: Bayesian optimization (BO) is a powerful method for optimizing black-box manufacturing processes, but its performance is often limited when dealing with high-dimensional multi-stage systems, where we can observe intermediate outputs. Standard BO models the process...
BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs
arXiv:2602.17680v1 Announce Type: new Abstract: Existing Protein Language Models (PLMs) often suffer from limited adaptability to multiple tasks and exhibit poor generalization across diverse biological contexts. In contrast, general-purpose Large Language Models (LLMs) lack the capability to interpret protein sequences...
Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling
arXiv:2602.17685v1 Announce Type: new Abstract: This paper addresses the challenge of multi target active debris removal (ADR) in Low Earth Orbit (LEO) by introducing a unified coelliptic maneuver framework that combines Hohmann transfers, safety ellipse proximity operations, and explicit refueling...
Parallel Complex Diffusion for Scalable Time Series Generation
arXiv:2602.17706v1 Announce Type: new Abstract: Modeling long-range dependencies in time series generation poses a fundamental trade-off between representational capacity and computational efficiency. Traditional temporal diffusion models suffer from local entanglement and the $\mathcal{O}(L^2)$ cost of attention mechanisms. We address these...
Provable Adversarial Robustness in In-Context Learning
arXiv:2602.17743v1 Announce Type: new Abstract: Large language models adapt to new tasks through in-context learning (ICL) without parameter updates. Current theoretical explanations for this capability assume test tasks are drawn from a distribution similar to that seen during pretraining. This...
Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs
arXiv:2602.17778v1 Announce Type: new Abstract: Multi-turn interaction length is a dominant factor in the operational costs of conversational LLMs. In this work, we present a new failure mode in conversational LLMs: turn amplification, in which a model consistently prolongs multi-turn...
Calibrated Adaptation: Bayesian Stiefel Manifold Priors for Reliable Parameter-Efficient Fine-Tuning
arXiv:2602.17809v1 Announce Type: new Abstract: Parameter-efficient fine-tuning methods such as LoRA enable practical adaptation of large language models but provide no principled uncertainty estimates, leading to poorly calibrated predictions and unreliable behavior under domain shift. We introduce Stiefel-Bayes Adapters (SBA),...
Avoid What You Know: Divergent Trajectory Balance for GFlowNets
arXiv:2602.17827v1 Announce Type: new Abstract: Generative Flow Networks (GFlowNets) are a flexible family of amortized samplers trained to generate discrete and compositional objects with probability proportional to a reward function. However, learning efficiency is constrained by the model's ability to...
Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning
arXiv:2602.17835v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational...
Two Calm Ends and the Wild Middle: A Geometric Picture of Memorization in Diffusion Models
arXiv:2602.17846v1 Announce Type: new Abstract: Diffusion models generate high-quality samples but can also memorize training data, raising serious privacy concerns. Understanding the mechanisms governing when memorization versus generalization occurs remains an active area of research. In particular, it is unclear...
Neural Prior Estimation: Learning Class Priors from Latent Representations
arXiv:2602.17853v1 Announce Type: new Abstract: Class imbalance induces systematic bias in deep neural networks by imposing a skewed effective class prior. This work introduces the Neural Prior Estimator (NPE), a framework that learns feature-conditioned log-prior estimates from latent representations. NPE...
JAX-Privacy: A library for differentially private machine learning
arXiv:2602.17861v1 Announce Type: new Abstract: JAX-Privacy is a library designed to simplify the deployment of robust and performant mechanisms for differentially private machine learning. Guided by design principles of usability, flexibility, and efficiency, JAX-Privacy serves both researchers requiring deep customization...
Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors
arXiv:2602.17898v1 Announce Type: new Abstract: Attention-based regression models are often trained by jointly optimizing Mean Squared Error (MSE) loss and Pearson correlation coefficient (PCC) loss, emphasizing the magnitude of errors and the order or shape of targets, respectively. A common...
Distribution-Free Sequential Prediction with Abstentions
arXiv:2602.17918v1 Announce Type: new Abstract: We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d.\ instances, but at each round, the learner may also \emph{abstain} from making...
Causal Neighbourhood Learning for Invariant Graph Representations
arXiv:2602.17934v1 Announce Type: new Abstract: Graph data often contain noisy and spurious correlations that mask the true causal relationships, which are essential for enabling graph models to make predictions based on the underlying causal structure of the data. Dependence on...