Language Model Representations for Efficient Few-Shot Tabular Classification
arXiv:2602.15844v1 Announce Type: cross Abstract: The Web is a rich source of structured data in the form of tables, from product catalogs and knowledge bases to scientific datasets. However, the heterogeneity of the structure and semantics of these tables makes...
Artificial Intelligence and Justice in Family Law: Addressing Bias and Promoting Fairness
Artificial Intelligence (AI) plays a crucial role in the legal field today, carrying out processes such as predictive analysis, data interpretation, and decision making. AI is valued for its efficiency and accuracy along with its affordability. However, one problem that...
Preference Optimization for Review Question Generation Improves Writing Quality
arXiv:2602.15849v1 Announce Type: cross Abstract: Peer review relies on substantive, evidence-based questions, yet existing LLM-based approaches often generate surface-level queries, drawing over 50\% of their question tokens from a paper's first page. To bridge this gap, we develop IntelliReward, a...
Narrative Theory-Driven LLM Methods for Automatic Story Generation and Understanding: A Survey
arXiv:2602.15851v1 Announce Type: cross Abstract: Applications of narrative theories using large language models (LLMs) deliver promising use-cases in automatic story generation and understanding tasks. Our survey examines how natural language processing (NLP) research engages with fields of narrative studies, and...
Kalman-Inspired Runtime Stability and Recovery in Hybrid Reasoning Systems
arXiv:2602.15855v1 Announce Type: cross Abstract: Hybrid reasoning systems that combine learned components with model-based inference are increasingly deployed in tool-augmented decision loops, yet their runtime behavior under partial observability and sustained evidence mismatch remains poorly understood. In practice, failures often...
NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey
arXiv:2602.15866v1 Announce Type: cross Abstract: Natural Language Processing (NLP) is integral to social media analytics but often processes content containing Personally Identifiable Information (PII), behavioral cues, and metadata raising privacy risks such as surveillance, profiling, and targeted advertising. To systematically...
Fly0: Decoupling Semantic Grounding from Geometric Planning for Zero-Shot Aerial Navigation
arXiv:2602.15875v1 Announce Type: cross Abstract: Current Visual-Language Navigation (VLN) methodologies face a trade-off between semantic understanding and control precision. While Multimodal Large Language Models (MLLMs) offer superior reasoning, deploying them as low-level controllers leads to high latency, trajectory oscillations, and...
IT-OSE: Exploring Optimal Sample Size for Industrial Data Augmentation
arXiv:2602.15878v1 Announce Type: cross Abstract: In industrial scenarios, data augmentation is an effective approach to improve model performance. However, its benefits are not unidirectionally beneficial. There is no theoretical research or established estimation for the optimal sample size (OSS) in...
Evidence for Daily and Weekly Periodic Variability in GPT-4o Performance
arXiv:2602.15889v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used in research both as tools and as objects of investigation. Much of this work implicitly assumes that LLM performance under fixed conditions (identical model snapshot, hyperparameters, and prompt)...
Understand Then Memory: A Cognitive Gist-Driven RAG Framework with Global Semantic Diffusion
arXiv:2602.15895v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) effectively mitigates hallucinations in LLMs by incorporating external knowledge. However, the inherent discrete representation of text in existing frameworks often results in a loss of semantic integrity, leading to retrieval deviations. Inspired...
Doc-to-LoRA: Learning to Instantly Internalize Contexts
arXiv:2602.15902v1 Announce Type: cross Abstract: Long input sequences are central to in-context learning, document understanding, and multi-step reasoning of Large Language Models (LLMs). However, the quadratic attention cost of Transformers makes inference memory-intensive and slow. While context distillation (CD) can...
SourceBench: Can AI Answers Reference Quality Web Sources?
arXiv:2602.16942v1 Announce Type: new Abstract: Large language models (LLMs) increasingly answer queries by citing web sources, but existing evaluations emphasize answer correctness rather than evidence quality. We introduce SourceBench, a benchmark for measuring the quality of cited web sources across...
LLM4Cov: Execution-Aware Agentic Learning for High-coverage Testbench Generation
arXiv:2602.16953v1 Announce Type: new Abstract: Execution-aware LLM agents offer a promising paradigm for learning from tool feedback, but such feedback is often expensive and slow to obtain, making online reinforcement learning (RL) impractical. High-coverage hardware verification exemplifies this challenge due...
Sales Research Agent and Sales Research Bench
arXiv:2602.17017v1 Announce Type: new Abstract: Enterprises increasingly need AI systems that can answer sales-leader questions over live, customized CRM data, but most available models do not expose transparent, repeatable evidence of quality. This paper describes the Sales Research Agent in...
Agentic Wireless Communication for 6G: Intent-Aware and Continuously Evolving Physical-Layer Intelligence
arXiv:2602.17096v1 Announce Type: new Abstract: As 6G wireless systems evolve, growing functional complexity and diverse service demands are driving a shift from rule-based control to intent-driven autonomous intelligence. User requirements are no longer captured by a single metric (e.g., throughput...
Instructor-Aligned Knowledge Graphs for Personalized Learning
arXiv:2602.17111v1 Announce Type: new Abstract: Mastering educational concepts requires understanding both their prerequisites (e.g., recursion before merge sort) and sub-concepts (e.g., merge sort as part of sorting algorithms). Capturing these dependencies is critical for identifying students' knowledge gaps and enabling...
Epistemology of Generative AI: The Geometry of Knowing
arXiv:2602.17116v1 Announce Type: new Abstract: Generative AI presents an unprecedented challenge to our understanding of knowledge and its production. Unlike previous technological transformations, where engineering understanding preceded or accompanied deployment, generative AI operates through mechanisms whose epistemic character remains obscure,...
Mechanistic Interpretability of Cognitive Complexity in LLMs via Linear Probing using Bloom's Taxonomy
arXiv:2602.17229v1 Announce Type: new Abstract: The black-box nature of Large Language Models necessitates novel evaluation frameworks that transcend surface-level performance metrics. This study investigates the internal neural representations of cognitive complexity using Bloom's Taxonomy as a hierarchical lens. By analyzing...
Claim Automation using Large Language Model
arXiv:2602.16836v1 Announce Type: new Abstract: While Large Language Models (LLMs) have achieved strong performance on general-purpose language tasks, their deployment in regulated and data-sensitive domains, including insurance, remains limited. Leveraging millions of historical warranty claims, we propose a locally deployed...
When Semantic Overlap Is Not Enough: Cross-Lingual Euphemism Transfer Between Turkish and English
arXiv:2602.16957v1 Announce Type: new Abstract: Euphemisms substitute socially sensitive expressions, often softening or reframing meaning, and their reliance on cultural and pragmatic context complicates modeling across languages. In this study, we investigate how cross-lingual equivalence influences transfer in multilingual euphemism...
Eigenmood Space: Uncertainty-Aware Spectral Graph Analysis of Psychological Patterns in Classical Persian Poetry
arXiv:2602.16959v1 Announce Type: new Abstract: Classical Persian poetry is a historically sustained archive in which affective life is expressed through metaphor, intertextual convention, and rhetorical indirection. These properties make close reading indispensable while limiting reproducible comparison at scale. We present...
Evaluating Cross-Lingual Classification Approaches Enabling Topic Discovery for Multilingual Social Media Data
arXiv:2602.17051v1 Announce Type: new Abstract: Analysing multilingual social media discourse remains a major challenge in natural language processing, particularly when large-scale public debates span across diverse languages. This study investigates how different approaches for cross-lingual text classification can support reliable...
ALPS: A Diagnostic Challenge Set for Arabic Linguistic & Pragmatic Reasoning
arXiv:2602.17054v1 Announce Type: new Abstract: While recent Arabic NLP benchmarks focus on scale, they often rely on synthetic or translated data which may benefit from deeper linguistic verification. We introduce ALPS (Arabic Linguistic & Pragmatic Suite), a native, expert-curated diagnostic...
BankMathBench: A Benchmark for Numerical Reasoning in Banking Scenarios
arXiv:2602.17072v1 Announce Type: new Abstract: Large language models (LLMs)-based chatbots are increasingly being adopted in the financial domain, particularly in digital banking, to handle customer inquiries about products such as deposits, savings, and loans. However, these models still exhibit low...
Entropy-Based Data Selection for Language Models
arXiv:2602.17465v1 Announce Type: new Abstract: Modern language models (LMs) increasingly require two critical resources: computational resources and data resources. Data selection techniques can effectively reduce the amount of training data required for fine-tuning LMs. However, their effectiveness is closely related...
PEACE 2.0: Grounded Explanations and Counter-Speech for Combating Hate Expressions
arXiv:2602.17467v1 Announce Type: new Abstract: The increasing volume of hate speech on online platforms poses significant societal challenges. While the Natural Language Processing community has developed effective methods to automatically detect the presence of hate speech, responses to it, called...
Auditing Reciprocal Sentiment Alignment: Inversion Risk, Dialect Representation and Intent Misalignment in Transformers
arXiv:2602.17469v1 Announce Type: new Abstract: The core theme of bidirectional alignment is ensuring that AI systems accurately understand human intent and that humans can trust AI behavior. However, this loop fractures significantly across language barriers. Our research addresses Cross-Lingual Sentiment...
Modeling Distinct Human Interaction in Web Agents
arXiv:2602.17588v1 Announce Type: new Abstract: Despite rapid progress in autonomous web agents, human involvement remains essential for shaping preferences and correcting agent behavior as tasks unfold. However, current agentic systems lack a principled understanding of when and why humans intervene,...
Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking
arXiv:2602.16746v1 Announce Type: new Abstract: Grokking -- the delayed transition from memorization to generalization in small algorithmic tasks -- remains poorly understood. We present a geometric analysis of optimization dynamics in transformers trained on modular arithmetic. PCA of attention weight...
Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning
arXiv:2602.16796v1 Announce Type: new Abstract: Fine-tuning pre-trained diffusion and flow models to optimize downstream utilities is central to real-world deployment. Existing entropy-regularized methods primarily maximize expected reward, providing no mechanism to shape tail behavior. However, tail control is often essential:...