Arbitration

LOW Academic International

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning tasks, exhibiting human-like patterns in exploring multiple alternative solutions. Upon closer inspection, however, we uncover a surprising phenomenon: The First is...

1 min 1 week, 4 days ago

bit

LOW Academic International

Internalized Reasoning for Long-Context Visual Document Understanding

arXiv:2604.02371v1 Announce Type: cross Abstract: Visual long-document understanding is critical for enterprise, legal, and scientific applications, yet the best performing open recipes have not explored reasoning, a capability which has driven leaps in math and code performance. We introduce a...

1 min 1 week, 4 days ago

bit

LOW Academic International

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

arXiv:2604.02423v1 Announce Type: new Abstract: Large language models exhibit sycophancy: the tendency to shift outputs toward user-expressed stances, regardless of correctness or consistency. While prior work has studied this issue and its impacts, rigorous computational linguistic metrics are needed to...

1 min 1 week, 4 days ago

bit

LOW Academic United States

Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic Prompting

arXiv:2604.02512v1 Announce Type: new Abstract: Large language models (LLMs) increasingly exhibit human-like patterns of pragmatic and social reasoning. This paper addresses two related questions: do LLMs approximate human social meaning not only qualitatively but also quantitatively, and can prompting strategies...

1 min 1 week, 4 days ago

bit

LOW Academic International

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

arXiv:2604.02557v1 Announce Type: new Abstract: Language models are known to exhibit various forms of cultural bias in decision-making tasks, yet much less is known about their degree of cultural familiarity in open-ended text generation tasks. In this paper, we introduce...

1 min 1 week, 4 days ago

bit

LOW Academic International

Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

arXiv:2604.02923v1 Announce Type: new Abstract: Large Language Models (LLMs), particularly those employing Mixture-of-Experts (MoE) architectures, have achieved remarkable capabilities across diverse natural language processing tasks. However, these models frequently suffer from hallucinations -- generating plausible but factually incorrect content --...

1 min 1 week, 4 days ago

bit

LOW Academic International

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

arXiv:2604.02947v1 Announce Type: new Abstract: Computer-use agents extend language models from text generation to persistent action over tools, files, and execution environments. Unlike chat systems, they maintain state across interactions and translate intermediate outputs into concrete actions. This creates a...

1 min 1 week, 4 days ago

bit

LOW Academic International

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

arXiv:2604.02713v1 Announce Type: new Abstract: Conversational AI is increasingly deployed in emotionally charged and ethically sensitive interactions. Previous research has primarily concentrated on emotional benchmarks or static safety checks, overlooking how alignment unfolds in evolving conversation. We explore the research...

1 min 1 week, 4 days ago

bit

LOW Academic United States

Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis

arXiv:2604.02382v1 Announce Type: cross Abstract: The scale and complexity of modern cloud infrastructure have made Infrastructure-as-Code (IaC) essential for managing deployments. While large Language models (LLMs) are increasingly being used to generate IaC configurations from natural language, user requests are...

1 min 1 week, 4 days ago

bit

LOW Academic International

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

arXiv:2604.02668v1 Announce Type: new Abstract: Large language models (LLMs) often exhibit sycophancy: agreement with user stance even when it conflicts with the model's opinion. While prior work has mostly studied this in single-agent settings, it remains underexplored in collaborative multi-agent...

1 min 1 week, 4 days ago

bit

LOW Academic International

Fast NF4 Dequantization Kernels for Large Language Model Inference

arXiv:2604.02556v1 Announce Type: new Abstract: Large language models (LLMs) have grown beyond the memory capacity of single GPU devices, necessitating quantization techniques for practical deployment. While NF4 (4-bit NormalFloat) quantization enables 4$\times$ memory reduction, inference on current NVIDIA GPUs (e.g.,...

1 min 1 week, 4 days ago

bit

LOW Academic International

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

arXiv:2604.03216v1 Announce Type: new Abstract: Large language models (LLMs) often produce confident but incorrect answers in settings where abstention would be safer. Standard evaluation protocols, however, require a response and do not account for how confidence should guide decisions under...

1 min 1 week, 4 days ago

bit

LOW Academic International

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

arXiv:2604.02733v1 Announce Type: new Abstract: Reasoning benchmarks typically evaluate whether a model derives the correct answer from a fixed premise set, but they under-measure a closely related capability that matters in dynamic environments: belief revision under minimal evidence change. We...

1 min 1 week, 4 days ago

bit

LOW Academic International

AXELRAM: Quantize Once, Never Dequantize

arXiv:2604.02638v1 Announce Type: new Abstract: We propose AXELRAM, a smart SRAM macro architecture that computes attention scores directly from quantized KV cache indices without dequantization. The key enabler is a design-time fixed codebook: orthogonal-transform-based quantization concentrates each coordinate's distribution to...

1 min 1 week, 4 days ago

bit

LOW Academic United States

Mitigating LLM biases toward spurious social contexts using direct preference optimization

arXiv:2604.02585v1 Announce Type: new Abstract: LLMs are increasingly used for high-stakes decision-making, yet their sensitivity to spurious contextual information can introduce harmful biases. This is a critical concern when models are deployed for tasks like evaluating teachers' instructional quality, where...

1 min 1 week, 4 days ago

bit

LOW Academic International

Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models

arXiv:2604.02485v1 Announce Type: new Abstract: Confirmation bias, the tendency to seek evidence that supports rather than challenges one's belief, hinders one's reasoning ability. We examine whether large language models (LLMs) exhibit confirmation bias by adapting the rule-discovery study from human...

1 min 1 week, 4 days ago

bit

LOW Academic European Union

Differentiable Symbolic Planning: A Neural Architecture for Constraint Reasoning with Learned Feasibility

arXiv:2604.02350v1 Announce Type: cross Abstract: Neural networks excel at pattern recognition but struggle with constraint reasoning -- determining whether configurations satisfy logical or physical constraints. We introduce Differentiable Symbolic Planning (DSP), a neural architecture that performs discrete symbolic reasoning while...

1 min 1 week, 4 days ago

bit

LOW Academic International

Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming

arXiv:2604.02554v1 Announce Type: new Abstract: Diversity-aware retrieval is essential for Retrieval-Augmented Generation (RAG), yet existing methods lack theoretical guarantees and face scalability issues as the number of retrieved passages $k$ increases. We propose a principled formulation of diversity retrieval as...

1 min 1 week, 4 days ago

adr

LOW Academic International

Querying Structured Data Through Natural Language Using Language Models

arXiv:2604.03057v1 Announce Type: new Abstract: This paper presents an open source methodology for allowing users to query structured non textual datasets through natural language Unlike Retrieval Augmented Generation RAG which struggles with numerical and highly structured information our approach trains...

1 min 1 week, 4 days ago

bit

LOW Academic International

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

arXiv:2604.03147v1 Announce Type: new Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive emotion steering vectors, then learn VA axes as linear combinations of their top PCA...

1 min 1 week, 4 days ago

bit

LOW Academic International

Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

arXiv:2604.02772v1 Announce Type: new Abstract: Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing. However, they often exhibit biases related to sensitive attributes such as gender, race, and religion. In this paper, we introduce a comprehensive...

1 min 1 week, 4 days ago

bit

LOW Academic United States

StoryScope: Investigating idiosyncrasies in AI fiction

arXiv:2604.03136v1 Announce Type: new Abstract: As AI-generated fiction becomes increasingly prevalent, questions of authorship and originality are becoming central to how written work is evaluated. While most existing work in this space focuses on identifying surface-level signatures of AI writing,...

1 min 1 week, 4 days ago

bit

LOW Academic International

FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint Forecasting

arXiv:2604.02347v1 Announce Type: new Abstract: Accurate and up-to-date forecasting of the power grid's carbon footprint is crucial for effective product carbon footprint (PCF) accounting and informed decarbonization decisions. However, the carbon intensity of the grid exhibits high non-stationarity, and existing...

1 min 1 week, 4 days ago

bit

LOW Academic International

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages

arXiv:2604.02596v1 Announce Type: new Abstract: In-context learning (ICL) allows large language models (LLMs) to adapt to new tasks from a few examples, making it promising for languages underrepresented in pre-training. Recent work on many-shot ICL suggests that modern LLMs can...

1 min 1 week, 4 days ago

bit

LOW Academic United States

On the Geometric Structure of Layer Updates in Deep Language Models

arXiv:2604.02459v1 Announce Type: new Abstract: We study the geometric structure of layer updates in deep language models. Rather than analyzing what information is encoded in intermediate representations, we ask how representations change from one layer to the next. We show...

1 min 1 week, 4 days ago

bit

LOW Academic European Union

Learning the Signature of Memorization in Autoregressive Language Models

arXiv:2604.03199v1 Announce Type: new Abstract: All prior membership inference attacks for fine-tuned language models use hand-crafted heuristics (e.g., loss thresholding, Min-K\%, reference calibration), each bounded by the designer's intuition. We introduce the first transferable learned attack, enabled by the observation...

1 min 1 week, 4 days ago

bit

LOW Academic European Union

Analytic Drift Resister for Non-Exemplar Continual Graph Learning

arXiv:2604.02633v1 Announce Type: new Abstract: Non-Exemplar Continual Graph Learning (NECGL) seeks to eliminate the privacy risks intrinsic to rehearsal-based paradigms by retaining solely class-level prototype representations rather than raw graph examples for mitigating catastrophic forgetting. However, this design choice inevitably...

1 min 1 week, 4 days ago

adr

LOW Academic International

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

arXiv:2604.02343v1 Announce Type: cross Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve...

1 min 1 week, 4 days ago

bit

LOW News International

Can orbital data centers help justify a massive valuation for SpaceX?

On the latest episode of TechCrunch’s Equity podcast, we debated Elon Musk's vision for data centers in space.

1 min 1 week, 4 days ago

bit

LOW Academic International

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

arXiv:2604.00013v1 Announce Type: cross Abstract: Multimodal sentiment analysis aims to understand human emotions by integrating textual, auditory, and visual modalities. Although Multimodal Large Language Models (MLLMs) have achieved state-of-the-art performance via supervised fine-tuning (SFT), their end-to-end "black-box" nature limits interpretability....

1 min 2 weeks ago

bit

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

Internalized Reasoning for Long-Context Visual Document Understanding

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

Social Meaning in Large Language Models: Structure, Magnitude, and Pragmatic Prompting

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

AgentHazard: A Benchmark for Evaluating Harmful Behavior in Computer-Use Agents

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

Ambig-IaC: Multi-level Disambiguation for Interactive Cloud Infrastructure-as-Code Synthesis

Too Polite to Disagree: Understanding Sycophancy Propagation in Multi-Agent Systems

Fast NF4 Dequantization Kernels for Large Language Model Inference

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

AXELRAM: Quantize Once, Never Dequantize

Mitigating LLM biases toward spurious social contexts using direct preference optimization

Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models

Differentiable Symbolic Planning: A Neural Architecture for Constraint Reasoning with Learned Feasibility

Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming

Querying Structured Data Through Natural Language Using Language Models

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

StoryScope: Investigating idiosyncrasies in AI fiction

FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint Forecasting

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages

On the Geometric Structure of Layer Updates in Deep Language Models

Learning the Signature of Memorization in Autoregressive Language Models

Analytic Drift Resister for Non-Exemplar Continual Graph Learning

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

Can orbital data centers help justify a massive valuation for SpaceX?

MSA-Thinker: Discrimination-Calibration Reasoning with Hint-Guided Reinforcement Learning for Multimodal Sentiment Analysis

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.