Arbitration

LOW Academic International

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages

arXiv:2604.02596v1 Announce Type: new Abstract: In-context learning (ICL) allows large language models (LLMs) to adapt to new tasks from a few examples, making it promising for languages underrepresented in pre-training. Recent work on many-shot ICL suggests that modern LLMs can...

1 min 1 week, 4 days ago

bit

LOW Academic International

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

arXiv:2604.02967v1 Announce Type: new Abstract: Recent Large Reasoning Models (LRMs) like DeepSeek-R1 have demonstrated remarkable success in complex reasoning tasks, exhibiting human-like patterns in exploring multiple alternative solutions. Upon closer inspection, however, we uncover a surprising phenomenon: The First is...

1 min 1 week, 4 days ago

bit

LOW Academic International

Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models

arXiv:2604.02485v1 Announce Type: new Abstract: Confirmation bias, the tendency to seek evidence that supports rather than challenges one's belief, hinders one's reasoning ability. We examine whether large language models (LLMs) exhibit confirmation bias by adapting the rule-discovery study from human...

1 min 1 week, 4 days ago

bit

LOW Academic International

Querying Structured Data Through Natural Language Using Language Models

arXiv:2604.03057v1 Announce Type: new Abstract: This paper presents an open source methodology for allowing users to query structured non textual datasets through natural language Unlike Retrieval Augmented Generation RAG which struggles with numerical and highly structured information our approach trains...

1 min 1 week, 4 days ago

bit

LOW Academic International

Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

arXiv:2604.02923v1 Announce Type: new Abstract: Large Language Models (LLMs), particularly those employing Mixture-of-Experts (MoE) architectures, have achieved remarkable capabilities across diverse natural language processing tasks. However, these models frequently suffer from hallucinations -- generating plausible but factually incorrect content --...

1 min 1 week, 4 days ago

bit

LOW Academic International

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

arXiv:2604.02713v1 Announce Type: new Abstract: Conversational AI is increasingly deployed in emotionally charged and ethically sensitive interactions. Previous research has primarily concentrated on emotional benchmarks or static safety checks, overlooking how alignment unfolds in evolving conversation. We explore the research...

1 min 1 week, 4 days ago

bit

LOW Academic International

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

arXiv:2604.03216v1 Announce Type: new Abstract: Large language models (LLMs) often produce confident but incorrect answers in settings where abstention would be safer. Standard evaluation protocols, however, require a response and do not account for how confidence should guide decisions under...

1 min 1 week, 4 days ago

bit

LOW Academic International

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

arXiv:2604.02557v1 Announce Type: new Abstract: Language models are known to exhibit various forms of cultural bias in decision-making tasks, yet much less is known about their degree of cultural familiarity in open-ended text generation tasks. In this paper, we introduce...

1 min 1 week, 4 days ago

bit

LOW Academic International

Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

arXiv:2604.02772v1 Announce Type: new Abstract: Multilingual Pre-trained Language Models (MPLMs) have become essential tools for natural language processing. However, they often exhibit biases related to sensitive attributes such as gender, race, and religion. In this paper, we introduce a comprehensive...

1 min 1 week, 4 days ago

bit

LOW Academic International

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

arXiv:2604.03147v1 Announce Type: new Abstract: We present a method to identify a valence-arousal (VA) subspace within large language model representations. From 211k emotion-labeled texts, we derive emotion steering vectors, then learn VA axes as linear combinations of their top PCA...

1 min 1 week, 4 days ago

bit

LOW Academic International

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

arXiv:2604.02423v1 Announce Type: new Abstract: Large language models exhibit sycophancy: the tendency to shift outputs toward user-expressed stances, regardless of correctness or consistency. While prior work has studied this issue and its impacts, rigorous computational linguistic metrics are needed to...

1 min 1 week, 4 days ago

bit

LOW Academic International

AXELRAM: Quantize Once, Never Dequantize

arXiv:2604.02638v1 Announce Type: new Abstract: We propose AXELRAM, a smart SRAM macro architecture that computes attention scores directly from quantized KV cache indices without dequantization. The key enabler is a design-time fixed codebook: orthogonal-transform-based quantization concentrates each coordinate's distribution to...

1 min 1 week, 4 days ago

bit

LOW Academic International

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

arXiv:2604.02343v1 Announce Type: cross Abstract: We study the compression of LLM-generated text across lossless and lossy regimes, characterizing a compression-compute frontier where more compression is possible at the cost of more compute. For lossless compression, domain-adapted LoRA adapters can improve...

1 min 1 week, 4 days ago

bit

LOW Academic International

Revealing the Learning Dynamics of Long-Context Continual Pre-training

arXiv:2604.02650v1 Announce Type: new Abstract: Existing studies on Long-Context Continual Pre-training (LCCP) mainly focus on small-scale models and limited data regimes (tens of billions of tokens). We argue that directly migrating these small-scale settings to industrial-grade models risks insufficient adaptation...

1 min 1 week, 4 days ago

bit

LOW Academic International

FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint Forecasting

arXiv:2604.02347v1 Announce Type: new Abstract: Accurate and up-to-date forecasting of the power grid's carbon footprint is crucial for effective product carbon footprint (PCF) accounting and informed decarbonization decisions. However, the carbon intensity of the grid exhibits high non-stationarity, and existing...

1 min 1 week, 4 days ago

bit

LOW Academic International

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

arXiv:2604.02733v1 Announce Type: new Abstract: Reasoning benchmarks typically evaluate whether a model derives the correct answer from a fixed premise set, but they under-measure a closely related capability that matters in dynamic environments: belief revision under minimal evidence change. We...

1 min 1 week, 4 days ago

bit

LOW Academic International

Fast NF4 Dequantization Kernels for Large Language Model Inference

arXiv:2604.02556v1 Announce Type: new Abstract: Large language models (LLMs) have grown beyond the memory capacity of single GPU devices, necessitating quantization techniques for practical deployment. While NF4 (4-bit NormalFloat) quantization enables 4$\times$ memory reduction, inference on current NVIDIA GPUs (e.g.,...

1 min 1 week, 4 days ago

bit

LOW News International

Can orbital data centers help justify a massive valuation for SpaceX?

On the latest episode of TechCrunch’s Equity podcast, we debated Elon Musk's vision for data centers in space.

1 min 1 week, 4 days ago

bit

LOW Academic International

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

arXiv:2604.01345v1 Announce Type: new Abstract: Inverse reinforcement learning (IRL) recovers the loss function of a forward learner from its observed responses adaptive IRL aims to reconstruct the loss function of a forward learner by passively observing its gradients as it...

1 min 2 weeks ago

bit

LOW Academic International

Improving Latent Generalization Using Test-time Compute

arXiv:2604.01430v1 Announce Type: new Abstract: Language Models (LMs) exhibit two distinct mechanisms for knowledge acquisition: in-weights learning (i.e., encoding information within the model weights) and in-context learning (ICL). Although these two modes offer complementary strengths, in-weights learning frequently struggles to...

1 min 2 weeks ago

bit

LOW Academic International

Dual-Attention Based 3D Channel Estimation

arXiv:2604.01769v1 Announce Type: new Abstract: For multi-input and multi-output (MIMO) channels, the optimal channel estimation (CE) based on linear minimum mean square error (LMMSE) requires three-dimensional (3D) filtering. However, the complexity is often prohibitive due to large matrix dimensions. Suboptimal...

1 min 2 weeks ago

bit

LOW Academic International

Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms

arXiv:2604.00012v1 Announce Type: cross Abstract: Despite the impressive performance of general-purpose large language models (LLMs), they often require fine-tuning or post-training to excel at specific tasks. For instance, large reasoning models (LRMs), such as the DeepSeek-R1 series, demonstrate strong reasoning...

1 min 2 weeks ago

bit

LOW Academic International

Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager

arXiv:2604.00011v1 Announce Type: cross Abstract: The growing prominence of large language models (LLMs) in daily life has heightened concerns that LLMs exhibit many of the same gender-related biases as their creators. In the context of hiring decisions, we quantify the...

1 min 2 weeks ago

bit

LOW Academic International

LLM Essay Scoring Under Holistic and Analytic Rubrics: Prompt Effects and Bias

arXiv:2604.00259v1 Announce Type: new Abstract: Despite growing interest in using Large Language Models (LLMs) for educational assessment, it remains unclear how closely they align with human scoring. We present a systematic evaluation of instruction-tuned LLMs across three open essay-scoring datasets...

1 min 2 weeks ago

adr

LOW Academic International

Do Language Models Know When They'll Refuse? Probing Introspective Awareness of Safety Boundaries

arXiv:2604.00228v1 Announce Type: new Abstract: Large language models are trained to refuse harmful requests, but can they accurately predict when they will refuse before responding? We investigate this question through a systematic study where models first predict their refusal behavior,...

1 min 2 weeks ago

bit

LOW Academic International

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

arXiv:2604.00477v1 Announce Type: new Abstract: LLM-based agent judges are an emerging approach to evaluating conversational AI, yet a fundamental uncertainty remains: can we trust their assessments, and if so, how many are needed? Through 960 sessions with two model pairs...

1 min 2 weeks ago

bit

LOW Academic International

Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling

arXiv:2604.01601v1 Announce Type: new Abstract: We investigate training strategies that co-develop in-context learning (ICL) and in-weights learning (IWL), and the ability to switch between them based on context relevance. Although current LLMs exhibit both modes, standard task-specific fine-tuning often erodes...

1 min 2 weeks ago

bit

LOW Academic International

FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

arXiv:2604.01762v1 Announce Type: new Abstract: Parameter-efficient fine-tuning (PEFT) has emerged as a crucial paradigm for adapting large language models (LLMs) under constrained computational budgets. However, standard PEFT methods often struggle in multi-task fine-tuning settings, where diverse optimization objectives induce task...

1 min 2 weeks ago

bit

LOW Academic International

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

arXiv:2604.00375v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) theoretically permit token decoding in arbitrary order, a flexibility that could enable richer exploration of reasoning paths than autoregressive (AR) LLMs. In practice, however, random-order decoding often hurts generation quality....

1 min 2 weeks ago

bit

LOW Academic International

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

arXiv:2604.01622v1 Announce Type: new Abstract: Diffusion language models (DLMs) enable parallel, non-autoregressive text generation, yet existing DLM mixture-of-experts (MoE) models inherit token-choice (TC) routing from autoregressive systems, leading to load imbalance and rigid computation allocation. We show that expert-choice (EC)...

1 min 2 weeks ago

bit

An Empirical Study of Many-Shot In-Context Learning for Machine Translation of Low-Resource Languages

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

Failing to Falsify: Evaluating and Mitigating Confirmation Bias in Language Models

Querying Structured Data Through Natural Language Using Language Models

Council Mode: Mitigating Hallucination and Bias in LLMs via Multi-Agent Consensus

Breakdowns in Conversational AI: Interactional Failures in Emotionally and Ethically Sensitive Contexts

BAS: A Decision-Theoretic Approach to Evaluating Large Language Model Confidence

Pragmatics Meets Culture: Culturally-adapted Artwork Description Generation and Evaluation

Multiple-Debias: A Full-process Debiasing Method for Multilingual Pre-trained Language Models

Valence-Arousal Subspace in LLMs: Circular Emotion Geometry and Multi-Behavioral Control

SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy

AXELRAM: Quantize Once, Never Dequantize

Haiku to Opus in Just 10 bits: LLMs Unlock Massive Compression Gains

Revealing the Learning Dynamics of Long-Context Continual Pre-training

FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint Forecasting

DeltaLogic: Minimal Premise Edits Reveal Belief-Revision Failures in Logical Reasoning Models

Fast NF4 Dequantization Kernels for Large Language Model Inference

Can orbital data centers help justify a massive valuation for SpaceX?

Malliavin Calculus for Counterfactual Gradient Estimation in Adaptive Inverse Reinforcement Learning

Improving Latent Generalization Using Test-time Compute

Dual-Attention Based 3D Channel Estimation

Finding and Reactivating Post-Trained LLMs' Hidden Safety Mechanisms

Quantifying Gender Bias in Large Language Models: When ChatGPT Becomes a Hiring Manager

LLM Essay Scoring Under Holistic and Analytic Rubrics: Prompt Effects and Bias

Do Language Models Know When They'll Refuse? Probing Introspective Awareness of Safety Boundaries

Logarithmic Scores, Power-Law Discoveries: Disentangling Measurement from Coverage in Agent-Based Evaluation

Training In-Context and In-Weights Mixtures Via Contrastive Context Sampling

FourierMoE: Fourier Mixture-of-Experts Adaptation of Large Language Models

Locally Confident, Globally Stuck: The Quality-Exploration Dilemma in Diffusion Language Models

Expert-Choice Routing Enables Adaptive Computation in Diffusion Language Models

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.