Arbitration

LOW Academic International

SemBench: A Universal Semantic Framework for LLM Evaluation

arXiv:2603.11687v1 Announce Type: new Abstract: Recent progress in Natural Language Processing (NLP) has been driven by the emergence of Large Language Models (LLMs), which exhibit remarkable generative and reasoning capabilities. However, despite their success, evaluating the true semantic understanding of...

1 min 1 month ago

bit

LOW Academic International

On the Robustness of Langevin Dynamics to Score Function Error

arXiv:2603.11319v1 Announce Type: new Abstract: We consider the robustness of score-based generative modeling to errors in the estimate of the score function. In particular, we show that Langevin dynamics is not robust to the L^2 errors (more generally L^p errors)...

1 min 1 month ago

bit

LOW Academic International

Multilingual Financial Fraud Detection Using Machine Learning and Transformer Models: A Bangla-English Study

arXiv:2603.11358v1 Announce Type: new Abstract: Financial fraud detection has emerged as a critical research challenge amid the rapid expansion of digital financial platforms. Although machine learning approaches have demonstrated strong performance in identifying fraudulent activities, most existing research focuses exclusively...

1 min 1 month ago

bit

LOW Academic International

Explainable LLM Unlearning Through Reasoning

arXiv:2603.09980v1 Announce Type: cross Abstract: LLM unlearning is essential for mitigating safety, copyright, and privacy concerns in pre-trained large language models (LLMs). Compared to preference alignment, it offers a more explicit way by removing undesirable knowledge characterized by specific unlearning...

1 min 1 month ago

bit

LOW Academic International

A Two-Stage Architecture for NDA Analysis: LLM-based Segmentation and Transformer-based Clause Classification

arXiv:2603.09990v1 Announce Type: cross Abstract: In business-to-business relations, it is common to establish NonDisclosure Agreements (NDAs). However, these documents exhibit significant variation in format, structure, and writing style, making manual analysis slow and error-prone. We propose an architecture based on...

1 min 1 month ago

bit

LOW Academic International

The Dunning-Kruger Effect in Large Language Models: An Empirical Study of Confidence Calibration

arXiv:2603.09985v1 Announce Type: cross Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across diverse tasks, yet their ability to accurately assess their own confidence remains poorly understood. We present an empirical study investigating whether LLMs exhibit patterns reminiscent of...

1 min 1 month ago

bit

LOW Academic International

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

arXiv:2603.09978v1 Announce Type: cross Abstract: Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives within...

1 min 1 month ago

bit

LOW Academic International

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

arXiv:2603.10588v1 Announce Type: new Abstract: Reinforcement learning with verifiable rewards (RLVR) has achieved remarkable success in logical reasoning tasks, yet whether large language model (LLM) alignment requires fundamentally different approaches remains unclear. Given the apparent tolerance for multiple valid responses...

1 min 1 month ago

bit

LOW Academic International

CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents

arXiv:2603.10577v1 Announce Type: new Abstract: Computer-Use Agents (CUAs) are emerging as a new paradigm in human-computer interaction, enabling autonomous execution of tasks in desktop environment by perceiving high-level natural-language instructions. As such agents become increasingly capable and are deployed across...

1 min 1 month ago

bit

LOW Academic International

Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

arXiv:2603.10000v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency across diverse tasks, exhibiting emergent properties such as semantic prompt comprehension, In-Context Learning (ICL), and Chain-of-Thought (CoT) reasoning. Despite their empirical success, the theoretical mechanisms driving these...

1 min 1 month ago

bit

LOW Academic International

Mitigating Translationese Bias in Multilingual LLM-as-a-Judge via Disentangled Information Bottleneck

arXiv:2603.10351v1 Announce Type: new Abstract: Large language models (LLMs) have become a standard for multilingual evaluation, yet they exhibit a severe systematic translationese bias. In this paper, translationese bias is characterized as LLMs systematically favoring machine-translated text over human-authored references,...

1 min 1 month ago

bit

LOW Academic International

Gated Adaptation for Continual Learning in Human Activity Recognition

arXiv:2603.10046v1 Announce Type: new Abstract: Wearable sensors in Internet of Things (IoT) ecosystems increasingly support applications such as remote health monitoring, elderly care, and smart home automation, all of which rely on robust human activity recognition (HAR). Continual learning systems...

1 min 1 month ago

bit

LOW Academic International

Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs

arXiv:2603.10100v1 Announce Type: new Abstract: Modern CNNs' high computational demands hinder edge deployment, as traditional ``hard'' sparsity (skipping mathematical zeros) loses effectiveness in deep layers or with smooth activations like Tanh. We propose a ``soft sparsity'' paradigm using a hardware...

1 min 1 month ago

bit

LOW Academic International

Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias

arXiv:2603.10123v1 Announce Type: new Abstract: The ``Lost in the Middle'' phenomenon -- a U-shaped performance curve where LLMs retrieve well from the beginning and end of a context but fail in the middle -- is widely attributed to learned Softmax...

1 min 1 month ago

bit

LOW Academic International

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

arXiv:2603.09095v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) can process text presented as images, yet they often perform worse than when the same content is provided as textual tokens. We systematically diagnose this "modality gap" by evaluating seven...

1 min 1 month, 1 week ago

bit

LOW Academic International

Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety

arXiv:2603.09154v1 Announce Type: new Abstract: Large language models (LLMs) trained on internet-scale corpora can exhibit systematic biases that increase the probability of unwanted behavior. In this study, we examined potential biases towards synthetic vs. biological technological solutions across four domains...

1 min 1 month, 1 week ago

bit

LOW Academic International

One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations

arXiv:2603.08869v1 Announce Type: new Abstract: Do the features learned by Sparse Autoencoders (SAEs) represent abstract meaning, or are they tied to how text is written? We investigate this question using Serbian digraphia as a controlled testbed: Serbian is written interchangeably...

1 min 1 month, 1 week ago

bit

LOW Academic International

Logos: An evolvable reasoning engine for rational molecular design

arXiv:2603.09268v1 Announce Type: new Abstract: The discovery and design of functional molecules remain central challenges across chemistry,biology, and materials science. While recent advances in machine learning have accelerated molecular property prediction and candidate generation, existing models tend to excel either...

1 min 1 month, 1 week ago

bit

LOW Academic International

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

arXiv:2603.09909v1 Announce Type: new Abstract: While Multi-Agent Systems (MAS) show potential for complex clinical decision support, the field remains hindered by architectural fragmentation and the lack of standardized multimodal integration. Current medical MAS research suffers from non-uniform data ingestion pipelines,...

1 min 1 month, 1 week ago

bit

LOW Academic International

Robust Regularized Policy Iteration under Transition Uncertainty

arXiv:2603.09344v1 Announce Type: new Abstract: Offline reinforcement learning (RL) enables data-efficient and safe policy learning without online exploration, but its performance often degrades under distribution shift. The learned policy may visit out-of-distribution state-action pairs where value estimates and learned dynamics...

1 min 1 month, 1 week ago

bit

LOW Academic International

MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games

arXiv:2603.09022v1 Announce Type: new Abstract: Multi-turn, multi-agent LLM game evaluations often exhibit substantial run-to-run variance. In long-horizon interactions, small early deviations compound across turns and are amplified by multi-agent coupling. This biases win rate estimates and makes rankings unreliable across...

1 min 1 month, 1 week ago

bit

LOW Academic International

Modelling the Diachronic Emergence of Phoneme Frequency Distributions

arXiv:2603.09503v1 Announce Type: new Abstract: Phoneme frequency distributions exhibit robust statistical regularities across languages, including exponential-tailed rank-frequency patterns and a negative relationship between phonemic inventory size and the relative entropy of the distribution. The origin of these patterns remains largely...

1 min 1 month, 1 week ago

bit

LOW Academic International

ESAinsTOD: A Unified End-to-End Schema-Aware Instruction-Tuning Framework for Task-Oriented Dialog Modeling

arXiv:2603.09691v1 Announce Type: new Abstract: Existing end-to-end modeling methods for modular task-oriented dialog systems are typically tailored to specific datasets, making it challenging to adapt to new dialog scenarios. In this work, we propose ESAinsTOD, a unified End-to-end Schema-Aware Instruction-tuning...

1 min 1 month, 1 week ago

bit

LOW Academic International

Benchmarking Political Persuasion Risks Across Frontier Large Language Models

arXiv:2603.09884v1 Announce Type: new Abstract: Concerns persist regarding the capacity of Large Language Models (LLMs) to sway political views. Although prior research has claimed that LLMs are not more persuasive than standard political campaign practices, the recent rise of frontier...

1 min 1 month, 1 week ago

bit

LOW Academic International

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models

arXiv:2603.08859v1 Announce Type: new Abstract: Hybrid sequence models--combining Transformer and state-space model layers--seek to gain the expressive versatility of attention as well as the computational efficiency of state-space model layers. Despite burgeoning interest in hybrid models, we lack a basic...

1 min 1 month, 1 week ago

bit

LOW Academic International

Quantifying Memorization and Privacy Risks in Genomic Language Models

arXiv:2603.08913v1 Announce Type: new Abstract: Genomic language models (GLMs) have emerged as powerful tools for learning representations of DNA sequences, enabling advances in variant prediction, regulatory element identification, and cross-task transfer learning. However, as these models are increasingly trained or...

1 min 1 month, 1 week ago

bit

LOW Academic International

Two Teachers Better Than One: Hardware-Physics Co-Guided Distributed Scientific Machine Learning

arXiv:2603.09032v1 Announce Type: new Abstract: Scientific machine learning (SciML) is increasingly applied to in-field processing, controlling, and monitoring; however, wide-area sensing, real-time demands, and strict energy and reliability constraints make centralized SciML implementation impractical. Most SciML models assume raw data...

1 min 1 month, 1 week ago

bit

LOW Academic International

Overcoming Valid Action Suppression in Unmasked Policy Gradient Algorithms

arXiv:2603.09090v1 Announce Type: new Abstract: In reinforcement learning environments with state-dependent action validity, action masking consistently outperforms penalty-based handling of invalid actions, yet existing theory only shows that masking preserves the policy gradient theorem. We identify a distinct failure mode...

1 min 1 month, 1 week ago

bit

LOW Academic International

Better Bounds for the Distributed Experts Problem

arXiv:2603.09168v1 Announce Type: new Abstract: In this paper, we study the distributed experts problem, where $n$ experts are distributed across $s$ servers for $T$ timesteps. The loss of each expert at each time $t$ is the $\ell_p$ norm of the...

1 min 1 month, 1 week ago

bit

LOW Academic International

TA-GGAD: Testing-time Adaptive Graph Model for Generalist Graph Anomaly Detection

arXiv:2603.09349v1 Announce Type: new Abstract: A significant number of anomalous nodes in the real world, such as fake news, noncompliant users, malicious transactions, and malicious posts, severely compromises the health of the graph data ecosystem and urgently requires effective identification...

1 min 1 month, 1 week ago

bit

SemBench: A Universal Semantic Framework for LLM Evaluation

On the Robustness of Langevin Dynamics to Score Function Error

Multilingual Financial Fraud Detection Using Machine Learning and Transformer Models: A Bangla-English Study

Explainable LLM Unlearning Through Reasoning

A Two-Stage Architecture for NDA Analysis: LLM-based Segmentation and Transformer-based Clause Classification

The Dunning-Kruger Effect in Large Language Models: An Empirical Study of Confidence Calibration

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

CUAAudit: Meta-Evaluation of Vision-Language Models as Auditors of Autonomous Computer-Use Agents

Beyond the Prompt in Large Language Models: Comprehension, In-Context Learning, and Chain-of-Thought

Mitigating Translationese Bias in Multilingual LLM-as-a-Judge via Disentangled Information Bottleneck

Gated Adaptation for Continual Learning in Human Activity Recognition

Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs

Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Bioalignment: Measuring and Improving LLM Disposition Toward Biological Systems for AI Safety

One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations

Logos: An evolvable reasoning engine for rational molecular design

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

Robust Regularized Policy Iteration under Transition Uncertainty

MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games

Modelling the Diachronic Emergence of Phoneme Frequency Distributions

ESAinsTOD: A Unified End-to-End Schema-Aware Instruction-Tuning Framework for Task-Oriented Dialog Modeling

Benchmarking Political Persuasion Risks Across Frontier Large Language Models

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models

Quantifying Memorization and Privacy Risks in Genomic Language Models

Two Teachers Better Than One: Hardware-Physics Co-Guided Distributed Scientific Machine Learning

Overcoming Valid Action Suppression in Unmasked Policy Gradient Algorithms

Better Bounds for the Distributed Experts Problem

TA-GGAD: Testing-time Adaptive Graph Model for Generalist Graph Anomaly Detection

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.