Arbitration

LOW Academic International

PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding

arXiv:2602.20696v1 Announce Type: new Abstract: Reliable AI systems require large language models (LLMs) to exhibit behaviors aligned with human preferences and values. However, most existing alignment approaches operate at training time and rely on additional high-quality data, incurring significant computational...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Pressure Reveals Character: Behavioural Alignment Evaluation at Depth

arXiv:2602.20813v1 Announce Type: new Abstract: Evaluating alignment in language models requires testing how they behave under realistic pressure, not just what they claim they would do. While alignment failures increasingly cause real-world harm, comprehensive evaluation frameworks with realistic multi-turn scenarios...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Tool Building as a Path to "Superintelligence"

arXiv:2602.21061v1 Announce Type: new Abstract: The Diligent Learner framework suggests LLMs can achieve superintelligence via test-time search, provided a sufficient step-success probability $\gamma$. In this work, we design a benchmark to measure $\gamma$ on logical out-of-distribution inference. We construct a...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing

arXiv:2602.20168v1 Announce Type: cross Abstract: Emergency triage decisions are made under severe information constraints, yet most data-driven deterioration models are evaluated using signals unavailable during initial assessment. We present a leakage-aware benchmarking framework for early deterioration prediction that evaluates model...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Autonomous AI and Ownership Rules

arXiv:2602.20169v1 Announce Type: cross Abstract: This Article examines the circumstances in which AI-generated outputs remain linked to their creators and the points at which they lose that connection, whether through accident, deliberate design, or emergent behavior. In cases where AI...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Disentangling Geometry, Performance, and Training in Language Models

arXiv:2602.20433v1 Announce Type: new Abstract: Geometric properties of Transformer weights, particularly the unembedding matrix, have been widely useful in language model interpretability research. Yet, their utility for estimating downstream performance remains unclear. In this work, we systematically investigate the relationship...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

The ASIR Courage Model: A Phase-Dynamic Framework for Truth Transitions in Human and AI Systems

arXiv:2602.21745v1 Announce Type: new Abstract: We introduce the ASIR (Awakened Shared Intelligence Relationship) Courage Model, a phase-dynamic framework that formalizes truth-disclosure as a state transition rather than a personality trait. The mode characterizes the shift from suppression (S0) to expression...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors

arXiv:2602.21218v1 Announce Type: cross Abstract: High-quality data is essential for modern machine learning, yet many valuable corpora are sensitive and cannot be freely shared. Synthetic data offers a practical substitute for downstream development, and large language models (LLMs) have emerged...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Latent Context Compilation: Distilling Long Context into Compact Portable Memory

arXiv:2602.21221v1 Announce Type: cross Abstract: Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

arXiv:2602.21233v1 Announce Type: cross Abstract: This technical report introduces AngelSlim, a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. By consolidating cutting-edge algorithms, including quantization, speculative decoding, token pruning, and distillation. AngelSlim provides a...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

arXiv:2602.22302v1 Announce Type: new Abstract: Traditional software relies on contracts -- APIs, type systems, assertions -- to specify and enforce correct behavior. AI agents, by contrast, operate on prompts and natural language instructions with no formal behavioral specification. This gap...

1 min 1 month, 2 weeks ago

enforcement

LOW Academic International

Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

arXiv:2602.22408v1 Announce Type: new Abstract: Humans exhibit remarkable flexibility in abstract reasoning, and can rapidly learn and apply rules from sparse examples. To investigate the cognitive strategies underlying this ability, we introduce the Cognitive Abstraction and Reasoning Corpus (CogARC), a...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models

arXiv:2602.22508v1 Announce Type: new Abstract: Large Reasoning Models (LRMs) often exhibit structural fragility in complex reasoning tasks, failing to produce correct answers even after successfully deriving valid intermediate steps. Through systematic analysis, we observe that these failures frequently stem not...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

arXiv:2602.22650v1 Announce Type: new Abstract: In online advertising, the inherent complexity and dynamic nature of advertising environments necessitate the use of auto-bidding services to assist advertisers in bid optimization. This complexity is further compounded in multi-channel scenarios, where effective allocation...

1 min 1 month, 2 weeks ago

enforcement

LOW Academic International

AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications

arXiv:2602.22769v1 Announce Type: new Abstract: Large Language Models (LLMs) are deployed as autonomous agents in increasingly complex applications, where enabling long-horizon memory is critical for achieving strong performance. However, a significant gap exists between practical applications and current evaluation standards...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

arXiv:2602.22971v1 Announce Type: new Abstract: As LLMs achieved breakthroughs in general reasoning, their proficiency in specialized scientific domains reveals pronounced gaps in existing benchmarks due to data contamination, insufficient complexity, and prohibitive human labor costs. Here we present SPM-Bench, an...

1 min 1 month, 2 weeks ago

bit

LOW Academic International

TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought

arXiv:2602.22828v1 Announce Type: new Abstract: Background: Retrieval augmented generation (RAG) technology can empower large language models (LLMs) to generate more accurate, professional, and timely responses without fine tuning. However, due to the complex reasoning processes and substantial individual differences involved...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

arXiv:2602.23197v1 Announce Type: new Abstract: Transformer-based large language models exhibit in-context learning, enabling adaptation to downstream tasks via few-shot prompting with demonstrations. In practice, such models are often fine-tuned to improve zero-shot performance on downstream tasks, allowing them to solve...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

arXiv:2602.23286v1 Announce Type: new Abstract: Real-world Table-Text question answering (QA) tasks require models that can reason across long text and source tables, traversing multiple hops and executing complex operations such as aggregation. Yet existing benchmarks are small, manually curated -...

1 min 1 month, 3 weeks ago

enforcement

LOW Academic International

To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

arXiv:2602.22227v1 Announce Type: new Abstract: Despite their impressive capabilities, Multimodal Large Language Models (MLLMs) exhibit perceptual fragility when confronted with visually complex scenes. This weakness stems from a reliance on finite training datasets, which are prohibitively expensive to scale and...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

arXiv:2602.22266v1 Announce Type: new Abstract: State-space models (SSMs) have emerged as a powerful foundation for long-range sequence modeling, with the HiPPO framework showing that continuous-time projection operators can be used to derive stable, memory-efficient dynamical systems that encode the past...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

Early Risk Stratification of Dosing Errors in Clinical Trials Using Machine Learning

arXiv:2602.22285v1 Announce Type: new Abstract: Objective: The objective of this study is to develop a machine learning (ML)-based framework for early risk stratification of clinical trials (CTs) according to their likelihood of exhibiting a high rate of dosing errors, using...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

Manifold of Failure: Behavioral Attraction Basins in Language Models

arXiv:2602.22291v1 Announce Type: new Abstract: While prior work has focused on projecting adversarial examples back onto the manifold of natural data to restore safety, we argue that a comprehensive understanding of AI safety requires characterizing the unsafe regions themselves. This...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression

arXiv:2602.22422v1 Announce Type: new Abstract: Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

The legal protection of artificial intelligence-generated work: The argument for sui generis over copyright

Artificial intelligence (AI) is the simulation of human intelligence processes by machines, especially computer systems. As with other elements of society, the modern economy has become more reliant on AI, indicating the potentially great influence it has on innovation. Many...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

Enhancing Multilingual Embeddings via Multi-Way Parallel Text Alignment

arXiv:2602.21543v1 Announce Type: new Abstract: Multilingual pretraining typically lacks explicit alignment signals, leading to suboptimal cross-lingual alignment in the representation space. In this work, we show that training standard pretrained models for cross-lingual alignment with a multi-way parallel corpus in...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification

arXiv:2602.21608v1 Announce Type: new Abstract: Bangla-English code-mixing is widespread across South Asian social media, yet resources for implicit meaning identification in this setting remain scarce. Existing sentiment and sarcasm models largely focus on monolingual English or high-resource languages and struggle...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

FewMMBench: A Benchmark for Multimodal Few-Shot Learning

arXiv:2602.21854v1 Announce Type: new Abstract: As multimodal large language models (MLLMs) advance in handling interleaved image-text data, assessing their few-shot learning capabilities remains an open challenge. In this paper, we introduce FewMMBench, a comprehensive benchmark designed to evaluate MLLMs under...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

MERRY: Semantically Decoupled Evaluation of Multimodal Emotional and Role Consistencies of Role-Playing Agents

arXiv:2602.21941v1 Announce Type: new Abstract: Multimodal Role-Playing Agents (MRPAs) are attracting increasing attention due to their ability to deliver more immersive multimodal emotional interactions. However, existing studies still rely on pure textual benchmarks to evaluate the text responses of MRPAs,...

1 min 1 month, 3 weeks ago

bit

LOW Academic International

MEDSYN: Benchmarking Multi-EviDence SYNthesis in Complex Clinical Cases for Multimodal Large Language Models

arXiv:2602.21950v1 Announce Type: new Abstract: Multimodal large language models (MLLMs) have shown great potential in medical applications, yet existing benchmarks inadequately capture real-world clinical complexity. We introduce MEDSYN, a multilingual, multimodal benchmark of highly complex clinical cases with up to...

1 min 1 month, 3 weeks ago

bit

PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding

Pressure Reveals Character: Behavioural Alignment Evaluation at Depth

Tool Building as a Path to "Superintelligence"

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing

Autonomous AI and Ownership Rules

Disentangling Geometry, Performance, and Training in Language Models

The ASIR Courage Model: A Phase-Dynamic Framework for Truth Transitions in Human and AI Systems

EPSVec: Efficient and Private Synthetic Data Generation via Dataset Vectors

Latent Context Compilation: Distilling Long Context into Compact Portable Memory

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

Mirroring the Mind: Distilling Human-Like Metacognitive Strategies into Large Language Models

AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

AMA-Bench: Evaluating Long-Horizon Memory for Agentic Applications

SPM-Bench: Benchmarking Large Language Models for Scanning Probe Microscopy

TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought

Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

Early Risk Stratification of Dosing Errors in Clinical Trials Using Machine Learning

Manifold of Failure: Behavioral Attraction Basins in Language Models

Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression

The legal protection of artificial intelligence-generated work: The argument for sui generis over copyright

Enhancing Multilingual Embeddings via Multi-Way Parallel Text Alignment

MixSarc: A Bangla-English Code-Mixed Corpus for Implicit Meaning Identification

FewMMBench: A Benchmark for Multimodal Few-Shot Learning

MERRY: Semantically Decoupled Evaluation of Multimodal Emotional and Role Consistencies of Role-Playing Agents

MEDSYN: Benchmarking Multi-EviDence SYNthesis in Complex Clinical Cases for Multimodal Large Language Models

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.