Labor & Employment

LOW Academic International

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

arXiv:2602.24188v1 Announce Type: new Abstract: We present a scalable methodology for evaluating language models in multi-turn interactions, using a suite of collaborative games that require effective communication about private information. This enables an interactive scaling analysis, in which a fixed...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

arXiv:2602.23866v1 Announce Type: cross Abstract: Software engineering agents (SWE) are improving rapidly, with recent gains largely driven by reinforcement learning (RL). However, RL training is constrained by the scarcity of large-scale task collections with reproducible execution environments and reliable test...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Uncertainty-aware Language Guidance for Concept Bottleneck Models

arXiv:2602.23495v1 Announce Type: new Abstract: Concept Bottleneck Models (CBMs) provide inherent interpretability by first mapping input samples to high-level semantic concepts, followed by a combination of these concepts for the final classification. However, the annotation of human-understandable concepts requires extensive...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

arXiv:2602.23636v1 Announce Type: new Abstract: Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fixed definition of harmfulness. In practice, enforcement strictness -...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Optimizer-Induced Low-Dimensional Drift and Transverse Dynamics in Transformer Training

arXiv:2602.23696v1 Announce Type: new Abstract: We study the geometry of training trajectories in small transformer models and find that parameter updates organize into a dominant drift direction with transverse residual dynamics. Using uncentered, row-normalized trajectory PCA, we show that a...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Bridging Dynamics Gaps via Diffusion Schr\"odinger Bridge for Cross-Domain Reinforcement Learning

arXiv:2602.23737v1 Announce Type: new Abstract: Cross-domain reinforcement learning (RL) aims to learn transferable policies under dynamics shifts between source and target domains. A key challenge lies in the lack of target-domain environment interaction and reward supervision, which prevents direct policy...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure

arXiv:2602.23784v1 Announce Type: new Abstract: Foundation models have transformed domains from language to genomics by learning general-purpose representations from large-scale, heterogeneous data. We introduce TradeFM, a 524M-parameter generative Transformer that brings this paradigm to market microstructure, learning directly from billions...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

arXiv:2602.23795v1 Announce Type: new Abstract: Structured deep model compression methods are hardware-friendly and substantially reduce memory and inference costs. However, under aggressive compression, the resulting accuracy degradation often necessitates post-compression finetuning, which can be impractical due to missing labeled data...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

arXiv:2602.23997v1 Announce Type: new Abstract: The next generation of autonomous agents must not only learn efficiently but also act reliably and adapt their behavior in open worlds. Standard approaches typically assume fixed tasks and environments with little or no novelty,...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models

arXiv:2602.20324v1 Announce Type: new Abstract: Phenotyping is fundamental to rare disease diagnosis, but manual curation of structured phenotypes from clinical notes is labor-intensive and difficult to scale. Existing artificial intelligence approaches typically optimize individual components of phenotyping but do not...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

Implicit Intelligence -- Evaluating Agents on What Users Don't Say

arXiv:2602.20424v1 Announce Type: new Abstract: Real-world requests to AI agents are fundamentally underspecified. Natural human communication relies on shared context and unstated constraints that speakers expect listeners to infer. Current agentic benchmarks test explicit instruction-following but fail to evaluate whether...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production

arXiv:2602.20558v1 Announce Type: new Abstract: Large language models (LLMs) are promising backbones for generative recommender systems, yet a key challenge remains underexplored: verbalization, i.e., converting structured user interaction logs into effective natural language inputs. Existing methods rely on rigid templates...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

arXiv:2602.20812v1 Announce Type: new Abstract: As the construction industry advances toward digital transformation, BIM (Building Information Modeling)-based design has become a key driver supporting intelligent construction. Despite Large Language Models (LLMs) have shown potential in promoting BIM-based design, the lack...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Multimodal Multi-Agent Empowered Legal Judgment Prediction

arXiv:2601.12815v5 Announce Type: cross Abstract: Legal Judgment Prediction (LJP) aims to predict the outcomes of legal cases based on factual descriptions, serving as a fundamental task to advance the development of legal systems. Traditional methods often rely on statistical analyses...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Talking to Yourself: Defying Forgetting in Large Language Models

arXiv:2602.20162v1 Announce Type: cross Abstract: Catastrophic forgetting remains a major challenge when fine-tuning large language models (LLMs) on narrow, task-specific data, often degrading their general knowledge and reasoning abilities. We propose SA-SFT, a lightweight self-augmentation routine in which an LLM...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

ConceptRM: The Quest to Mitigate Alert Fatigue through Consensus-Based Purity-Driven Data Cleaning for Reflection Modelling

arXiv:2602.20166v1 Announce Type: cross Abstract: In many applications involving intelligent agents, the overwhelming volume of alerts (mostly false) generated by the agents may desensitize users and cause them to overlook critical issues, leading to the so-called ''alert fatigue''. A common...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing

arXiv:2602.20168v1 Announce Type: cross Abstract: Emergency triage decisions are made under severe information constraints, yet most data-driven deterioration models are evaluated using signals unavailable during initial assessment. We present a leakage-aware benchmarking framework for early deterioration prediction that evaluates model...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

No One Size Fits All: QueryBandits for Hallucination Mitigation

arXiv:2602.20332v1 Announce Type: new Abstract: Advanced reasoning capabilities in Large Language Models (LLMs) have led to more frequent hallucinations; yet most mitigation work focuses on open-source models for post-hoc detection and parameter editing. The dearth of studies focusing on hallucinations...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Disentangling Geometry, Performance, and Training in Language Models

arXiv:2602.20433v1 Announce Type: new Abstract: Geometric properties of Transformer weights, particularly the unembedding matrix, have been widely useful in language model interpretability research. Yet, their utility for estimating downstream performance remains unclear. In this work, we systematically investigate the relationship...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Latent Context Compilation: Distilling Long Context into Compact Portable Memory

arXiv:2602.21221v1 Announce Type: cross Abstract: Efficient long-context LLM deployment is stalled by a dichotomy between amortized compression, which struggles with out-of-distribution generalization, and Test-Time Training, which incurs prohibitive synthetic data costs and requires modifying model weights, creating stateful parameters that...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

arXiv:2602.21233v1 Announce Type: cross Abstract: This technical report introduces AngelSlim, a comprehensive and versatile toolkit for large model compression developed by the Tencent Hunyuan team. By consolidating cutting-edge algorithms, including quantization, speculative decoding, token pruning, and distillation. AngelSlim provides a...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Equitable Evaluation via Elicitation

arXiv:2602.21327v1 Announce Type: cross Abstract: Individuals with similar qualifications and skills may vary in their demeanor, or outward manner: some tend toward self-promotion while others are modest to the point of omitting crucial information. Comparing the self-descriptions of equally qualified...

1 min 1 month, 2 weeks ago

termination

LOW Academic International

MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

arXiv:2602.21379v1 Announce Type: cross Abstract: We introduce MrBERT, a family of 150M-300M parameter encoders built on the ModernBERT architecture and pre-trained on 35 languages and code. Through targeted adaptation, this model family achieves state-of-the-art results on Catalan- and Spanish-specific tasks,...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

The Headless Firm: How AI Reshapes Enterprise Boundaries

arXiv:2602.21401v1 Announce Type: cross Abstract: The boundary of the firm is determined by coordination cost. We argue that agentic AI induces a structural change in how coordination costs scale: in prior modular systems, integration cost grew with interaction topology (O(n^2)...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

arXiv:2602.22302v1 Announce Type: new Abstract: Traditional software relies on contracts -- APIs, type systems, assertions -- to specify and enforce correct behavior. AI agents, by contrast, operate on prompts and natural language instructions with no formal behavioral specification. This gap...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

arXiv:2602.22408v1 Announce Type: new Abstract: Humans exhibit remarkable flexibility in abstract reasoning, and can rapidly learn and apply rules from sparse examples. To investigate the cognitive strategies underlying this ability, we introduce the Cognitive Abstraction and Reasoning Corpus (CogARC), a...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

A Mathematical Theory of Agency and Intelligence

arXiv:2602.22519v1 Announce Type: new Abstract: To operate reliably under changing conditions, complex systems require feedback on how effectively they use resources, not just whether objectives are met. Current AI systems process vast information to produce sophisticated predictions, yet predictions can...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

arXiv:2602.22546v1 Announce Type: new Abstract: Large Language Model (LLM) based agents excel at general reasoning but often fail in specialized domains where success hinges on long-tail knowledge absent from their training data. While human experts can provide this missing knowledge,...

1 min 1 month, 2 weeks ago

labor

LOW Academic International

CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

arXiv:2602.22557v1 Announce Type: new Abstract: Current safety mechanisms for Large Language Models (LLMs) rely heavily on static, fine-tuned classifiers that suffer from adaptation rigidity, the inability to enforce new governance rules without expensive retraining. To address this, we introduce CourtGuard,...

1 min 1 month, 2 weeks ago

ada

LOW Academic International

AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

arXiv:2602.22650v1 Announce Type: new Abstract: In online advertising, the inherent complexity and dynamic nature of advertising environments necessitate the use of auto-bidding services to assist advertisers in bid optimization. This complexity is further compounded in multi-channel scenarios, where effective allocation...

1 min 1 month, 2 weeks ago

ada

MT-PingEval: Evaluating Multi-Turn Collaboration with Private Information Games

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Uncertainty-aware Language Guidance for Concept Bottleneck Models

FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation

Optimizer-Induced Low-Dimensional Drift and Transverse Dynamics in Transformer Training

Bridging Dynamics Gaps via Diffusion Schr\"odinger Bridge for Cross-Domain Reinforcement Learning

TradeFM: A Generative Foundation Model for Trade-flow and Market Microstructure

GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

Foundation World Models for Agents that Learn, Verify, and Adapt Reliably Beyond Static Environments

An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models

Implicit Intelligence -- Evaluating Agents on What Users Don't Say

From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production

Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

Multimodal Multi-Agent Empowered Legal Judgment Prediction

Talking to Yourself: Defying Forgetting in Large Language Models

ConceptRM: The Quest to Mitigate Alert Fatigue through Consensus-Based Purity-Driven Data Cleaning for Reflection Modelling

Benchmarking Early Deterioration Prediction Across Hospital-Rich and MCI-Like Emergency Triage Under Constrained Sensing

No One Size Fits All: QueryBandits for Hallucination Mitigation

Disentangling Geometry, Performance, and Training in Language Models

Latent Context Compilation: Distilling Long Context into Compact Portable Memory

AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Equitable Evaluation via Elicitation

MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

The Headless Firm: How AI Reshapes Enterprise Boundaries

Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

A Mathematical Theory of Agency and Intelligence

Requesting Expert Reasoning: Augmenting LLM Agents with Learned Collaborative Intervention

CourtGuard: A Model-Agnostic Framework for Zero-Shot Policy Adaptation in LLM Safety

AHBid: An Adaptable Hierarchical Bidding Framework for Cross-Channel Advertising

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.