Immigration Law

LOW Academic International

Steering at the Source: Style Modulation Heads for Robust Persona Control

arXiv:2603.13249v1 Announce Type: new Abstract: Activation steering offers a computationally efficient mechanism for controlling Large Language Models (LLMs) without fine-tuning. While effectively controlling target traits (e.g., persona), coherency degradation remains a major obstacle to safety and practical deployment. We hypothesize...

1 min 1 month ago

ead

LOW Academic International

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

arXiv:2603.13594v1 Announce Type: new Abstract: Large language models are shifting from passive information providers to active agents intended for complex workflows. However, their deployment as reliable AI workers in enterprise is stalled by benchmarks that fail to capture the intricacies...

1 min 1 month ago

ead

LOW Academic International

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

arXiv:2603.13348v1 Announce Type: new Abstract: Tool use represents a critical capability for AI agents, with recent advances focusing on leveraging reinforcement learning (RL) to scale up the explicit reasoning process to achieve better performance. However, there are some key challenges...

1 min 1 month ago

ead

LOW Academic International

InterventionLens: A Multi-Agent Framework for Detecting ASD Intervention Strategies in Parent-Child Shared Reading

arXiv:2603.13710v1 Announce Type: new Abstract: Home-based interventions like parent-child shared reading provide a cost-effective approach for supporting children with autism spectrum disorder (ASD). However, analyzing caregiver intervention strategies in naturalistic home interactions typically relies on expert annotation, which is costly,...

1 min 1 month ago

ead

LOW Academic International

Prompt Complexity Dilutes Structured Reasoning: A Follow-Up Study on the Car Wash Problem

arXiv:2603.13351v1 Announce Type: new Abstract: In a previous study [Jo, 2026], STAR reasoning (Situation, Task, Action, Result) raised car wash problem accuracy from 0% to 85% on Claude Sonnet 4.5, and to 100% with additional prompt layers. This follow-up asks:...

1 min 1 month ago

ead

LOW Academic European Union

The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning

arXiv:2603.13372v1 Announce Type: new Abstract: The Abstraction and Reasoning Corpus (ARC-AGI) has become a key benchmark for fluid intelligence in AI. This survey presents the first cross-generation analysis of 82 approaches across three benchmark versions and the ARC Prize 2024-2025...

1 min 1 month ago

tps

LOW Academic International

Projection-Free Evolution Strategies for Continuous Prompt Search

arXiv:2603.13786v1 Announce Type: new Abstract: Continuous prompt search offers a computationally efficient alternative to conventional parameter tuning in natural language processing tasks. Nevertheless, its practical effectiveness can be significantly hindered by the black-box nature and the inherent high-dimensionality of the...

1 min 1 month ago

ead

LOW Academic International

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

arXiv:2603.13966v1 Announce Type: new Abstract: Vision Language Action VLA models are typically evaluated using per benchmark scripts maintained independently by each model repository, leading to duplicated code, dependency conflicts, and underspecified protocols. We present vla eval, an open source evaluation...

1 min 1 month ago

ead

LOW Academic International

Do Large Language Models Get Caught in Hofstadter-Mobius Loops?

arXiv:2603.13378v1 Announce Type: new Abstract: In Arthur C. Clarke's 2010: Odyssey Two, HAL 9000's homicidal breakdown is diagnosed as a "Hofstadter-Mobius loop": a failure mode in which an autonomous system receives contradictory directives and, unable to reconcile them, defaults to...

1 min 1 month ago

ead

LOW Academic United States

GroupGuard: A Framework for Modeling and Defending Collusive Attacks in Multi-Agent Systems

arXiv:2603.13940v1 Announce Type: new Abstract: While large language model-based agents demonstrate great potential in collaborative tasks, their interactivity also introduces security vulnerabilities. In this paper, we propose and model group collusive attacks, a highly destructive threat in which multiple agents...

1 min 1 month ago

ead

LOW Academic International

DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents

arXiv:2603.13791v1 Announce Type: new Abstract: Reliable detection of deceptive behavior in Large Language Model (LLM) agents is an essential prerequisite for safe deployment in high-stakes agentic contexts. Prior work on scheming detection has focused exclusively on black-box monitors that observe...

1 min 1 month ago

ead

LOW Academic International

Early Rug Pull Warning for BSC Meme Tokens via Multi-Granularity Wash-Trading Pattern Profiling

arXiv:2603.13830v1 Announce Type: new Abstract: The high-frequency issuance and short-cycle speculation of meme tokens in decentralized finance (DeFi) have significantly amplified rug-pull risk. Existing approaches still struggle to provide stable early warning under scarce anomalies, incomplete labels, and limited interpretability....

1 min 1 month ago

ead

LOW Academic International

Explain in Your Own Words: Improving Reasoning via Token-Selective Dual Knowledge Distillation

arXiv:2603.13260v1 Announce Type: new Abstract: Knowledge Distillation (KD) can transfer the reasoning abilities of large models to smaller ones, which can reduce the costs to generate Chain-of-Thoughts for reasoning tasks. KD methods typically ask the student to mimic the teacher's...

1 min 1 month ago

tps

LOW Academic International

ManiBench: A Benchmark for Testing Visual-Logic Drift and Syntactic Hallucinations in Manim Code Generation

arXiv:2603.13251v1 Announce Type: new Abstract: Traditional benchmarks like HumanEval and MBPP test logic and syntax effectively, but fail when code must produce dynamic, pedagogical visuals. We introduce ManiBench, a specialized benchmark evaluating LLM performance in generating Manim CE code, where...

1 min 1 month ago

tps

LOW Academic International

Multimodal Emotion Regression with Multi-Objective Optimization and VAD-Aware Audio Modeling for the 10th ABAW EMI Track

arXiv:2603.13760v1 Announce Type: new Abstract: We participated in the 10th ABAW Challenge, focusing on the Emotional Mimicry Intensity (EMI) Estimation track on the Hume-Vidmimic2 dataset. This task aims to predict six continuous emotion dimensions: Admiration, Amusement, Determination, Empathic Pain, Excitement,...

1 min 1 month ago

ead

LOW Academic International

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

arXiv:2603.13853v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG), based on large language models (LLMs), serves as a vital approach to retrieving and leveraging external knowledge in various domain applications. When confronted with complex multi-hop questions, single-round retrieval is often insufficient...

1 min 1 month ago

ead

LOW Academic United States

OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset

arXiv:2603.13933v1 Announce Type: new Abstract: Ensuring the safety and compliance of large language models (LLMs) is of paramount importance. However, existing LLM safety datasets often rely on ad-hoc taxonomies for data generation and suffer from a significant shortage of rule-grounded,...

1 min 1 month ago

ead

LOW Academic International

ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering

arXiv:2603.13950v1 Announce Type: new Abstract: Large Language Model (LLM) agents increasingly use external tools for complex tasks and rely on embedding-based retrieval to select a small top-k subset for reasoning. As these systems scale, the robustness of this retrieval stage...

1 min 1 month ago

tps

LOW Academic International

CMHL: Contrastive Multi-Head Learning for Emotionally Consistent Text Classification

arXiv:2603.14078v1 Announce Type: new Abstract: Textual Emotion Classification (TEC) is one of the most difficult NLP tasks. State of the art approaches rely on Large language models (LLMs) and multi-model ensembles. In this study, we challenge the assumption that larger...

1 min 1 month ago

ead

LOW Academic International

OasisSimp: An Open-source Asian-English Sentence Simplification Dataset

arXiv:2603.14111v1 Announce Type: new Abstract: Sentence simplification aims to make complex text more accessible by reducing linguistic complexity while preserving the original meaning. However, progress in this area remains limited for mid-resource and low-resource languages due to the scarcity of...

1 min 1 month ago

tps

LOW Academic International

Selective Fine-Tuning of GPT Architectures for Parameter-Efficient Clinical Text Classification

arXiv:2603.14183v1 Announce Type: new Abstract: The rapid expansion of electronic health record (EHR) systems has generated large volumes of unstructured clinical narratives that contain valuable information for disease identification, patient cohort discovery, and clinical decision support. Extracting structured knowledge from...

1 min 1 month ago

ead

LOW Academic International

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

arXiv:2603.14251v1 Announce Type: new Abstract: Large Reasoning Language Models (LRLMs) demonstrate impressive capabilities on complex tasks by utilizing long Chain-of-Thought reasoning. However, they are prone to overthinking, which generates redundant reasoning steps that degrade both performance and efficiency. Recently, early-exit...

1 min 1 month ago

ead

LOW Academic International

Automatic Inter-document Multi-hop Scientific QA Generation

arXiv:2603.14257v1 Announce Type: new Abstract: Existing automatic scientific question generation studies mainly focus on single-document factoid QA, overlooking the inter-document reasoning crucial for scientific understanding. We present AIM-SciQA, an automated framework for generating multi-document, multi-hop scientific QA datasets. AIM-SciQA extracts...

1 min 1 month ago

ead

LOW Academic International

SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging

arXiv:2603.14303v1 Announce Type: new Abstract: Existing KV cache compression methods generally operate on discrete tokens or non-semantic chunks. However, such approaches often lead to semantic fragmentation, where linguistically coherent units are disrupted, causing irreversible information loss and degradation in model...

1 min 1 month ago

ead

LOW Academic International

Creative Convergence or Imitation? Genre-Specific Homogeneity in LLM-Generated Chinese Literature

arXiv:2603.14430v1 Announce Type: new Abstract: Large Language Models (LLMs) have demonstrated remarkable capabilities in narrative generation. However, they often produce structurally homogenized stories, frequently following repetitive arrangements and combinations of plot events along with stereotypical resolutions. In this paper, we...

1 min 1 month ago

ead

LOW Academic International

PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark

arXiv:2603.14456v1 Announce Type: new Abstract: Persian poses unique audio understanding challenges through its classical poetry, traditional music, and pervasive code-switching - none captured by existing benchmarks. We introduce PARSA-Bench (Persian Audio Reasoning and Speech Assessment Benchmark), the first benchmark for...

1 min 1 month ago

tps

LOW Academic International

Continual Fine-Tuning with Provably Accurate and Parameter-Free Task Retrieval

arXiv:2603.13235v1 Announce Type: new Abstract: Continual fine-tuning aims to adapt a pre-trained backbone to new tasks sequentially while preserving performance on earlier tasks whose data are no longer available. Existing approaches fall into two categories which include input- and parameter-adaptation....

1 min 1 month ago

ead

LOW Academic European Union

ICaRus: Identical Cache Reuse for Efficient Multi Model Inference

arXiv:2603.13281v1 Announce Type: new Abstract: Multi model inference has recently emerged as a prominent paradigm, particularly in the development of agentic AI systems. However, in such scenarios, each model must maintain its own Key-Value (KV) cache for the identical prompt,...

1 min 1 month ago

ead

LOW Academic International

From Stochastic Answers to Verifiable Reasoning: Interpretable Decision-Making with LLM-Generated Code

arXiv:2603.13287v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used for high-stakes decision-making, yet existing approaches struggle to reconcile scalability, interpretability, and reproducibility. Black-box models obscure their reasoning, while recent LLM-based rule systems rely on per-sample evaluation, causing...

1 min 1 month ago

ead

LOW Academic European Union

RelayCaching: Accelerating LLM Collaboration via Decoding KV Cache Reuse

arXiv:2603.13289v1 Announce Type: new Abstract: The increasing complexity of AI tasks has shifted the paradigm from monolithic models toward multi-agent large language model (LLM) systems. However, these collaborative architectures introduce a critical bottleneck: redundant prefill computation for shared content generated...

1 min 1 month ago

ead

Steering at the Source: Style Modulation Heads for Robust Persona Control

EnterpriseOps-Gym: Environments and Evaluations for Stateful Agentic Planning and Tool Use in Enterprise Settings

AutoTool: Automatic Scaling of Tool-Use Capabilities in RL via Decoupled Entropy Constraints

InterventionLens: A Multi-Agent Framework for Detecting ASD Intervention Strategies in Parent-Child Shared Reading

Prompt Complexity Dilutes Structured Reasoning: A Follow-Up Study on the Car Wash Problem

The ARC of Progress towards AGI: A Living Survey of Abstraction and Reasoning

Projection-Free Evolution Strategies for Continuous Prompt Search

vla-eval: A Unified Evaluation Harness for Vision-Language-Action Models

Do Large Language Models Get Caught in Hofstadter-Mobius Loops?

GroupGuard: A Framework for Modeling and Defending Collusive Attacks in Multi-Agent Systems

DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents

Early Rug Pull Warning for BSC Meme Tokens via Multi-Granularity Wash-Trading Pattern Profiling

Explain in Your Own Words: Improving Reasoning via Token-Selective Dual Knowledge Distillation

ManiBench: A Benchmark for Testing Visual-Logic Drift and Syntactic Hallucinations in Manim Code Generation

Multimodal Emotion Regression with Multi-Objective Optimization and VAD-Aware Audio Modeling for the 10th ABAW EMI Track

APEX-Searcher: Augmenting LLMs' Search Capabilities through Agentic Planning and Execution

OmniCompliance-100K: A Multi-Domain, Rule-Grounded, Real-World Safety Compliance Dataset

ToolFlood: Beyond Selection -- Hiding Valid Tools from LLM Agents via Semantic Covering

CMHL: Contrastive Multi-Head Learning for Emotionally Consistent Text Classification

OasisSimp: An Open-source Asian-English Sentence Simplification Dataset

Selective Fine-Tuning of GPT Architectures for Parameter-Efficient Clinical Text Classification

Mitigating Overthinking in Large Reasoning Language Models via Reasoning Path Deviation Monitoring

Automatic Inter-document Multi-hop Scientific QA Generation

SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging

Creative Convergence or Imitation? Genre-Specific Homogeneity in LLM-Generated Chinese Literature

PARSA-Bench: A Comprehensive Persian Audio-Language Model Benchmark

Continual Fine-Tuning with Provably Accurate and Parameter-Free Task Retrieval

ICaRus: Identical Cache Reuse for Efficient Multi Model Inference

From Stochastic Answers to Verifiable Reasoning: Interpretable Decision-Making with LLM-Generated Code

RelayCaching: Accelerating LLM Collaboration via Decoding KV Cache Reuse

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.