International Law

LOW Academic International

Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation

arXiv:2603.13099v1 Announce Type: new Abstract: We introduce **CRYSTAL** (*__C__lear __R__easoning via __Y__ielded __S__teps, __T__raceability and __L__ogic*), a diagnostic benchmark with 6,372 instances that evaluates multimodal reasoning through verifiable intermediate steps. We propose two complementary metrics: *Match F1*, which scores step-level...

1 min 1 month ago

ear

LOW Academic International

ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning

arXiv:2603.12740v1 Announce Type: new Abstract: Large Language Model (LLM) agents are increasingly applied to complex, multi-step tasks that require interaction with diverse external tools across various domains. However, current LLM agent tool planning methods typically rely on greedy, reactive tool...

1 min 1 month ago

ear

LOW Academic European Union

DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs

arXiv:2603.12269v1 Announce Type: cross Abstract: Early-exit deep neural networks enable adaptive inference by terminating computation when sufficient confidence is achieved, reducing cost for edge AI accelerators in resource-constrained settings. Existing methods, however, rely on suboptimal exit policies, ignore input difficulty,...

1 min 1 month ago

ear

LOW Academic International

ODRL Policy Comparison Through Normalisation

arXiv:2603.12926v1 Announce Type: new Abstract: The ODRL language has become the standard for representing policies and regulations for digital rights. However its complexity is a barrier to its usage, which has caused many related theoretical and practical works to focus...

1 min 1 month ago

ear

LOW Academic International

On Using Machine Learning to Early Detect Catastrophic Failures in Marine Diesel Engines

arXiv:2603.12733v1 Announce Type: new Abstract: Catastrophic failures of marine engines imply severe loss of functionality and destroy or damage the systems irreversibly. Being sudden and often unpredictable events, they pose a severe threat to navigation, crew, and passengers. The abrupt...

1 min 1 month ago

ear

LOW Academic United States

Prompt Injection as Role Confusion

arXiv:2603.12277v1 Announce Type: cross Abstract: Language models remain vulnerable to prompt injection attacks despite extensive safety training. We trace this failure to role confusion: models infer roles from how text is written, not where it comes from. We design novel...

1 min 1 month ago

ear

LOW Academic United States

Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

arXiv:2603.12933v1 Announce Type: new Abstract: Large Language Model (LLM)-driven Multi-Agent Systems (MAS) have demonstrated strong capability in complex reasoning and tool use, and heterogeneous agent pools further broaden the quality--cost trade-off space. Despite these advances, real-world deployment is often constrained...

1 min 1 month ago

ear

LOW Academic European Union

Diagnosing Retrieval Bias Under Multiple In-Context Knowledge Updates in Large Language Models

arXiv:2603.12271v1 Announce Type: cross Abstract: LLMs are widely used in knowledge-intensive tasks where the same fact may be revised multiple times within context. Unlike prior work focusing on one-shot updates or single conflicts, multi-update scenarios contain multiple historically valid versions...

1 min 1 month ago

ear

LOW Academic European Union

Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions

arXiv:2603.12296v1 Announce Type: cross Abstract: Deep learning has achieved transformative performance across diverse domains, largely driven by the large-scale, high-quality training data. In contrast, the development of brain-computer interfaces (BCIs) is fundamentally constrained by the limited, heterogeneous, and privacy-sensitive neural...

1 min 1 month ago

ear

LOW Academic International

Predictive Analytics for Foot Ulcers Using Time-Series Temperature and Pressure Data

arXiv:2603.12278v1 Announce Type: cross Abstract: Diabetic foot ulcers (DFUs) are a severe complication of diabetes, often resulting in significant morbidity. This paper presents a predictive analytics framework utilizing time-series data captured by wearable foot sensors -- specifically NTC thin-film thermocouples...

1 min 1 month ago

ear

LOW Academic European Union

Detecting Miscitation on the Scholarly Web through LLM-Augmented Text-Rich Graph Learning

arXiv:2603.12290v1 Announce Type: cross Abstract: Scholarly web is a vast network of knowledge connected by citations. However, this system is increasingly compromised by miscitation, where references do not support or even contradict the claims they are cited for. Current miscitation...

1 min 1 month ago

ear

LOW Academic International

Maximum Entropy Exploration Without the Rollouts

arXiv:2603.12325v1 Announce Type: cross Abstract: Efficient exploration remains a central challenge in reinforcement learning, serving as a useful pretraining objective for data collection, particularly when an external reward function is unavailable. A principled formulation of the exploration problem is to...

1 min 1 month, 1 week ago

ear

LOW Academic International

Optimizing Task Completion Time Updates Using POMDPs

arXiv:2603.12340v1 Announce Type: cross Abstract: Managing announced task completion times is a fundamental control problem in project management. While extensive research exists on estimating task durations and task scheduling, the problem of when and how to update completion times communicated...

1 min 1 month, 1 week ago

ear

LOW Academic International

SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs

arXiv:2603.12382v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) have advanced from image-level reasoning to pixel-level grounding, but extending these capabilities to videos remains challenging as models must achieve spatial precision and temporally consistent reference tracking. Existing video MLLMs...

1 min 1 month, 1 week ago

ear

LOW Academic International

Test-Time Strategies for More Efficient and Accurate Agentic RAG

arXiv:2603.12396v1 Announce Type: cross Abstract: Retrieval-Augmented Generation (RAG) systems face challenges with complex, multihop questions, and agentic frameworks such as Search-R1 (Jin et al., 2025), which operates iteratively, have been proposed to address these complexities. However, such approaches can introduce...

1 min 1 month, 1 week ago

ear

LOW Academic International

Revisiting Model Stitching In the Foundation Model Era

arXiv:2603.12433v1 Announce Type: cross Abstract: Model stitching, connecting early layers of one model (source) to later layers of another (target) via a light stitch layer, has served as a probe of representational compatibility. Prior work finds that models trained on...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images

arXiv:2603.12445v1 Announce Type: cross Abstract: Convolutional Neural Networks have shown promising effectiveness in identifying different types of cancer from radiographs. However, the opaque nature of CNNs makes it difficult to fully understand the way they operate, limiting their assessment to...

1 min 1 month, 1 week ago

ear

LOW Academic United States

Operationalising Cyber Risk Management Using AI: Connecting Cyber Incidents to MITRE ATT&CK Techniques, Security Controls, and Metrics

arXiv:2603.12455v1 Announce Type: cross Abstract: The escalating frequency of cyber-attacks poses significant challenges for organisations, particularly small enterprises constrained by limited in-house expertise, insufficient knowledge, and financial resources. This research presents a novel framework that leverages Natural Language Processing to...

1 min 1 month, 1 week ago

ear

LOW Academic International

Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs

arXiv:2603.12458v1 Announce Type: cross Abstract: While Large Language Models (LLMs) achieve expert-level performance on standard medical benchmarks through single-hop factual recall, they severely struggle with the complex, multi-hop diagnostic reasoning required in real-world clinical settings. A primary obstacle is "shortcut...

1 min 1 month, 1 week ago

ear

LOW Academic International

CLARE: Classification-based Regression for Electron Temperature Prediction

arXiv:2603.12470v1 Announce Type: cross Abstract: Electron temperature (Te) is an important parameter governing space weather in the upper atmosphere, but has historically been underexplored in the space weather machine learning literature. We present CLARE, a machine learning model for predicting...

1 min 1 month, 1 week ago

ear

LOW Academic International

TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction

arXiv:2603.12500v1 Announce Type: cross Abstract: We present a Temporal Rule-Anchored Chain-of-Evidence (TRACE) on knowledge graphs for interpretable stock movement prediction that unifies symbolic relational priors, dynamic graph exploration, and LLM-guided decision making in a single end-to-end pipeline. The approach performs...

1 min 1 month, 1 week ago

ear

LOW Academic International

ELLA: Generative AI-Powered Social Robots for Early Language Development at Home

arXiv:2603.12508v1 Announce Type: cross Abstract: Early language development shapes children's later literacy and learning, yet many families have limited access to scalable, high-quality support at home. Recent advances in generative AI make it possible for social robots to move beyond...

1 min 1 month, 1 week ago

ear

LOW Academic International

LLM BiasScope: A Real-Time Bias Analysis Platform for Comparative LLM Evaluation

arXiv:2603.12522v1 Announce Type: cross Abstract: As large language models (LLMs) are deployed widely, detecting and understanding bias in their outputs is critical. We present LLM BiasScope, a web application for side-by-side comparison of LLM outputs with real-time bias analysis. The...

1 min 1 month, 1 week ago

ear

LOW Academic International

TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning

arXiv:2603.12529v1 Announce Type: cross Abstract: Large Reasoning Models (LRMs) achieve impressive performance on complex reasoning tasks via Chain-of-Thought (CoT) reasoning, which enables them to generate intermediate thinking tokens before arriving at the final answer. However, LRMs often suffer from significant...

1 min 1 month, 1 week ago

ear

LOW Academic International

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

arXiv:2603.12275v1 Announce Type: new Abstract: Unlearning knowledge is a pressing and challenging task in Large Language Models (LLMs) because of their unprecedented capability to memorize and digest training data at scale, raising more significant issues regarding safety, privacy, and intellectual...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

LLM-Augmented Therapy Normalization and Aspect-Based Sentiment Analysis for Treatment-Resistant Depression on Reddit

arXiv:2603.12343v1 Announce Type: new Abstract: Treatment-resistant depression (TRD) is a severe form of major depressive disorder in which patients do not achieve remission despite multiple adequate treatment trials. Evidence across pharmacologic options for TRD remains limited, and trials often do...

1 min 1 month, 1 week ago

ear

LOW Academic United States

CSE-UOI at SemEval-2026 Task 6: A Two-Stage Heterogeneous Ensemble with Deliberative Complexity Gating for Political Evasion Detection

arXiv:2603.12453v1 Announce Type: new Abstract: This paper describes our system for SemEval-2026 Task 6, which classifies clarity of responses in political interviews into three categories: Clear Reply, Ambivalent, and Clear Non-Reply. We propose a heterogeneous dual large language model (LLM)...

1 min 1 month, 1 week ago

ear

LOW Academic European Union

Marked Pedagogies: Examining Linguistic Biases in Personalized Automated Writing Feedback

arXiv:2603.12471v1 Announce Type: new Abstract: Effective personalized feedback is critical to students' literacy development. Though LLM-powered tools now promise to automate such feedback at scale, LLMs are not language-neutral: they privilege standard academic English and reproduce social stereotypes, raising concerns...

1 min 1 month, 1 week ago

ear

LOW Academic International

AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents

arXiv:2603.12564v1 Announce Type: new Abstract: Tool-augmented LLM agents increasingly serve as multi-turn advisors in high-stakes domains, yet their evaluation relies on ranking-quality metrics that measure what is recommended but not whether it is safe for the user. We introduce a...

1 min 1 month, 1 week ago

ear

LOW Academic International

Expert Pyramid Tuning: Efficient Parameter Fine-Tuning for Expertise-Driven Task Allocation

arXiv:2603.12577v1 Announce Type: new Abstract: Parameter-Efficient Fine-Tuning (PEFT) has become a dominant paradigm for deploying LLMs in multi-task scenarios due to its extreme parameter efficiency. While Mixture-of-Experts (MoE) based LoRA variants have achieved promising results by dynamically routing tokens to...

1 min 1 month, 1 week ago

ear

Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation

ToolTree: Efficient LLM Agent Tool Planning via Dual-Feedback Monte Carlo Tree Search and Bidirectional Pruning

DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs

ODRL Policy Comparison Through Normalisation

On Using Machine Learning to Early Detect Catastrophic Failures in Marine Diesel Engines

Prompt Injection as Role Confusion

Efficient and Interpretable Multi-Agent LLM Routing via Ant Colony Optimization

Diagnosing Retrieval Bias Under Multiple In-Context Knowledge Updates in Large Language Models

Synthetic Data Generation for Brain-Computer Interfaces: Overview, Benchmarking, and Future Directions

Predictive Analytics for Foot Ulcers Using Time-Series Temperature and Pressure Data

Detecting Miscitation on the Scholarly Web through LLM-Augmented Text-Rich Graph Learning

Maximum Entropy Exploration Without the Rollouts

Optimizing Task Completion Time Updates Using POMDPs

SPARROW: Learning Spatial Precision and Temporal Referential Consistency in Pixel-Grounded Video MLLMs

Test-Time Strategies for More Efficient and Accurate Agentic RAG

Revisiting Model Stitching In the Foundation Model Era

Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images

Operationalising Cyber Risk Management Using AI: Connecting Cyber Incidents to MITRE ATT&CK Techniques, Security Controls, and Metrics

Shattering the Shortcut: A Topology-Regularized Benchmark for Multi-hop Medical Reasoning in LLMs

CLARE: Classification-based Regression for Electron Temperature Prediction

TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction

ELLA: Generative AI-Powered Social Robots for Early Language Development at Home

LLM BiasScope: A Real-Time Bias Analysis Platform for Comparative LLM Evaluation

TERMINATOR: Learning Optimal Exit Points for Early Stopping in Chain-of-Thought Reasoning

GONE: Structural Knowledge Unlearning via Neighborhood-Expanded Distribution Shaping

LLM-Augmented Therapy Normalization and Aspect-Based Sentiment Analysis for Treatment-Resistant Depression on Reddit

CSE-UOI at SemEval-2026 Task 6: A Two-Stage Heterogeneous Ensemble with Deliberative Complexity Gating for Political Evasion Detection

Marked Pedagogies: Examining Linguistic Biases in Personalized Automated Writing Feedback

AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents

Expert Pyramid Tuning: Efficient Parameter Fine-Tuning for Expertise-Driven Task Allocation

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.