Immigration Law

LOW Academic International

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

arXiv:2602.17753v1 Announce Type: cross Abstract: Agentic AI systems are increasingly capable of performing professional and personal tasks with limited human involvement. However, tracking these developments is difficult because the AI agent ecosystem is complex, rapidly evolving, and inconsistently documented, posing...

1 min 1 month, 1 week ago

tps

LOW Academic International

Feedback-based Automated Verification in Vibe Coding of CAS Adaptation Built on Constraint Logic

arXiv:2602.18607v1 Announce Type: new Abstract: In CAS adaptation, a challenge is to define the dynamic architecture of the system and changes in its behavior. Implementation-wise, this is projected into an adaptation mechanism, typically realized as an Adaptation Manager (AM). With...

1 min 1 month, 1 week ago

ead

LOW Academic International

TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models

arXiv:2602.18884v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs), particularly smaller, deployable variants, exhibit a critical deficiency in understanding temporal and procedural visual data, a bottleneck hindering their application in real-world embodied AI. This gap is largely caused by...

1 min 1 month, 1 week ago

tps

LOW Academic International

(Perlin) Noise as AI coordinator

arXiv:2602.18947v1 Announce Type: new Abstract: Large scale control of nonplayer agents is central to modern games, while production systems still struggle to balance several competing goals: locally smooth, natural behavior, and globally coordinated variety across space and time. Prior approaches...

1 min 1 month, 1 week ago

ead

LOW Academic International

Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents

arXiv:2602.19065v1 Announce Type: new Abstract: Large Language Models (LLMs) are evolving into autonomous agents, yet current "frameless" development--relying on ambiguous natural language without engineering blueprints--leads to critical risks such as scope creep and open-loop failures. To ensure industrial-grade reliability, this...

1 min 1 month, 1 week ago

ead

LOW Academic International

Defining Explainable AI for Requirements Analysis

arXiv:2602.19071v1 Announce Type: new Abstract: Explainable Artificial Intelligence (XAI) has become popular in the last few years. The Artificial Intelligence (AI) community in general, and the Machine Learning (ML) community in particular, is coming to the realisation that in many...

1 min 1 month, 1 week ago

ead

LOW Academic International

Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

arXiv:2602.19109v1 Announce Type: new Abstract: We study three-digit addition in Meta-Llama-3-8B (base) under a one-token readout to characterize how arithmetic answers are finalized after cross-token routing becomes causally irrelevant. Causal residual patching and cumulative attention ablations localize a sharp boundary...

1 min 1 month, 1 week ago

ead

LOW Academic International

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

arXiv:2602.19281v1 Announce Type: new Abstract: The test-time compute strategy, such as Chain-of-Thought (CoT), has significantly enhanced the ability of large language models to solve complex tasks like logical reasoning. However, empirical studies indicate that simply increasing the compute budget can...

1 min 1 month, 1 week ago

ead

LOW Academic International

Time Series, Vision, and Language: Exploring the Limits of Alignment in Contrastive Representation Spaces

arXiv:2602.19367v1 Announce Type: new Abstract: The Platonic Representation Hypothesis posits that learned representations from models trained on different modalities converge to a shared latent structure of the world. However, this hypothesis has largely been examined in vision and language, and...

1 min 1 month, 1 week ago

ead

LOW Academic International

ReportLogic: Evaluating Logical Quality in Deep Research Reports

arXiv:2602.18446v1 Announce Type: new Abstract: Users increasingly rely on Large Language Models (LLMs) for Deep Research, using them to synthesize diverse sources into structured reports that support understanding and action. In this context, the practical reliability of such reports hinges...

1 min 1 month, 1 week ago

ead

LOW Academic International

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

arXiv:2602.18487v1 Announce Type: new Abstract: This paper introduces GLiNER-bi-Encoder, a novel architecture for Named Entity Recognition (NER) that harmonizes zero-shot flexibility with industrial-scale efficiency. While the original GLiNER framework offers strong generalization, its joint-encoding approach suffers from quadratic complexity as...

1 min 1 month, 1 week ago

ead

LOW Academic International

Luna-2: Scalable Single-Token Evaluation with Small Language Models

arXiv:2602.18583v1 Announce Type: new Abstract: Real-time guardrails require evaluation that is accurate, cheap, and fast - yet today's default, LLM-as-a-judge (LLMAJ), is slow, expensive, and operationally non-deterministic due to multi-token generation. We present Luna-2, a novel architecture that leverages decoder-only...

1 min 1 month, 1 week ago

ead

LOW Academic International

Contradiction to Consensus: Dual Perspective, Multi Source Retrieval Based Claim Verification with Source Level Disagreement using LLM

arXiv:2602.18693v1 Announce Type: new Abstract: The spread of misinformation across digital platforms can pose significant societal risks. Claim verification, a.k.a. fact-checking, systems can help identify potential misinformation. However, their efficacy is limited by the knowledge sources that they rely on....

1 min 1 month, 1 week ago

ead

LOW Academic International

Rethinking Retrieval-Augmented Generation as a Cooperative Decision-Making Problem

arXiv:2602.18734v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has demonstrated strong effectiveness in knowledge-intensive tasks by grounding language generation in external evidence. Despite its success, many existing RAG systems are built based on a ranking-centric, asymmetric dependency paradigm, where the...

1 min 1 month, 1 week ago

tps

LOW Academic International

ArabicNumBench: Evaluating Arabic Number Reading in Large Language Models

arXiv:2602.18776v1 Announce Type: new Abstract: We present ArabicNumBench, a comprehensive benchmark for evaluating large language models on Arabic number reading tasks across Eastern Arabic-Indic numerals (0-9 in Arabic script) and Western Arabic numerals (0-9). We evaluate 71 models from 10...

1 min 1 month, 1 week ago

ead

LOW Academic International

Yor-Sarc: A gold-standard dataset for sarcasm detection in a low-resource African language

arXiv:2602.18964v1 Announce Type: new Abstract: Sarcasm detection poses a fundamental challenge in computational semantics, requiring models to resolve disparities between literal and intended meaning. The challenge is amplified in low-resource languages where annotated datasets are scarce or nonexistent. We present...

1 min 1 month, 1 week ago

tps

LOW Academic International

HumanMCP: A Human-Like Query Dataset for Evaluating MCP Tool Retrieval Performance

arXiv:2602.23367v1 Announce Type: new Abstract: Model Context Protocol (MCP) servers contain a collection of thousands of open-source standardized tools, linking LLMs to external systems; however, existing datasets and benchmarks lack realistic, human-like user queries, remaining a critical gap in evaluating...

1 min 1 month, 1 week ago

ead

LOW Academic International

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

arXiv:2602.23632v1 Announce Type: new Abstract: Synthesizing high-quality training data is crucial for enhancing domain models' reasoning abilities. Existing methods face limitations in long-tail knowledge coverage, effectiveness verification, and interpretability. Knowledge-graph-based approaches still fall short in functionality, granularity, customizability, and evaluation....

1 min 1 month, 1 week ago

tps

LOW Academic International

From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems

arXiv:2602.23701v1 Announce Type: new Abstract: LLM-powered Multi-Agent Systems (MAS) have demonstrated remarkable capabilities in complex domains but suffer from inherent fragility and opaque failure mechanisms. Existing failure attribution methods, whether relying on direct prompting, costly replays, or supervised fine-tuning, typically...

1 min 1 month, 1 week ago

ead

LOW Academic International

ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation

arXiv:2602.23716v1 Announce Type: new Abstract: Large Language Model (LLM)-based agents show promise for e-commerce conversational shopping, yet existing implementations lack the interaction depth and contextual breadth required for complex product research. Meanwhile, the Deep Research paradigm, despite advancing information synthesis...

1 min 1 month, 1 week ago

ead

LOW Academic International

Reasoning-Driven Multimodal LLM for Domain Generalization

arXiv:2602.23777v1 Announce Type: new Abstract: This paper addresses the domain generalization (DG) problem in deep learning. While most DG methods focus on enforcing visual feature invariance, we leverage the reasoning capability of multimodal large language models (MLLMs) and explore the...

1 min 1 month, 1 week ago

ead

LOW Academic International

RF-Agent: Automated Reward Function Design via Language Agent Tree Search

arXiv:2602.23876v1 Announce Type: new Abstract: Designing efficient reward functions for low-level control tasks is a challenging problem. Recent research aims to reduce reliance on expert experience by using Large Language Models (LLMs) with task information to generate dense reward functions....

1 min 1 month, 1 week ago

tps

LOW Academic International

Pessimistic Auxiliary Policy for Offline Reinforcement Learning

arXiv:2602.23974v1 Announce Type: new Abstract: Offline reinforcement learning aims to learn an agent from pre-collected datasets, avoiding unsafe and inefficient real-time interaction. However, inevitable access to out-ofdistribution actions during the learning process introduces approximation errors, causing the error accumulation and...

1 min 1 month, 1 week ago

ead

LOW Academic International

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

arXiv:2602.24110v1 Announce Type: new Abstract: Reinforcement Learning from Verifiable Rewards (RLVR) has emerged as a powerful paradigm for enhancing the complex reasoning capabilities of Large Reasoning Models. However, standard outcome-based supervision suffers from a critical limitation that penalizes trajectories that...

1 min 1 month, 1 week ago

ead

LOW Academic International

LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics

arXiv:2602.24173v1 Announce Type: new Abstract: We present a new approach for benchmarking Large Language Model (LLM) capabilities on research-level mathematics. Existing benchmarks largely rely on static, hand-curated sets of contest or textbook-style problems as proxies for mathematical research. Instead, we...

1 min 1 month, 1 week ago

ead

LOW Academic International

Reason to Contrast: A Cascaded Multimodal Retrieval Framework

arXiv:2602.23369v1 Announce Type: cross Abstract: Traditional multimodal retrieval systems rely primarily on bi-encoder architectures, where performance is closely tied to embedding dimensionality. Recent work, Think-Then-Embed (TTE), shows that incorporating multimodal reasoning to elicit additional informative tokens before embedding can further...

1 min 1 month, 1 week ago

ead

LOW Academic International

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

arXiv:2602.23370v1 Announce Type: cross Abstract: Long-document topic segmentation plays an important role in information retrieval and document understanding, yet existing methods still show clear shortcomings in ultra-long text settings. Traditional discriminative models are constrained by fixed windows and cannot model...

1 min 1 month, 1 week ago

ead

LOW Academic International

Hello-Chat: Towards Realistic Social Audio Interactions

arXiv:2602.23387v1 Announce Type: cross Abstract: Recent advancements in Large Audio Language Models (LALMs) have demonstrated exceptional performance in speech recognition and translation. However, existing models often suffer from a disconnect between perception and expression, resulting in a robotic "read-speech" style...

1 min 1 month, 1 week ago

ead

LOW Academic International

Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages

arXiv:2602.23388v1 Announce Type: cross Abstract: The rising demand for inclusive speech technologies amplifies the need for multilingual datasets for Natural Language Processing (NLP) research. However, limited awareness of existing task-specific resources in low-resource languages hinders research. This challenge is especially...

1 min 1 month, 1 week ago

ead

LOW Academic International

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

arXiv:2602.23499v1 Announce Type: cross Abstract: Collecting a high-quality dataset is a critical task that demands meticulous attention to detail, as overlooking certain aspects can render the entire dataset unusable. Autonomous driving challenges remain a prominent area of research, requiring further...

1 min 1 month, 1 week ago

ead

The 2025 AI Agent Index: Documenting Technical and Safety Features of Deployed Agentic AI Systems

Feedback-based Automated Verification in Vibe Coding of CAS Adaptation Built on Constraint Logic

TPRU: Advancing Temporal and Procedural Understanding in Large Multimodal Models

(Perlin) Noise as AI coordinator

Agentic Problem Frames: A Systematic Approach to Engineering Reliable Domain Agents

Defining Explainable AI for Requirements Analysis

Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

Time Series, Vision, and Language: Exploring the Limits of Alignment in Contrastive Representation Spaces

ReportLogic: Evaluating Logical Quality in Deep Research Reports

The Million-Label NER: Breaking Scale Barriers with GLiNER bi-encoder

Luna-2: Scalable Single-Token Evaluation with Small Language Models

Contradiction to Consensus: Dual Perspective, Multi Source Retrieval Based Claim Verification with Source Level Disagreement using LLM

Rethinking Retrieval-Augmented Generation as a Cooperative Decision-Making Problem

ArabicNumBench: Evaluating Arabic Number Reading in Large Language Models

Yor-Sarc: A gold-standard dataset for sarcasm detection in a low-resource African language

HumanMCP: A Human-Like Query Dataset for Evaluating MCP Tool Retrieval Performance

MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

From Flat Logs to Causal Graphs: Hierarchical Failure Attribution for LLM-based Multi-Agent Systems

ProductResearch: Training E-Commerce Deep Research Agents via Multi-Agent Synthetic Trajectory Distillation

Reasoning-Driven Multimodal LLM for Domain Generalization

RF-Agent: Automated Reward Function Design via Language Agent Tree Search

Pessimistic Auxiliary Policy for Offline Reinforcement Learning

Recycling Failures: Salvaging Exploration in RLVR via Fine-Grained Off-Policy Guidance

LemmaBench: A Live, Research-Level Benchmark to Evaluate LLM Capabilities in Mathematics

Reason to Contrast: A Cascaded Multimodal Retrieval Framework

Toward General Semantic Chunking: A Discriminative Framework for Ultra-Long Documents

Hello-Chat: Towards Realistic Social Audio Interactions

Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages

TaCarla: A comprehensive benchmarking dataset for end-to-end autonomous driving

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.