Labor & Employment

LOW Academic International

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

arXiv:2603.02239v1 Announce Type: new Abstract: The Engineering Reasoning and Instruction (ERI) benchmark is a taxonomy-driven instruction dataset designed to train and evaluate engineering-capable large language models (LLMs) and agents. This dataset spans nine engineering fields (namely: civil, mechanical, electrical, chemical,...

1 min 1 month, 1 week ago

ada

LOW Academic United States

SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

arXiv:2603.02240v1 Announce Type: new Abstract: We present SuperLocalMemory, a local-first memory system for multi-agent AI that defends against OWASP ASI06 memory poisoning through architectural isolation and Bayesian trust scoring, while personalizing retrieval through adaptive learning-to-rank -- all without cloud dependencies...

1 min 1 month, 1 week ago

ada

LOW Academic European Union

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

arXiv:2603.02540v1 Announce Type: new Abstract: Large language models (LLMs) exhibit a unified "general factor" of capability across 10 benchmarks, a finding confirmed by our factor analysis of 156 models, yet they still struggle with simple, trivial tasks for humans. This...

1 min 1 month, 1 week ago

ada

LOW Academic International

AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows

arXiv:2603.02601v1 Announce Type: new Abstract: Autonomous AI agents are deployed at unprecedented scale, yet no principled methodology exists for verifying that an agent has not regressed after changes to its prompts, tools, models, or orchestration logic. We present AgentAssay, the...

1 min 1 month, 1 week ago

ada

LOW Academic International

See and Remember: A Multimodal Agent for Web Traversal

arXiv:2603.02626v1 Announce Type: new Abstract: Autonomous web navigation requires agents to perceive complex visual environments and maintain long-term context, yet current Large Language Model (LLM) based agents often struggle with spatial disorientation and navigation loops. In this paper, we propose...

1 min 1 month, 1 week ago

ada

LOW Academic International

A Natural Language Agentic Approach to Study Affective Polarization

arXiv:2603.02711v1 Announce Type: new Abstract: Affective polarization has been central to political and social studies, with growing focus on social media, where partisan divisions are often exacerbated. Real-world studies tend to have limited scope, while simulated studies suffer from insufficient...

1 min 1 month, 1 week ago

labor

LOW Academic International

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

arXiv:2603.02798v1 Announce Type: new Abstract: As LLM-powered agents have been used for high-stakes decision-making, such as clinical diagnosis, it becomes critical to develop reliable verification of their decisions to facilitate trustworthy deployment. Yet, existing verifiers usually underperform owing to a...

1 min 1 month, 1 week ago

discrimination

LOW Academic International

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

arXiv:2603.02908v1 Announce Type: new Abstract: In recent years, pre-trained large language models have achieved remarkable success across diverse tasks. Besides the pivotal role of self-supervised pre-training, their effectiveness in downstream applications also depends critically on the post-training process, which adapts...

1 min 1 month, 1 week ago

ada

LOW Academic International

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

arXiv:2603.02939v1 Announce Type: new Abstract: Recent advancements in reinforcement fine-tuning have significantly improved the reasoning ability of large language models (LLMs). In particular, methods such as group relative policy optimization (GRPO) have demonstrated strong capabilities across various fields. However, applying...

1 min 1 month, 1 week ago

ada

LOW Academic International

Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

arXiv:2603.03175v1 Announce Type: new Abstract: Saarthi is an agentic AI framework that uses multi-agent collaboration to perform end-to-end formal verification. Even though the framework provides a complete flow from specification to coverage closure, with around 40% efficacy, there are several...

1 min 1 month, 1 week ago

labor

LOW Academic International

No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

arXiv:2603.03203v1 Announce Type: new Abstract: CDD, or Contamination Detection via output Distribution, identifies data contamination by measuring the peakedness of a model's sampled outputs. We study the conditions under which this approach succeeds and fails on small language models ranging...

1 min 1 month, 1 week ago

ada

LOW Academic International

Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

arXiv:2603.03242v1 Announce Type: new Abstract: Language models deployed in online communities must adapt to norms that vary across social, cultural, and domain-specific contexts. Prior alignment approaches rely on explicit preference supervision or predefined principles, which are effective for well-resourced settings...

1 min 1 month, 1 week ago

ada

LOW Academic International

RO-N3WS: Enhancing Generalization in Low-Resource ASR with Diverse Romanian Speech Benchmarks

arXiv:2603.02368v1 Announce Type: new Abstract: We introduce RO-N3WS, a benchmark Romanian speech dataset designed to improve generalization in automatic speech recognition (ASR), particularly in low-resource and out-of-distribution (OOD) conditions. RO-N3WS comprises over 126 hours of transcribed audio collected from broadcast...

1 min 1 month, 1 week ago

ada

LOW Academic International

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

arXiv:2603.02464v1 Announce Type: new Abstract: Automatic Speech Recognition (ASR) in dialect-heavy settings remains challenging due to strong regional variation and limited labeled data. We propose GLoRIA, a parameter-efficient adaptation framework that leverages metadata (e.g., coordinates) to modulate low-rank updates in...

1 min 1 month, 1 week ago

ada

LOW Academic United States

ExpGuard: LLM Content Moderation in Specialized Domains

arXiv:2603.02588v1 Announce Type: new Abstract: With the growing deployment of large language models (LLMs) in real-world applications, establishing robust safety guardrails to moderate their inputs and outputs has become essential to ensure adherence to safety policies. Current guardrail models predominantly...

1 min 1 month, 1 week ago

ada

LOW Academic International

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

arXiv:2603.03680v1 Announce Type: new Abstract: Large Language Model (LLM) agents have demonstrated remarkable proficiency in learned tasks, yet they often struggle to adapt to non-stationary environments with feedback. While In-Context Learning and external memory offer some flexibility, they fail to...

1 min 1 month, 1 week ago

ada

LOW Academic International

RAGNav: A Retrieval-Augmented Topological Reasoning Framework for Multi-Goal Visual-Language Navigation

arXiv:2603.03745v1 Announce Type: new Abstract: Vision-Language Navigation (VLN) is evolving from single-point pathfinding toward the more challenging Multi-Goal VLN. This task requires agents to accurately identify multiple entities while collaboratively reasoning over their spatial-physical constraints and sequential execution order. However,...

1 min 1 month, 1 week ago

labor

LOW Academic United States

Specification-Driven Generation and Evaluation of Discrete-Event World Models via the DEVS Formalism

arXiv:2603.03784v1 Announce Type: new Abstract: World models are essential for planning and evaluation in agentic systems, yet existing approaches lie at two extremes: hand-engineered simulators that offer consistency and reproducibility but are costly to adapt, and implicit neural models that...

1 min 1 month, 1 week ago

ada

LOW Academic International

In-Context Environments Induce Evaluation-Awareness in Language Models

arXiv:2603.03824v1 Announce Type: new Abstract: Humans often become more self-aware under threat, yet can lose self-awareness when absorbed in a task; we hypothesize that language models exhibit environment-dependent \textit{evaluation awareness}. This raises concerns that models could strategically underperform, or \textit{sandbag},...

1 min 1 month, 1 week ago

ada

LOW Academic International

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

arXiv:2603.04191v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly serving as personal assistants, where users share complex and diverse preferences over extended interactions. However, assessing how well LLMs can follow these preferences in realistic, long-term situations remains underexplored....

1 min 1 month, 1 week ago

ada

LOW Academic International

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

arXiv:2603.04457v1 Announce Type: new Abstract: The fundamental topology of manufacturing has not undergone a paradigm-level transformation since Henry Ford's moving assembly line in 1913. Every major innovation of the past century, from the Toyota Production System to Industry 4.0, has...

1 min 1 month, 1 week ago

labor

LOW Academic International

Adaptive Memory Admission Control for LLM Agents

arXiv:2603.04549v1 Announce Type: new Abstract: LLM-based agents increasingly rely on long-term memory to support multi-session reasoning and interaction, yet current systems provide little control over what information is retained. In practice, agents either accumulate large volumes of conversational content, including...

1 min 1 month, 1 week ago

ada

LOW Academic United States

From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security

arXiv:2603.04723v1 Announce Type: new Abstract: Shoplifting is a growing operational and economic challenge for retailers, with incidents rising and losses increasing despite extensive video surveillance. Continuous human monitoring is infeasible, motivating automated, privacy-preserving, and resource-aware detection solutions. In this paper,...

1 min 1 month, 1 week ago

ada

LOW Academic United States

Visioning Human-Agentic AI Teaming: Continuity, Tension, and Future Research

arXiv:2603.04746v1 Announce Type: new Abstract: Artificial intelligence is undergoing a structural transformation marked by the rise of agentic systems capable of open-ended action trajectories, generative representations and outputs, and evolving objectives. These properties introduce structural uncertainty into human-AI teaming (HAT),...

1 min 1 month, 1 week ago

ada

LOW Academic International

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

arXiv:2603.04822v1 Announce Type: new Abstract: Aligning Large Language Models (LLMs) with nuanced human values remains a critical challenge, as existing methods like Reinforcement Learning from Human Feedback (RLHF) often handle only coarse-grained attributes. In practice, fine-tuning LLMs on task-specific datasets...

1 min 1 month, 1 week ago

ada

LOW Academic European Union

Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

arXiv:2603.04837v1 Announce Type: new Abstract: We introduce the Dynamic Behavioral Constraint (DBC) benchmark, the first empirical framework for evaluating the efficacy of a structured, 150-control behavioral governance layer, the MDBC (Madan DBC) system, applied at inference time to large language...

1 min 1 month, 1 week ago

ada

LOW Academic International

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

arXiv:2603.04873v1 Announce Type: new Abstract: Accurate time series forecasting underpins decision-making across domains, yet conventional ML development suffers from data scarcity in new deployments, poor adaptability under distribution shift, and diminishing returns from manual iteration. We propose Self-Evolving Agent for...

1 min 1 month, 1 week ago

ada

LOW Academic International

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

arXiv:2603.04885v1 Announce Type: new Abstract: Real-world dialogue usually unfolds as an infinite stream. It thus requires bounded-state memory mechanisms to operate within an infinite horizon. However, existing read-then-think memory is fundamentally misaligned with this setting, as it cannot support ad-hoc...

1 min 1 month, 1 week ago

ada

LOW Academic International

Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

arXiv:2603.04896v1 Announce Type: new Abstract: The rapid adoption of vision-language models (VLMs) has heightened the demand for robust intellectual property (IP) protection of these high-value pretrained models. Effective IP protection should proactively confine model deployment within authorized domains and prevent...

1 min 1 month, 1 week ago

ada

LOW Academic International

Knowledge-informed Bidding with Dual-process Control for Online Advertising

arXiv:2603.04920v1 Announce Type: new Abstract: Bid optimization in online advertising relies on black-box machine-learning models that learn bidding decisions from historical data. However, these approaches fail to replicate human experts' adaptive, experience-driven, and globally coherent decisions. Specifically, they generalize poorly...

1 min 1 month, 1 week ago

ada

Engineering Reasoning and Instruction (ERI) Benchmark: A Large Taxonomy-driven Dataset for Foundation Models and Agents

SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

A Neuropsychologically Grounded Evaluation of LLM Cognitive Abilities

AgentAssay: Token-Efficient Regression Testing for Non-Deterministic AI Agent Workflows

See and Remember: A Multimodal Agent for Web Traversal

A Natural Language Agentic Approach to Study Affective Polarization

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

SAE as a Crystal Ball: Interpretable Features Predict Cross-domain Transferability of LLMs without Training

ShipTraj-R1: Reinforcing Ship Trajectory Prediction in Large Language Models via Group Relative Policy Optimization

Saarthi for AGI: Towards Domain-Specific General Intelligence for Formal Verification

No Memorization, No Detection: Output Distribution-Based Contamination Detection in Small Language Models

Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

RO-N3WS: Enhancing Generalization in Low-Resource ASR with Diverse Romanian Speech Benchmarks

GLoRIA: Gated Low-Rank Interpretable Adaptation for Dialectal ASR

ExpGuard: LLM Content Moderation in Specialized Domains

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation

RAGNav: A Retrieval-Augmented Topological Reasoning Framework for Multi-Goal Visual-Language Navigation

Specification-Driven Generation and Evaluation of Discrete-Event World Models via the DEVS Formalism

In-Context Environments Induce Evaluation-Awareness in Language Models

Towards Realistic Personalization: Evaluating Long-Horizon Preference Following in Personalized User-LLM Interactions

Capability Thresholds and Manufacturing Topology: How Embodied Intelligence Triggers Phase Transitions in Economic Geography

Adaptive Memory Admission Control for LLM Agents

From Offline to Periodic Adaptation for Pose-Based Shoplifting Detection in Real-world Retail Security

Visioning Human-Agentic AI Teaming: Continuity, Tension, and Future Research

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Design Behaviour Codes (DBCs): A Taxonomy-Driven Layered Governance Benchmark for Large Language Models

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

Authorize-on-Demand: Dynamic Authorization with Legality-Aware Intellectual Property Protection for VLMs

Knowledge-informed Bidding with Dual-process Control for Online Advertising

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.