Academic

Latest First Most Viewed Alphabetical

All Conference (266) Law Review (314) Academic (4957) Think Tank (60) News (791) Journal (139) Technology & AI (4) Business & Strategy (1) Finance & Economics (2) Legal & Compliance (1) Innovation & Research (0) International Affairs (2) Cybersecurity (2) Healthcare & Biotech (2)

Academic · 1 min

On the Ability of Transformers to Verify Plans

arXiv:2603.19954v1 Announce Type: new Abstract: Transformers have shown inconsistent success in AI planning tasks, and theoretical understanding of when generalization should be expected has been …

Yash Sarrof, Yupei Du, Katharina Stein, Alexander Koller, Sylvie Thi\'ebaux, Michael Hahn

32 views Mar 23

Academic · 1 min

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

arXiv:2603.19685v1 Announce Type: new Abstract: Large language model (LLM)-based agents have emerged as powerful autonomous controllers for digital environments, including mobile interfaces, operating systems, and …

Taiyi Wang, Sian Gooding, Florian Hartmann, Oriana Riva, Edward Grefenstette

101 views Mar 23

Academic · 1 min

LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

arXiv:2603.19255v1 Announce Type: cross Abstract: Despite the strong performance of Large Language Models (LLMs) on complex instruction-following tasks, precise control of output length remains a …

Wei Zhang, Lintong Du, Yuanhe Zhang, Zhenhong Zhou, Kun Wang, Li Sun, Sen Su

33 views Mar 23

Academic · 1 min

Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

arXiv:2603.20046v1 Announce Type: new Abstract: Reinforcement Learning (RL) with rubric-based rewards has recently shown remarkable progress in enhancing general reasoning capabilities of Large Language Models …

Wenjian Zhang, Kongcheng Zhang, Jiaxin Qi, Baisheng Lai, Jianqiang Huang

28 views Mar 23

Academic · 1 min

Pitfalls in Evaluating Interpretability Agents

arXiv:2603.20101v1 Announce Type: new Abstract: Automated interpretability systems aim to reduce the need for human labor and scale analysis to increasingly large models and diverse …

Tal Haklay, Nikhil Prakash, Sana Pandey, Antonio Torralba, Aaron Mueller, Jacob Andreas, Tamar Rott Shaham, Yonatan Belinkov

31 views Mar 23

Academic · 1 min

PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

arXiv:2603.19584v1 Announce Type: new Abstract: Battery life remains a critical challenge for mobile devices, yet existing power management mechanisms rely on static rules or coarse-grained …

Xingyu Feng, Chang Sun, Yuzhu Wang, Zhangbing Zhou, Chengwen Luo, Zhuangzhuang Chen, Xiaomin Ouyang, Huanqi Yang

120 views Mar 23

Academic · 1 min

Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

arXiv:2603.19715v1 Announce Type: new Abstract: Formal verification via interactive theorem proving is increasingly used to ensure the correctness of critical systems, yet constructing large proof …

Baoding He, Zenan Li, Wei Sun, Yuan Yao, Taolue Chen, Xiaoxing Ma, Zhendong Su

39 views Mar 23

Academic · 1 min

Hyperagents

arXiv:2603.19461v1 Announce Type: new Abstract: Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. …

Jenny Zhang, Bingchen Zhao, Wannan Yang, Jakob Foerster, Jeff Clune, Minqi Jiang, Sam Devlin, Tatiana Shavrina

28 views Mar 23

Academic · 1 min

Breeze Taigi: Benchmarks and Models for Taiwanese Hokkien Speech Recognition and Synthesis

arXiv:2603.19259v1 Announce Type: cross Abstract: Taiwanese Hokkien (Taigi) presents unique opportunities for advancing speech technology methodologies that can generalize to diverse linguistic contexts. We introduce …

Yu-Siang Lan, Chia-Sheng Liu, Yi-Chang Chen, Po-Chun Hsu, Allyson Chiu, Shun-Wen Lin, Da-shan Shiu, Yuan-Fu Liao

49 views Mar 23

Academic · 1 min

GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

arXiv:2603.19252v1 Announce Type: cross Abstract: Evaluating the symbolic reasoning of large language models (LLMs) calls for geometry benchmarks that require multi-step proofs grounded in both …

Yushun Zhang, Weiping Fu, Zesheng Yang, Bo Zhao, Lingling Zhang, Jian Zhang, Yumeng Fu, Jiaxing Huang, Jun Liu

34 views Mar 23

Academic · 1 min

PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning

arXiv:2603.19579v1 Announce Type: new Abstract: Multi-objective reinforcement learning (MORL) provides an effective solution for decision-making problems involving conflicting objectives. However, achieving high-quality approximations to the …

Tianmeng Hu, Biao Luo

30 views Mar 23

Academic · 1 min

L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

arXiv:2603.19236v1 Announce Type: cross Abstract: The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework provides a rigorous foundation for evidence synthesis, yet the …

Samar Shailendra, Rajan Kadel, Aakanksha Sharma, Islam Mohammad Tahidul, Urvashi Rahul Saxena

77 views Mar 23

← Previous

89 90 91 92 93

Academic

On the Ability of Transformers to Verify Plans

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

LARFT: Closing the Cognition-Action Gap for Length Instruction Following in Large Language Models

Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Pitfalls in Evaluating Interpretability Agents

PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

Stepwise: Neuro-Symbolic Proof Search for Automated Systems Verification

Hyperagents

Breeze Taigi: Benchmarks and Models for Taiwanese Hokkien Speech Recognition and Synthesis

GeoChallenge: A Multi-Answer Multiple-Choice Benchmark for Geometric Reasoning with Diagrams

PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning

L-PRISMA: An Extension of PRISMA in the Era of Generative Artificial Intelligence (GenAI)

JCG, PC

HSOLLC Co., Ltd.