TemporalBench: A Benchmark for Evaluating LLM-Based Agents on Contextual and Event-Informed Time Series Tasks
arXiv:2602.13272v1 Announce Type: new Abstract: It is unclear whether strong forecasting performance reflects genuine temporal understanding or the ability to reason under contextual and event-driven …
Muyan Weng, Defu Cao, Wei Yang, Yashaswi Sharma, Yan Liu
3 views