Academic

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

Tashreef Muhammad, Tahsin Ahmed, Meherun Farzana, Md. Mahmudul Hasan, Abrar Eyasir, Md. Emon Khan, Mahafuzul Islam Shawon, Ferdous Mondol, Mahmudul Hasan, Muhammad Ibrahim · April 9, 2026 · 1 min read · 50 views

#cs.LG #econ.EM

arXiv:2604.06227v1 Announce Type: new Abstract: Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. This paper makes two contributions. First, we introduce AgriPriceBD, a benchmark dataset of 1,779 daily retail mid-prices for five Bangladeshi commodities - garlic, chickpea, green chilli, cucumber, and sweet pumpkin - spanning July 2020 to June 2025, extracted from government reports via an LLM-assisted digitisation pipeline. Second, we evaluate seven forecasting approaches spanning classical models - na\"{i}ve persistence, SARIMA, and Prophet - and deep learning architectures - BiLSTM, Transformer, Time2Vec-enhanced Transformer, and Informer - with Diebold-Mariano statistical significance tests. Commodity price forecastability is fundamentally heterogeneous: na\"{i}ve persistence dominates on near-random-walk commodities. Time2Vec temporal encoding provides no statistically significant advantage over fixed sinusoidal encoding and causes catastrophic degradation on green chilli (+146.1% MAE, p<0.001). Prophet fails systematically, attributable to discrete step-function price dynamics incompatible with its smooth decomposition assumptions. Informer produces erratic predictions (variance up to 50x ground-truth), confirming sparse-attention Transformers require substantially larger training sets than small agricultural datasets provide. All code, models, and data are released publicly to support replication and future forecasting research on agricultural commodity markets in Bangladesh and similar developing economies.

Executive Summary

This paper introduces AgriPriceBD, a novel dataset of daily retail prices for five Bangladeshi agricultural commodities, spanning July 2020 to June 2025, created using an LLM-assisted digitisation pipeline. The authors rigorously benchmark seven classical and deep learning models for short-term price forecasting. Key findings reveal significant heterogeneity in commodity price forecastability, with simpler models like naive persistence often outperforming complex deep learning architectures on near-random-walk commodities. Prophet systematically fails due to its smooth decomposition assumptions clashing with discrete price dynamics, while Informer struggles with limited data. The study highlights the challenges of applying sophisticated models to small, volatile agricultural datasets and provides valuable open-source resources for future research.

Key Points

▸ Introduction of AgriPriceBD, a new daily retail price dataset for five Bangladeshi agricultural commodities (July 2020 - June 2025).
▸ LLM-assisted digitisation pipeline employed for dataset creation from government reports.
▸ Benchmarking of seven forecasting models: Naive persistence, SARIMA, Prophet, BiLSTM, Transformer, Time2Vec-enhanced Transformer, and Informer.
▸ Significant heterogeneity in commodity price forecastability, with naive persistence often superior for near-random-walk commodities.
▸ Prophet's systematic failure attributed to incompatibility with discrete step-function price dynamics.
▸ Informer's erratic predictions linked to insufficient training data for sparse-attention Transformers.
▸ Time2Vec temporal encoding provided no statistically significant advantage and sometimes degraded performance.

Merits

Novel Dataset Contribution

The creation and public release of AgriPriceBD addresses a critical data scarcity issue in South Asia for agricultural commodity price forecasting, enabling future research and practical applications.

Rigorous Benchmarking Methodology

The comprehensive evaluation across classical and deep learning models, coupled with Diebold-Mariano statistical significance tests, provides robust and reliable comparative insights into model performance.

Transparency and Reproducibility

The public release of all code, models, and data is exemplary, fostering transparency, reproducibility, and collaborative research within the academic community.

Practical Insights into Model Limitations

The detailed analysis of why specific models (e.g., Prophet, Informer) fail under certain conditions offers crucial guidance for practitioners and researchers in selecting appropriate forecasting tools.

Demerits

Limited Scope of Economic Factors

While focusing on technical forecasting, the analysis does not incorporate external economic or climatic variables that significantly influence agricultural prices, potentially limiting model accuracy and explanatory power.

Absence of Feature Engineering Exploration

The study primarily compares raw model architectures without exploring the impact of advanced feature engineering (e.g., lagged variables, moving averages, volatility indicators) which could enhance forecasting performance.

Forecasting Horizon Specificity

The abstract specifies 'short-term forecasting,' but a more precise definition of the forecast horizon (e.g., 1-day, 7-day) and its implications for model choice would strengthen the analysis.

Expert Commentary

This paper is a commendable contribution to the nascent field of agricultural commodity price forecasting in developing economies. The introduction of AgriPriceBD is particularly significant, as data scarcity remains a formidable barrier to applying advanced analytical techniques in such contexts. The rigorous benchmarking, employing both classical and deep learning models with statistical significance tests, provides a robust foundation for future work. The findings regarding the heterogeneous forecastability of commodities and the systematic failures of certain models (Prophet, Informer) due to data characteristics or model assumptions are crucial. They underscore a critical lesson: model complexity does not inherently equate to superior performance, especially with limited and volatile real-world data. The transparent release of all resources is highly laudable and sets a high standard for academic research in this domain. This work will undoubtedly serve as a vital reference point for researchers and policymakers striving to enhance food security and economic stability in similar markets.

Recommendations

✓ Future research should explore the integration of exogenous variables (e.g., weather patterns, global market prices, macroeconomic indicators, supply chain disruptions) into the forecasting models to enhance accuracy and provide richer explanatory power.
✓ Investigate advanced feature engineering techniques and ensemble methods that combine the strengths of different models, particularly for commodities exhibiting mixed price dynamics.
✓ Conduct a sensitivity analysis on the forecasting horizon, evaluating model performance across various short-term periods (e.g., 1-day, 3-day, 7-day, 30-day) to provide more granular insights for different decision-making needs.
✓ Explore alternative deep learning architectures or adaptations specifically designed for small, high-volatility time series datasets, potentially incorporating transfer learning or meta-learning approaches.
✓ Consider incorporating uncertainty quantification (e.g., prediction intervals) into the forecasts, as point estimates alone may not provide sufficient information for risk-averse decision-making in agricultural markets.

Sources

Original: arXiv - cs.LG

arXiv - cs.LG

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

AI Commentary

Executive Summary

Key Points

Merits

Novel Dataset Contribution

Rigorous Benchmarking Methodology

Transparency and Reproducibility

Practical Insights into Model Limitations

Demerits

Limited Scope of Economic Factors

Absence of Feature Engineering Exploration

Forecasting Horizon Specificity

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs