Benchmarking State Space Models, Transformers, and Recurrent Networks for US Grid Forecasting
arXiv:2602.21415v1 Announce Type: new Abstract: Selecting the right deep learning model for power grid forecasting is challenging, as performance heavily depends on the data available to the operator. This paper presents a comprehensive benchmark of five modern neural architectures: two state space models (PowerMamba, S-Mamba), two Transformers (iTransformer, PatchTST), and a traditional LSTM. We evaluate these models on hourly electricity demand across six diverse US power grids for forecast windows between 24 and 168 hours. To ensure a fair comparison, we adapt each model with specialized temporal processing and a modular layer that cleanly integrates weather covariates. Our results reveal that there is no single best model for all situations. When forecasting using only historical load, PatchTST and the state space models provide the highest accuracy. However, when explicit weather data is added to the inputs, the rankings reverse: iTransformer improves its accuracy three times mor
arXiv:2602.21415v1 Announce Type: new Abstract: Selecting the right deep learning model for power grid forecasting is challenging, as performance heavily depends on the data available to the operator. This paper presents a comprehensive benchmark of five modern neural architectures: two state space models (PowerMamba, S-Mamba), two Transformers (iTransformer, PatchTST), and a traditional LSTM. We evaluate these models on hourly electricity demand across six diverse US power grids for forecast windows between 24 and 168 hours. To ensure a fair comparison, we adapt each model with specialized temporal processing and a modular layer that cleanly integrates weather covariates. Our results reveal that there is no single best model for all situations. When forecasting using only historical load, PatchTST and the state space models provide the highest accuracy. However, when explicit weather data is added to the inputs, the rankings reverse: iTransformer improves its accuracy three times more efficiently than PatchTST. By controlling for model size, we confirm that this advantage stems from the architecture's inherent ability to mix information across different variables. Extending our evaluation to solar generation, wind power, and wholesale prices further demonstrates that model rankings depend on the forecast task: PatchTST excels on highly rhythmic signals like solar, while state space models are better suited for the chaotic fluctuations of wind and price. Ultimately, this benchmark provides grid operators with actionable guidelines for selecting the optimal forecasting architecture based on their specific data environments.
Executive Summary
This article presents a comprehensive benchmark of five modern neural architectures for US grid forecasting, evaluating their performance on hourly electricity demand across six diverse power grids. The results reveal that no single model is best for all situations, with PatchTST and state space models excelling on historical load forecasting, while iTransformer outperforms with the addition of explicit weather data. The study controls for model size and demonstrates that rankings depend on the forecast task. The benchmark provides grid operators with actionable guidelines for selecting the optimal forecasting architecture based on their data environments.
Key Points
- ▸ The study evaluates five modern neural architectures for US grid forecasting: PowerMamba, S-Mamba, iTransformer, PatchTST, and LSTM.
- ▸ The results reveal that no single model is best for all situations, with performance depending on the availability of explicit weather data.
- ▸ PatchTST and state space models excel on historical load forecasting, while iTransformer outperforms with the addition of explicit weather data.
- ▸ The study demonstrates that rankings depend on the forecast task, with PatchTST excelling on highly rhythmic signals and state space models better suited for chaotic fluctuations.
Merits
Comprehensive Benchmark
The study provides a comprehensive benchmark of five modern neural architectures, offering a detailed comparison of their performance on a range of tasks.
Control for Model Size
The study controls for model size, allowing for a fair comparison of the architectures' performance and providing insight into their inherent abilities.
Actionable Guidelines
The study provides grid operators with actionable guidelines for selecting the optimal forecasting architecture based on their specific data environments.
Demerits
Limited Generalizability
The study focuses on US power grids, and its findings may not be directly applicable to other regions or grid systems.
Lack of Real-World Deployment
The study does not provide any information on the real-world deployment of the models, making it difficult to assess their practical applicability.
Expert Commentary
The study provides a comprehensive and well-designed benchmark of modern neural architectures for US grid forecasting. The results are insightful and contribute to the growing body of research on deep learning in energy forecasting. However, the study's limitations, such as the lack of real-world deployment and limited generalizability, should be considered when interpreting the findings. The study's implications for grid resilience and reliability are significant, and its findings have the potential to inform the development of more accurate and efficient forecasting systems.
Recommendations
- ✓ Future studies should investigate the application of modern neural architectures to other types of grid systems and regions.
- ✓ Researchers should prioritize the development of more accurate and efficient forecasting systems, leveraging the insights gained from this study to inform their work.