SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks
arXiv:2602.21307v1 Announce Type: new Abstract: Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited due to the engineering barrier of integrating symbolic regression into deep learning workflows. We introduce SymTorch, a library that automates this distillation by wrapping neural network components, collecting their input-output behavior, and approximating them with human-readable equations via PySR. SymTorch handles the engineering challenges that have hindered adoption: GPU-CPU data transfer, input-output caching, model serialization, and seamless switching between neural and symbolic forward passes. We demonstrate SymTorch across diverse architectures including GNNs, PINNs and transformer models. Finally, we present a proof-of-concept for accel
arXiv:2602.21307v1 Announce Type: new Abstract: Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited due to the engineering barrier of integrating symbolic regression into deep learning workflows. We introduce SymTorch, a library that automates this distillation by wrapping neural network components, collecting their input-output behavior, and approximating them with human-readable equations via PySR. SymTorch handles the engineering challenges that have hindered adoption: GPU-CPU data transfer, input-output caching, model serialization, and seamless switching between neural and symbolic forward passes. We demonstrate SymTorch across diverse architectures including GNNs, PINNs and transformer models. Finally, we present a proof-of-concept for accelerating LLM inference by replacing MLP layers with symbolic surrogates, achieving an 8.3\% throughput improvement with moderate performance degradation.
Executive Summary
This article presents SymTorch, a novel framework for symbolic distillation of deep neural networks, which automates the process of replacing neural network components with interpretable, closed-form mathematical expressions. By leveraging PySR, SymTorch overcomes engineering challenges associated with integrating symbolic regression into deep learning workflows, enabling seamless switching between neural and symbolic forward passes. The framework is demonstrated across various architectures, including GNNs, PINNs, and transformer models, with a proof-of-concept showing an 8.3% throughput improvement in LLM inference by replacing MLP layers with symbolic surrogates. This development has significant implications for accelerating neural network computations and facilitating the discovery of physical laws and mathematical relationships.
Key Points
- ▸ SymTorch automates symbolic distillation of deep neural networks
- ▸ Overcomes engineering challenges associated with symbolic regression
- ▸ Demonstrated across various architectures, including GNNs, PINNs, and transformer models
Merits
Strength in Symbolic Representation
SymTorch enables the conversion of complex neural network components into human-readable equations, facilitating interpretability and understanding of deep learning models.
Scalability and Efficiency
The framework's ability to seamlessly switch between neural and symbolic forward passes and its proof-of-concept showing an 8.3% throughput improvement in LLM inference demonstrate its potential for accelerating neural network computations.
Demerits
Data Transfer and Caching Complexity
SymTorch's reliance on GPU-CPU data transfer and input-output caching may introduce additional complexity, potentially hindering its adoption in certain applications.
Performance Degradation
The proof-of-concept shows moderate performance degradation when replacing MLP layers with symbolic surrogates, which may limit the framework's applicability in certain scenarios.
Expert Commentary
SymTorch represents a significant step forward in the development of symbolic distillation frameworks for deep neural networks. By automating the process of replacing neural network components with interpretable, closed-form mathematical expressions, SymTorch overcomes several engineering challenges associated with symbolic regression. The framework's ability to seamlessly switch between neural and symbolic forward passes and its demonstration across various architectures, including GNNs, PINNs, and transformer models, underscore its potential for accelerating neural network computations. However, the introduction of additional complexity due to data transfer and caching, as well as the potential for performance degradation, should not be overlooked. Nevertheless, SymTorch's development has the potential to drive significant advancements in various fields, including science and engineering, and its implications should be carefully considered.
Recommendations
- ✓ Further research should focus on addressing the data transfer and caching complexity associated with SymTorch, as well as exploring strategies to mitigate performance degradation.
- ✓ The development of SymTorch-based frameworks for specific applications, such as scientific computing and engineering, should be prioritized to realize the framework's full potential.