Academic

pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures

arXiv:2602.24066v1 Announce Type: new Abstract: Path signatures provide a rich representation of sequential data, with strong theoretical guarantees and good performance in a variety of machine-learning tasks. While signatures have progressed from fixed feature extractors to trainable components of machine-learning models, existing libraries often lack the required scalability for large-scale, gradient-based learning. To address this gap, this paper introduces pathsig, a PyTorch-native library that computes path signatures directly in the word basis. By using CUDA kernels to update signature coefficients in parallel over prefix-closed word sets, pathsig achieves high GPU throughput and near-minimal peak memory. Compared with other libraries, pathsig achieves 10-30x speedups for computation of truncated signatures and up to 4-10x speedups in training that require backpropagation through the signature. Beyond regular truncation, pathsig supports projections of the (infinite-dimensional)

T
Tobias Nygaard
· · 1 min read · 2 views

arXiv:2602.24066v1 Announce Type: new Abstract: Path signatures provide a rich representation of sequential data, with strong theoretical guarantees and good performance in a variety of machine-learning tasks. While signatures have progressed from fixed feature extractors to trainable components of machine-learning models, existing libraries often lack the required scalability for large-scale, gradient-based learning. To address this gap, this paper introduces pathsig, a PyTorch-native library that computes path signatures directly in the word basis. By using CUDA kernels to update signature coefficients in parallel over prefix-closed word sets, pathsig achieves high GPU throughput and near-minimal peak memory. Compared with other libraries, pathsig achieves 10-30x speedups for computation of truncated signatures and up to 4-10x speedups in training that require backpropagation through the signature. Beyond regular truncation, pathsig supports projections of the (infinite-dimensional) signature onto user-specified sets of words and anisotropic truncation motivated by inhomogeneous path regularity, enabling more compact representations that can reduce dimensionality, redundancy, and computational cost.

Executive Summary

The article introduces pathsig, a PyTorch-native library designed to compute path signatures directly in the word basis, leveraging CUDA kernels for parallel updates of signature coefficients. This approach enables high GPU throughput and minimal peak memory, resulting in significant speedups compared to existing libraries. Pathsing supports various truncation methods, including regular truncation, projections onto user-specified word sets, and anisotropic truncation, allowing for more compact representations and reduced dimensionality.

Key Points

  • Introduction of pathsig library
  • GPU-accelerated computation of path signatures
  • Support for various truncation methods

Merits

Scalability

Pathsing achieves 10-30x speedups for computation of truncated signatures and up to 4-10x speedups in training that require backpropagation through the signature.

Demerits

Limited Context

The article primarily focuses on the technical aspects of the pathsig library, with limited discussion on its potential applications and real-world implications.

Expert Commentary

The introduction of pathsig marks a significant advancement in the computation of path signatures, offering a scalable and efficient solution for large-scale machine learning tasks. The library's support for various truncation methods and GPU acceleration enables researchers to explore more complex models and applications. However, further research is needed to fully realize the potential of pathsig and its implications for the broader field of artificial intelligence.

Recommendations

  • Further research on the applications of pathsig in real-world machine learning tasks
  • Investigation into the potential integration of pathsig with other machine learning libraries and frameworks

Sources