Academic

IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings

arXiv:2603.05556v1 Announce Type: new Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$--$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An abl

K
Kazuhisa Nakasho
· · 1 min read · 16 views

arXiv:2603.05556v1 Announce Type: new Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$--$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An ablation removing the modulo stream confirms it accounts for $+15.2$ pt of the MMA gain and contributes an additional $+6.2$ pt to magnitude accuracy. A probabilistic Chinese Remainder Theorem (CRT)-based Solver converts the model's predictions into concrete integers, yielding a 7.4-fold improvement in next-term prediction over the tokenised-Transformer baseline (Top-1: 19.09% vs. 2.59%). Modulo spectrum analysis reveals a strong negative correlation between Normalised Information Gain (NIG) and Euler's totient ratio $\varphi(m)/m$ ($r = -0.851$, $p < 10^{-28}$), providing empirical evidence that composite moduli capture OEIS arithmetic structure more efficiently via CRT aggregation.

Executive Summary

The article introduces IntSeqBERT, a dual-stream Transformer encoder for predicting integer sequences in the OEIS database. It achieves state-of-the-art performance by leveraging modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver. The model outperforms a standard tokenised Transformer baseline, demonstrating the importance of capturing arithmetic structure in integer sequences. The results have significant implications for predicting complex sequences and understanding the underlying structure of mathematical data.

Key Points

  • IntSeqBERT uses a dual-stream Transformer encoder for masked integer-sequence modelling
  • The model leverages modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver
  • IntSeqBERT outperforms a standard tokenised Transformer baseline in predicting integer sequences

Merits

Improved Prediction Accuracy

IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy, outperforming the baseline model

Efficient Capture of Arithmetic Structure

The model's use of modulo-spectrum embeddings allows it to efficiently capture the arithmetic structure of integer sequences

Demerits

Computational Complexity

The use of a dual-stream Transformer encoder and a probabilistic Chinese Remainder Theorem-based solver may increase computational complexity

Limited Interpretability

The model's reliance on complex mathematical transformations may limit its interpretability and understanding

Expert Commentary

The article presents a significant advancement in the field of mathematical data analysis, demonstrating the power of machine learning techniques in predicting complex integer sequences. The use of modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver allows the model to efficiently capture the arithmetic structure of the sequences, leading to state-of-the-art performance. However, the complexity of the model and its limited interpretability may pose challenges for future research and practical applications.

Recommendations

  • Further research on the interpretability and explainability of the model's predictions
  • Exploration of the model's applications in various fields, such as cryptography and coding theory

Sources