IntSeqBERT: Learning Arithmetic Structure in OEIS via Modulo-Spectrum Embeddings
arXiv:2603.05556v1 Announce Type: new Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$--$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An abl
arXiv:2603.05556v1 Announce Type: new Abstract: Integer sequences in the OEIS span values from single-digit constants to astronomical factorials and exponentials, making prediction challenging for standard tokenised models that cannot handle out-of-vocabulary values or exploit periodic arithmetic structure. We present IntSeqBERT, a dual-stream Transformer encoder for masked integer-sequence modelling on OEIS. Each sequence element is encoded along two complementary axes: a continuous log-scale magnitude embedding and sin/cos modulo embeddings for 100 residues (moduli $2$--$101$), fused via FiLM. Three prediction heads (magnitude regression, sign classification, and modulo prediction for 100 moduli) are trained jointly on 274,705 OEIS sequences. At the Large scale (91.5M parameters), IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy (MMA) on the test set, outperforming a standard tokenised Transformer baseline by $+8.9$ pt and $+4.5$ pt, respectively. An ablation removing the modulo stream confirms it accounts for $+15.2$ pt of the MMA gain and contributes an additional $+6.2$ pt to magnitude accuracy. A probabilistic Chinese Remainder Theorem (CRT)-based Solver converts the model's predictions into concrete integers, yielding a 7.4-fold improvement in next-term prediction over the tokenised-Transformer baseline (Top-1: 19.09% vs. 2.59%). Modulo spectrum analysis reveals a strong negative correlation between Normalised Information Gain (NIG) and Euler's totient ratio $\varphi(m)/m$ ($r = -0.851$, $p < 10^{-28}$), providing empirical evidence that composite moduli capture OEIS arithmetic structure more efficiently via CRT aggregation.
Executive Summary
The article introduces IntSeqBERT, a dual-stream Transformer encoder for predicting integer sequences in the OEIS database. It achieves state-of-the-art performance by leveraging modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver. The model outperforms a standard tokenised Transformer baseline, demonstrating the importance of capturing arithmetic structure in integer sequences. The results have significant implications for predicting complex sequences and understanding the underlying structure of mathematical data.
Key Points
- ▸ IntSeqBERT uses a dual-stream Transformer encoder for masked integer-sequence modelling
- ▸ The model leverages modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver
- ▸ IntSeqBERT outperforms a standard tokenised Transformer baseline in predicting integer sequences
Merits
Improved Prediction Accuracy
IntSeqBERT achieves 95.85% magnitude accuracy and 50.38% Mean Modulo Accuracy, outperforming the baseline model
Efficient Capture of Arithmetic Structure
The model's use of modulo-spectrum embeddings allows it to efficiently capture the arithmetic structure of integer sequences
Demerits
Computational Complexity
The use of a dual-stream Transformer encoder and a probabilistic Chinese Remainder Theorem-based solver may increase computational complexity
Limited Interpretability
The model's reliance on complex mathematical transformations may limit its interpretability and understanding
Expert Commentary
The article presents a significant advancement in the field of mathematical data analysis, demonstrating the power of machine learning techniques in predicting complex integer sequences. The use of modulo-spectrum embeddings and a probabilistic Chinese Remainder Theorem-based solver allows the model to efficiently capture the arithmetic structure of the sequences, leading to state-of-the-art performance. However, the complexity of the model and its limited interpretability may pose challenges for future research and practical applications.
Recommendations
- ✓ Further research on the interpretability and explainability of the model's predictions
- ✓ Exploration of the model's applications in various fields, such as cryptography and coding theory