Academic

From Uniform to Learned Knots: A Study of Spline-Based Numerical Encodings for Tabular Deep Learning

arXiv:2604.05635v1 Announce Type: new Abstract: Numerical preprocessing remains an important component of tabular deep learning, where the representation of continuous features can strongly affect downstream performance. Although its importance is well established for classical statistical and machine learning models, the role of explicit numerical preprocessing in tabular deep learning remains less well understood. In this work, we study this question with a focus on spline-based numerical encodings. We investigate three spline families for encoding numerical features, namely B-splines, M-splines, and integrated splines (I-splines), under uniform, quantile-based, target-aware, and learnable-knot placement. For the learnable-knot variants, we use a differentiable knot parameterization that enables stable end-to-end optimization of knot locations jointly with the backbone. We evaluate these encodings on a diverse collection of public regression and classification datasets using MLP, Re

arXiv:2604.05635v1 Announce Type: new Abstract: Numerical preprocessing remains an important component of tabular deep learning, where the representation of continuous features can strongly affect downstream performance. Although its importance is well established for classical statistical and machine learning models, the role of explicit numerical preprocessing in tabular deep learning remains less well understood. In this work, we study this question with a focus on spline-based numerical encodings. We investigate three spline families for encoding numerical features, namely B-splines, M-splines, and integrated splines (I-splines), under uniform, quantile-based, target-aware, and learnable-knot placement. For the learnable-knot variants, we use a differentiable knot parameterization that enables stable end-to-end optimization of knot locations jointly with the backbone. We evaluate these encodings on a diverse collection of public regression and classification datasets using MLP, ResNet, and FT-Transformer backbones, and compare them against common numerical preprocessing baselines. Our results show that the effect of numerical encodings depends strongly on the task, output size, and backbone. For classification, piecewise-linear encoding (PLE) is the most robust choice overall, while spline-based encodings remain competitive. For regression, no single encoding dominates uniformly. Instead, performance depends on the spline family, knot-placement strategy, and output size, with larger gains typically observed for MLP and ResNet than for FT-Transformer. We further find that learnable-knot variants can be optimized stably under the proposed parameterization, but may substantially increase training cost, especially for M-spline and I-spline expansions. Overall, the results show that numerical encodings should be assessed not only in terms of predictive performance, but also in terms of computational overhead.

Executive Summary

The study systematically evaluates spline-based numerical encodings (B-splines, M-splines, I-splines) for tabular deep learning, comparing uniform, quantile-based, target-aware, and learnable-knot placement strategies. Using MLP, ResNet, and FT-Transformer backbones across diverse datasets, it finds that encoding performance varies by task (classification vs. regression), output size, and backbone architecture. While piecewise-linear encoding (PLE) emerges as the most robust for classification, no single spline encoding dominates regression tasks. Learnable-knot variants, though computationally expensive, offer stable optimization but limited universal gains. The work underscores the need to balance predictive performance with computational overhead, advocating for task-specific and architecture-aware encoding selection in tabular deep learning pipelines.

Key Points

  • Spline-based encodings (B-splines, M-splines, I-splines) are systematically evaluated for tabular deep learning, with a focus on knot placement strategies (uniform, quantile-based, target-aware, learnable).
  • Performance of numerical encodings is highly task-dependent: PLE excels in classification, while regression performance varies by spline family, knot strategy, and backbone (MLP/ResNet/FT-Transformer).
  • Learnable-knot variants enable end-to-end optimization but incur significant training costs, particularly for M-spline and I-spline expansions, without guaranteeing uniform performance improvements.
  • The study highlights the importance of assessing numerical encodings not only for predictive power but also for computational efficiency, especially in large-scale applications.

Merits

Rigorous Empirical Validation

The study employs a diverse set of regression and classification datasets, three backbone architectures (MLP, ResNet, FT-Transformer), and multiple spline families with varied knot placement strategies, ensuring comprehensive and robust comparisons.

Methodological Innovation

Introduction of differentiable knot parameterization for learnable-knot splines enables stable end-to-end optimization, addressing a critical gap in the literature on spline-based encodings for deep learning.

Task-Specific Insights

The findings provide nuanced guidance on encoding selection by task (classification vs. regression) and backbone architecture, offering actionable recommendations for practitioners in tabular deep learning.

Demerits

Computational Overhead of Learnable Knots

While learnable-knot variants offer flexibility, their training costs are prohibitively high for some spline families (e.g., M-splines, I-splines), limiting scalability and practical deployment in resource-constrained environments.

Limited Generalizability of Results

The study’s conclusions are based on a specific set of datasets and backbone architectures, which may not fully capture the diversity of real-world tabular data or emerging deep learning models, necessitating broader validation.

Focus on Spline Encodings Only

The analysis excludes other advanced numerical encoding techniques (e.g., Fourier features, wavelets), which may offer competitive or superior performance in certain scenarios, limiting the scope of the comparative study.

Expert Commentary

This study makes a significant contribution to the understudied yet critical domain of numerical preprocessing in tabular deep learning. By systematically dissecting the impact of spline-based encodings across diverse tasks and architectures, the authors provide a nuanced understanding of how feature representations influence model performance. The methodological innovation of differentiable knot parameterization is particularly noteworthy, as it bridges the gap between traditional feature engineering and modern end-to-end deep learning pipelines. However, the study also highlights a key tension in contemporary ML: the trade-off between predictive power and computational efficiency. While learnable knots offer flexibility, their training overhead may limit adoption in practice. The findings underscore that no one-size-fits-all solution exists, and the choice of encoding must be context-dependent. This work should inspire further research into hybrid encoding strategies and the development of more efficient differentiable parameterizations. For practitioners, the takeaway is clear: the era of treating numerical preprocessing as an afterthought in tabular deep learning is over. A more sophisticated, task-aware approach is now essential.

Recommendations

  • Conduct additional studies to explore hybrid encoding strategies that combine spline-based methods with other advanced techniques (e.g., Fourier features, neural splines) to leverage their respective strengths while mitigating computational costs.
  • Develop and benchmark more efficient differentiable parameterizations for learnable-knot splines, particularly for M-splines and I-splines, to reduce training overhead without sacrificing performance gains.
  • Create a standardized framework for evaluating numerical encodings in tabular deep learning, incorporating metrics for predictive performance, computational efficiency, interpretability, and robustness across diverse datasets and architectures.
  • Investigate the interpretability benefits of spline-based encodings, particularly in high-stakes applications, by developing methods to visualize and explain the learned transformations of numerical features.
  • Expand the scope of empirical validation to include emerging tabular deep learning architectures (e.g., TabNet, TabTransformer) and real-world datasets from domains like healthcare and finance to validate the generalizability of the findings.

Sources

Original: arXiv - cs.LG