Skip to main content
Academic

Revisiting Chebyshev Polynomial and Anisotropic RBF Models for Tabular Regression

arXiv:2602.22422v1 Announce Type: new Abstract: Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response varies gradually with inputs. Despite these properties, smooth models seldom appear in tabular regression, where tree ensembles dominate. We ask whether they can compete, benchmarking models across 55 regression datasets organised by application domain. We develop an anisotropic RBF network with data-driven centre placement and gradient-based width optimisation, a ridge-regularised Chebyshev polynomial regressor, and a smooth-tree hybrid (Chebyshev model tree); all three are released as scikit-learn-compatible packages. We benchmark these against tree ensembles, a pre-trained transformer, and standard baselines, evaluating accuracy alongside generalisat

L
Luciano Gerber, Huw Lloyd
· · 1 min read · 4 views

arXiv:2602.22422v1 Announce Type: new Abstract: Smooth-basis models such as Chebyshev polynomial regressors and radial basis function (RBF) networks are well established in numerical analysis. Their continuously differentiable prediction surfaces suit surrogate optimisation, sensitivity analysis, and other settings where the response varies gradually with inputs. Despite these properties, smooth models seldom appear in tabular regression, where tree ensembles dominate. We ask whether they can compete, benchmarking models across 55 regression datasets organised by application domain. We develop an anisotropic RBF network with data-driven centre placement and gradient-based width optimisation, a ridge-regularised Chebyshev polynomial regressor, and a smooth-tree hybrid (Chebyshev model tree); all three are released as scikit-learn-compatible packages. We benchmark these against tree ensembles, a pre-trained transformer, and standard baselines, evaluating accuracy alongside generalisation behaviour. The transformer ranks first on accuracy across a majority of datasets, but its GPU dependence, inference latency, and dataset-size limits constrain deployment in the CPU-based settings common across applied science and industry. Among CPU-viable models, smooth models and tree ensembles are statistically tied on accuracy, but the former tend to exhibit tighter generalisation gaps. We recommend routinely including smooth-basis models in the candidate pool, particularly when downstream use benefits from tighter generalisation and gradually varying predictions.

Executive Summary

This article presents a comparative study of smooth-basis models, specifically Chebyshev polynomial regressors and anisotropic Radial Basis Function (RBF) networks, against traditional tree ensembles in tabular regression. The authors develop new models, including an anisotropic RBF network and a smooth-tree hybrid, and benchmark their performance across 55 datasets. The results show that smooth models can compete with tree ensembles in terms of accuracy, particularly when downstream use requires tighter generalisation and gradually varying predictions. The study highlights the potential of smooth-basis models in tabular regression, recommending their routine inclusion in the candidate pool. The findings also underscore the importance of considering the trade-offs between model complexity, computational resources, and deployment settings.

Key Points

  • Smooth-basis models (Chebyshev polynomial regressors and anisotropic RBF networks) can compete with tree ensembles in tabular regression.
  • The authors develop new models, including an anisotropic RBF network and a smooth-tree hybrid.
  • Smooth models tend to exhibit tighter generalisation gaps compared to tree ensembles.

Merits

Novel Contributions

The article presents new models and benchmarks their performance across 55 datasets, providing a comprehensive evaluation of smooth-basis models in tabular regression.

Comparative Study

The study compares smooth-basis models with traditional tree ensembles, highlighting their strengths and weaknesses in different deployment settings.

Demerits

Limited Generalisability

The results may not generalise to all domains and deployment settings, particularly those with high-dimensional data or complex relationships between inputs and outputs.

Computational Costs

Smooth-basis models can be computationally expensive to train and deploy, particularly for large datasets or complex models.

Expert Commentary

The article presents a rigorous and comprehensive evaluation of smooth-basis models in tabular regression, highlighting their potential benefits and limitations. The study's findings are significant, as they show that smooth-basis models can compete with traditional tree ensembles in terms of accuracy, particularly when downstream use requires tighter generalisation and gradually varying predictions. However, the results also underscore the importance of considering the trade-offs between model complexity, computational resources, and deployment settings. The study's recommendations to routinely include smooth-basis models in the candidate pool are well-supported by the evidence, and its implications for practical and policy settings are far-reaching. Overall, the article is a valuable contribution to the field of tabular regression, and its findings will likely inform the development of more effective models for this task.

Recommendations

  • Researchers and practitioners should consider using smooth-basis models in tabular regression tasks, particularly when downstream use requires tighter generalisation and gradually varying predictions.
  • Model developers and implementers should design and implement smooth-basis models that are computationally efficient and scalable for large datasets and complex relationships between inputs and outputs.

Sources