Academic

Polynomial Surrogate Training for Differentiable Ternary Logic Gate Networks

arXiv:2603.00302v1 Announce Type: new Abstract: Differentiable logic gate networks (DLGNs) learn compact, interpretable Boolean circuits via gradient-based training, but all existing variants are restricted to the 16 two-input binary gates. Extending DLGNs to Ternary Kleene $K_3$ logic and training DTLGNs where the UNKNOWN state enables principled abstention under uncertainty is desirable. However, the support set of potential gates per neuron explodes to $19{,}683$, making the established softmax-over-gates training approach intractable. We introduce Polynomial Surrogate Training (PST), which represents each ternary neuron as a degree-$(2,2)$ polynomial with 9 learnable coefficients (a $2{,}187\times$ parameter reduction) and prove that the gap between the trained network and its discretized logic circuit is bounded by a data-independent commitment loss that vanishes at convergence. Scaling experiments from 48K to 512K neurons on CIFAR-10 demonstrate that this hardening gap contracts

arXiv:2603.00302v1 Announce Type: new Abstract: Differentiable logic gate networks (DLGNs) learn compact, interpretable Boolean circuits via gradient-based training, but all existing variants are restricted to the 16 two-input binary gates. Extending DLGNs to Ternary Kleene $K_3$ logic and training DTLGNs where the UNKNOWN state enables principled abstention under uncertainty is desirable. However, the support set of potential gates per neuron explodes to $19{,}683$, making the established softmax-over-gates training approach intractable. We introduce Polynomial Surrogate Training (PST), which represents each ternary neuron as a degree-$(2,2)$ polynomial with 9 learnable coefficients (a $2{,}187\times$ parameter reduction) and prove that the gap between the trained network and its discretized logic circuit is bounded by a data-independent commitment loss that vanishes at convergence. Scaling experiments from 48K to 512K neurons on CIFAR-10 demonstrate that this hardening gap contracts with overparameterization. Ternary networks train $2$-$3\times$ faster than binary DLGNs and discover true ternary gates that are functionally diverse. On synthetic and tabular tasks we find that the UNKNOWN output acts as a Bayes-optimal uncertainty proxy, enabling selective prediction in which ternary circuits surpass binary accuracy once low-confidence predictions are filtered. More broadly, PST establishes a general polynomial-surrogate methodology whose parameterization cost grows only quadratically with logic valence, opening the door to many-valued differentiable logic.

Executive Summary

This article introduces Polynomial Surrogate Training (PST), a novel approach to train differentiable ternary logic gate networks (DTLGNs) that overcome the limitations of existing binary DLGNs. By representing ternary neurons as degree-$(2,2)$ polynomials with 9 learnable coefficients, PST achieves a significant reduction in parameters while maintaining a bounded gap between the trained network and its discretized logic circuit. The authors demonstrate the effectiveness of PST through scaling experiments on CIFAR-10, showing that ternary networks train faster and discover functionally diverse true ternary gates. Furthermore, the UNKNOWN output acts as a Bayes-optimal uncertainty proxy, enabling selective prediction and surpassing binary accuracy in certain cases. PST opens the door to many-valued differentiable logic and has the potential to significantly impact various applications, including machine learning and artificial intelligence.

Key Points

  • PST introduces a novel approach to train DTLGNs with ternary Kleene $K_3$ logic
  • PST reduces the parameters of ternary neurons by a factor of $2{,}187$ compared to softmax-over-gates training
  • Ternary networks trained with PST show faster training times and discover functionally diverse true ternary gates

Merits

Strength

The PST approach provides a practical solution to the limitations of existing binary DLGNs, enabling the training of ternary logic gate networks.

Demerits

Limitation

The article focuses primarily on theoretical developments and requires further experimentation to demonstrate its practical applicability in real-world scenarios.

Expert Commentary

The article presents a well-structured and theoretically sound approach to training DTLGNs. The PST method is an innovative solution to the limitations of existing binary DLGNs and demonstrates promising results in scaling experiments. However, as the authors note, further experimentation is required to fully evaluate the practical applicability of PST in real-world scenarios. Moreover, the article raises interesting questions about the potential applications of many-valued differentiable logic and the implications for various fields, including machine learning and artificial intelligence.

Recommendations

  • Further experimentation is needed to fully evaluate the practical applicability of PST in real-world scenarios.
  • The authors should investigate the potential applications of PST in various fields, including machine learning and artificial intelligence.

Sources