Academic

Multi-objective Genetic Programming with Multi-view Multi-level Feature for Enhanced Protein Secondary Structure Prediction

arXiv:2603.12293v1 Announce Type: new Abstract: Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propose MOGP-MMF, a multi-objective genetic programming framework that reformulates PSSP as an automated optimization task focused on feature selection and fusion. Specifically, MOGP-MMF introduces a multi-view multi-level representation strategy that integrates evolutionary, semantic, and newly introduced structural views to capture the comprehensive protein folding logic. Leveraging an enriched operator set, the framework evolves both linear and nonlinear fusion functions, effectively capturing high-order feature interactions while reducing fusion complexity. To resolve the accuracy-complexity trade-off, an improved multi-objective GP algorithm is developed, incorporating a knowledge transfer me

Y
Yining Qian, Lijie Su, Meiling Xu, Xianpeng Wang
· · 1 min read · 29 views

arXiv:2603.12293v1 Announce Type: new Abstract: Predicting protein secondary structure is essential for understanding protein function and advancing drug discovery. However, the intricate sequence-structure relationship poses significant challenges for accurate modeling. To address these, we propose MOGP-MMF, a multi-objective genetic programming framework that reformulates PSSP as an automated optimization task focused on feature selection and fusion. Specifically, MOGP-MMF introduces a multi-view multi-level representation strategy that integrates evolutionary, semantic, and newly introduced structural views to capture the comprehensive protein folding logic. Leveraging an enriched operator set, the framework evolves both linear and nonlinear fusion functions, effectively capturing high-order feature interactions while reducing fusion complexity. To resolve the accuracy-complexity trade-off, an improved multi-objective GP algorithm is developed, incorporating a knowledge transfer mechanism that utilizes prior evolutionary experience to guide the population toward global optima. Extensive experiments across seven benchmark datasets demonstrate that MOGP-MMF surpasses state-of-the-art methods, particularly in Q8 accuracy and structural integrity. Furthermore, MOGP-MMF generates a diverse set of non-dominated solutions, offering flexible model selection schemes for various practical application scenarios. The source code is available on GitHub: https://github.com/qian-ann/MOGP-MMF/tree/main.

Executive Summary

This article proposes MOGP-MMF, a multi-objective genetic programming framework for enhanced protein secondary structure prediction. By integrating evolutionary, semantic, and structural views, MOGP-MMF captures the comprehensive protein folding logic. The framework leverages an enriched operator set to evolve linear and nonlinear fusion functions, reducing fusion complexity while capturing high-order feature interactions. MOGP-MMF surpasses state-of-the-art methods, particularly in Q8 accuracy and structural integrity, and generates diverse non-dominated solutions. The source code is available on GitHub. This framework demonstrates the potential for multi-objective genetic programming in protein secondary structure prediction, offering flexible model selection schemes for various practical applications.

Key Points

  • MOGP-MMF is a multi-objective genetic programming framework for enhanced protein secondary structure prediction.
  • MOGP-MMF integrates evolutionary, semantic, and structural views to capture the comprehensive protein folding logic.
  • MOGP-MMF leverages an enriched operator set to evolve linear and nonlinear fusion functions.

Merits

Comprehensive protein folding logic

MOGP-MMF's integration of multiple views enables a more comprehensive understanding of protein folding.

Improved accuracy and structural integrity

MOGP-MMF surpasses state-of-the-art methods in Q8 accuracy and structural integrity.

Flexible model selection schemes

MOGP-MMF generates diverse non-dominated solutions, offering flexible model selection schemes for various practical applications.

Demerits

Complexity of the framework

MOGP-MMF's multi-view multi-level representation strategy may introduce complexity and require significant computational resources.

Limited generalizability to other protein-related tasks

MOGP-MMF's performance on protein secondary structure prediction may not be directly generalizable to other protein-related tasks.

Expert Commentary

The authors of this article have made a significant contribution to the field of protein secondary structure prediction by proposing MOGP-MMF, a multi-objective genetic programming framework that integrates evolutionary, semantic, and structural views to capture the comprehensive protein folding logic. The framework's performance on protein secondary structure prediction is impressive, surpassing state-of-the-art methods in Q8 accuracy and structural integrity. However, the complexity of the framework may be a limitation for some researchers, and its generalizability to other protein-related tasks is unclear. Nevertheless, MOGP-MMF's performance on protein secondary structure prediction has significant implications for protein-related research and development, and its development highlights the potential for the use of genetic programming and multi-objective optimization in this field.

Recommendations

  • Future research should focus on the development of more efficient and scalable algorithms for MOGP-MMF, as well as the application of MOGP-MMF to other protein-related tasks.
  • MOGP-MMF's performance on protein secondary structure prediction should be further validated using additional benchmark datasets and experimental protocols.

Sources