Advances in Protein Structure Prediction through Deep Learning Models
DOI:
https://doi.org/10.61173/rsjk7x31Keywords:
Protein Structure Prediction, Deep Learn-ing, AlphaFold, Neural Network Architectures, CASPAbstract
Accurate protein structure prediction (PSP) is essential for understanding biological function. However, experimental determination of protein structures remains costly and limited in scope. In recent years, advances in deep learning (DL) have substantially improved PSP. These methods help close the gap between the vast number of known protein sequences and limited experimentally resolved structures. This article reviews five representative DL architectures: convolutional neural networks (CNNs), recurrent neural networks (RNNs), transformers, graph neural networks (GNNs), and diffusion-based models. Key protein databases used for training and validation are also introduced, including the Protein Data Bank (PDB), UniProt, Pfam, RefSeq, and the Big Fantastic Database (BFD). Performance evaluations based on the Critical Assessment of Protein Structure Prediction (CASP) benchmarks show that models such as AlphaFold2 and its successors achieve near-experimental accuracy. Nevertheless, challenges remain for low-homology sequences, protein–protein interactions, and dynamic folding pathways. Current limitations of DL in PSP also include data bias, restricted interpretability, and an inability to fully capture protein dynamics. To overcome these barriers, future directions may include explainable AI, debiased datasets, and integration with molecular dynamics simulations.