NSF III: Medium: Collaborative Research: Guiding Exploration of Protein Structure Spaces with Deep Learning

[Research] [Software and Data] [Education] [Outreach] [People]

Research

Journal Publications

1. J. Hou, T. Wu, R. Cao, J. Cheng. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins, accepted. [at Proteins]

2. A. Al-Azzawi, A. Quadou, J.J. Tanner, J. Cheng. AutoCryoPicker: an unsupervised learning approach for fully automated single particle picking in cryo-EM images. BMC Bioinformatics, accepted.

3. J. Cheng, M. Choe, A. Elofsson, K. Han, J. Hou, A. Maghrabi, L.J. McGuffin, D. Menendez-Hurtado, K. Olechnovič, T. Schwede, G. Studer, K. Uziela, Č. Venclovas, B. Wallner. Estimation of model quality accuracy in CASP13. Proteins, under review, 2019. (CASP13 invited paper)

4. S. Dong, S.A. Moritz, J. Pfab, J. Hou, R. Cao, L. Wang, T. Wu, J. Cheng. Deep learning to predict protein backbone structure from high-resolution cryo-EM density maps. Scientific Reports, 10(1):1-22, 2020.

5. . N. Zhou, Y. Jiang, T.R. Bergquist, A.J. Lee, B.Z. Kacsoh, A.W. Crocker, ... & L. Davis. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology, 2019.

6. M.F. Lensink, G. Brysbaert, N. Nadzirin, S. Velankar, R.A. Chaleil, T. Gerguri, ... & R. Kong. Blind prediction of homo‐and hetero‐protein complexes: The CASP13‐CAPRI experiment. Proteins: Structure, Function, and Bioinformatics, 2019.

7. J. Cheng, M. Choe, A. Elofsson, K. Han, J.Hou, A. Maghrabi, L.J. McGuffin, D. Menéndez-Hurtado, K. Olechnovič, T. Schwede, G. Studer, K. Uziela, Č. Venclovas, B. Wallner. Estimation of model accuracy in CASP13. Proteins, accepted, 2019.

8. J. Hou, B. Adhikari, J. Tanner, J. Cheng*. SAXSDom: Modeling multidomain protein structures using small-angle X-ray scattering data. Proteins, accepted.

9. Guo, Z., Hou, J., & Cheng, J. DNSS2: improved ab initio protein secondary structure prediction using advanced deep learning architectures. Proteins, 89(2), 207-217, 2021.

10. Lawson, C. L., Kryshtafovych, A., Adams, P. D., Afonine, P., Baker, M. L., Barad, B. A., ... & Chojnowski, G.. Outcomes of the 2019 EMDataResource model challenge: validation of cryo-EM models at near-atomic resolution. Nature Methods, accepted, 2020.

11. T. Wu, Z. Guo, J. Hou, J. Cheng. DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinformatics, 22:30, 2021.

12. M. Necci et al. Critical Assessment of Protein Intrinsic Disorder Prediction. Nature Methods, accepted, 2021.

13. C. Chen, T. Wu, Z. Guo, J. Cheng. Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction. Proteins, accepted, 2021.

14. Guo, Z, Wu, T., Liu, J., Hou, J., & Cheng, J. Improving deep learning-based protein distance prediction in CASP14. Bioinformatics, 2021.

15. Chen, X., Liu, J., Guo, Z., Wu, T., Hou, J., & Cheng, J. Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14. Scientific Reports, 2021.

16. Al-Azzawi, Adil and Ouadou, Anes and Duan, Ye and Cheng, Jianlin. (2020). Auto3DCryoMap: an automated particle alignment approach for 3D cryo-EM density map reconstruction. BMC Bioinformatics. 21 (S21) .

17. Kryshtafovych, A., Moult, J., Billings, W. M., Della Corte, D., Fidelis, K., Kwon, S., ... & CASP‐COVID participants. (2021). Modeling SARS‐CoV‐2 proteins in the CASP‐commons experiment. Proteins: Structure, Function, and Bioinformatics, 89(12), 1987-1996

18. Liu, J., Wu, T., Guo, Z., Hou, J., & Cheng, J. (2022). Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14. Proteins: Structure, Function, and Bioinformatics, 90(1), 58-72.

19. Wu, T., Liu, J., Guo, Z., Hou, J., & Cheng, J. (2021). MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction. Scientific reports, 11(1), 1-9.

20. Chen, X., & Cheng, J. (2022). DISTEMA: distance map-based estimation of single protein model accuracy with attentive 2D convolutional neural network. BMC bioinformatics, 23(3), 1-14.

21. Guo, Z., Liu, J., Wang, Y., Chen, M., Wang, D., Xu, D., Cheng, J. (2023). Diffusion models in bioinformatics and computational biology. Nature Reviews Bioengineering, accepted.

22. Liu, J., Guo, Z., Wu, T., Roy, R.S., Chen, C., Cheng, J. (2023) Improving AlphaFold2-based Protein Tertiary Structure Prediction with MULTICOM in CASP15. Communications Chemistry, accepted.

23. Roy, R., Liu, J., Giri, N., Guo, Z., Cheng, J. (2023). Combining Pairwise Structural Similarity and Deep Learning Interface Contact Prediction to Estimate Protein Complex Model Accuracy in CASP15. Proteins, accepted.

24. Chen, X., Morehead, A., Liu, J., Cheng, J. (2023) A Gated Graph Transformer for Protein Complex Structure Quality Assessment and its Performance in CASP15. Bioinformatics, accepted.

25. Chen, C., Chen, X., Morehead, A., Wu, T., Cheng, J. (2023) 3D-equivariant graph neural networks for protein model quality assessment. Bioinformatics, accepted.

26. Lensink et al. (2023). Impact of AlphaFold on Structure Prediction of Protein Complexes: The CASP15-CAPRI Experiment. Proteins, accepted.

27. Giri, N., R.S. Roy, and J. Cheng, Deep learning for reconstructing protein structures from cryo-EM density maps: Recent advances and future directions. Current Opinion in Structural Biology, 2023. 79: p. 102536.

28. Giri, N. and J. Cheng, Improving Protein–Ligand Interaction Modeling with cryo-EM Data, Templates, and Deep Learning in 2021 Ligand Model Challenge. Biomolecules, 2023. 13(1): p. 132.

29. Lensink et al. (2023). Impact of AlphaFold on Structure Prediction of Protein Complexes: The CASP15-CAPRI Experiment. Proteins, accepted.

Manuscript Preprint in bioRxiv

1. J. Hou, R. Cao, J. Cheng. Deep convolutional neural networks for predicting the quality of single protein structural models. bioRxiv, 590620, 2019. [at BioRxiv]

Book Chapter

1. J. Hou, T. Wu, Z. Guo, F. Quadir, J. Cheng. The MULTICOM protein structure prediction server empowered by deep learning and contact distance prediction. Methods in Mol. Biol., under review, 2019.

Conference Proceedings

1. X. Chen, N. Akhter, Z. Guo, T. Wu, J. Hou, A. Shehu, J. Cheng. Deep ranking in template-free protein structure prediction. The 11th ACM Conference on Bioinformatics and Computational Biology, 2020.

2. Gao, M., Lund-Andersen, P., Morehead, A., Mahmud, S., Chen, C., Chen, X., ... Cheng, J., & Sedova, A. (2021, November). High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function. In 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (MLHPC) (pp. 46-57). IEEE.

3. Elham Soltanikazemi, Raj Roy, Farhan Quadir, Nabin Giri, Alex Morehead and Jianlin Cheng. DRLComplex: Reconstruction of Protein Quaternary Structures Using Deep Reinforcement Learning. The International Conference on Intelligent Biology and Medicine (ICIBM), Tempa, Florida, 2023.

4. Chen, X., Morehead, A., Liu, J., Cheng, J. (2023) A Gated Graph Transformer for Protein Complex Structure Quality Assessment and its Performance in CASP15. The International Conference on Intelligent systems for Molecular Biology, Lyon, France, 2023.

Conference Abstracts, Posters, and Presentations

1. J. Hou, J. Cheng. Deep convolutional neural networks for predicting the quality of single protein structural model. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract)

2. J. Hou, T. Wu, J. Cheng. Improving protein tertiary structure prediction by deep learning, contact prediction, and domain recognition. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract)

3. J. Hou, T. Wu, R. Cao, J. Cheng. CASP13 tertiary structure prediction by the MULTICOM human group. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract and poster)

4. J. Cheng. Protein Structure Modeling Driven by Deep Learning and Contact Prediction. Invited Talk. The CASP13 Conference, Cancun, Mexico, Dec. 1-4, 2018. (invited talk)

5. J. Cheng. Protein Structure Modeling Driven by Deep Learning and Contact Distance Prediction. Invited Talk. Workshop on Bioinformatics and Health Informatics, IEEE Conference on Big Data, 2018. (invited talk)

6. J. Hou. Protein model quality assessment by deep learning. The CASP13 Conference, Cancun, Mexico, 2018. (invited presentation)

7. J. Cheng. Talk. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction. University of Alabama, 2020.

8. J. Cheng. Talk. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction in CASP13. International Conference on Advanced Bioinformatics and Biomedical Engineering (ICABB), Seoul, South Korea, 2020.

9. J. Cheng. Talk. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction in CASP13. ACM-BCB Conference, Niagara Fall, New York, 2019.

10. J. Cheng. Talk. Deep Learning for Protein Structure Prediction, Stowers Institute, Kansas City, 2019.

11. Chloe Jones. Estimating Model Accuracy for Protein Structures Using Artificial Intelligence. Undergraduate Research Forum, University of Missouri, Columbia, 2019. (Abstract).

12. J. Cheng. Introduction to 2019 ACM-BCB Highlights Session. ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), Niagara Falls, NY, 2019. (Abstract)

13. J. Liu, J. Hou, T. Wu, Z. Guo, J. Cheng. CASP14 Tertiary Structure Prediction by MULTICOM Human Group. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

14. Jian Liu, Xiao Chen, Jie Hou, Tianqi Wu, Zhiye Guo, J. Cheng. Protein model quality assessment with deep learning and residue-residue contact and distance predictions. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

15. Tianqi Wu, Jian Liu, Zhiye Guo, Jie Hou, J. Cheng. CASP14 Protein Tertiary Structure Prediction by MULTICOM Server Predictors. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

16. Jie Hou, Zhiye Guo, Jian Liu, Tianqi Wu and Jianlin Cheng. Improving protein single-model and consensus quality assessment using inter-residue distance prediction and deep learning. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

17. J. Cheng. Talk. Deep Learning Prediction of Protein Structure and Interaction. NIH, 2021.

18. J. Cheng. Keynote talk. Large-Scale Machine Learning and Optimization for Bioinformatics Data Analysis, 2nd International Workshop on High Performance Computing, Big Data Analytics and Integration for Multi-Omics Biomedical Data, ACM-BCB, 2020.

19. J. Cheng. Talk. Data Assisted Ab Initio Protein Structure Modeling, Satellite session on data assisted protein structure prediction in 14th Critical Assessment of Techniques for Protein Structure Prediction (CASP14), 2020.

20. J. Cheng. Talk. Round-table Discussion on Deep Learning in Protein Structure Prediction, the 14th Critical Assessment of Techniques for Protein Structure Prediction (CASP14), 2020.

21. J. Cheng. Combining Pairwise Similarity and Interface Contact Prediction to Evaluate Protein Assembly Models, The 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15), Antalya, Turkey. (Talk)

22. J. Cheng. Deep Learning Techniques for Protein Structure Refinement and Evaluation, Southern California AI Symposium, University of California, Irvine. (Talk)

23. J. Cheng. Deep Learning Techniques for Protein Structure Refinement and Evaluation, Stockholm University, Sweden, 2022. (Talk)

24. Jian Liu, Zhiye Guo, Xiao Chen, Alex Morehead, Raj S. Roy, Tianqi Wu, Nabin Giri, Farhan Quadir, Chen Chen, Jianlin Cheng. Improving Multimer Structure Prediction by Sensitive Alignment Sampling, Template Identification, Model Ranking, Iterative Refinement in CASP15-CAPRI. CASP15-CAPRI conference, 2022. (abstract)

25. Xiao Chen†, Alex Morehead†, Raj S. Roy†, Zhiye Guo, Jian Liu, Nabin Giri, Tianqi Wu, Chen Chen, Jianlin Cheng. Multimer Model Scoring Based on Gated-Graph Transformer and Steerable Equivariant Graph Neural Networks in CASP15-CAPRI. CASP15-CAPRI conference, 2022. (abstract)

26. Xiao Chen†, Alex Morehead†, Raj S. Roy†, Zhiye Guo, Jian Liu, Nabin Giri, Tianqi Wu, Chen Chen, Jianlin Cheng. Multimer Model Quality Assessment Using Gated-Graph Transformer, Steerable Equivariant Graph Neural Networks, and Pairwise Model Similarity . CASP15 conference, 2022.

27. Jian Liu, Zhiye Guo, Tianqi Wu, Raj S. Roy, Farhan Quadir, Chen Chen, Jianlin Cheng. Improving Assembly Structure Prediction by Sensitive Alignment Sampling, Template Identification, Model Ranking, and Iterative Refinement. CASP15 conference, 2022. (abstract)

28. Jian Liu, Zhiye Guo, Tianqi Wu, Raj S. Roy, Farhan Quadir, Chen Chen, Jianlin Cheng. Improving Tertiary Structure Prediction by Alignment Sampling, Template Identification, Model Ranking, Iterative Refinement, and Protein Interaction-Aware Modeling. CASP15 conference, 2022. (abstract)

29. A. Morehead. A Gated Graph Transformer for Protein Complex Structure Quality Assessment and its Performance in CASP15. The International Conference on Intelligent systems for Molecular Biology, Lyon, France, 2023. (Talk)

30. J. Cheng. Deep learning protein structure prediction in CASP15, KAUST, Sandi Arabia, 2023. (Talk)

Theses & Dissertations

1. J. Hou. Improving Protein Structure Prediction by Deep Learning and Computational Optimization. PhD Dissertation. University of Missouri – Columbia, 2019.

2. Chen Chen. Protein-DNA interaction prediction and protein structure modeling by machine learning. University of Missouri - Columbia, 2022.

Software and Data

1. MULTICOM: the open source comprehensive protein structure prediction system at GitHub

2. CNNQA: deep convolutional neural networks for predicting the quality of single protein structural models

3. MULTICOM protein structure prediction server with model quality assessment service

4. DeepRank (version 1): deep learning ranking of protein structural models

5. DeepRank3 (version 3): deep learning ranking of protein structural models

6. MULTICOM2 protein tertiary structure prediction system

7. DISTEMA: distance-based estimation of accuracy of protein strucutre models

8. EnQA: 3D equivariant graph neural networks for evaluating the quality of protein tertiary structures

9. DProQ: 3D equivariant graph neural networks for evaluating the quality of single protein quaternary structures

10. MULTICOM_qa: combining pairwise model similarity and deep learning to evaluate multimer model quality

11. DProQA: a gated transformer for evaluating the quality of protein tertiary structures

12. MULTICOM3 protein monomer and multimer structure prediction

Education

Courses and Seminar

1. University of Missouri Deep Learning Seminar

2. Machine Learning for Biomedical Informatics

The community-wide Critical Assessment of Techniques for Protein Structure Prediction(CASP)

1. Our method was ranked no. 1 in the 13th CASP competition (CASP13) in 2018. [Official Ranking]

2. Our method was ranked no. 1 in ranking protein structure models in terms of GDT-TS loss in the 14th CASP competition (CASP14) in 2020.

3. MULTICOM_qa ranked no. 1 in estimating the global fold accuracy of multimer models in the CASP15 competition in 2022.

Outreach

1. MULTICOM was ranked among top 3 in the world-wide protein tertiary structure prediction (CASP13) (News)

2. Dr. Cheng gave a Ted-like talk on deep learning's application to protein folding for general public in the Research Day Conference, College of Engineering, University of Missouri - Columbia ((News)

3. MULTICOM was ranked among top in the world-wide protein tertiary structure prediction (CASP14)

4. MULTICOM was ranked among top in CASP15. Mizzou News and MU Engineering News

5. Raj's success in CASP15 was reported in news in Bangladesh News

People

Principle Investigator

Dr. Jianlin Cheng

Graduate Students

Jie Hou, Zhiye Guo, Xiao Chen, Raj Roy, Jian Liu, Alex Morehead, Chen Chen

Undergraduate Students

Chloe Jones, Elaine Liu

Contact

Prof. Jianlin Jack Cheng

Director of Bioinformatics and Machine Learning Laboratory (BML)

Professor
Department of Electrical Engineering and Computer Science
Informatics Institute
College of Engineering
University of Missouri, Columbia, MO 65211-2060

Primary Office: EBW 109
Lab: EBN 226, 304, 305, 307
Phone: 573-882-7306
Fax: 573-882-8318
Email: chengji@missouri.edu