NSF ABI Project: Deep Learning Methods for Protein Bioinformatics

[Research] [Software and Data] [Education] [Outreach] [People]

Research

Journal Publications

1. J. Hou, B. Adhikari, J. Cheng. DeepSF: deep convolutional neural network for mapping protein sequences to folds. Bioinformatics, 34(8):1295–1303, 2018. [at Bioinformatics]

2. J. Hou, T. Wu, R. Cao, J. Cheng. Protein tertiary structure modeling driven by deep learning and contact distance prediction in CASP13. Proteins, accepted. [at Proteins]

3. A. Al-Azzawi, A. Quadou, J. Cheng. Super Clustering Approach for Fully Automated Single Particle Picking in Cryo-EM. Genes, in press.

4. A. Al-Azzawi, A. Quadou, J.J. Tanner, J. Cheng. AutoCryoPicker: an unsupervised learning approach for fully automated single particle picking in cryo-EM images. BMC Bioinformatics, accepted.

5. T. Wu, J. Hou, B. Adhikari and J. Cheng. Elucidating key determinants of deep learning-based inter-residue contact distance prediction. Bioinformatics, 36(4), 1091-1098, 2020. 2019.

6. Yan, J., Cheng, J., Kurgan, L., Uversky, V. N. (2019). Structural and functional analysis of “non-smelly” proteins. Cellular and Molecular Life Sciences, 1-18.

7. Lensink, M. F., Brysbaert, G., Nadzirin, N., Velankar, S., Chaleil, R. A., Gerguri, T., ..., Kong, R. (2019). Blind prediction of homo‐and hetero‐protein complexes: The CASP13‐CAPRI experiment. Proteins: Structure, Function, and Bioinformatics, 87(12), 1200-1221.

8. Zhou, N., Jiang, Y., Bergquist, T. R., Lee, A. J., Kacsoh, B. Z., Crocker, A. W., ..., Davis, L. (2019). The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome biology, 20(1), 1-23.

9. Hou, J., Adhikari, B., Tanner, J. J., Cheng, J. (2020). SAXSDom: Modeling multidomain protein structures using small‐angle X‐ray scattering data. Proteins: Structure, Function, and Bioinformatics, 88(6), 775-787.

10. S. Dong, S.A. Moritz, J. Pfab, J. Hou, R. Cao, L. Wang, T. Wu, J. Cheng. Deep learning to predict protein backbone structure from high-resolution cryo-EM density maps. Scientific Reports, 10(1):1-22, 2020.

11. Chen, C., Hou, J., Tanner, J.J. and Cheng, J., 2020. Bioinformatics methods for mass spectrometry-based proteomics data analysis. International Journal of Molecular Sciences, 21(8), p.2873.

12. J. Hou, Z. Guo, and J. Cheng. DNSS2: improved ab initio protein secondary structure prediction using advanced deep learning architectures. BioRxiv, 2019.

13. Al-Azzawi, A., Ouadou, A., Max, H., Duan, Y., Tanner, J. J., & Cheng, J. DeepCryoPicker: fully automated deep neural network for single protein particle picking in cryo-EM. BMC bioinformatics, 21(1), 1-38, 20220.

14. Lawson, C. L., Kryshtafovych, A., Adams, P. D., Afonine, P., Baker, M. L., Barad, B. A., ... & Chojnowski, G.. Outcomes of the 2019 EMDataResource model challenge: validation of cryo-EM models at near-atomic resolution. Nature Methods, accepted, 2020.

15. Adil Al-Azzawi, Anes Ouadou, Ye Duan, and Jianlin Cheng. Auto3DCryoMap: An Automated Particle Alignment Approach for 3D cryo-EM Density Map Reconstruction. BMC Bioinformatics, 21(21), 1-26, 2020.

16. T. Wu, Z. Guo, J. Hou, J. Cheng. DeepDist: real-value inter-residue distance prediction with deep residual convolutional network. BMC Bioinformatics, 22:30, 2021.

17. M. Necci et al. Critical Assessment of Protein Intrinsic Disorder Prediction. Nature Methods, accepted, 2021.

18. C. Chen, T. Wu, Z. Guo, J. Cheng. Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction. Proteins, accepted, 2021.

19. Guo, Z, Wu, T., Liu, J., Hou, J., & Cheng, J. Improving deep learning-based protein distance prediction in CASP14. Bioinformatics, 2021.

20. Chen, X., Liu, J., Guo, Z., Wu, T., Hou, J., & Cheng, J. Protein model accuracy estimation empowered by deep learning and inter-residue distance prediction in CASP14. Scientific Reports, 2021.

Book Chapter

1. J. Hou, T. Wu, Z. Guo, F. Quadir, J. Cheng. The MULTICOM protein structure prediction server empowered by deep learning and contact distance prediction. Methods in Mol. Biol., , 2020.

BioRxiv Preprint

1. Hong, Y., Deng, Y., Cui, H., Segert, J., Cheng, J. (2020). Classifying protein structures into folds by convolutional neural networks, distance maps, and persistent homology. BioRxiv.

2. Liu, J., Wu, T., Guo, Z., Hou, J., & Cheng, J. (2021). Improving protein tertiary structure prediction by deep learning and distance prediction in CASP14. bioRxiv.

Conference Proceedings

1. A. Al-Azzawi, A. Quadou, J. Cheng. Super Clustering Approach for Fully Automated Single Particle Picking in Cryo-EM. International Conference on Intelligent Biology and Medicine (ICIBM), Columbus, OH, 2019.

2. Adil Al-Azzawi, Anes Ouadou, Ye Duan, and Jianlin Cheng. Auto3DCryoMap: An Automated Particle Alignment Approach for 3D cryo-EM Density Map Reconstruction. ICIBM 2020.

Conference Abstracts, Posters, and Presentations

1. B. Adhikari, J. Hou, J. Cheng. Improved protein contact prediction using two-level deep convolutional neural networks. 26th Conference on Intelligent Systems for Molecular Biology, Chicago, IL, USA, 2018. (Talk)

2. J. Hou, B. Adhikari, and J. Cheng. DeepSF: deep convolutional neural network for mapping protein sequences to folds. In Proceedings of the 9th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, Boston, MA, USA, 2018. (Highlight Talk)

3. T. Wu, J. Hou, B. Adhikari, J. Cheng. Deep convolutional neural networks for improving protein contact prediction. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract)

4. J. Hou, T. Wu, J. Cheng. Improving protein tertiary structure prediction by deep learning, contact prediction, and domain recognition. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract)

5. J. Hou, T. Wu, R. Cao, J. Cheng. CASP13 tertiary structure prediction by the MULTICOM human group. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract and poster)

6. J. Hou, R. Cao, J. Cheng. CASP13 tertiary structure prediction by wfAll-Cheng of the WeFold collaborative. 13th Critical Assessment of Techniques of Protein Structure Prediction (CASP13), Cancun, Mexico, 2018. (abstract)

7. J. Cheng. Protein Structure Modeling Driven by Deep Learning and Contact Prediction. Invited Talk. The CASP13 Conference, Cancun, Mexico, Dec. 1-4, 2018. (Inivted Talk)

8. J. Cheng. Deep Learning for Protein Contact/Distance Prediction, Roundtable discussion, The CASP13 Conference, Cancun, Mexico, Dec. 1-4, 2018. (Round table discussion and presentation)

9. J. Cheng. Protein Structure Modeling Driven by Deep Learning and Contact Distance Prediction. Invited Talk. Workshop on Health Informatics, IEEE Conference on Big Data, 2018. (Invited talk)

10. J. Cheng. Distance-based ab initio protein structure prediction driven by deep learning. The workshop on deep learning algorithms and applications, Copenhagen, Denmark, 2019. (Invited talk)

11. J. Cheng. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction in CASP13. ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), Niagara Falls, NY, 2019. (abstract)

12. J. Cheng. Introduction to 2019 ACM-BCB Highlights Session. ACM Conference on Bioinformatics, Computational Biology and Health Informatics (ACM-BCB), Niagara Falls, NY, 2019. (abstract)

13. Deep Learning Prediction of Protein Contact and Distance, International Society for Computational Biology (ISCB), 2020. (talk)

14. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction. University of Alabama, 2020. (talk).

15. Protein Tertiary Structure Modeling Driven by Deep Learning and Contact Distance Prediction in CASP13. ACM-BCB Conference, Niagara Fall, New York, 2019. (talk).

16. Deep Learning for Protein Structure Prediction, Stowers Institute, Kansas City, 2019.

17. J. Liu, J. Hou, T. Wu, Z. Guo, J. Cheng. CASP14 Tertiary Structure Prediction by MULTICOM Human Group. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

18. Tianqi Wu, Jian Liu, Zhiye Guo, Jie Hou, J. Cheng. CASP14 Protein Tertiary Structure Prediction by MULTICOM Server Predictors. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

19. Zhiye Guo1, Tianqi Wu1, Jian Liu1, Jie Hou2, Jianlin Cheng. Prediction of protein inter-residue distance and contacts with deep learning. The 14th Critical Assessment of Techniques of Protein Structure Prediction (CASP14), Virtual Meeting, 2020.

Theses & Dissertations

1. Y. Hong. PRO3DCNN: Convolutional Neural Networks for Mapping Protein Structure into Folds. Master’s Thesis. University of Missouri – Columbia, 2019.

2. J. Hou. Improving Protein Structure Prediction by Deep Learning and Computational Optimization. PhD Dissertation. University of Missouri – Columbia, 2019.

3. A. Al-azzawi. Fully Automated Deep Supervised and Unsupervised Learning Approaches for 3D protein Cryo-EM Density Map Reconstruction. PhD Dissertation. University of Missouri – Columbia, 2019.

4. Xiangyu Li. CONFOLD New Version: Contact-Guided Ab Initio Protein Folding with New Features. Master’s Thesis. University of Missouri – Columbia, 2019.

Software and Data

1. DNSS2: Deep learning tools for protein secondary structure prediction at GitHub

2. MULTICOM: the open source comprehensive protein structure prediction system at GitHub

3. DeepSF: deep convolutional neural networks for mapping protein sequences to folds at GitHub

4. DNCON2: deep learning prediction of protein residue-residue contacts

5. PRO3DCNN: convolutional neural networks for mapping protein structures to folds

6. MULTICOM protein structure prediction server with model quality assessment service

7. SAXSDom: protein domain assembly using SAXS data

8. DNCON4: protein contact map prediction

9. DeepCryoPicker: deep learning picking of protein particles in cryo-EM images

10. DeepDist: deep learning protein distance prediction

11. DFold: distance-based protein structure modeling with simulated annealing

12. Auto3DCryoMap: automatically reconstructing 3D protein density maps from cryo-EM protein particle images

13. GFOLD: gradient descent-based modeling of protein structure

14. MULTICOM2: the second version of MULTICOM protein structure prediction system

15. Deep learning interpretation of protein contact prediction and folding

Education

Courses and Seminar

1. Machine Learning for Bioinformatics

2. Computational Modeling of Molecular Structures

3. University of Missouri Deep Learning Seminar

The community-wide Critical Assessment of Techniques for Protein Structure Prediction(CASP)

1. Our MULTICOM method was ranked 3rd in the 13th CASP competition (CASP13) in 2018. [Official Ranking]

2. Our MULTICOM method was ranked 3rd in inter-domain protein structure prediction and 7th in protein tertiary structure prediction in 14th CASP competition (CASp14) in 2020. [Official Ranking]

Outreach

1. MULTICOM was ranked among top 3 in the world-wide protein tertiary structure prediction (CASP13) (News)

2. Dr. Cheng gave a Ted-like talk on deep learning's application to protein folding for general public in the Research Day Conference, College of Engineering, University of Missouri - Columbia ((News)

3. MULTICOM was ranked among top in the world-wide protein tertiary structure prediction (CASP14) News)

People

Principle Investigator

Dr. Jianlin Cheng

Graduate Students

Jie Hou, Tianqi Wu, Zhiye Guo, Yechan Hong, Farhan Quadir, Carlos Martinez Villar, Chen Chen, Adil Al-azzawi, Xiangyu Li, Alex Morehead, Jian Liu

Undergraduate Students

Yongyu Deng, Haofan Cui, Royal Sanders

Contact

Prof. Jianlin Jack Cheng

Director of Bioinformatics, Data Mining and Machine Learning Laboratory

Professor
Department of Electrical Engineering and Computer Science
Informatics Institute
College of Engineering
University of Missouri, Columbia, MO 65211-2060

Primary Office: EBW 109
Lab: E1425 Lafferre Hall and 250 Naka Hall
Phone: 573-882-7306
Fax: 573-882-8318
Email: chengji@missouri.edu