Home
Publications
Research
Group
Courses
Software
Services
Positions
News


Bioinformatics and Machine Learning Laboratory (BML)

Overview:

The research in my lab focuses on developing machine learning, deep learning, and artificial intelligence (AI) methods to analyze biological and medical data and address fundamental problems in biological and medical sciences. Currently, we are developing  bioinformatics algorithms and tools for protein structure, interaction and function prediction, 3D genomics, biological network modeling, and omics data analysis. We have active projects in protein structure, interaction and function prediction, protein and drug design, cryo-electron microscopy (cryo-EM) data analysis, 3D genome structure modeling, inference and simulation of biological networks and systems, and multi-omics (transcriptomics, genomics, epigenomics, and proteomics) data analysis. These projects are being funded by the National Institutes of Health (NIH), the National Science Foundation (NSF), the Department of Energy (DOE), and US Department of Agriculture (USDA).

The main techniques that we are developing include deep learning, artificial intelligence (AI), machine learning, data mining, optimization, and high performance computing (cloud computing and GPU). Our AI and bioinformatics tools, web services, and datasets are freely available. Our MULTICOM suite for the prediction of protein structure and structural features were ranked among the best methods in the last several community-wide biennial Critical Assessments of Techniques for Protein Structure Prediction (CASP7, 8, 9, 10, 11, 12, 13, 14, 15, and 16) in 2006, 2008, 2010, 2012, 2014, 2016, 2018, 2020, 2022, and 2024), respectively.

The citations to our research papers according to Google Scholar

Highlights:

In 2024, during the latest CASP16 competition, our MULTICOM predictors were ranked among top in five major categories: (a) protein complex structure prediction (no. 1 in Phase 0 prediction without stoichiometry information, (b) Phase 1 protein complex structure prediction with stoichiometry information (no. 3), (c) tertiary structure prediction (no. 2), (d) protein model accuracy estimation (no. 2 in global fold accuracy estimation and no. 1 in ranking homo-multimer structures), and (e) protein-ligand structure (pose) and binding affinity prediction (no. 5).