Home
Publications
Research
Group
Courses
Software
Web Services
Professional Services
Datasets
Positions


Bioinformatics, Data Mining, Machine Learning Laboratory (BDML)

The research in my lab focuses on developing and applying computational and statistical methods (e.g. machine learning and data mining techniques) to address problems in biomedical sciences. Currently we are developing  bioinformatics algorithms and tools for protein structure and function prediction, systems biology, genomics and epigenomics. We have active projects in protein structure prediction, genome structure modeling, inference and simulation of biological networks, protein interaction and docking, protein function prediction, biological sequence alignments, RNA-seq and microarray gene expression data analysis, cancer genomics and epigenomics, and plant and animal bioinformatics . These projects are being funded by the National Institutes of Health (NIH), the National Science Foundation (NSF), and the Department of Energy (DOE).

The main techniques we are using include computational optimization methods, neural networks, deep learning networks, support vector machines, hidden Markov models, graphical models, kernel methods, clustering methods, graph algorithms, dynamic programming, differential equations, information theory, data mining methods, (Bayesian) statistical methods. The bioinformatics tools, web services, and datasets produced by our research are freely available. Our automated tools for the prediction of protein tertiary structure, domain boundary, disorder region, and contact map were ranked among the best methods in the last three community-wide biennial Critical Assessment of Techniques for Protein Structure Prediction (CASP7, 8, 9, 10), in 2006, 2008, 2010, and 2012, respectively. Our protein function prediction method (MULTICOM-PDCN) was ranked among the best methods during 2010-2011 Critical Assessment of Protein Function Annotation (CAFA).

The citations to our research papers according to Google Scholar

Highlights:

In 2012 our MULTICOM protein structure prediction system was ranked among the best methods in protein tertiary structure modeling, protein model refinement, protein model quality assessment, and protein contact map prediction during the 10th Critical Assessment of Techniques for Protein Structure Prediction (CASP10).

From 2010 to 2011, our protein function predition tool MULTICOM-PDCN was ranked among top methods during the Critical Assessment of Protein Function Prediction (CAFA)

In 2010 our MULTICOM protein structure prediction system was ranked among the best methods in template-based structure modeling, template-free structure modeling, protein model quality assessment, protein disorder region prediction, and protein contact map prediction during the 9th Critical Assessment of Techniques for Protein Structure Prediction (CASP9).

In 2008 our MULTICOM protein structure prediction methods were ranked among the best methods in template-based structure modeling, template-free structure modeling, protein model quality assessment, protein disorder region prediction, protein domain boundary prediction, and protein contact map prediction in CASP8, 2008. Dr. Jianlin Cheng was invited to give four talks during the CASP8 meeting, Cagliari, the island of Sardinia, Italy, Dec 3-7, 2008. [CASP8 template_free modeling talk]; [CASP8 template-based modeling talk]; [CASP8 model quality assessment talk]; [CASP8 disorder prediction talk].