Machine Learning Methods for Biomedical Informatics
Instructor: Prof. Jianlin Cheng
Location: Physics Building 120, Time: TuTh 11:00 am - 12:15 pm, Office Hours: TuTh 1:30 - 2:00, Semester: Fall 2018
Lecture Notes
1. HMM Theory
2. HMM Application in Bioinformatics [PDF], [PPT]
3. Neural Network and Deep Learning Theory
4. Neural Network and Deep Learning Applications in Bioinformatics
5. Support Vector Machine Theory
6. Support Vector Machine Applications in Bioinformatics
7. Bayesian Network Theory (Introduction)
Reading Assignments
(1) Hidden Markov Model Theory and its Application in Bioinformatics (e.g. sequence and profile alignment)
L. Page, S. Brin, R. Motwani, T. Winograd. The PageRank Citation Ranking: Brining Order to the Web. Technical Report. Stanford InfoLab, 1999.
L. R. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE. 1989. (pages 257-266)
A. Krogh, M. Brown, I. S. Mian, K. Sjolander, and D. Haussler. Hidden Markov Models in Computational Biology (Applications to Protein Modeling). Journal of Molecular Biology. 1993.
S. R. Eddy. Profile Hidden Markov Models. Bioinformatics. 1998.
J. Soeding. Protein Homology Detection by HMM-HMM Comparison. Bioinformatics. 2005.
M. Remmert, A. Biegert, A. Hauser, J. Söding. "HHblits: Lightning-fast iterative protein sequence searching by HMM-HMM alignment.". Nat. Methods. 9 (2): 173–175, 2011.
(2) Neural Network Theory and its Application in Bioinformatics (e.g. protein secondary structure prediction)
Free Online Book on Deep Learning by Goodfellow, Bengio and Courville
D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning representations by back-propagating errors. Nature. 1986.
Y. LeCun, Y. Bengio, G. E. Hinton. Deep Learning. Nature. 521:436-444, 2015.
(3) Support Vector Machine Theory and its Application in Bioinformatics (e.g. protein fold recognion)
C. J. C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery. 1998.
J. Cheng and P. Baldi. A Machine Learning Information Retrieval Approach to Protein Fold Recognition. Bioinformatics. 2006.
Projects
There is one comprehensive group project. Each group selects one project from the following options:
(1) Multiple sequence alignment using HMM
(2) Secondary structure prediction or fold classification using deep learning
(3) Protein residue-residue contact prediction using deep learning
(4) Cancer classification using support vector machine
Presentation
Each group / person has 25 minutes to present the selected project (about 20 minutes for presentatioin and 5 minutes for questions).