CMP SC 8370: Data Mining and Knowledge Discovery
Instructor: Dr. Jianlin Cheng
Location: Engineering Building West 353, Time: MoWe 4:00 pm - 5:15 pm,
Office Hours: MoWe 2:30 pm - 3:30 pm, Semester: Spring 2016

Syllabus
Lecture Slides  
Acknowledgements: these slides are largely customized and adapted from the text book's slides.
1. Data Mining Concepts and Process 
2. Data Preprocessing 
3. Frequent Pattern Mining
4. Classification and Prediction
5. Cluster Analysis
6. Network Mining
Text Book
Han and Kamber. Data Mining: Concepts and Techniques
. Morgan Kaufman.  
Reading Materials and Other Resources 
1. A portal web site of the data mining community (news, tools, data, jobs, trends)
2. Chapters of the text book covered in the class (self-reading, not graded)
3. R Statistics Computing Software
4. Weka open source data mining software
5. The vote data set used to demo both classification and clustering with Weka
6. RapidMiner open source data mining software
7. Cloud computing
8. Data Mining Theory
9. Data Science Portal
10. LinkedIn Business Data Mining Group
Assignments 
All the assignments should be submitted to mudatamining@gmail.com.
Assignment 1 (15 points), due by the end of Feb 8.
Assignment 2 (20 points), due by Feb. 18. (Here are  a couple of examples about how to use R to draw plots, which may be useful)
Assignment 3 (35 points), due by March 1.
Assignment 4 (10 points, due March 15) (your group members' names are required. Other information about the project you choose is optional. Each group has up to four students. Only one member of a group should send the information to mudatamining@gmail.com on behalf of all members.)
Projects 
The description of project 1 - customer relation prediction
The description of project 2 - new customer recognition
The description of project 3 - internet query classification
Project 4 - image data mining, Face V.S. other object recognition
Project 5 - social network mining
The final report of your project is due on May 14. The mininum number of pages of the report is three. The report should include a title, author names, an abstract, an introduction to the problem, methods, results, conclusion, and optionally references. Please submit your report to mudatamining@gmail.com.  
Related Courses taught by Prof. Jianlin Cheng
Supervised Machine Learning
Computational Modeling of Molecular Structures
Data Mining and Knowledge Discovery
Machine Learning for Bioinformatics
Problem Solving in Bioinformatics
Computational Optimization Mehtods