CS 5083: Knowledge Discovery and Data Mining

The following is a preliminary schedule for CS 5083 Knowledge Discovery and Data Mining. This schedule will be updated as the semester progresses. Also, although it is not listed on each day, there will be project discussions every class period.

Date Topic Due
Jan 20 (Week 1) What is data mining? Seminar style classes, project discussion
Jan 22

Is data mining just statistics? Project discussion

Barn-Raising paper

Statistics and Data Mining: Intersecting Disciplines

Project ideas

Jan 27 (Week 2) Snow day!
Jan 29 Fast Algorithms for Mining Association Rules (if link does not work, paper is also on D2L) Summary 1, Vote on Project
Feb 3 (Week 3) Statewide Monitoring of the Mesoscale Environment: A Technical Update on the Oklahoma Mesonet (paper on D2L) Summary 2
Feb 5 A Geographic Information Systems?Based Analysis of Supercells across Oklahoma from 1994 to 2003 (paper on D2L) Summary 3
Feb 10 (Week 4) Catch up on the GIS paper and project discussion  
Feb 12 Very Simple Classification Rules Perform Well on Most Commonly Used Datasets Summary 4
Feb 17 (Week 5) Random Forests Summary 5
Feb 19 Exploiting Relational Structure to Understand Publication Patterns in High-Energy Physics Summary 6
Feb 24 (Week 6) Handling Missing Features when Applying Classification Models Summary 7
Feb 26 Constrained K-means Clustering with Background Knowledge Summary 8
Mar 3 (Week 7) Clustering Distributed Time Series in Sensor Networks Summary 9
Mar 5 Pick a Keogh paper Summary 10
Mar 10 (Week 8) iSAX: Indexing and Mining Terabyte Sized Time Series. Additional information is here. Summary 11
Mar 12    
Mar 14-22
Spring Break!
Mar 24 (Week 9) A tutorial on Principal Component Analysis Summary 12
Mar 26 Can Complex Network Metrics Predict the Behavior of NBA Teams? paper is also on D2L if the above doesn't work. Summary 13
Mar 31 (Week 10) Predicting Future Decision Trees from Evolving Data (paper on D2L) Summary 14
Apr 2 The NFL Coaching Network: Analysis of the Social Network Among Professional Football Coaches Summary 15
Apr 7 (Week 11) Project discussion (Dr Basara leading class)  
Apr 9 Temporal-Relational Classifiers for Prediction in Evolving Domains Summary 16
Apr 14 (Week 12) Learning Relational Probability Trees Summary 17
Apr 16 Spatiotemporal Relational Probability Trees Summary 18
Apr 21 (Week 13) Learn all you can about PageRank. Summary 19
Apr 23 Joke Retrieval: Recognizing the Same Joke Told Differently Summary 20
Apr 28 (Week 14) Finding Text Reuse on the Web Summary 21
Apr 30 C4. 5, class imbalance, and cost sensitivity: why under-sampling beats over-sampling Summary 22
May 5 (Week 15) Project work  
May 7 Presentations at NWC 3902
May 15
No final!