The following is a preliminary schedule for CS 5083 Knowledge Discovery and Data Mining for Spring 2012. This schedule will be updated as the semester progresses.
|Jan 17 (Week 1: Introduction)||Introduction, what is data mining, how the class works, project discussion|
|Jan 19||What is data mining? Project discussion||
|Jan 24 (Week 2: )||
|Jan 26||What is data mining? Ethics and data mining, Is data mining just statistics? Project discussion||
Finish chapter 1 from the book
|Find and read a paper on ethics and data mining||Project vote|
|Jan 31 (Week 3: )||Association rules||Fast Algorithms for Mining Association Rules||Summary 1|
|Feb 2||Algorithm overview||Chapter 4||Summary 2|
|Feb 7 (Week 4: )||Evaluation||Chapter 5||Summary 3|
|Feb 9||Real algorithms||Chapter 6: 1st half||Summary 4|
|Feb 14 (Week 5: )||Real algorithms||Chapter 6: through section 2.4||Project updates|
|Feb 16||Real algorithms||Chapter 6: 2nd half||Summary 5|
|Feb 21 (Week 6: )||Finish real algorithms||Project updates|
|Feb 23||Visit from our data source representative|
|Mar 28 (Week 7: )||Logical Shapelets||Summary 6|
|Mar 1||Indexing time series efficiently||iSAX 2.0: Indexing and Mining One Billion Time Series||Summary 7|
|Mar 6 (Week 8: )||Time series paper presentations||
Tim: Qiang Zhu and Eamonn Keogh (2010) Using CAPTCHAs to Index Cultural Artifacts. The Ninth International Symposium on Intelligent Data Analysis [pdf].
Chris: Li Wei and Eamonn Keogh (2006) Semi-Supervised Time Series Classification. SIGKDD 2006.
Scott: Lin, J., Keogh, E., Lonardi, S. & Chiu, B. (2003) A Symbolic Representation of Time Series, with Implications for Streaming Algorithms.
Caleb: Eamonn Keogh, Li Wei, Xiaopeng Xi, Stefano Lonardi, Jin Shieh, Scott Sirowy (2006). Intelligent Icons: Integrating Lite-Weight Data Mining and Visualization into GUI Operating Systems
Carlos: Jin Shieh and Eamonn Keogh (2008) iSAX: Indexing and Mining Terabyte Sized Time Series. SIGKDD 2008.
Allen: Bing Hu, Thanawin Rakthanmanon, Yuan Hao, Scott Evans, Stefano Lonardi, and Eamonn Keogh (2011). Discovering the Intrinsic Cardinality and Dimensionality of Time Series using MDL
Wayne: Gustavo Batista, Xiaoyue Wang and Eamonn J. Keogh (2011) A Complexity-Invariant Distance Measure for Time Series.
James: E. Keogh, J. Lin and A. Fu (2005). HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence
Sonya: Thanawin Rakthanmanon, Eamonn Keogh, Stefano Lonardi, and Scott Evans (2011). Time Series Epenthesis: Clustering Time Series Streams Requires Ignoring Some Data.
Nathan: Abdullah Mueen, Eamonn Keogh, Qiang Zhu, Sydney Cash, Brandon Westover (2009). Exact Discovery of Time Series Motifs.
|Mar 8||Finish up presentations from Tuesday|
|Mar 13 (Week 9: )||Multi-dimensional time series mining||Identifying Predictive Multi-Dimensional Time Series Motifs: An application to severe weather prediction||Summary 8, pick a good paper to help with the project|
|Mar 15||Multi-dimensional time series mining||Summary 9|
|Mar 27 (Week 10: )||Data transformations and SVMs||Chapter 7 of the book, Support Vector Machines: Hype or Hallelujah?||Summaries on both papers|
|Mar 29||SVMs on real data||David Goldberg's thesis||Summary|
|Apr 3 (Week 11: )||Random Forests||Summary 13|
|Apr 10 (Week 12: )||
|Apr 17 (Week 13: )||Student papers||Fast and Flexible Multivariate Time Series Subsequence Search|
Cancelled: Dr McGovern sick
|Apr 24 (Week 14: )||Student papers||Clustering Very Large Multi-dimensional Datasets with MapReduce and|
|Apr 26||Student papers||Finish discussion of Causality Quantification and Its Applications: Structuring and Modeling of Multivariate Time Series and then Mining periodic behaviors for moving objects and Large Linear Classification When Data Cannot Fit In Memory|
|May 1 (Week 15: )|
|May 3 (Course wrapup)||Final presentations||Final presentations|