Data Mining Techniques (464.506) : Spring 2012

Notice: 학부생 대상 데이터마이닝 강좌는 금년 가을학기에 열릴 예정입니다.
Notice: All course materials are encrypted with the password noticed in class.
Notice: 프로젝트 최종 발표 강의실이 39동 309호에서 327호로 변경되었습니다.
  1. Instructor
  2. Prof. Sungzoon Cho  (zoonsnu.ac.kr) 
  3. Time and Place
  4. Texts
  5. Evaluation
  6. Teaching Assistant (TA)
  7. Eunjeong Lucy Park  (ejpark04snu.ac.kr)
  8. Course schedule 
  9. Week Topic Reading Assignment* Homework**
    1 3/6 (Tue) - Introduction
    3/8 (Thu) - Introduction HAN 1.1~1.7
    2 3/13 (Tue) - Business Problem: Marketing
    • Introductory Paper Submission: Hard-copy to class
    ***
    BER p.461-471
    SHK p.317-349
    BLA p.13-15
    Self Introduction Paper
    3/15 (Thu) - Business Problem: Manufacturing ***
    3 3/20 (Tue) - Business Problem: Risk Management ***
    SHK p.317-349
    3/22 (Thu) - Data exploration
    • Visualization tool introduction: Spotfire
    HAN 2.1~2.3
    4 3/27 (Tue) - Data exploration, Preprocessing HAN 2.4~3.3
    3/29 (Thu) - Preprocessing
    HAN 3.4~3.5
    5 4/3 (Tue) - Mining Frequent Patterns, Associations and Correlations HAN 6.1~6.3
    4/5 (Thu) - Intro: Classification and Prediction, Regression (1) HAN 8.1
    SHU 6.1~6.3, 10.1~10.2
    6 4/10 (Tue) - Intro: Classification and Prediction, Regression (2)
    4/12 (Thu) - Intro: Cluster Analysis HAN 10.1~10.2, 10.6
    7 4/17 (Tue) - Lab: Matlab Experiments
    4/19 (Thu) - Midterm Exam
    • Location: 43-1-402
    8 4/24 (Tue) - 휴강 (춘계 BI데이터마이닝 학회)
    4/26 (Thu) - Classification and Prediction: Decision Trees + Term Project Proposal Presentation (연장 수업)
    • Report Format: Presentation (max 10 slides)
    • Required Contents:
      • Business background
      • Business problem
      • Data mining process
      • How to obtain the data
      • Models to use
      • Expected data mining results
      • Expected business implications
    • File name example: DMT_Team11_20120425_Data-segmentation-for-semiconductor-manufaturing-data_Proposal.pptx
    • Submission:
      • Soft-copy: Due 4/25(Wed) 23:59 via email to TA
      • Hard-copy: Due 4/26(Thu) 16:00 bring to class
    HAN 8.2Proposal Report
    9 5/1 (Tue) - Classification and Prediction: Bayes, k-NN + Evaluation (연장 수업)
    HAN 8.3, 8.5, 9.5
    5/3 (Thu) - Classification and Prediction: Applications
    10 5/8 (Tue) - Cluster Analysis: Hierarchical Methods
    HAN 10.3
    5/10 (Thu) - 휴강 (춘계 산업공학회)
    11 5/15 (Tue) - Cluster Analysis: SOM (1) HAY 9.1~9.5
    5/17 (Thu) - Cluster Analysis: SOM (2) HAN 12.1~12.4
    12 5/22 (Tue) - Term Project Progress Presentation (연장 수업 + pizza party)
    • Report Format: Presentation (max 20 slides)
    • Required Contents:
      • Proposal
      • Data exploration, Model used, Data mining results
    • File name example: DMT_Team11_20120521_Data-segmentation-for-semiconductor-manufaturing-data_Progress.pptx
    • Submission:
      • Soft-copy: Due 5/21(Mon) 23:59 via email to TA
      • Hard-copy: Due 4/22(Tue) 16:00 bring to class
    Progress Report
    5/24 (Thu) - Outlier Detection + Classification and Prediction: MLP HAN 9.2
    13 5/29 (Tue) - Classification and Prediction: SVM HAN 9.3
    5/31 (Thu) - Cluster Analysis: Density, Grid, Model-based Methods + Clustering Applications
    HAN 10.4~10.5, 11.1
    14 6/5 (Tue) - Text Mining and its Applications
    6/7 (Thu) - Big Data Analysis
    15 6/12 (Tue) - Image Processing
    • 아주대학교 김동윤 교수님 초청 강의
    6/14 (Thu) - Final Exam
    • 전범위
    • Location: 43-1-402
    16 6/19 (Tue) - 휴강
    6/21 (Thu) - Term Project Final Presentation (연장 수업)
    • Time: 15:30-18:00
    • Location: 39-327
    • Report Format: Presentation (max 30 slides)
    • Required Contents:
      • Progress report
      • Enhanced results, Business implications, Future work
    • File name example: DMT_Team11_20120618_Data-segmentation-for-semiconductor-manufaturing-data_Final.pptx
    • Submission:
      • Soft-copy: Due 6/20(Wed) 23:59 via email to TA
      • Hard-copy: Due 6/21(Tuu) 16:00 bring to class
    Final Report
    * Quiz는 reading assignment의 범위 내에서 출제됩니다. 문제는 italic & bold-face text, figures 그리고 equations에서 주로 출제.
    ** Delay penalty: 10% off per hour from soft-copy due
    *** 논문의 경우, introduction 부분만 읽으면 됩니다. 전문을 읽지 않아도 됩니다. & NO quiz today!

  10. Board