Data Mining

Hello, I am the module leader for the Data Mining module, that is part of the MSc Artificial Intelligence course. I will be delivering the module’s lectures and labs. This page shows the tentative schedule for the module.

Most of the content is based on the Book “Introduction to Data Mining” Second Edition by Tan, Steinbach, Karpatne and Kumar.

Tentative Schedule

Day 1Lesson 1Introduction to Data Mining 
Lesson 2Introduction to data and its characteristics
Lesson 3Types of data
Lesson 4Data quality
Lesson 5Measures of similarity and dissimilarity
Lesson 6Data preparation and preprocessing
Lesson 7 Information Theory basics
Day 2Classification
Lesson 8Overview of conventional and deep learning classifiers.
Lesson 9Ensemble learning
Lesson 10Model overfitting
Lesson 11Model evaluation and imbalanced classes
Day 3Feature Selection and Extraction
Lesson 12Feature selection
Lesson 13Filter approaches
Lesson 14Wrapper approaches
Lesson 15Embedded approaches
Lesson 16Hybrid approaches
Lesson 17Feature Extraction (SVD, PCA)
Day 4Clustering Basics and Beyond
Lesson 18Characteristics of data and clusters
Lesson 19Clustering algorithms
Lesson 20Cluster validity measures
Lesson 21Soft clustering
Lesson 22Density-based clustering
Lesson 23Graph based clustering
Developing Good Solutions
Day 5Lesson 24Outlier and anomaly detection 
Lesson 25False discoveries
Days 6-10 Applications of Data Mining
Day 6Lesson 26Information Retrieval
Day 7 Lesson 27Natural Language Processing
Day 8 Lesson 28Biomedical Predictive Modelling
Day 9Lesson 29 Image Retrieval 
Day 10Lesson 30Coursework -project showcase videos from industry

Module Delivery

  • Recommended text book: Introduction to Data Mining
  • Slides: PPT slides will be available.
  • Slides with recordings: Short versions of the slides with recordings for every lesson
  • Labs: In MATLAB and/or Python.
  • Weekly lab activities: In your preferred programming language
  • Live discussion sessions