From CMPT 726 and CMPT 732, students have learned machine learning algorithms and big data programming tools. However, when facing a realworld data problem, we find that there is still a gap between what is addressed by machine learning or data engineering and what we are going to do in practice.
The goal of this course is to fill this gap, enabling the students to apply what they have learned to solve realworld problems. To achieve this goal, our course will cover a set of important topics that a data scientist should know, and teach students about stateoftheart approaches. After taking this course, students should feel confident when being asked to extract value from realworld data sets, and know how to ask interesting questions about data, how to choose proper tools, how to design dataprocessing pipelines, and how to present final data products.
Week  Date  Event Type  Description  Course Materials 
Week 1  Thursday Jan 5 
Lecture 1  Course Introduction What/Why Data Science? Data Science Lifecycle Questions that data scientists can answer Course Logistics 
[slides] 
Thursday Jan 12/16 
A1 due  Assignment #11 and #12 due  [Assignment #1 (Part 1)] [Assignment #1 (Part 2)] 

Week 2  Thursday Jan 12 
Lecture 2  Data Preparation Data Collection Data Transformation Data Cleaning Data Integration 
[slides] 
Thursday Jan 23 
A2 due  Assignment #2 due  [Assignment #2 (Part 1)] [Assignment #2 (Part 2)] 

Week 3  Thursday Jan 19 
Lecture 3  Data Visualization (Part I) Introduction to Visualization Exploratory Data Analysis Visualization Principles 
[slides] 
Tuesday Jan 31 
A3 due  Assignment #3 due  [Assignment #3]  
Week 4  Thursday Jan 26 
Lecture 4  Statistics (Part I) Statistical Thinking EDA 
[slides] 
Monday Feb 6 
A4 due  Assignment #4 due  [Assignment #4]  
Friday Feb 10 
Blog Post due  Blog Post due  [Blog Post Task]  
Week 5  Thursday Feb 2 
Lecture 5  Practical Machine Learning (Part I)  [slides] 
Tuesday Feb 14 
A5 due  Assignment #5 due 
[Assignment #5]  
Week 6  Thursday Feb 9 
Lecture 6  Deep Learning (Part I)  [slides] 
Wednesday Feb 22 
A6 due  Assignment #6 due 
[Assignment #6]  
Week 7  Thursday Feb 16 
Lecture 7  Data Visualization (Part II) Design Principles Graph Drawing 
[slides 1][slides 2] 
Friday Feb 17 
Proposal due  Course Project Proposal due 

Tuesday Mar 7 
A7 due  Assignment #7 due 
[Assignment #7]  
Week 8  Reading Break (Feb 21  26)  Suggestion: Start final project 

Week 9  Thursday Mar 2 
Lecture 8  Practical Machine Learning (Part II) 
[slides 1][slides 2] 
Monday Mar 20 
A8 due  Assignment #8 due  [Assignment #8 (Part 1)] [Assignment #8 (Part 2)] 

Week 10  Thursday Mar 9 
Milestone Presentation  
Monday Mar 20 
A8 due  Assignment #8 due  [Assignment #8 (Part 1)] [Assignment #8 (Part 2)] 

Week 10  Thursday Mar 16 
Lecture 9  Statistics (Part II)  [slides 1][slides 2] 
Monday Mar 27 
A9 due  Assignment #9 due 
[Assignment #9 (Part 1)] [Assignment #9 (Part 2)] 

Week 11  Thursday Mar 23 
Lecture 10  Deep learning (Part II), Natural Language Processing 
[slides DL] [slides NLP] [NLP notebook] 
Monday Apr 3 
A10 due  Assignment #10 due 
[Assignment #10]  
Week 12  Thursday Mar 30 
Lecture 11  Responsible Data Science  [slides 1] [slides 2] 
Thursday Mar 30 Monday Apr 17 
A11 due  Assignment #11 due 
[Assignment #11 (Part 1)] [Assignment #11 (Part 2)] 

Week 13+  Tuesday Apr 11 
Final Project Presentation & Code  Course Project Presentation 

Wednesday Apr 12 
Report & Video due  Course Project Report due 
© Jiannan Wang & Steven Bergner 2023