From CMPT 726 and CMPT 732, students have learned machine learning algorithms and big data programming tools. However, when facing a real-world data problem, we find that there is still a gap between what is addressed by machine learning or data engineering and what we are going to do in practice.
The goal of this course is to fill this gap, enabling the students to apply what they have learned to solve real-world problems. To achieve this goal, our course will cover a set of important topics that a data scientist should know, and teach students about state-of-the-art approaches. After taking this course, students should feel confident when being asked to extract value from real-world data sets, and know how to ask interesting questions about data, how to choose proper tools, how to design data-processing pipelines, and how to present final data products.
Week | Date | Event Type | Description | Course Materials |
Week 1 | Thursday Jan 5 |
Lecture 1 | Course Introduction What/Why Data Science? Data Science Lifecycle Questions that data scientists can answer Course Logistics |
[slides] |
Thursday Jan 12/16 |
A1 due | Assignment #1-1 and #1-2 due | [Assignment #1 (Part 1)] [Assignment #1 (Part 2)] |
|
Week 2 | Thursday Jan 12 |
Lecture 2 | Data Preparation Data Collection Data Transformation Data Cleaning Data Integration |
[slides] |
Thursday Jan 23 |
A2 due | Assignment #2 due | [Assignment #2 (Part 1)] [Assignment #2 (Part 2)] |
|
Week 3 | Thursday Jan 19 |
Lecture 3 | Data Visualization (Part I) Introduction to Visualization Exploratory Data Analysis Visualization Principles |
[slides] |
Tuesday Jan 31 |
A3 due | Assignment #3 due | [Assignment #3] | |
Week 4 | Thursday Jan 26 |
Lecture 4 | Statistics (Part I) Statistical Thinking EDA |
[slides] |
Monday Feb 6 |
A4 due | Assignment #4 due | [Assignment #4] | |
Friday Feb 10 |
Blog Post due | Blog Post due | [Blog Post Task] | |
Week 5 | Thursday Feb 2 |
Lecture 5 | Practical Machine Learning (Part I) | [slides] |
Tuesday Feb 14 |
A5 due | Assignment #5 due |
[Assignment #5] | |
Week 6 | Thursday Feb 9 |
Lecture 6 | Deep Learning (Part I) | [slides] |
Wednesday Feb 22 |
A6 due | Assignment #6 due |
[Assignment #6] | |
Week 7 | Thursday Feb 16 |
Lecture 7 | Data Visualization (Part II) Design Principles Graph Drawing |
[slides 1][slides 2] |
Friday Feb 17 |
Proposal due | Course Project Proposal due |
||
Tuesday Mar 7 |
A7 due | Assignment #7 due |
[Assignment #7] | |
Week 8 | Reading Break (Feb 21 - 26) | Suggestion: Start final project | ||
Week 9 | Thursday Mar 2 |
Lecture 8 | Practical Machine Learning (Part II) |
[slides 1][slides 2] |
Monday Mar 20 |
A8 due | Assignment #8 due | [Assignment #8 (Part 1)] [Assignment #8 (Part 2)] |
|
Week 10 | Thursday Mar 9 |
Milestone Presentation | ||
Monday Mar 20 |
A8 due | Assignment #8 due | [Assignment #8 (Part 1)] [Assignment #8 (Part 2)] |
|
Week 10 | Thursday Mar 16 |
Lecture 9 | Statistics (Part II) | [slides 1][slides 2] |
Monday Mar 27 |
A9 due | Assignment #9 due |
[Assignment #9 (Part 1)] [Assignment #9 (Part 2)] |
|
Week 11 | Thursday Mar 23 |
Lecture 10 | Deep learning (Part II), Natural Language Processing |
[slides DL] [slides NLP] [NLP notebook] |
Monday Apr 3 |
A10 due | Assignment #10 due |
[Assignment #10] | |
Week 12 | Thursday Mar 30 |
Lecture 11 | Responsible Data Science | [slides 1] [slides 2] |
Thursday Mar 30 Monday Apr 17 |
A11 due | Assignment #11 due |
[Assignment #11 (Part 1)] [Assignment #11 (Part 2)] |
|
Week 13+ | Tuesday Apr 11 |
Final Project Presentation & Code | Course Project Presentation |
|
Wednesday Apr 12 |
Report & Video due | Course Project Report due |
© Jiannan Wang & Steven Bergner 2023