Course Information
- Instructor: Prof. Jin Sun
- Time: Monday 3:00 PM - 3:50 PM, Tuesday/Thursday 2:20 PM - 3:35 PM
- Location: Miller Plant Science 2102
- Office Hours: Thursdays 4:00 PM - 5:00 PM or by appointment
Description
This class is designed as an introductory study of the theory and practice of data science. Topics covered include fundamentals of data science, practical libraries to handle data, data collection and cleaning, data visualization and analysis, learning algorithms for classification and regression, unsupervised learning, validation metrics, applications in computer vision, natural language processing, and recommendation systems.Learning Outcomes
- Demonstrate understanding of data science pipeline fundamentals.
- Familiar with relevant data science libraries and software packages.
- Ability to formulate a learning problem from data.
- Ability to evaluate a learning model.
- Gain experience deploying learning models in computer vision, natural language processing, and other application domains.
(FREE) Textbooks
-
An Introduction to Statistical Learning
by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
ISBN: 9781461471370
https://www.statlearning.com/ -
Python Data Science Handbook
by Jake VanderPlas
ISBN: 9781491912157
https://jakevdp.github.io/PythonDataScienceHandbook/
Syllabus
Course Schedule
Introduction and basics
Week 1 Jan 6,7,9
Topic: Introduction and overview
Week 2 Jan 13,14,16
Topic: Python and data science basics
Readings:
- Whirlwind Tour of Python
- Python Data Science Handbook, Chapter 2
Martin Luther King Jr. Day Jan 20
No class
Data
Week 3 Jan 21,23
Topic: Data collection and processing
Readings:
- Python Data Science Handbook, Chapter 3
Week 4 Jan 27,28,30
Topic: Data visualization
Readings:
- Python Data Science Handbook, Chapter 4
Learning
Week 5 Feb 3,4,6
Topic: Probabilistic learning foundation
Readings:
- Introduction to Statistical Learning, Chapter 2
Week 6 Feb 10,11,13
Topic: Regression
Readings:
- Introduction to Statistical Learning, Chapter 3
Week 7 Feb 17,18,20
Topic: Classification
Readings:
- Introduction to Statistical Learning, Chapter 4
Week 8 Feb 24,25,27
Topic: Unsupervised learning
Readings:
- Introduction to Statistical Learning, Chapter 12
Spring Break Mar 3-7
No class
Week 9 Mar 10,11,13
Topic: Trees and boosting
Readings:
- Introduction to Statistical Learning, Chapter 8
Week 10 Mar 17,18,20
Topic: Validation
Readings:
- Introduction to Statistical Learning, Chapter 5
Applications
Week 11 Mar 24,25,27
Topic: Vision
Week 12 Mar 31, Apr 1,3
Topic: Language
Week 13 Apr 7,8,10
Topic: Recommendation systems
Week 14 Apr 14,15,17
Topic: Real world-ready development
Week 15 Apr 21,22,24
Topic: Project presentation