CSCI 3360: Data Science I

Spring 2025

Data Science I cover

Course Information

  • Instructor: Prof. Jin Sun
  • Time: Monday 3:00 PM - 3:50 PM, Tuesday/Thursday 2:20 PM - 3:35 PM
  • Location: Miller Plant Science 2102
  • Office Hours: Thursdays 4:00 PM - 5:00 PM or by appointment

Description

This class is designed as an introductory study of the theory and practice of data science. Topics covered include fundamentals of data science, practical libraries to handle data, data collection and cleaning, data visualization and analysis, learning algorithms for classification and regression, unsupervised learning, validation metrics, applications in computer vision, natural language processing, and recommendation systems.

Learning Outcomes

  • Demonstrate understanding of data science pipeline fundamentals.
  • Familiar with relevant data science libraries and software packages.
  • Ability to formulate a learning problem from data.
  • Ability to evaluate a learning model.
  • Gain experience deploying learning models in computer vision, natural language processing, and other application domains.

(FREE) Textbooks

  • An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani
    ISBN: 9781461471370
    https://www.statlearning.com/
  • Python Data Science Handbook by Jake VanderPlas
    ISBN: 9781491912157
    https://jakevdp.github.io/PythonDataScienceHandbook/

Course Schedule

Introduction and basics

Week 1 Jan 6,7,9

Topic: Introduction and overview

Week 2 Jan 13,14,16

Topic: Python and data science basics

Readings:
  • Whirlwind Tour of Python
  • Python Data Science Handbook, Chapter 2

Martin Luther King Jr. Day Jan 20

No class

Data

Week 3 Jan 21,23

Topic: Data collection and processing

Readings:
  • Python Data Science Handbook, Chapter 3

Week 4 Jan 27,28,30

Topic: Data visualization

Readings:
  • Python Data Science Handbook, Chapter 4

Learning

Week 5 Feb 3,4,6

Topic: Probabilistic learning foundation

Readings:
  • Introduction to Statistical Learning, Chapter 2

Week 6 Feb 10,11,13

Topic: Regression

Readings:
  • Introduction to Statistical Learning, Chapter 3

Week 7 Feb 17,18,20

Topic: Classification

Readings:
  • Introduction to Statistical Learning, Chapter 4

Week 8 Feb 24,25,27

Topic: Unsupervised learning

Readings:
  • Introduction to Statistical Learning, Chapter 12

Spring Break Mar 3-7

No class

Week 9 Mar 10,11,13

Topic: Trees and boosting

Readings:
  • Introduction to Statistical Learning, Chapter 8

Week 10 Mar 17,18,20

Topic: Validation

Readings:
  • Introduction to Statistical Learning, Chapter 5

Applications

Week 11 Mar 24,25,27

Topic: Vision

Week 12 Mar 31, Apr 1,3

Topic: Language

Week 13 Apr 7,8,10

Topic: Recommendation systems

Week 14 Apr 14,15,17

Topic: Real world-ready development

Week 15 Apr 21,22,24

Topic: Project presentation

Final exam TBD (Apr 30-May 6)