CSCI 8945 Advanced Representation Learning

Fall 2023

Instructor: Prof. Jin Sun

4 Credit Hours

Catalog Description: Advanced Representation Learning is a course designed to delve deeper into the fundamental concepts of representation learning and its applications. In this class, students will explore various representation learning techniques, including both classical and deep learning methods, and learn how to apply these techniques to solve complex problems in computer vision, natural language processing, audio, and other areas. By working on the research project component of the course, the students will develop novel methods and theories about representation learning and prepare manuscripts describing their findings. By the end of this course, the students will have a solid understanding of the state-of-the-art in representation learning and be able to apply these techniques to solve real-world problems.

Prerequisties: Students should have a solid understanding of machine learning basics and relevant math concepts.

Class Location and Times:
Tue & Thu 12:45 pm - 2:00 pm 222 Boyd
Wed 12:40 pm - 1:30 pm 222 Boyd

Reading Materials:

Student Outcomes:
  1. Demonstrate understanding of machine learning and deep neural network fundamentals.
  2. Gain experience deploying deep learning models in computer vision, natural language processing, and audio domains.

Instructor Contact:
Prof. Jin Sun
Office Hours: Thursdays 4-5 pm or by appointment
Office: 804 Boyd
Email: jinsun@uga.edu

Evaluation and Grading: The final course grade will be weighted as the follows:
Paper presentations 20%
Homework 20%
Midterm exam 20%
Course project and presentation 40%

Homework assignments: Your homework submission should be in PDF format. You are encouraged to use LaTeX (for online editing, use Overleaf). Cite any references used (including books, online webpages, and code). Your homework should be done by yourself, not as a group.
The PDF file should contain all essential text, equations, figures, code, and program outputs. Attach your code as the appendix.

Midterm exam: You will take an in-class exam to test your knowledge on the essential concepts covered in the class including general representation learning techniques and various domain-specific representations such as word embeddings.

Paper presentations: You will choose one research paper to present in the later part of the semester. Make sure you cover all the essential components and main messages about the paper and lead insightful discussions with the class.

Team Project: You will work in a team on a course project. Each team should have 2-3 members. You are encouraged to design the project to solve a real-world application using deep learning and computer vision. Feel free to use any programming language or software packages of your choice. The schedule for the project is as follows:
  1. Project Proposal: The project proposal should clearly state what your team plan to do. It should be four pages long (not including references). It should contain a timeline. You should list the questions the project will address and that will be discussed in the report. You should list what software you will be using or will build upon. Describe the datasets you will use and how will you know if the project is successful. Describe the hypotheses you will test and the related work. You should be able to reuse much of the text for the final report.

  2. Project Milestone: You can re-use the project proposal for this report but expand it with additional content. You should talk about preliminary results and/or other measurable items listed in the proposal.

  3. Project Report and Presentation: The final report contains a complete description of the project: what you have done and what the result looks like. It should be about six to eight pages long (not including references). You are encouraged to format it in CVPR format. We will have a presentation session for all projects at the last day of the class. Make sure every member in your team participate in the presentation.

Class Schedule

Date Topic Day Schedule Homeworks
8/16 Introduction and background Introduction and overview
8/17 Data and dimensionality
8/22 Data representation space and structures Dimension reduction, metric learning, PCA, MDS
8/23 Deep learning workflow and useful programming tools
8/24 Structures in data spaces, manifolds, subspaces, sparse coding
8/29 Visual representations Pixels, 3D points, and cameras
8/30 Image operations
8/31 Semantics HW1 assignment
9/5 Videos
9/6 Project and research discussion, how to do research?
9/7 Image subspaces and manipulations
9/12 Language representations Representing words and sentences
9/13 Project and research discussion, how to read a paper?
9/14 Language model pretraining
9/19 NLP tasks
9/20 Project and research discussion, how to write a paper?
9/21 Zero-shot and in-context learning HW1 due
9/26 Audio representations Representing sound
9/27 Project and research discussion
9/28 Audio generation and editing
10/3 Graphs Graphs and neural networks
10/4 Project and research discussion
10/5 GNN applications HW2 assignment
10/10 Multi-modal representations Midterm Midterm
10/11 Project and research discussion
10/12 Overview of multi-modal learning: Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
10/17 Multimodal representation alignment: reference
10/18 Project and research discussion
10/19 Multimodal reasoning: reference
10/24 Advanced Topics - Implicit representation Implicit Neural Representations with Periodic Activation Functions
10/25 Project and research discussion
10/26 Neural Ordinary Differential Equations
10/31 Advanced Topics - Meta-learning and multi-domain learning Meta-Learning in Neural Networks: A Survey
11/1 Project and research discussion
11/2 Domain Generalization: A Survey
11/7 Advanced Topics - Adapter approach LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
11/8 Project and research discussion
11/9 LoRA: Low-Rank Adaptation of Large Language Models
11/14 Advanced Topics - Beyond perception PaLM-E: An Embodied Multimodal Language Model
11/15 Project and research discussion
11/16 Representation Learning for Autonomous Robots HW2 due (11/19)
11/21 Class review and summary Class wrap-up and discussion
--Thanksgiving--
--Thanksgiving--
11/28 Project Project presentation and discussion
11/29 Project and research discussion
11/30 Project and research discussion
12/5 No class No class (Friday Schedule)

School of Computing | University of Georgia | 2023