| Tue & Thu | 9:35 am - 10:50 am | |
| Wed | 10:20 am - 11:10 am | |
| Google Map Direction | ||
| Paper presentations | 40% |
| Paper readings | 20% |
| Course project | 40% |
| Date | Topic | Required Readings | Presenter(s) | Background and Additional Readings |
|---|---|---|---|---|
| Jan 10 (Tue) | Introduction and Background | Prof. Sun | ||
| Jan 11 (Wed) | Deep Learning Review | Prof. Sun | ||
| Jan 12 (Thu) | Discussion | |||
| Jan 17 (Tue) | Attention and Transformers | Attention Is All You Need (Transformer) | Pradeep Kumar Ragu Chanthar, Srinivasa Sai Deepak Varukol, [slides] | Illustrated Transformer Attention? Attention! Why multi-head self attention works? |
| Jan 18 (Wed) | Discussion | Deep learning development basics, Attention and transformer playground | ||
| Jan 19 (Thu) | Vision Transformer | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) | Sixiang Zhang, Spencer King, [slides] | Swin Transformer: Hierarchical Vision Transformer using Shifted Windows |
| Jan 24 (Tue) | Transformers and Foundation Models | Language Models are Few-Shot Learners (GPT-3) | Vaishnavi Thesma, Akhila Devabhaktuni, Zihao Wu, [slides] | |
| Jan 25 (Wed) | Discussion | Vision Transformer playground d2l, uva | ||
| Jan 26 (Thu) | Transformers and Foundation Models | Finetuned language models are zero-shot learners | Hemanth Reddy Jakkannapally, Wen Zhang, [slides] | PaLM: Scaling Language Modeling with Pathways Transformers learn in-context by gradient descent |
| Jan 31 (Tue) | Transformers and Foundation Models | Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks | Yuchen Zhang, Kriti Ghosh, [slides] | |
| Feb 1 (Wed) | Discussion | Data processing | ||
| Feb 2 (Thu) | Transformers and Foundation Models | A ConvNet for the 2020s | Jashwanthreddy Katamreddy, Chenqian Xu, [slides] | |
| Feb 7 (Tue) | Image Generation | Denoising Diffusion Probabilistic Models | Xuansheng Wu, Daniel Redder, [slides] | |
| Feb 8 (Wed) | Discussion | Diffusion model playground | ||
| Feb 9 (Thu) | Image Generation | High-Resolution Image Synthesis with Latent Diffusion Models (Stable Diffusion) | Jacobi Coleman, Dongliang Guo, [slides] | |
| Feb 14 (Tue) | Image Generation and Edits | DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | Ehsan Latif, Chetan Dhamane, [slides] | |
| Feb 15 (Wed) | Discussion | Stable diffusion and inversion playground | ||
| Feb 16 (Thu) | Image Generation and Edits | An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion | Venkatesh Morpoju, Padmaja Saraf, [slides] | Prompt-to-Prompt Image Editing with Cross Attention Control |
| Feb 21 (Tue) | Understanding Neural Networks | Emergent Abilities of Large Language Models | Mohammed Aldosari, Rutuja Talekar, [slides] | |
| Feb 22 (Wed) | Discussion | Understanding backpropagation, gradient flows, and the optimization process | ||
| Feb 23 (Thu) | Understanding Neural Networks | What do Vision Transformers Learn? A Visual Exploration | Krishna Paladugu, Keerthana Garimella, [slides] | |
| Feb 28 (Tue) | Understanding Neural Networks | Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective | Maansi Reddy Jakkidi, Nasid Habib Barna, [slides] | |
| Mar 1 (Wed) | Discussion | Neural network training dynamics | ||
| Mar 2 (Thu) - Midterm | Understanding Neural Networks | Understanding deep learning (still) requires rethinking generalization | Afsaneh Shams, Subas Rana, [slides] | Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation |
| Mar 7-9 | Spring Break | |||
| Mar 14-16 | Project Milestone Presentation | |||
| Mar 21 (Tue) | Neural Scene Representation and Reconstruction | NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis | Vatsal Thakkar, Sheung Hang Sean Kan, [slides] | |
| Mar 22 (Wed) | Discussion | Neural reconstruction playground | ||
| Mar 23 (Thu) | Neural Scene Representation and Reconstruction | Plenoxels: Radiance Fields without Neural Networks | Likitha Karnati, Sakshi Seth, Ratish Jha, [slides] | |
| Mar 28 (Tue) | Training Large-Scale Neural Networks | Training Compute-Optimal Large Language Models (Chinchila) | Krushi Karukala, Rezwan Mahmud, [slides] | |
| Mar 29 (Wed) | Discussion | Working with GPUs | ||
| Mar 30 (Thu) | Training Large-Scale Neural Networks | Reinforcement Learning from Human Feedback (RLHF) | Swarali Gujarathi, Shivam Yadav, [slides] | Training language models to follow instructions with human feedback |
| Apr 4 (Tue) | Training Large-Scale Neural Networks | LAION-5B: An open large-scale dataset for training next generation image-text models Scaling Language-Image Pre-training via Masking |
Noyon Dey, Vijay Iyengar | Scaling Vision Transformers |
| Apr 5 (Wed) | Discussion | Mixed precision, parallelism, hardware | ||
| Apr 6 (Thu) | Training Large-Scale Neural Networks | How to train really large models on many GPUs? | Rajat Rajesh Mhetre, Yucheng Shi | |
| Apr 11 (Tue) | Self-supervised Learning | Masked Autoencoders Are Scalable Vision Learners | Pranavpalreddy Pingili, Zhengliang Liu | |
| Apr 12 (Wed) | Discussion | Visualizing neural networks | ||
| Apr 13 (Thu) | Self-supervised Learning | Barlow Twins: Self-Supervised Learning via Redundancy Reduction | Aishwary Nigam, Pranathi Vankineni | |
| Apr 18 (Tue) | Self-supervised Learning | Emerging Properties in Self-Supervised Vision Transformers | Yousef Fekri, Vaibhav Goyal | |
| Apr 19 (Wed) | Discussion | Explore self-supervision signals | ||
| Apr 20 (Thu) | Self-supervised Learning | Bootstrap your own latent: A new approach to self-supervised Learning | Hao Zhen, Shanmukha Sai Jasti | |
| Apr 25 (Tue) | Project Presentation | |||
| Apr 26 (Wed) | Project Presentation | |||
| Apr 27 (Thu) | Project Presentation | |||