Tue & Thu | 9:35 am - 10:50 am | |
Wed | 10:20 am - 11:10 am | |
Google Map Direction |
Paper presentations | 40% |
Paper readings | 20% |
Course project | 40% |
Date | Topic | Required Readings | Presenter(s) | Background and Additional Readings |
---|---|---|---|---|
Jan 10 (Tue) | Introduction and Background | Prof. Sun | ||
Jan 11 (Wed) | Deep Learning Review | Prof. Sun | ||
Jan 12 (Thu) | Discussion | |||
Jan 17 (Tue) | Attention and Transformers | Attention Is All You Need (Transformer) | Pradeep Kumar Ragu Chanthar, Srinivasa Sai Deepak Varukol, [slides] | Illustrated Transformer Attention? Attention! Why multi-head self attention works? |
Jan 18 (Wed) | Discussion | Deep learning development basics, Attention and transformer playground | ||
Jan 19 (Thu) | Vision Transformer | An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT) | Sixiang Zhang, Spencer King, [slides] | Swin Transformer: Hierarchical Vision Transformer using Shifted Windows |
Jan 24 (Tue) | Transformers and Foundation Models | Language Models are Few-Shot Learners (GPT-3) | Vaishnavi Thesma, Akhila Devabhaktuni, Zihao Wu, [slides] | |
Jan 25 (Wed) | Discussion | Vision Transformer playground d2l, uva | ||
Jan 26 (Thu) | Transformers and Foundation Models | Finetuned language models are zero-shot learners | Hemanth Reddy Jakkannapally, Wen Zhang, [slides] | PaLM: Scaling Language Modeling with Pathways Transformers learn in-context by gradient descent |
Jan 31 (Tue) | Transformers and Foundation Models | Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks | Yuchen Zhang, Kriti Ghosh, [slides] | |
Feb 1 (Wed) | Discussion | Data processing | ||
Feb 2 (Thu) | Transformers and Foundation Models | A ConvNet for the 2020s | Jashwanthreddy Katamreddy, Chenqian Xu, [slides] | |
Feb 7 (Tue) | Image Generation | Denoising Diffusion Probabilistic Models | Xuansheng Wu, Daniel Redder, [slides] | |
Feb 8 (Wed) | Discussion | Diffusion model playground | ||
Feb 9 (Thu) | Image Generation | High-Resolution Image Synthesis with Latent Diffusion Models (Stable Diffusion) | Jacobi Coleman, Dongliang Guo, [slides] | |
Feb 14 (Tue) | Image Generation and Edits | DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation | Ehsan Latif, Chetan Dhamane, [slides] | |
Feb 15 (Wed) | Discussion | Stable diffusion and inversion playground | ||
Feb 16 (Thu) | Image Generation and Edits | An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion | Venkatesh Morpoju, Padmaja Saraf, [slides] | Prompt-to-Prompt Image Editing with Cross Attention Control |
Feb 21 (Tue) | Understanding Neural Networks | Emergent Abilities of Large Language Models | Mohammed Aldosari, Rutuja Talekar, [slides] | |
Feb 22 (Wed) | Discussion | Understanding backpropagation, gradient flows, and the optimization process | ||
Feb 23 (Thu) | Understanding Neural Networks | What do Vision Transformers Learn? A Visual Exploration | Krishna Paladugu, Keerthana Garimella, [slides] | |
Feb 28 (Tue) | Understanding Neural Networks | Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective | Maansi Reddy Jakkidi, Nasid Habib Barna, [slides] | |
Mar 1 (Wed) | Discussion | Neural network training dynamics | ||
Mar 2 (Thu) - Midterm | Understanding Neural Networks | Understanding deep learning (still) requires rethinking generalization | Afsaneh Shams, Subas Rana, [slides] | Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation |
Mar 7-9 | Spring Break | |||
Mar 14-16 | Project Milestone Presentation | |||
Mar 21 (Tue) | Neural Scene Representation and Reconstruction | NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis | Vatsal Thakkar, Sheung Hang Sean Kan, [slides] | |
Mar 22 (Wed) | Discussion | Neural reconstruction playground | ||
Mar 23 (Thu) | Neural Scene Representation and Reconstruction | Plenoxels: Radiance Fields without Neural Networks | Likitha Karnati, Sakshi Seth, Ratish Jha, [slides] | |
Mar 28 (Tue) | Training Large-Scale Neural Networks | Training Compute-Optimal Large Language Models (Chinchila) | Krushi Karukala, Rezwan Mahmud, [slides] | |
Mar 29 (Wed) | Discussion | Working with GPUs | ||
Mar 30 (Thu) | Training Large-Scale Neural Networks | Reinforcement Learning from Human Feedback (RLHF) | Swarali Gujarathi, Shivam Yadav, [slides] | Training language models to follow instructions with human feedback |
Apr 4 (Tue) | Training Large-Scale Neural Networks | LAION-5B: An open large-scale dataset for training next generation image-text models Scaling Language-Image Pre-training via Masking |
Noyon Dey, Vijay Iyengar | Scaling Vision Transformers |
Apr 5 (Wed) | Discussion | Mixed precision, parallelism, hardware | ||
Apr 6 (Thu) | Training Large-Scale Neural Networks | How to train really large models on many GPUs? | Rajat Rajesh Mhetre, Yucheng Shi | |
Apr 11 (Tue) | Self-supervised Learning | Masked Autoencoders Are Scalable Vision Learners | Pranavpalreddy Pingili, Zhengliang Liu | |
Apr 12 (Wed) | Discussion | Visualizing neural networks | ||
Apr 13 (Thu) | Self-supervised Learning | Barlow Twins: Self-Supervised Learning via Redundancy Reduction | Aishwary Nigam, Pranathi Vankineni | |
Apr 18 (Tue) | Self-supervised Learning | Emerging Properties in Self-Supervised Vision Transformers | Yousef Fekri, Vaibhav Goyal | |
Apr 19 (Wed) | Discussion | Explore self-supervision signals | ||
Apr 20 (Thu) | Self-supervised Learning | Bootstrap your own latent: A new approach to self-supervised Learning | Hao Zhen, Shanmukha Sai Jasti | |
Apr 25 (Tue) | Project Presentation | |||
Apr 26 (Wed) | Project Presentation | |||
Apr 27 (Thu) | Project Presentation |