CS7870: Algorithms for Machine Learning


[Home] [Schedule]
The schedule is tentative and subjects to change (e.g. snow days)
Date Topic Notes Additional readings/references
Class 1 (Sep 4) Course Overview, logistics
Class 2 (Sep 8) Architectures, transformers, pipeline
Class 3 (Sep 11) Attention, intro to GPU, flash attention
Class 4 (Sep 15) Variants of attention for time/memory optimization
Class 5 (Sep 18) Optimization, backpropagation for FFN, attention
Class 6 (Sep 22) Stochastic gradient descent, AdaGrad, Adam
Class 7 (Sep 25) Recent variants of optimizers (Muon, low rank)
Class 8 (Sep 29) Parallel algorithms on GPU
Class 9 (Oct 2) DeepSpeed/ZeRO, FSDP
Class 10 (Oct 6) Locality sensitive hashing
Class 11 (Oct 9) Kernel density estimation
Oct 13 Indigenous Peoples Day, no classes
Class 12 (Oct 16) Graph-based nearest neighbor search, RAG
Class 13 (Oct 20) Hashing-based attention approximation
Class 14 (Oct 23) Mixture of experts
Class 15 (Oct 27) State space models
Class 16 (Oct 30) Fine-tuning, PEFT Project proposal is due
Class 17 (Nov 3) Fast inference
Class 18 (Nov 6) Quantization (clustering, hashing e.g. RaBitQ)
Class 19 (Nov 10) Quantization aware training, low bit quantization
Class 20 (Nov 13)
Class 21 (Nov 17) Project progress update due
Class 22 (Nov 20)
Class 23 (Nov 24)
Nov 27 Thanksgiving, no classes
Class 24 (Dec 1)
Class 25 (Dec 4)
Dec 8 time for project, no classes Final project due
Dec 11 no classes