CS 531: Advanced Optimization Algorithms


Course Description

This is a graduate-level course on the design and analysis of iterative algorithms for convex and non-convex optimization. Iterative methods based on gradient descent have had a profound impact on many areas of computer science and engineering, including algorithms and theoretical computer science, machine learning, data mining, finance, and operations research. The main topics covered are the design and analysis of gradient descent methods for convex problems; adaptive, stochastic and non-convex optimization; linear programming; linear and convex duality; online learning.

Syllabus

Collaboration and Honesty Policy

Piazza

Gradescope for submitting homeworks. Sign up using the entry code 4PJR4X.


Course Staff

Prof. Alina Ene
Homepage: cs-people.bu.edu/aene/
Email: aene @ bu . edu
Office Hours: Mondays 9:30am - 11am, on Zoom (please see Piazza for the Zoom link).

Teaching Fellow: Andrew Suh
Email: asuh9 @ bu . edu
Office Hours: Fridays 11:15am - 12:45pm, on Zoom (please see Piazza for the Zoom link).

Lectures

Mon/Wed 4:30pm - 5:45pm, in KCB 103

The schedule is tentative and subject to change (e.g., snow days).

You can try the Jupyter notebooks in your browser here. You can make changes and run the code in the browser. The changes will not be saved, when you hit refresh all the changes will be gone.

Lecture Topic Annotated Slides Further reading
Mathematical background and introduction to optimization
1/25 Course overview and introduction. Linear classification and the Perceptron algorithm. Slides
1/27 Review of concepts from linear algebra. Slides
2/1 Review of concepts from multivariate calculus. Slides Jupyter notebook
2/3, 2/8 Convex functions and sets. Slides
2/10 Introduction to optimization. Examples of optimization problems. Slides Jupyter notebook
2/16 Optimality conditions for general and convex problems. Slides
Iterative methods for convex optimization
2/17 Oracle models, iterative methods, and gradient descent. Slides
2/22 Analysis of gradient descent. Smoothness and strong convexity. Slides
2/24 Gradient descent for smooth and strongly convex functions. Slides
3/1 Accelerated gradient descent for smooth convex functions. Slides
3/3 Accelerated gradient descent for well-conditioned functions. Frank-Wolfe algorithm. Slides
3/8 Midterm exam I (take-home, lecture cancelled).
Adaptive methods, stochastic and non-convex optimization
3/10 Adaptive methods. Slides
3/15 Stochastic gradient descent. Slides
3/17 Supervised learning, neural networks. Slides
3/22 Neural networks continued: backpropagation algorithm, ADAM algorithm. Slides
Linear programming, duality
3/24 Introduction to linear programming. Modeling using LPs. Slides
3/29, 4/5 LP duality. Applications of LP duality. Slides
Learning from experts, mirror descent
4/7 Prediction using expert advice, majority algorithms. Slides
4/12 Multiplicative weights update algorithm, online convex optimization. Slides
4/14 Fenchel duality. Slides
4/21 Bregman divergences, mirror descent. Slides
4/26 Course recap. Slides
4/28 Midterm exam II (take-home, lecture cancelled).

Acknowledgments: Many pictures used in the lecture slides are courtesy of Google Images and their respective authors. I am indebted to my colleagues at other institutions for some of the material in the lectures: Amir Ali Ahmadi's course at Princeton, Yaron Singer's course at Harvard, Nick Harvey's course at UBC, ... . The specific references/credits are on the References slide at the end of each lecture.



Homeworks

Homeworks are released on Wednesday before class, and are due in two weeks on Wednesday at midnight. Expect a homework every two weeks except for the midterm exam weeks.