**Lectures:** Tuesdays and Thursdays, 5pm-6:30pm, MCS B31.

**Instructor:** Prof. Alina Ene, MCS 291.

*Office hours:* Fridays, 10:30am-noon, MCS 291.

**Course Overview** (click to show/hide)

At their core, many real-world problems involve optimization problems that are complex and challenging, and they require principled mathematical and algorithmic solutions. Companies such as Google and Facebook use powerful algorithmic primitives such as gradient descent to solve a host of challenging problems. These primitives in turn are based on a beautiful mathematical theory developed in the context of convex and discrete optimization.
In this class, we will explore several of these optimization primitives that have found significant applications and we will develop a mathematical toolkit that can be used to tackle real-world problems.

**Tentative Syllabus** (click to show/hide)

The main topics covered include:

**Convex optimization methods and their applications:** gradient descent, mirror descent, the multiplicative weights update method, the Frank-Wolfe gradient descent algorithm.

**Discrete and combinatorial optimization methods and their applications:** submodular functions, submodular minimization via the ellipsoid method and gradient descent, and submodular maximization via the Greedy algorithm.

**Algorithms for linear algebra and their applications:** the power method, singular value decompositions, principal component analysis, spectral clustering, and regression.

**Prerequisites** (click to show/hide)

The class will require familiarity with fundamental algorithmic problems and techniques at the level of CS330/CS530 or equivalent, and a working knowledge of (multivariate) calculus and linear algebra.

**Evaluation** (click to show/hide)

The course evaluation will be based on homework assignments (30%), a final project (60%), and lecture notes scribing (10%).

**Reference materials:** The reading material includes the following textbooks and lecture notes.

**[BV]** Boyd and Vanderberghe. Convex Optimization

**[Bubeck]** Bubeck. Convex Optimization: Algorithms and Complexity

**[BHK]** Blum, Hopcroft, Kannan. Foundations of Data Science.

**[Shalev-Shwartz]** Shalev-Shwartz. Online Learning and Online Convex Optimization

**Lecture Scribing** (click to show/hide)
Please read the scribing guidelines carefully and sign up to scribe a lecture. The lecture notes will be posted on the class website.

**Scribing Guidelines**

- The lecture notes should be typeset in LaTex using this template
- Start by briefly summarizing the main topics discussed in the lecture
- Try to use full sentences/prose, avoid bullet points
- Make sure to also include the provided motivation, illustrative examples/figures and intuitions that Alina provided in lecture
- Include any figures that you feel would be helpful in understanding the concepts, and especially those that Alina provided in lecture. You can use a tool such as LatexDraw to draw figures.
- Make sure that all the formal arguments are correct and do not have any gaps in reasoning. Make sure to include the simple calculations we skipped during lecture. If something in the argument is unclear to you, just email Alina for clarification.
- Please spell check your notes and polish the grammar/language.

**Sign-up Instructions**

Please sign up to scribe one lecture on this spreadsheet

.

Homework 1 due on Thursday, October 6.

Homework 2 due on Tuesday, November 1.

Homework 3 due on Thursday, November 17.

**Final Project information** (click to show/hide)
Please read the final project guidelines carefully and make a note of the deadlines for each component.

**Project Guidelines**
The final project is an important component of this course, and the subject matter of the project should have a strong connection to the topics covered in the course. The final project can take several forms:

**The project requires submitting three components:** a project proposal (due Nov. 17), an interim report (due Dec. 1), and a final report (due Dec. 15). For each component, typeset your submission using LaTex and email a pdf to Alina by the due date.

**Project Proposal (due Thursday, November 17)**. Write a 1-2 page document that describes the plan for your project. The proposal should discuss all of the following points that are applicable:

**Interim Report (due Thursday, December 1)**. Write a 3-5 page document that describes your progress so far. The report should discuss all of the following points that are applicable:

**Final Report (due Thursday, December 15)** The final report should be a document of up to 10 pages. Keep in mind that the quality of the presentation is as important as the report's content. You can structure your report to (roughly) contain the following sections: abstract, introduction, related work, problem, results, experiments/evaluations, conclusions/discussion.

**Lecture 1 (9/6/16)** Course overview and introduction. *Prelude:* linear classification via the Perceptron algorithm.

**Reading:** [BHK] Chapter 6.3 for the Perceptron algorithm.

**Lecture 2 (9/8/16)** Introduction to convex optimization. Basics of convex sets and functions.

**Reading:** [BV] Chapters 2 and 3. [BV] Appendix A is a good reference/refresher on definitions and notations from calculus and linear algebra that we will be using throughout the course.

**Lecture 3 (9/13/16)** Introduction to gradient descent. Projected subgradient descent algorithm.

**Reading:** [Bubeck] Chapter 3.1.

**Lectures 4, 5, 6 (9/15/16, 9/20/16, 9/22/16)** Gradient descent for smooth convex functions.

**Reading:** [Bubeck] Chapter 3.2.

**Lecture 7 (9/27/16)** Frank-Wolfe algorithm for constrained minimization of smooth convex functions.

**Reading:** [Bubeck] Chapter 3.3.

**Lecture 8 (9/29/16)** Gradient descent for smooth and strongly convex functions.

**Reading:** [Bubeck] Chapter 3.4, [BV] Chapters 9.1--9.3.

**Lecture 9 (10/4/16)** Lower bounds in the black-box model, Nesterov's accelerated gradient descent algorithm.

**Reading:** [Bubeck] Chapters 3.5 -- 3.7.

**Lecture 10 (10/6/16)** Analysis of Nesterov's accelerated gradient descent algorithm.

**(10/11/16)** No class (Monday schedule).

**Lecture 11 (10/13/16)** Online learning with expert advice: weighted majority and randomized weighted majority algorithms.

**Reading:** Section 1 in the Arora-Hazan-Kale survey.

**Lecture 12 (10/18/3016)** Online learning with expert advice: generalization to losses in [-1, 1], multiplicative weights update algorithm.

**Reading:** Section 2 in the Arora-Hazan-Kale survey.

**Lecture 13 (10/20/16)** Applications of the multiplicate weights update algorithm: the Winnow algorithm for linear classification, the Plotkin-Shmoys-Tardos framework for solving packing/covering linear programs.

**Reading:** Section 3 in the Arora-Hazan-Kale survey.

**Lecture 14 (10/25/16)** Online convex optimization, Follow-the-Leader algorithm, analysis of FTL for quadratic loss functions.

**Reading:** Chapters 1, 2.1, 2.2 in [Shalev-Shwartz].

**Lecture 15 (10/27/16)** Online learning in the OCO model with linear loss functions, Follow-the-Regularized-Leader algorithm, Online Gradient Descent algorithm.

**Reading:** Chapters 2.3, 2.4 in [Shalev-Shwartz].

**Lecture 16 (11/1/16)** Follow-the-Regularized-Leader with strongly convex regularizers.

**Reading:** Chapter 2.5 in [Shalev-Shwartz].

**Lecture 17 (11/3/16)** Regret bounds for expert learning via FoRel with strongly convex regularizers, Online Mirror Descent.

**Reading:** Chapters 2.5, 2.6 in [Shalev-Shwartz].

**Lecture 18 (11/8/16)** Online Mirror Descent: derived algorithms and analysis via duality.

**Reading:** Chapters 2.6, 2.7 in [Shalev-Shwartz].

**Lectures 19, 20 (11/10/16, 11/15/16)** The Mirror Descent framework for convex optimization.

**Reading:** Chapters 4.1 -- 4.3 in [Bubeck].

**Lecture 21 (11/17/16)** Introduction to discrete optimization and submodular functions.

**Reading:** Notes.

**Lectures 22, 23 (11/22/16, 11/29/16)** Submodular function minimization via gradient descent.

**Reading:** Notes.

**Lectures 24, 25, 26 (12/1/16, 12/6/16, 12/8/16)** Sumbodular function minimization continued. Regularization and duality, minimum norm point formulation. Frank-Wolfe algorithm and its connection to the subgradient descent algorithm.