May 20 – June 27
Lectures in PSY B53, Mon-Thur, 1-3pm.
Steve Homer, homer@cs.bu.edu
Office Hours: Wed: 3-4:30
Evimaria Terzi, evimaria@cs.bu.edu
Office Hours: Tues: 3-4:30
Dora Erdos, edori@cs.bu.edu
Office Hourse: Thur: 3-4:30
Maryam Ghasemi, ghasemi@cs.bu.edu
This course centers on approximation and randomized algorithms that are used to provide approximate solutions to NP-hard problems; problems known to have polynomial-time solutions, and which nevertheless are impractical for large data sets. The course will be built using material from two textbooks:
‘‘Approximation Algorithms", by Vijay V. Vazirani, Chapters: 1-15,
24-26.
This part will focus on problems like: set cover, steiner tree, cuts on
graphs, k-center, feedback vertex set, knpasack, binpacking, TSP as well as
generic techniques like linear programming approximations and semidefinite
programming.
‘‘Probability and Computing", by Michael Mitzenmacher and Eli Upfal,
Chapters: 3,4 and 7 and some sections of 5, 6, and 10.
This part of the course will review some basic material on probability,
Chernoff
bounds, Markov chains and random walks and will focus on facts about random
graphs, as well as Monte Carlo methods for counting and sampling.
Although the majority of the course will be based on the material of the above two textbooks, towards the end we will discuss some classical papers that apply some of the techniques covered in the class in practical scenarios, and will test their limits. While we will focus on the theory of approximation algorithms, we will stress those algorithms most useful for large data sets and will in each case, consider the efficiency of modern techniques for carrying out large data-centric computation.
4-5 homework problems (40% of the grade)
A take-home exam (60% of the grade)
No late homework will be accepted. You can collaborate to solve the problems (but not in the take home exam), use the web and other available material BUT you should list your collaborators and your references.
In general you need to comply to the academic conduct code.
The course assumes basic knowledge of Probability (CS 237) and Algorithms (CS 330), or equivalent. Some of the material depends as well on fundamentals of Theory of Computation (CS 332) or equivalent. Acquaintance with this material is helpful as well, though not required for this summer course.
Introduction to approximation algorithms: Set Cover |
Introduction to randomized algorithms: Min cuts |
Knapsack and Bin Packing |
k-Cuts and Multicuts |
k-median and k-means |
k-center and hierarchical clustering |
Steiner Trees and TSPs |
Graphs as electrical networks |
Linear Programming |
DNF counting |
Counting matchings |
Fingerprinting |
Semi-definite programming |
Homework 1 is now available. The datasets required for the programming assignement can be downloaded from the lines below: 10K graph and 100K graph.
Homework 2 is now available.
Homework 3 is now available.
Homework 4 is now available.