AMSC 607 / CMSC 764 Advanced Numerical Optimization

Fall 2008

oleary@cs.umd.edu

New: 12-19-08: Final Grades and Announcements

When and Where: TuTh......11:00am-12:15pm (CSI 3120) (CSI is the Computer Science Classroom building, attached to A.V. Williams and behind the Wind Tunnel.)

Office Hours: Tuesday 1:00-2:15, Thursday 8:00-9:15, Friday 9-10, and by appointment, in AVW 3271.

Textbooks:

Sections III and IV of Linear and Nonlinear Programming by Stephen G. Nash and Ariela Sofer, McGraw-Hill.
ISBN-007-046065-5 or ISBN: 0071145370. (The book is currently under revision.) You can look for a used copy on the internet if you prefer.

"Lecture Notes on Interior Point Polynomial Time Methods in Convex Programming" by Arkadi Nemirovski, Spring 2004.

Prerequisite: A course in linear algebra and a course in numerical analysis. (A previous course in optimization is NOT expected.)

Topics:

Unconstrained Optimization (approx. 4 weeks)

Characterization of solutions to minimization problems.

Selection of appropriate algorithms.

Global optimization.

Constrained Optimization (approx. 8 weeks)

Feasible set methods and projection methods.

Construction of interior point methods for problems with linear and nonlinear constraints.

Grading: Homework plus a project. No in-class quizzes or exams.

Course Information and Syllabus

Background notes on error analysis

UMCP Code of Academic Integrity

Survival Guide for Optimization

Information about computer accounts. I have requested accounts on GRACE for each of you. For your assignments, you may use GRACE or any other machine with Matlab access.

Information from the last time the course was offered.

Lecture notes:

Introduction

Unconstrained Optimization: Basics

An example of a good linesearch: cvsrch.m and cstep.m

Unconstrained Optimization: Part 2

For more detail on linear conjugate gradients:

Some notes on conjugate gradients

cg1.m

Constrained Optimization: Basics

Constrained Optimization: Feasible Direction Methods

Constrained Optimization: Penalty and Barrier Methods

Constrained Optimization: Interior Point Methods

A. Nemirovski lecture notes

Data fitting and least squares problems

Global Optimization for discussion on December 9

Information on the term project 12-04-08.

Homework:

Homework 1:

The homework

If a sequence converges at rate p (as defined on p. 13 of the Unconstrained Optimization: Basics notes and p. 29 of the textbook), it also converges at rates less than p. In problem 1, try to find the largest value of p.

In Problem 3c and 3d, if there is more than one such interval, the right-most one is the most desirable.

The subscript denotes the 2-norm, or Euclidean norm. So || p ||_2 = sqrt (p^T p) = sqrt(sum of squares of elements of the vector p).

Partial solution to the homework,

Homework 2:

The homework

Part (iii) of the question posed in Nash-Sopher says to output the step length alpha at each iteration. Output the trust-region radius instead.

Answers to some questions posed in office hours:

(10-02-08) You do not need to do a Cholesky factorization or a modified Cholesky factorization. You are doing the trust region method instead.

(10-02-08) If the Newton step falls inside the trust region and is downhill, your algorithm should take that step, since the Newton step is then the solution to the trust region problem. Only consider a nonzero gamma if the Newton step is too big or the Newton step is not downhill.

(10-02-08) Your Newton-trust region program should be general: no matter what f,g,H your user provides, it should work. But your users are required to write their own f,g,H; you do not need to compute g and H for them. In particular, don't use a finite difference approx. for g and H. The assignment is meant to use exact g and H, programmed by the user or computed using automatic differentiation.

(10-06-08) To estimate the convergence rate, you might assume that e_{k+1} = c e_k^p and then determine c and p by a least squares fit.

Homework 3:

The homework

10-20-08 Make sure that you allow constraints to be dropped at every iteration, not just when you are at a vertex (with n constraints active).

10-23-08 1. Can we collect the testing problems from internet and use them? Yes, as long as you give a reference in your testing program. And only one problem is required.

10-23-08 Should the LP problems be small size? middle size? or large size? Your choice.

10-23-08 What kind of testing results are you expected? Just validate the accuracy of the solutions? or efficiency? Verify that your program computes the correct solution. For full credit, your program should not use an order of magnitude more operations than necessary, but it does not need to be perfectly efficient.

10-23-08 Should we also submit a matlab output diary? No, I'll run your script myself, along with other test problems.

10-26-08 Submission instructions:

Send lpfeasdir.m and your script that tests it as 2 plain-text email attachments to oleary@cs.umd.edu.

If your "group" contained more than 1 person, hand in (hardcopy) of a statement signed by each of you stating the work done by each person.

10-27-08 If you used a datafile, send it as an email attachment, too.

11-04-2008. In accordance with the vote taken in class, you have until 2pm Tuesday Nov 11 to resubmit your program, if desired, without late penalty. Submissions after that will not be accepted. The correctness test that I will use (worth 7 points) is "mytester". You will need these files to run the code:

mytester.m (Reposted 3pm 11-04-08)

testoneproblem.m

check_lp_opt.m

myproblem.mat

``At the test problem 6, it assumes that the correct solution is [Inf, Inf]. However, x2 is still bound in [1,3] in that case." You are correct. I will change the correctness test to check that x1 has been set to INF.

11-17-08 lpfeasdir.m Sample solution code by Vincent Nguyen. Check your email for your grade on this homework.

Homework 4: Due 2pm November 25

The homework

Homework 5, due Tuesday Dec 9, 2pm.

(10 points) Solve Nemirovski Exercise 3.3.1, p. 50. Hint: Figure out what "#+" means, but write the answer in your own words.

(30 points) Solve Nemirovski Exercise 4.7.4, p. 66, but omit the last piece, "Evaluate the arithmetic cost of a Newton step of the method." Part (1) is worth 10 points and part (2) is worth 20.

Comments on the solution:

Nemirovski Exercise 3.3.1, p. 50.
For the sketch of the solution, see p. 199 of the notes. I looked for demonstrations of these pieces:

The function must have a continuous 3rd derivative and the barrier property.

The 1st derivative must be bounded in terms of the square root of the 2nd.

The 3rd derivative must be bounded in terms of the 2nd raised to the 3/2 power. (Note that Newton's generalized binomial theorem is not the best way to get the r,s property.)

To put the pieces together, use Prop. 3.1.1.

(These statements are rather loose, but somehow preferable to having you try to read pseudo-latex math formulas.)

Nemirovski Exercise 4.7.4, p. 66
Nemirovski gives the algorithm is given on pp. 59-60. I was looking for enough detail so that someone could write a good computer code from your description. In particular, I looked for the following features:

Carefully define the function you are minimizing.

Make sure that you update all variables, including the extra one that you added to form the standard form objective function.

Specify a way to calculate the derivatives, and make sure that they involve all of the variables. Either grind out the formulas, or specify that automatic differentiation should be used.

NEVER compute a matrix inverse! It is inefficient (doubles the work for a dense matrix and even worse for a sparse one) and introduces unnecessary rounding error. In Matlab, use backslash to solve the linear system. In other languages, use an appropriate factorization of the matrix to solve the linear system (which is what Matlab's backslash does).

The most expensive piece of each iteration is the solution of the Newton equation to get the search direction. This is O(n^3), where x has dimension n x 1. If you redo this computation in forming lambda, you have doubled the work! And if you then recalculate lambda to test its value, you triple the work! Calculate the solution to the linear system once and then store it for use in multiple statements within the algorithm.