?? optimtips.m
字號:
%{Regression and optimization tips & tricks of the trade - usingmatlab and the optimization toolbox in ways that you may never haveconsidered. Think of this text as a Matlab work book. I've includedthe tricks I see asked about many times on the comp.soft-sys.matlabnewsgroup. If I've left out your favorite trick, just drop me a lineand I'll try to add it in. This article is not intended as a FAQhowever. Much of what I'll say in here has to do with curve-fitting andparameter estimation. It is after all a fertile source ofoptimization problems. I'll do my best to look at as many ofthe different optimizers as possible. This may help a noviceoptimizer to use the solutions found in this text as a set oftemplates for their problems.Is the order of the topics I've chosen important? I've tried to makeit all flow as well as possible, but I won't always succeed. Feelfree to skip to the section that holds your own interest.Apologies are due to my readers who live with a release of Matlabolder than release 14. Almost all of the examples here are writtenusing anonymous functions. Luckily, most of these examples willstill be quite readable, so all may not be lost. If I hear enoughrequests I may be willing to expand these examples, includingversions which are accessible to users of older releases.Using this workbookThis text is written with matlab's cell mode in mind. Each sectionhas blocks of executable code that are surrounded with a %% beforeand after. This allows you to execute that block directly in matlab.Anything that I want to say inside a block will be in the form ofa matlab comment. Users of matlab releases that do not supportcell mode can always use copy and paste to execute these blocksof matlab commands.I've also used the block comment form, %{ and %}, which became an option in release 14 of matlab. In effect, every line of thisarticle is either an executable line of matlab code, or a validmatlab comment. Users of older releases will just have to forgiveme once more.If you do have release 14 or beyond, then try "Publish to HTML".Its an option on the editor's file menu, or a button on the editortask bar. Give Matlab a minute or two to run through all of myexamples, then matlab produces a nice document that can be readthrough at your leisure.%}%% 1. Linear regression basics in matlab%{I'll start with some linear regression basics. While polyfit doesa lot, a basic understanding of the process is useful.Lets assume that you have some data in the form y = f(x) + noise.We'll make some up and plot it.%}%%% Execute this cell.x = sort(rand(20,1));y = 2 + 3*x + randn(size(x));plot(x,y,'o')title 'A linear relationship with added noise'%%% We'd like to estimate the coefficients of this model from the data.% Many books show a solution using the normal equations.M = [ones(length(x),1),x];% These are the normal equations.coef = inv(M'*M)*M'*y% coef contains regression estimates of the parametersyhat = M*coef;plot(x,y,'o',x,yhat,'-')title 'Linear regression model'%%% A better solution uses \. Why? Because \ is more numerically% stable than is inv. Its something you will appreciate one day% when your data is nasty. In this case, the different methods% will be indistinguishable. Use \ anyway. disp 'Use of \'coef2 = M\y%%% Pinv is also an option. It too is numerically stable, but it% will yield subtly different results when your matrix is singular% or nearly so. Is pinv better? There are arguments for both \ and% pinv. The difference really lies in what happens on singular or% nearly singular matrixes. See the sidebar below.% Pinv will not work on sparse problems, and since pinv relies on% the singular value decomposition, it may be slower for large% problems.disp 'Use of pinv'coef3 = pinv(M)*y% Large-scale problems where M is sparse may sometimes benefit% from a sparse iterative solution. An iterative solver is overkill% on this small problem, but ...disp 'Use of lsqr'coef4 = lsqr(M,y,1.e-13,10)% There is another option, lscov. lscov is designed to handle problems% where the data covariance matrix is known. It can also solve a% weighted regression problem (see section 2.)disp 'Use of lscov'coef5 = lscov(M,y)%%% Directly related to the \ solution is one based on the QR% factorization. If our over-determined system of equations to% solve is M*coef = y, then a quick look at the normal equations,%% coef = inv(M'*M)*M'*y%% combined with the qr factorization of M,%% M = Q*R%% yields%% coef = inv(R'*Q'*Q*R)*R'*Q'*y%% Of course, we know that Q is an orthogonal matrix, so Q'*Q is% an identity matrix.%% coef = inv(R'*R)*R'*Q'*y%% If R is non-singular, then inv(R'*R) = inv(R)*inv(R'), so% we can further reduce to%% coef = inv(R)*Q'*y%% Finally, recognize that this is best written in matlab% (especially for upper triangular R) as%% coef = R\(Q'*y)%% Why show this solution at all? Because later on, when we discuss% confidence intervals on the parameters, this will prove useful.disp 'Use of an explicit qr factorization'[Q,R] = qr(M,0);coef6 = R\(Q'*y)%%% Note that when we generated our data above, we added random noise% using the function randn. Randn generates uncorrelated Gaussian% (normally distributed) noise. In fact, the model that we chose% was the correct model for our data. In some cases the choice of% model will be only a guess.x2 = sort(rand(50,1));y2 = 1 + 2*x2 - 4*x2.^2 + randn(size(x2))/10;% lets fit this data with our same linear model.M2 = [ones(length(x2),1),x2];coef = M2\y2% and plot the resultsyhat2 = M2*coef;plot(x2,y2,'o',x2,yhat2,'-')title 'Linear model through quadratic data'%%% Plotting the residuals shows the clear lack of fit in our model. % I'll leave any more discussion of basic regression analysis to a% good text on the subject. Draper and Smith, "Applied regression% Analysis" was always a favorite of mine.res2 = y2 - yhat2;plot(x2,res2,'o')title 'Residuals for a linear model through quadratic data'%%% Sidebar: Pinv uses a singular value decomposition, whereas \% uses a qr factorization for non-square matrices. The difference?% lets try out the alternative solutions on a singular problem,% with no noise in the data.M = rand(10,2);M = M(:,[1 2 2]);y = sum(M,2);disp 'Singular matrix: pinv'coef1 = pinv(M)*ydisp 'Singular matrix: \'coef2 = M\ydisp 'Singular matrix: lsqr'coef3 = lsqr(M,y)disp 'Singular matrix: lscov'coef4 = lscov(M,y)% Lsqr produces a solution with pinv-like characteristics, while% lscov is clearly similar to \.%%% Note that \ gave a warning of rank deficiency, and that since% the second and third columns of M were replicates, the two% solutions are really equivalent. Except that \ resulted in a% zero coefficient for the third column. Pinv has the property% that in the case of singularity, it will produce the minimum% norm solution.[norm(coef1),norm(coef2)]% Either solution [1 1 1]' or [1 2 0]' was equally valid, but the% pinv solution had a lower norm.%% 2. Polynomial regression models%{Arguably the most common linear regression model is the polynomialmodel. In a simple case, we may wish to estimate the coefficients(a and b) of the model y = a*x + bAs I showed in the previous section, this is easily done using \,or any of a variety of other tools in matlab. We could also havedone the regression estimation using polyfit.Note that polyfit returns its polynomial with terms in orderfrom the highest power down.You can also build more general polynomial models, with your choiceof terms, or in multiple dimensions using polyfitn. Its here onthe file exchange:http://www.mathworks.com/matlabcentral/fileexchange/loadFile.do?objectId=10065&objectType=FILE%}%% % A linear model estimated using polyfitx = sort(rand(20,1));y = 3*x + 2 + randn(size(x));p = polyfit(x,y,1)%% 3. Weighted regression models%{What do you do when you have weights? How should we interpretregression weights anyway? Suppose we knew one particular data point had much lower errorthan the rest. We might just choose to replicate that data pointmultiple times. That replicated point will drag the sums ofsquares of errors around. Lets try it out.%}%% x = sort(rand(20,1));y = 2 + 3*x + randn(size(x));% replicating the point (1,5) 20 timesnreps = 20;x2 = [x;ones(nreps,1)];y2 = [y;repmat(5,nreps,1)];% and solveM2 = [ones(size(x2)),x2];coef = M2\y2yhat = M2*coef;closeplot(x2,y2,'o',x2,yhat,'-')title 'A weighted regression, weighting by replication'% note that the error in the replicated point was probably pretty% small, much lower than the rest of the data.%%% We can emulate this point replication using a weighted regression.% Note the sqrt(nreps) in this approach to the weighted regression.x3 = [x;1];y3 = [y;5];nreps = 20;weights = [ones(size(y));sqrt(nreps)];M3 = [ones(size(x3)),x3];% Just multiply each row of M and the corresponding y by its weight.coef = (repmat(weights,1,2).*M3)\(weights.*y3)%%% Weighted regression is one of the abilities of lscov.weights = [ones(size(y));nreps];coef = lscov(M3,y3,weights)%%% Are regression weights really used as a description of the known% variance of your data? Clearly the examples above show that% weights are interpreted as a relative replication factor. Thus% a weight of k for a point is equivalent to having replicated the% given point k times.% Does this mean that a weighted regression with all its weights% equal to some large value will yield a different result?n=20;x = sort(rand(n,1));y = 2 + 3*x + randn(size(x));M = [ones(n,1),x];% The unweighted resultcoef0 = lscov(M,y)%%% With weights all equal to 10w = 10;weights = w*ones(n,1);coef1 = lscov(M,y,weights)% Likewise, any confidence limits derived for the model will also% be unchanged.% Thus weights in this context are PURELY relative weights. Doubling% all of the weights will not reflect any overall belief on your% part that the data is more accurate. %% 4. Robust estimation %{Outliers are the source of many difficulties for estimation problems.Least squares estimation, linear or nonlinear, will be dragged aroundby points with large residuals. If these large residual points do notarise because of the expected normal distribution, but actually arisefrom some other distribution mixed in, then the least squares estimatesmay well be wildly off.In this case, some sort of trimming or iterative re-weighting schememay be appropriate. Iterative re-weighting simply means to computea regression model, then generates weights which are somehow inverselyrelated to the residual magnitude. Very often this relationship willbe highly nonlinear, perhaps the 5% of the points with the largestresiduals will be assigned a zero weight, the rest of the pointstheir normal weight. Then redo the regression model as a weightedregression.%}%%n = 50;m = 5;x = rand(n,1);y = 2 + 3*x + randn(size(x))/10;% Now mix in a few outliers. Since the data is in random order,% just add some noise to the first few data points.y(1:m) = y(1:m) + exp(rand(m,1)*3);closeplot(x,y,'o')title 'Outliers in the data'%%% Fit with a simple first order linear modelM = [ones(n,1),x];coef0 = M\y%%% Compute the residuals for the current model, then make up some% weights based on the residuals, then fit. Iterate a few times.for i=1:3 res = M*coef0 - y; weights = exp(-3*abs(res)/max(abs(res)))'; % compute the weighted estimate using these weights coef1 = lscov(M,y,weights) coef0=coef1;end%{This final estimate of coef1 will usually be closer to the known coefficients than the first (unweighted) estimate.I chose to a fairly arbitrary weight transformation. For thosewho are interested, potentially better choices may be found inone of these texts:"Robust Statistics", P.J. Huber"Data Analysis and Regression: A Second Course in Statistics",F. Mosteller, J.W. Tukey"Understanding Robust and Exploratory Data Analysis", D.C. Hoaglin
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -