?? rafisher2cda.m
字號:
function [RAFisher2cda] = RAFisher2cda(X,pp,c,alpha)
% RAFISHER2CDA Canonical Discriminant Analysis.
%
%[As well as the previous RAFisher1, this file was named RAFisher2 in honor to
%Sir Ronald Aylmer Fisher (17 Feb.,1890-29 Jul.,1962), one of the most insightful
%and influential statisticians of all times. His pioneering work on discriminant
%functions was made in 1936 (a pdf file of this paper is attached in this zip),
%presenting an analysis on data collected by Edgar Anderson on three species of
%iris flowers (a classical set of multivariate data and also attached in this zip).]
%
%-Again, we encourage to all the users to initiate a chain of petitions to make
%possible to carry to the big or small screen the life of this notable genius of our
%time. A possible title of the film would be 'Fisher's geometric mind'.-His work shaped
%the world of statistics, he made it as we know it today.-
%
%While RAFisher1 is a procedure that produces very different functions for classification
%that are also called linear discriminant analysis, RAFisher2cda is a dimension-reduction
%technique related to principal component analysis and canonical correlation called
%canonical discriminant analysis. It derives the canonical coefficients parallels that
%of one-way MANOVA and it finds linear combinations of the quantitative variables that
%provide maximal separation between the classes or groups in much the same way that
%principal components summarize total variation. The output produced are the canonical
%coefficients and the scored canonical variables. The canonical coefficients are rotated.
%Also, it proceeds with a Bartlett's approximate chi-squared statistic for testing the
%canonical correlation coefficients.
%
%In summary, the canonical discriminant analysis:
% - Transform the variables so that the pooled within-group covariance matrix is
% an identity matrix.
% - Compute group means on the transformed variables.
% - Perform a principal component analysis on the means, weighting each mean by
% the number of observations in the group. The eigenvalues are equal to the ratio
% of between-group variation to the within-group variation in the direction of
% each principal component. Here, the principal component analysis is runned by
% the singular value decomposition.
% - Back-transform the principal components into the space of the original variables,
% obtaining the canonical variables.
%
%File gives you the option to get an unbiased or maximum-likelihood parameter estimation.
%
% Syntax: function [RAFisher2cda] = RAFisher2cda(X,pp,c,alpha)
% % Inputs:% X - multivariate data matrix.
% pp - vector of prior probabilities (unknown,pp = 1;
% known [default], pp = 2 [you must to give it]).
% c - asking for ellipse confidence bounds (yes:c = 1;
% [default], no:c = 2).
% alpha - significance level (default = 0.05). Used for the
% Chi-square tests and for the ellipse confidence bounds.
% % Output:% - Canonical discriminant functions. It includes:
% constant, variates, eigenvalues, percentages and
% cumulative percentages. Also the Bartlett's approximate
% chi-squared statistic for testing the canonical correlation
% coefficients. Optionally, it also can show you the pair-wise
% canonical scores with its ellipses confidence bounds.
%
% Example: From the Table 11.5 (Iris data) of Johnson and Wichern (1992, p. 562),
% with 150 observations (n = 150), four variables (p = 4) [all = cm] and
% three groups (g = 3; sizes: 50,50,50). We are interested to apply a
% Canonical Discriminant Analysis with an unbiased parameter estimation.
% It is considering equal prior probabilities (pp = 1) and the groups
% ellipse bound with a probability of 0.95 (alpha value of 0.05).
%
% ------------------------------------------------------------------------------
% 1 2 3
% ------------------------------------------------------------------------------
% x1 x2 x3 x4 x1 x2 x3 x4 x1 x2 x3 x4
% ------------------------------------------------------------------------------
% 5.1 3.5 1.4 0.2 7.0 3.2 4.7 1.4 6.3 3.3 6.0 2.5
% 4.9 3.0 1.4 0.2 6.4 3.2 4.5 1.5 5.8 2.7 5.1 1.9
% 4.7 3.2 1.3 0.2 6.9 3.1 4.9 1.5 7.1 3.0 5.9 2.1
% 4.6 3.1 1.5 0.2 5.5 2.3 4.0 1.3 6.3 2.9 5.6 1.8
% 5.0 3.6 1.4 0.2 6.5 2.8 4.6 1.5 6.5 3.0 5.8 2.2
% 5.4 3.9 1.7 0.4 5.7 2.8 4.5 1.3 7.6 3.0 6.6 2.1
% . . . . . . . . . . . .
% . . . . . . . . . . . .
% . . . . . . . . . . . .
% 5.1 3.8 1.6 0.2 5.7 2.9 4.2 1.3 6.3 2.5 5.0 1.9
% 4.6 3.2 1.4 0.2 6.2 2.9 4.3 1.3 6.5 3.0 5.2 2.0
% 5.3 3.7 1.5 0.2 5.1 2.5 3.0 1.1 6.2 3.4 5.4 2.3
% 5.0 3.3 1.4 0.2 5.7 2.8 4.1 1.3 5.9 3.0 5.1 1.8
% ------------------------------------------------------------------------------
%
% Group 1 = Iris setosa (n1 = 50); Group 2 = Iris versicolor (n2 = 50);
% Group 3 = Iris virginica (n3 = 50).
% Var1 (x1) = sepal length; Var2 (x2) = sepal with;
% Var3 (x3) = petal length; Var4 (x4) = petal with.
%
% Total data matrix must be:
% You can get the X-matrix by calling to iris data file provided in
% the zip as
% load path-drive:irisdata
%
% Calling on Matlab the function:
% RAFisher2cda(X,1)
%
% Answer is:
%
% It was asking for the ellipses confidence bounds.
% Do you want an unbiased (1) or maximum-likelihood parameter estimation? (2):1
%
% Canonical Discriminant Functions.
% --------------------------------------------------------------------------------
% Constant =
% -2.1051 6.6615
%
% Variates =
% 0.8294 0.0241
% 1.5345 2.1645
% -2.2012 -0.9319
% -2.8105 2.8392
%
% Eigenvalue =
% 32.1919 0.2854
%
% Percentage =
% 99.1213 0.8787
%
% CumulativePercentage =
% 99.1213 100.0000
% --------------------------------------------------------------------------------
% Functions = columns. On variates, Variate1 = first row and so forth to 4
%
% Chi-square Tests with Successive Roots Removed.
% -------------------------------------------------------------------------
% Removed Eigenvalue CanCor LW Chi-sqr. df P
% -------------------------------------------------------------------------
% 32.1919 0.9848 0.0234 546.1153 8 0.0000
% 1 0.2854 0.4712 0.7780 36.7807 3 0.0000
% -------------------------------------------------------------------------
% With a given significance of: 0.05
% If P-value >= alpha, results not significative. Else, it is significative.
%
% The pair-wise plots you can get are: 1
% --------------
% plots =
%
% 1 2
%
% --------------
% Give me the pair of factors to plot [a b]: [1 2]
% Do you need another plot? (y/n): n
%
%
% Created by A. Trujillo-Ortiz, R. Hernandez-Walls and S. Perez-Osuna
% Facultad de Ciencias Marinas
% Universidad Autonoma de Baja California
% Apdo. Postal 453
% Ensenada, Baja California
% Mexico.
% atrujo@uabc.mx
%
% Copyright (C) April 15, 2004.
%
% To cite this file, this would be an appropriate format:
% Trujillo-Ortiz, A., R. Hernandez-Walls and S. Perez-Osuna. (2004).
% RAFisher2cda:Canonical Discriminant Analysis. A MATLAB file.
% [WWW document]. URL http://www.mathworks.com/matlabcentral/fileexchange/
% loadFile.do?objectId=4836&objectType=FILE
%
% References:% % Johnson, R. A. and Wichern, D. W. (1992), Applied Multivariate Statistical Analysis.
% 3rd. ed. New-Jersey:Prentice Hall. Chapter 11, pp. 493-572.
% Fisher, R. A. (1936), The use of multiple measurements in taxonomic problems. Annals
% of Eugenics, 7: 179-188.
%
if nargin < 4,
alpha = 0.05; %(default)
end;
if (alpha <= 0 | alpha >= 1)
fprintf('Warning: significance level must be between 0 and 1\n');
return;
end;
if nargin < 3,
c = 1; %(default)
end;
disp(' ')
if c == 1;
fprintf('It was asking for the ellipses confidence bounds.');
else c == 2;
fprintf('It was not asking for the ellipses confidence bounds.');
end;
disp(' ')
if nargin < 2,
pp = 1; %(default)
end;
if nargin < 1,
error('Requires at least one input arguments.');
return;
end;
Y = X(:,2:end);
G = X(:,1);
g = max(G); %number of groups
%Vector of sample sizes per group.
n = [];
indice = X(:,1);
for i = 1:g
Xe = find(indice==i);
eval(['X' num2str(i) '= X(Xe,2);']);
eval(['n' num2str(i) '= length(X' num2str(i) ') ;'])
eval(['xn = n' num2str(i) ';'])
n = [n,xn];
end;
r = 1;
r1 = n(1);
%Partition of the group mean and covariance matrices.
M = [];
for k = 1:g
eval(['M' num2str(k) '= mean(Y(r:r1,:));']);
eval(['m = M' num2str(k) ';'])
if k < g
r = r+n(k);
r1 = r1+n(k+1);
end;
M = [M;m];
end;
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -