?? knndd.m
字號:
%KNNDD K-Nearest neighbour data description method.% % W = KNNDD(A,FRACREJ,K,METHOD)% % Calculates the K-Nearest neighbour data description on dataset A.% Three methods are defined to compute a distance to the dataset using% the k-nearest neighbours:%% METHOD does:% 'kappa' use distance to the k-th nearest neighbor% 'delta' distance to the average of the k-nn's% 'gamma' average distance to the k-nn's%% When no K is defined, it will be optimized using knn_optk, when it% is smaller than 0, sqrt(n) will be used.% Copyright: D. Tax, davidt@ph.tn.tudelft.nl% Faculty of Applied Physics, Delft University of Technology% P.O. Box 5046, 2600 GA Delft, The Netherlandsfunction W = knndd(a,fracrej,k,method)if nargin < 4, method = 'kappa'; endif nargin < 3, k = []; endif nargin < 2 | isempty(fracrej), fracrej = 0.05; endif nargin < 1 | isempty(a) % empty knndd W = mapping(mfilename,{fracrej,k,method}); W = setname(W,'K-Nearest neighbour data description'); returnendif ~ismapping(fracrej) %training % some checking of datatypes and sizes: a = +target_class(a); % make sure we have a OneClass dataset [m,d] = size(a); if (m<2) warning([mfilename ': Dataset contains less than 2 objects']); end if (k>=m) error(['More neighbors than training samples are requested! (max=',... num2str(m-1),')']); end if isa(k,'char') error('Argument k should define the number of neighbors'); end % the most important thing: distmat = sqeucldistm(a,a); % is k is not defined, find the optimal k optimizing the loglikelihood: if isempty(k) k = knn_optk(distmat,d); else %tricky, when k<=0 we use the default sqrt(n) solution... if (k<=0) k = round(sqrt(m)); end end if (k<1) warning([mfilename ': K must be positive (>0)']); end [sD,I] = sort(distmat,2); % different treatment by different methods: switch method case 'kappa' fit = sD(:,k+1); case 'delta' nn = zeros(m,d); for i=2:k+1 nn = nn + a(I(:,i),:); end nn = (+a - (nn/(k))); fit = sum(nn.*nn,2); case 'gamma' fit = mean(sD(:,(2:(k+1))),2); otherwise error([mfilename,': Unknown method']); end %now obtain the threshold: thresh = dd_threshold(fit,1-fracrej); %and save all useful data: W.x = +a; W.k = k; W.method = method; W.threshold = thresh; W.scale = mean(fit); W = mapping(mfilename,'trained',W,str2mat('target','outlier'),d,2); W = setname(W,'K-Nearest neighbour data description');else %testing W = getdata(fracrej); % unpack [m,d] = size(a); %compute: distmat = sqeucldistm(+a,W.x); %dist between train and test [sD,I] = sort(distmat,2); % different treatment by different methods: switch W.method case 'kappa' ind = sD(:,W.k); %ind = sD(:,W.k+1); case 'delta' nn = zeros(m,d); %for i=1:W.k+1 for i=1:W.k nn = nn + W.x(I(:,i),:); end nn = (+a - (nn/(W.k))); ind = sum(nn.*nn,2); case 'gamma' ind = mean(sD(:,(1:(W.k))),2); otherwise error([mfilename,': Unknown method']); end % store the results in the final dataset: out = -[ind repmat(W.threshold,[m,1])]; W = setdat(a,out,fracrej);endreturn
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -