?? 40.txt

?? This complete matlab for neural network
?? TXT
字號:
發信人: GzLi (笑梨), 信區: DataMining
標  題: [轉載] Re: 什么叫validation set(轉寄)
發信站: 南京大學小百合站 (Tue Dec 24 23:13:58 2002)

【 以下文字轉載自 GzLi 的信箱 】
【 原文由 <GzLi@smth.edu.cn> 所發表 】

來  源: 211.68.16.32

發信人: IKM (IKM), 信區: AI       

標  題: Re: 什么叫validation set

發信站: BBS 水木清華站 (Tue Dec 24 13:06:24 2002)


來自ANN的FAQ


Subject: What are the population, sample, training set, design set, validation


 set, and test set?

It is rarely useful to have a NN simply memorize a set of data, since memoriza


tion can be done much more efficiently by numerous algorithms for table look-u


p. Typically, you want the NN to be able to perform accurately on new data, th


at is, to generalize. 

There seems to be no term in the NN literature for the set of all cases that y


ou want to be able to generalize to. Statisticians call this set the "populati


on". Tsypkin (1971) called it the "grand truth distribution," but this term ha


s never caught on. 


Neither is there a consistent term in the NN literature for the set of cases t


hat are available for training and evaluating an NN. Statisticians call this s


et the "sample". The sample is usually a subset of the population. 


(Neurobiologists mean something entirely different by "population," apparently


 some collection of neurons, but I have never found out the exact meaning.. I a


m going to continue to use "population" in the statistical sense until NN rese


archers reach a consensus on some other terms for "population" and "sample"; I


 suspect this will never happen.) 


In NN methodology, the sample is often subdivided into "training", "validation


", and "test" sets. The distinctions among these subsets are crucial, but the 


terms "validation" and "test" sets are often confused. Bishop (1995), an indis


pensable reference on neural networks, provides the following explanation (p. 


372): 


Since our goal is to find the network having the best performance on new data,


 the simplest approach to the comparison of different networks is to evaluate 


the error function using data which is independent of that used for training. 


Various networks are trained by minimization of an appropriate error function 


defined with respect to a training data set. The performance of the networks i


s then compared by evaluating the error function using an independent validati


on set, and the network having the smallest error with respect to the validati


on set is selected. This approach is called the hold out method. Since this pr


ocedure can itself lead to some overfitting to the validation set, the perform


ance of the selected network should be confirmed by measuring its performance 


on a third independent set of data called a test set. 

And there is no book in the NN literature more authoritative than Ripley (1996


), from which the following definitions are taken (p.354): 

Training set: 

A set of examples used for learning, that is to fit the parameters [i.e., weig


hts] of the classifier. 

Validation set: 

A set of examples used to tune the parameters [i.e., architecture, not weights


] of a classifier, for example to choose the number of hidden units in a neura


l network. 

Test set: 

A set of examples used only to assess the performance [generalization] of a fu


lly-specified classifier. 

The literature on machine learning often reverses the meaning of "validation" 


and "test" sets. This is the most blatant example of the terminological confus


ion that pervades artificial intelligence research. 

The crucial point is that a test set, by the standard definition in the NN lit


erature, is never used to choose among two or more networks, so that the error


 on the test set provides an unbiased estimate of the generalization error (as


suming that the test set is representative of the population, etc.). Any data 


set that is used to choose the best of two or more networks is, by definition,


 a validation set, and the error of the chosen network on the validation set i


s optimistically biased. 


There is a problem with the usual distinction between training and validation 


sets. Some training approaches, such as early stopping, require a validation s


et, so in a sense, the validation set is used for training. Other approaches, 


such as maximum likelihood, do not inherently require a validation set. So the


 "training" set for maximum likelihood might encompass both the "training" and


 "validation" sets for early stopping. Greg Heath has suggested the term "desi


gn" set be used for cases that are used solely to adjust the weights in a netw


ork, while "training" set be used to encompass both design and validation sets


.. There is considerable merit to this suggestion, but it has not yet been wide


ly adopted. 


But things can get more complicated. Suppose you want to train nets with 5 ,10


, and 20 hidden units using maximum likelihood, and you want to train nets wit


h 20 and 50 hidden units using early stopping. You also want to use a validati


on set to choose the best of these various networks. Should you use the same v


alidation set for early stopping that you use for the final network choice, or


 should you use two separate validation sets? That is, you could divide the sa


mple into 3 subsets, say A, B, C and proceed as follows: 


Do maximum likelihood using A. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on B (the validation set). 

Estimate the generalization error of the chosen network using C (the test set)


.. 

Or you could divide the sample into 4 subsets, say A, B, C, and D and proceed 


as follows: 

Do maximum likelihood using A and B combined. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set with respect to early stopping). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on C (this makes C a second valid


ation set). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

Or, with the same 4 subsets, you could take a third approach: 

Do maximum likelihood using A. 

Choose among the 3 nets trained by maximum likelihood based on the error compu


ted on B (the first validation set) 

Do early stopping with A to adjust the weights and B (the first validation set


) to decide when to stop. 

Choose among the best net trained by maximum likelihood and the 2 nets trained


 by early stopping based on the error computed on C (the second validation set


). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

You could argue that the first approach is biased towards choosing a net train


ed by early stopping. Early stopping involves a choice among a potentially lar


ge number of networks, and therefore provides more opportunity for overfitting


 the validation set than does the choice among only 3 networks trained by maxi


mum likelihood. Hence if you make the final choice of networks using the same 


validation set (B) that was used for early stopping, you give an unfair advant


age to early stopping. If you are writing an article to compare various traini


ng methods, this bias could be a serious flaw. But if you are using NNs for so


me practical application, this bias might not matter at all, since you obtain 


an honest estimate of generalization error using C. 

You could also argue that the second and third approaches are too wasteful in 


their use of data. This objection could be important if your sample contains 1


00 cases, but will probably be of little concern if your sample contains 100,0


00,000 cases. For small samples, there are other methods that make more effici


ent use of data; see "What are cross-validation and bootstrapping?" 


References: 


Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford U


niversity Press. 


Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge: Cambri


dge University Press. 


Tsypkin, Y. (1971), Adaptation and Learning in Automatic Systems, NY: Academic


 Press. 


------------------------------------------------------------------------



【 在 xinjian 的大作中提到: 】

: 不是一回事

: 我看到有的系統需要三個數據集，訓練集，校驗集，測試集

: 我也分得不清楚

: 感覺訓練集是訓練學習過程的參數的

: 而校驗集，我覺得是一個簡單的測試，但它到底起挑選作用呢（從多個學習算法中挑..

:   還是起二次訓練作用呢，我就不清楚了，也許由人自己定的吧

: 而測試集那就是最后的評判了

: 【 在 Karpov (卡爾波夫) 的大作中提到: 】

: : 是不是校驗集，它和測試集是一回事嗎？


--


※ 來源:·BBS 水木清華站 smth.org·[FROM: 61.149.31.217]


--


※ 來源:．南京大學小百合站 bbs.nju.edu.cn [FROM: 211.68.16.32]
--
※ 轉載:．南京大學小百合站 bbs.nju.edu.cn．[FROM: 211.80.38.17]
?? 文件大小 7787 K
?? 上傳用戶 zdh103
?? 所屬分類 matlab例程
??? 相關標簽

#complete #network #matlab #neural
?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

?? 40.txt

?? 快捷鍵說明