亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關于我們
? 蟲蟲下載站

?? 40.txt

?? This complete matlab for neural network
?? TXT
字號:
發信人: GzLi (笑梨), 信區: DataMining
標  題: [轉載] Re: 什么叫validation set(轉寄)
發信站: 南京大學小百合站 (Tue Dec 24 23:13:58 2002)

【 以下文字轉載自 GzLi 的信箱 】
【 原文由 <GzLi@smth.edu.cn> 所發表 】

來  源: 211.68.16.32

發信人: IKM (IKM), 信區: AI       

標  題: Re: 什么叫validation set

發信站: BBS 水木清華站 (Tue Dec 24 13:06:24 2002)


來自ANN的FAQ


Subject: What are the population, sample, training set, design set, validation


 set, and test set?

It is rarely useful to have a NN simply memorize a set of data, since memoriza


tion can be done much more efficiently by numerous algorithms for table look-u


p. Typically, you want the NN to be able to perform accurately on new data, th


at is, to generalize. 

There seems to be no term in the NN literature for the set of all cases that y


ou want to be able to generalize to. Statisticians call this set the "populati


on". Tsypkin (1971) called it the "grand truth distribution," but this term ha


s never caught on. 


Neither is there a consistent term in the NN literature for the set of cases t


hat are available for training and evaluating an NN. Statisticians call this s


et the "sample". The sample is usually a subset of the population. 


(Neurobiologists mean something entirely different by "population," apparently


 some collection of neurons, but I have never found out the exact meaning.. I a


m going to continue to use "population" in the statistical sense until NN rese


archers reach a consensus on some other terms for "population" and "sample"; I


 suspect this will never happen.) 


In NN methodology, the sample is often subdivided into "training", "validation


", and "test" sets. The distinctions among these subsets are crucial, but the 


terms "validation" and "test" sets are often confused. Bishop (1995), an indis


pensable reference on neural networks, provides the following explanation (p. 


372): 


Since our goal is to find the network having the best performance on new data,


 the simplest approach to the comparison of different networks is to evaluate 


the error function using data which is independent of that used for training. 


Various networks are trained by minimization of an appropriate error function 


defined with respect to a training data set. The performance of the networks i


s then compared by evaluating the error function using an independent validati


on set, and the network having the smallest error with respect to the validati


on set is selected. This approach is called the hold out method. Since this pr


ocedure can itself lead to some overfitting to the validation set, the perform


ance of the selected network should be confirmed by measuring its performance 


on a third independent set of data called a test set. 

And there is no book in the NN literature more authoritative than Ripley (1996


), from which the following definitions are taken (p.354): 

Training set: 

A set of examples used for learning, that is to fit the parameters [i.e., weig


hts] of the classifier. 

Validation set: 

A set of examples used to tune the parameters [i.e., architecture, not weights


] of a classifier, for example to choose the number of hidden units in a neura


l network. 

Test set: 

A set of examples used only to assess the performance [generalization] of a fu


lly-specified classifier. 

The literature on machine learning often reverses the meaning of "validation" 


and "test" sets. This is the most blatant example of the terminological confus


ion that pervades artificial intelligence research. 

The crucial point is that a test set, by the standard definition in the NN lit


erature, is never used to choose among two or more networks, so that the error


 on the test set provides an unbiased estimate of the generalization error (as


suming that the test set is representative of the population, etc.). Any data 


set that is used to choose the best of two or more networks is, by definition,


 a validation set, and the error of the chosen network on the validation set i


s optimistically biased. 


There is a problem with the usual distinction between training and validation 


sets. Some training approaches, such as early stopping, require a validation s


et, so in a sense, the validation set is used for training. Other approaches, 


such as maximum likelihood, do not inherently require a validation set. So the


 "training" set for maximum likelihood might encompass both the "training" and


 "validation" sets for early stopping. Greg Heath has suggested the term "desi


gn" set be used for cases that are used solely to adjust the weights in a netw


ork, while "training" set be used to encompass both design and validation sets


.. There is considerable merit to this suggestion, but it has not yet been wide


ly adopted. 


But things can get more complicated. Suppose you want to train nets with 5 ,10


, and 20 hidden units using maximum likelihood, and you want to train nets wit


h 20 and 50 hidden units using early stopping. You also want to use a validati


on set to choose the best of these various networks. Should you use the same v


alidation set for early stopping that you use for the final network choice, or


 should you use two separate validation sets? That is, you could divide the sa


mple into 3 subsets, say A, B, C and proceed as follows: 


Do maximum likelihood using A. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on B (the validation set). 

Estimate the generalization error of the chosen network using C (the test set)


.. 

Or you could divide the sample into 4 subsets, say A, B, C, and D and proceed 


as follows: 

Do maximum likelihood using A and B combined. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set with respect to early stopping). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on C (this makes C a second valid


ation set). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

Or, with the same 4 subsets, you could take a third approach: 

Do maximum likelihood using A. 

Choose among the 3 nets trained by maximum likelihood based on the error compu


ted on B (the first validation set) 

Do early stopping with A to adjust the weights and B (the first validation set


) to decide when to stop. 

Choose among the best net trained by maximum likelihood and the 2 nets trained


 by early stopping based on the error computed on C (the second validation set


). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

You could argue that the first approach is biased towards choosing a net train


ed by early stopping. Early stopping involves a choice among a potentially lar


ge number of networks, and therefore provides more opportunity for overfitting


 the validation set than does the choice among only 3 networks trained by maxi


mum likelihood. Hence if you make the final choice of networks using the same 


validation set (B) that was used for early stopping, you give an unfair advant


age to early stopping. If you are writing an article to compare various traini


ng methods, this bias could be a serious flaw. But if you are using NNs for so


me practical application, this bias might not matter at all, since you obtain 


an honest estimate of generalization error using C. 

You could also argue that the second and third approaches are too wasteful in 


their use of data. This objection could be important if your sample contains 1


00 cases, but will probably be of little concern if your sample contains 100,0


00,000 cases. For small samples, there are other methods that make more effici


ent use of data; see "What are cross-validation and bootstrapping?" 


References: 


Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford U


niversity Press. 


Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge: Cambri


dge University Press. 


Tsypkin, Y. (1971), Adaptation and Learning in Automatic Systems, NY: Academic


 Press. 


------------------------------------------------------------------------



【 在 xinjian 的大作中提到: 】

: 不是一回事

: 我看到有的系統需要三個數據集,訓練集,校驗集,測試集

: 我也分得不清楚

: 感覺訓練集是訓練學習過程的參數的

: 而校驗集,我覺得是一個簡單的測試,但它到底起挑選作用呢(從多個學習算法中挑..

:   還是起二次訓練作用呢,我就不清楚了,也許由人自己定的吧

: 而測試集那就是最后的評判了

: 【 在 Karpov (卡爾波夫) 的大作中提到: 】

: : 是不是校驗集,它和測試集是一回事嗎?


--


※ 來源:·BBS 水木清華站 smth.org·[FROM: 61.149.31.217]


--


※ 來源:.南京大學小百合站 bbs.nju.edu.cn [FROM: 211.68.16.32]
--
※ 轉載:.南京大學小百合站 bbs.nju.edu.cn.[FROM: 211.80.38.17]

?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
亚洲人成网站在线| caoporn国产精品| 亚洲综合av网| 自拍av一区二区三区| 国产精品国产三级国产普通话三级 | 日韩电影一区二区三区四区| 亚洲激情欧美激情| 肉丝袜脚交视频一区二区| 亚洲国产sm捆绑调教视频 | 91网站视频在线观看| 99久久久精品免费观看国产蜜| 国产精品国产精品国产专区不片| 国产精品网曝门| 日日摸夜夜添夜夜添国产精品 | 一区二区三区成人在线视频| 亚洲综合免费观看高清完整版在线| 亚洲精品欧美综合四区| 五月婷婷综合激情| 国产一区二区三区黄视频 | 9191久久久久久久久久久| 欧美午夜一区二区三区免费大片| 欧美精品aⅴ在线视频| 欧美一区二区三区四区高清| ww久久中文字幕| 亚洲特级片在线| 久久精品国产99| 色天天综合久久久久综合片| 欧美一区二区三区婷婷月色| 中文字幕一区二区三区蜜月| 午夜av区久久| 亚洲午夜av在线| 成人在线视频一区| 精品成人一区二区三区| 亚洲自拍与偷拍| 欧美中文字幕一二三区视频| 国产精品丝袜91| 国产精品一区二区三区乱码| 欧美日韩精品一区二区三区蜜桃| 国产精品少妇自拍| 国产精选一区二区三区| 久久午夜国产精品| 国产在线视频一区二区三区| 欧美一区二区视频免费观看| 午夜视频一区二区三区| 欧美三级在线视频| 亚洲1区2区3区4区| 欧美久久久久久久久久| 午夜激情一区二区| 韩国av一区二区三区在线观看| 欧美一区2区视频在线观看| 九九九精品视频| 日韩一区二区视频| 成人综合激情网| 亚洲三级理论片| 欧美mv日韩mv| 一区二区三区91| 国产日韩影视精品| 久久精品国产精品亚洲综合| 欧美视频一区在线观看| 亚洲成人777| 日韩亚洲欧美一区| 成人性生交大片免费| 一区2区3区在线看| 久久免费看少妇高潮| av一区二区不卡| 日本不卡一二三区黄网| 国产人成一区二区三区影院| 在线影视一区二区三区| 日韩成人午夜电影| 中文字幕一区二| 日韩你懂的在线观看| 99精品国产99久久久久久白柏| 亚洲欧美综合在线精品| 色综合天天综合网国产成人综合天 | 国产1区2区3区精品美女| 亚洲色图欧美偷拍| 久久精品欧美日韩| 久久精品国产亚洲高清剧情介绍 | 男女激情视频一区| 亚洲成人第一页| 亚洲精品欧美专区| 亚洲欧美日韩电影| 国产精品国模大尺度视频| 久久久国际精品| www成人在线观看| 欧美不卡激情三级在线观看| 欧美日韩成人综合在线一区二区| 91一区二区三区在线观看| 国产成人午夜精品5599| 高清久久久久久| 成人黄色av网站在线| 国产午夜精品一区二区 | 亚洲综合在线电影| 国产精品久久久久影院老司| 中文字幕av一区二区三区高 | 亚洲色图色小说| 国产日韩欧美综合一区| 久久精品欧美日韩精品| 亚洲色图.com| 亚洲国产一区二区三区| 日韩精品亚洲专区| 国产精品99久久久久久久vr| 懂色中文一区二区在线播放| 成人国产亚洲欧美成人综合网 | 成人激情黄色小说| 91精品福利视频| 欧美成人午夜电影| 中文字幕乱码日本亚洲一区二区| 一区二区三区不卡视频| 紧缚奴在线一区二区三区| 国产精品色一区二区三区| 亚洲午夜av在线| 成人激情开心网| 欧美v国产在线一区二区三区| 国产精品久久久久永久免费观看 | 亚洲视频在线一区| 蜜臀av亚洲一区中文字幕| 99久久久久久| 国产欧美日韩久久| 免费在线观看精品| 欧美性感一区二区三区| 亚洲视频一区二区在线观看| 久久综合九色欧美综合狠狠| 午夜精品福利久久久| 91麻豆成人久久精品二区三区| 日韩欧美的一区| 美国精品在线观看| 日韩视频免费直播| 蜜臀91精品一区二区三区| 亚瑟在线精品视频| 91免费观看在线| 亚洲欧洲国产日本综合| heyzo一本久久综合| 国产精品免费人成网站| 处破女av一区二区| 亚洲国产成人午夜在线一区| 国产在线精品一区二区三区不卡| 欧美老肥妇做.爰bbww视频| 欧美日韩中文字幕精品| 亚洲综合一区二区| 在线不卡免费欧美| ...xxx性欧美| 丰满白嫩尤物一区二区| 亚洲三级电影网站| 欧美高清激情brazzers| 青椒成人免费视频| 久久久亚洲高清| 91免费视频大全| 国产一区高清在线| 一区二区在线电影| 精品区一区二区| 91久久精品网| 国产精品一区二区久久不卡| 亚洲图片激情小说| 欧美一区二区视频在线观看 | 欧美一二三四在线| 国产成人超碰人人澡人人澡| 亚洲h在线观看| 国产精品第13页| xvideos.蜜桃一区二区| 在线影院国内精品| 成人免费视频网站在线观看| 亚洲444eee在线观看| 亚洲国产欧美在线人成| 亚洲成人一区二区| 成人黄色av电影| 亚洲最大成人综合| 精品日韩成人av| 日韩午夜电影av| 午夜欧美一区二区三区在线播放| 亚洲精品在线免费播放| 91精品综合久久久久久| 欧美日韩亚洲综合在线 | 色香蕉久久蜜桃| av午夜一区麻豆| 亚洲日本在线天堂| 综合激情网...| 中文字幕制服丝袜成人av | 日本欧美在线观看| 三级不卡在线观看| 免费成人av在线| 久久丁香综合五月国产三级网站| 欧美一区二区观看视频| 日韩丝袜情趣美女图片| 精品国产一区二区三区忘忧草| 欧美一区二区三区日韩视频| 欧美片网站yy| 国产精华液一区二区三区| 国产精品自拍一区| 北岛玲一区二区三区四区| 成人sese在线| 日韩一区二区不卡| 国产精品每日更新| 午夜精品久久久久久久蜜桃app| 亚洲1区2区3区4区| 波多野结衣在线一区| 51精品国自产在线| 日韩精品一区二区三区三区免费| 久久亚洲一区二区三区明星换脸| 中文字幕在线免费不卡|