亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關于我們
? 蟲蟲下載站

?? 41.txt

?? This complete matlab for neural network
?? TXT
字號:
發信人: GzLi (笑梨), 信區: DataMining
標  題: [轉載] Re: 什么叫validation set(轉寄)
發信站: 南京大學小百合站 (Tue Dec 24 23:13:58 2002)

【 以下文字轉載自 GzLi 的信箱 】
【 原文由 <GzLi@smth.edu.cn> 所發表 】

來  源: 211.68.16.32

發信人: IKM (IKM), 信區: AI       

標  題: Re: 什么叫validation set

發信站: BBS 水木清華站 (Tue Dec 24 13:06:24 2002)


來自ANN的FAQ


Subject: What are the population, sample, training set, design set, validation


 set, and test set?

It is rarely useful to have a NN simply memorize a set of data, since memoriza


tion can be done much more efficiently by numerous algorithms for table look-u


p. Typically, you want the NN to be able to perform accurately on new data, th


at is, to generalize. 

There seems to be no term in the NN literature for the set of all cases that y


ou want to be able to generalize to. Statisticians call this set the "populati


on". Tsypkin (1971) called it the "grand truth distribution," but this term ha


s never caught on. 


Neither is there a consistent term in the NN literature for the set of cases t


hat are available for training and evaluating an NN. Statisticians call this s


et the "sample". The sample is usually a subset of the population. 


(Neurobiologists mean something entirely different by "population," apparently


 some collection of neurons, but I have never found out the exact meaning.. I a


m going to continue to use "population" in the statistical sense until NN rese


archers reach a consensus on some other terms for "population" and "sample"; I


 suspect this will never happen.) 


In NN methodology, the sample is often subdivided into "training", "validation


", and "test" sets. The distinctions among these subsets are crucial, but the 


terms "validation" and "test" sets are often confused. Bishop (1995), an indis


pensable reference on neural networks, provides the following explanation (p. 


372): 


Since our goal is to find the network having the best performance on new data,


 the simplest approach to the comparison of different networks is to evaluate 


the error function using data which is independent of that used for training. 


Various networks are trained by minimization of an appropriate error function 


defined with respect to a training data set. The performance of the networks i


s then compared by evaluating the error function using an independent validati


on set, and the network having the smallest error with respect to the validati


on set is selected. This approach is called the hold out method. Since this pr


ocedure can itself lead to some overfitting to the validation set, the perform


ance of the selected network should be confirmed by measuring its performance 


on a third independent set of data called a test set. 

And there is no book in the NN literature more authoritative than Ripley (1996


), from which the following definitions are taken (p.354): 

Training set: 

A set of examples used for learning, that is to fit the parameters [i.e., weig


hts] of the classifier. 

Validation set: 

A set of examples used to tune the parameters [i.e., architecture, not weights


] of a classifier, for example to choose the number of hidden units in a neura


l network. 

Test set: 

A set of examples used only to assess the performance [generalization] of a fu


lly-specified classifier. 

The literature on machine learning often reverses the meaning of "validation" 


and "test" sets. This is the most blatant example of the terminological confus


ion that pervades artificial intelligence research. 

The crucial point is that a test set, by the standard definition in the NN lit


erature, is never used to choose among two or more networks, so that the error


 on the test set provides an unbiased estimate of the generalization error (as


suming that the test set is representative of the population, etc.). Any data 


set that is used to choose the best of two or more networks is, by definition,


 a validation set, and the error of the chosen network on the validation set i


s optimistically biased. 


There is a problem with the usual distinction between training and validation 


sets. Some training approaches, such as early stopping, require a validation s


et, so in a sense, the validation set is used for training. Other approaches, 


such as maximum likelihood, do not inherently require a validation set. So the


 "training" set for maximum likelihood might encompass both the "training" and


 "validation" sets for early stopping. Greg Heath has suggested the term "desi


gn" set be used for cases that are used solely to adjust the weights in a netw


ork, while "training" set be used to encompass both design and validation sets


.. There is considerable merit to this suggestion, but it has not yet been wide


ly adopted. 


But things can get more complicated. Suppose you want to train nets with 5 ,10


, and 20 hidden units using maximum likelihood, and you want to train nets wit


h 20 and 50 hidden units using early stopping. You also want to use a validati


on set to choose the best of these various networks. Should you use the same v


alidation set for early stopping that you use for the final network choice, or


 should you use two separate validation sets? That is, you could divide the sa


mple into 3 subsets, say A, B, C and proceed as follows: 


Do maximum likelihood using A. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on B (the validation set). 

Estimate the generalization error of the chosen network using C (the test set)


.. 

Or you could divide the sample into 4 subsets, say A, B, C, and D and proceed 


as follows: 

Do maximum likelihood using A and B combined. 

Do early stopping with A to adjust the weights and B to decide when to stop (t


his makes B a validation set with respect to early stopping). 

Choose among all 3 nets trained by maximum likelihood and the 2 nets trained b


y early stopping based on the error computed on C (this makes C a second valid


ation set). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

Or, with the same 4 subsets, you could take a third approach: 

Do maximum likelihood using A. 

Choose among the 3 nets trained by maximum likelihood based on the error compu


ted on B (the first validation set) 

Do early stopping with A to adjust the weights and B (the first validation set


) to decide when to stop. 

Choose among the best net trained by maximum likelihood and the 2 nets trained


 by early stopping based on the error computed on C (the second validation set


). 

Estimate the generalization error of the chosen network using D (the test set)


.. 

You could argue that the first approach is biased towards choosing a net train


ed by early stopping. Early stopping involves a choice among a potentially lar


ge number of networks, and therefore provides more opportunity for overfitting


 the validation set than does the choice among only 3 networks trained by maxi


mum likelihood. Hence if you make the final choice of networks using the same 


validation set (B) that was used for early stopping, you give an unfair advant


age to early stopping. If you are writing an article to compare various traini


ng methods, this bias could be a serious flaw. But if you are using NNs for so


me practical application, this bias might not matter at all, since you obtain 


an honest estimate of generalization error using C. 

You could also argue that the second and third approaches are too wasteful in 


their use of data. This objection could be important if your sample contains 1


00 cases, but will probably be of little concern if your sample contains 100,0


00,000 cases. For small samples, there are other methods that make more effici


ent use of data; see "What are cross-validation and bootstrapping?" 


References: 


Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford: Oxford U


niversity Press. 


Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge: Cambri


dge University Press. 


Tsypkin, Y. (1971), Adaptation and Learning in Automatic Systems, NY: Academic


 Press. 


------------------------------------------------------------------------



【 在 xinjian 的大作中提到: 】

: 不是一回事

: 我看到有的系統需要三個數據集,訓練集,校驗集,測試集

: 我也分得不清楚

: 感覺訓練集是訓練學習過程的參數的

: 而校驗集,我覺得是一個簡單的測試,但它到底起挑選作用呢(從多個學習算法中挑..

:   還是起二次訓練作用呢,我就不清楚了,也許由人自己定的吧

: 而測試集那就是最后的評判了

: 【 在 Karpov (卡爾波夫) 的大作中提到: 】

: : 是不是校驗集,它和測試集是一回事嗎?


--


※ 來源:·BBS 水木清華站 smth.org·[FROM: 61.149.31.217]


--


※ 來源:.南京大學小百合站 bbs.nju.edu.cn [FROM: 211.68.16.32]
--
※ 轉載:.南京大學小百合站 bbs.nju.edu.cn.[FROM: 211.80.38.17]

?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
极品销魂美女一区二区三区| 久久网这里都是精品| 99视频在线观看一区三区| 日韩精品每日更新| 丝瓜av网站精品一区二区| 亚洲一区二区中文在线| 精品美女在线观看| 国产欧美中文在线| 国产精品不卡视频| 亚洲午夜精品在线| 韩国在线一区二区| 成人性生交大片| 欧美三级电影网站| 欧美大胆人体bbbb| 有坂深雪av一区二区精品| 视频一区二区中文字幕| 国产酒店精品激情| 国产麻豆成人精品| 99九九99九九九视频精品| 欧美日韩国产大片| 久久精品一区二区三区四区| 国产欧美一区二区精品性色 | 欧洲国内综合视频| 精品美女一区二区| 久久综合九色综合97_久久久 | 亚洲精品日产精品乱码不卡| 免费成人在线影院| 国产黑丝在线一区二区三区| 欧美三级电影一区| 亚洲欧洲无码一区二区三区| 蜜臀av性久久久久蜜臀av麻豆| 国产jizzjizz一区二区| 欧美大片在线观看| 久久99蜜桃精品| 欧美精品一区二区在线播放| 国产一区二区主播在线| 国产日产欧美精品一区二区三区| 免费观看久久久4p| 欧美成人乱码一区二区三区| 精品一区二区在线免费观看| 亚洲精品在线一区二区| 风流少妇一区二区| 亚洲精品国产品国语在线app| 色综合久久久久久久久久久| 亚洲欧洲在线观看av| 欧美主播一区二区三区美女| 偷拍亚洲欧洲综合| 久久久美女毛片| 色综合天天综合网天天看片| 亚洲一区在线播放| 2022国产精品视频| 91香蕉视频在线| 美女一区二区在线观看| 亚洲欧洲精品成人久久奇米网| 在线观看欧美日本| 国产在线精品一区二区三区不卡| 国产精品免费久久久久| 欧美三级一区二区| 丁香啪啪综合成人亚洲小说 | 高清日韩电视剧大全免费| 亚洲激情成人在线| 国产欧美视频在线观看| 欧美三级日本三级少妇99| 成人精品视频网站| 国产精一区二区三区| 五月天一区二区三区| 亚洲男人的天堂一区二区| 久久国产精品第一页| 国产精品美女久久久久久久久 | 亚洲精品网站在线观看| 国产精品私人影院| 欧美成人a视频| 日韩免费在线观看| 日韩欧美专区在线| 亚洲精品一区二区三区精华液 | 久久精品国产成人一区二区三区| 中文字幕一区二区三区乱码在线| 日韩亚洲欧美在线| 宅男噜噜噜66一区二区66| 91国产精品成人| 欧美视频自拍偷拍| 在线不卡欧美精品一区二区三区| 欧美图片一区二区三区| 日本道色综合久久| 91精品国产免费| 欧美一区二区人人喊爽| 日韩欧美高清dvd碟片| 久久久91精品国产一区二区精品| 国产亚洲污的网站| 国产精品超碰97尤物18| 亚洲一二三专区| 免费一级片91| 99精品欧美一区| 8x8x8国产精品| 久久精品人人做人人综合| 中文字幕av在线一区二区三区| 国产精品福利一区| 日韩在线卡一卡二| av高清不卡在线| 2024国产精品视频| 亚洲精品国产品国语在线app| 日本中文字幕不卡| 色婷婷综合久久久中文一区二区| 91精品国产综合久久精品图片| 久久久久久久久久美女| 午夜精品一区二区三区免费视频| 紧缚奴在线一区二区三区| 在线看国产一区二区| 欧美国产禁国产网站cc| 免费欧美在线视频| 91久久免费观看| 中文字幕制服丝袜成人av | 欧美日韩欧美一区二区| 亚洲午夜电影在线观看| 粉嫩嫩av羞羞动漫久久久| 6080日韩午夜伦伦午夜伦| 亚洲免费观看高清| 97精品超碰一区二区三区| 欧美激情艳妇裸体舞| 国产精品夜夜爽| 久久久三级国产网站| 捆绑变态av一区二区三区| 日韩三级在线观看| 老司机精品视频线观看86| 91精品国产91综合久久蜜臀| 肉丝袜脚交视频一区二区| 欧美在线视频全部完| 婷婷一区二区三区| 日韩欧美亚洲国产另类| 精品一区二区三区免费毛片爱| 欧美成人官网二区| 国产成人精品1024| 久久久久久日产精品| 高清成人在线观看| 亚洲欧美成aⅴ人在线观看| 日本久久一区二区| 日本不卡不码高清免费观看| 欧美一级日韩一级| 成人性生交大片免费看视频在线| 综合欧美亚洲日本| 欧美日韩dvd在线观看| 久久国产福利国产秒拍| 中文字幕第一页久久| 欧美日韩国产大片| 成人国产精品免费观看动漫| 亚洲激情欧美激情| 精品免费国产一区二区三区四区| 99精品欧美一区二区蜜桃免费 | 欧美久久久久久蜜桃| 国产盗摄精品一区二区三区在线 | 国产传媒久久文化传媒| 亚洲人成网站影音先锋播放| 这里是久久伊人| 一本色道久久综合亚洲aⅴ蜜桃| 久久精品久久久精品美女| 亚洲精品视频在线观看网站| 国产欧美日韩另类视频免费观看| 欧美影院一区二区三区| 成人性生交大片免费看中文| 免费黄网站欧美| 亚洲成人av一区二区三区| 中文字幕欧美激情| 久久久久久免费网| 久久久五月婷婷| 日本一区二区免费在线观看视频| 精品乱人伦小说| 2欧美一区二区三区在线观看视频| 91精品国产综合久久国产大片| 日本韩国欧美在线| 91网页版在线| 91高清视频在线| 欧美乱妇一区二区三区不卡视频| 欧洲一区二区av| 欧美一卡二卡在线| 精品国产乱码久久久久久夜甘婷婷 | 欧美系列日韩一区| 日韩精品一区在线观看| 2020国产精品久久精品美国| 日韩精品一区二区三区在线播放 | 色婷婷久久久综合中文字幕 | 亚洲永久免费av| 视频一区国产视频| 国产乱码字幕精品高清av | 男人的j进女人的j一区| 国内精品国产三级国产a久久| 高清国产午夜精品久久久久久| 成人免费高清在线| 欧美一级精品在线| 欧美国产精品v| 亚洲一区精品在线| 国产精品一区二区男女羞羞无遮挡 | 欧美日韩激情一区二区三区| 日韩亚洲国产中文字幕欧美| 中文一区二区在线观看| 日韩影院免费视频| av电影一区二区| 日韩免费视频一区| 亚洲第一电影网| 91亚洲国产成人精品一区二区三| 日韩视频一区二区三区|