亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關于我們
? 蟲蟲下載站

?? c4.5文檔說明.txt

?? C4.5文檔說明(數據類型
?? TXT
?? 第 1 頁 / 共 4 頁
字號:
	Rule 4: (296, lift 1.1)
		on thyroxine = t
		FTI > 65
		->  class negative  [0.997]
	
	Rule 5: (240, lift 1.1)
		TT4 > 153
		->  class negative  [0.996]
	
	Rule 6: (29, lift 1.1)
		thyroid surgery = t
		FTI > 65
		->  class negative  [0.968]
	
	Rule 7: (31, lift 42.7)
		thyroid surgery = f
		TSH > 6
		TT4 <= 37
		->  class primary  [0.970]

The rules are divided into four bands of roughly equal sizes and a further summary is generated for both training and test cases. Here is the output for test cases: 
	Evaluation on test data (1000 cases):
	
		        Rules     
		  ----------------
		    No      Errors
	
		     7    5( 0.5%)   <<
	
	
		   (a)   (b)   (c)   (d)    <-classified as
		  ----  ----  ----  ----
		    32                      (a): class primary
		     1    39                (b): class compensated
		                            (c): class secondary
		     1     3         924    (d): class negative
	
	Rule utility summary:
	
		Rules	      Errors
		-----	      ------
		1-2	   56( 5.6%)
		1-4	   10( 1.0%)
		1-5	    6( 0.6%)

This shows that, when only the first two rules are used, the error rate on the test cases is 5.6%, dropping to 1.0% when the first four rules are used, and so on. The performance of the entire ruleset is not repeated since it is shown above the utility summary. 
Rule utility orderings are not given for cross-validations (see below). 


Boosting
Another innovation incorporated in See5 is adaptive boosting, based on the work of Rob Schapire and Yoav Freund. The idea is to generate several classifiers (either decision trees or rulesets) rather than just one. When a new case is to be classified, each classifier votes for its predicted class and the votes are counted to determine the final class. 

But how can we generate several classifiers from a single dataset? As the first step, a single decision tree or ruleset is constructed as before from the training data (e.g. hypothyroid.data). This classifier will usually make mistakes on some cases in the data; the first decision tree, for instance, gives the wrong class for 7 cases in hypothyroid.data. When the second classifier is constructed, more attention is paid to these cases in an attempt to get them right. As a consequence, the second classifier will generally be different from the first. It also will make errors on some cases, and these become the the focus of attention during construction of the third classifier. This process continues for a pre-determined number of iterations or trials, but stops if the most recent classifiers is either extremely accurate or inaccurate. 

The Boost option with x trials instructs See5 to construct up to x classifiers in this manner. Naturally, constructing multiple classifiers requires more computation that building a single classifier -- but the effort can pay dividends! Trials over numerous datasets, large and small, show that on average 10-classifier boosting reduces the error rate for test cases by about 25%. 

Selecting the Boost option with 10 trials causes ten decision trees to be generated. The summary of the trees' individual and aggregated performance on the 1000 test cases is: 

	Trial	    Decision Tree   
	-----	  ----------------  
		  Size      Errors  
	
	   0	    14    4( 0.4%)
	   1	     7   52( 5.2%)
	   2	    11    9( 0.9%)
	   3	    15   21( 2.1%)
	   4	     7   12( 1.2%)
	   5	    10    7( 0.7%)
	   6	     8    8( 0.8%)
	   7	    13   13( 1.3%)
	   8	    12   12( 1.2%)
	   9	    16   54( 5.4%)
	boost	          2( 0.2%)   <<

(Again, different hardware can lead to slightly different results.) The performance of the classifier constructed at each trial is summarized on a separate line, while the line labeled boost shows the result of voting all the classifiers. 

The decision tree constructed on Trial 0 is identical to that produced without the Boost option. Some of the subsequent trees produced by paying more attention to certain cases have relatively high overall error rates. Nevertheless, when the trees are combined by voting, the final predictions have a lower error rate of 0.2% on the test cases. 


Winnowing attributes
The decision trees and rulesets constructed by See5 do not generally use all of the attributes. The hypothyroid application has 22 predictive attributes (plus a class and a label attribute) but only six of them appear in the tree and the ruleset. This ability to pick and choose among the predictors is an important advantage of tree-based modeling techniques. 

Some applications, however, have an abundance of attributes! For instance, one approach to text classification describes each passage by the words that appear in it, so there is a separate attribute for each different word in a restricted dictionary. 

When there are numerous alternatives for each test in the tree or ruleset, it is likely that at least one of them will appear to provide valuable predictive information. In applications like these it can be useful to pre-select a subset of the attributes that will be used to construct the decision tree or ruleset. The See5 mechanism to do this is called "winnowing" by analogy with the process for separating wheat from chaff (or, here, useful attributes from unhelpful ones). 

Winnowing is not obviously relevant for the hypothyroid application since there are relatively few attributes. To illustrate the idea, however, here are the results when the Winnowing option is invoked: 

	See5 [Release 1.20a]	Wed Sep  1 11:02:48 2004
	
	    Options:
		Winnow attributes
	
	Class specified by attribute `diagnosis'
	
	Read 2772 cases (24 attributes) from hypothyroid.data
	
	Attributes winnowed:
	    age
	    sex
	    query on thyroxine
	    on antithyroid medication
	    sick
	    pregnant
	    I131 treatment
	    query hypothyroid
	    query hyperthyroid
	    lithium
	    tumor
	    goitre
	    hypopituitary
	    psych
	    T3
	    T4U
	    referral source
	
	Decision tree:
	
	TSH <= 6: negative (2472/2)
	TSH > 6:
	:...FTI <= 65: primary (72.4/13.9)
	    FTI > 65:
	    :...on thyroxine = t: negative (37.7)
	        on thyroxine = f:
	        :...thyroid surgery = t: negative (6.8)
	            thyroid surgery = f:
	            :...TT4 > 153: negative (6/0.1)
	                TT4 <= 153:
	                :...TT4 > 62: compensated (170.1/24.3)
	                    TT4 <= 62:
	                    :...TT4 <= 37: primary (2.5/0.2)
	                        TT4 > 37: compensated (4.5/0.4)
	
	
	Evaluation on training data (2772 cases):
	
		    Decision Tree   
		  ----------------  
		  Size      Errors  
	
		     8   12( 0.4%)   <<
	
	
		   (a)   (b)   (c)   (d)    <-classified as
		  ----  ----  ----  ----
		    60     3                (a): class primary
		     1   153                (b): class compensated
		                       2    (c): class secondary
		     5     1        2547    (d): class negative
	
	
	Evaluation on test data (1000 cases):
	
		    Decision Tree   
		  ----------------  
		  Size      Errors  
	
		     8    4( 0.4%)   <<
	
	
		   (a)   (b)   (c)   (d)    <-classified as
		  ----  ----  ----  ----
		    32                      (a): class primary
		     1    39                (b): class compensated
		                            (c): class secondary
		     1     2         925    (d): class negative
	
	
	Time: 0.0 secs

After analyzing the training cases, See5 winnows (discards) 17 of the 22 predictive attributes before the decision tree is built. Although it is smaller than the original tree, the new decision tree constructed from only five attributes is just as accurate on the test cases. 
Since winnowing the attributes can be a time-consuming process, it is recommended primarily for large applications (10,000 cases or more) where there is reason to suspect that many of the attributes have at best marginal relevance to the classification task. 


Softening thresholds
The top of our initial decision tree tests whether the value of the attribute TSH is less than or equal to, or greater than, 6. If the former holds, we go no further and predict that the case's class is negative, while if it does not we look at other information before making a decision. Thresholds like this are sharp by default, so that a case with a hypothetical value of 5.99 for TSH is treated quite differently from one with a value of 6.01. 

For some domains, this sudden change is quite appropriate -- for instance, there are hard-and-fast cutoffs for bands of the income tax table. For other applications, though, it is more reasonable to expect classification decisions to change more slowly with changes in attribute values. 

See5 contains an option to `soften' thresholds such as 6 above. When this is invoked, each threshold is broken into three ranges -- let us denote them by a lower bound lb, an upper bound ub, and a central value t. If the attribute value in question is below lb or above ub, classification is carried out using the single branch corresponding to the `<=' or '>' result respectively. If the value lies between lb and ub, both branches of the tree are investigated and the results combined probabilistically. The values of lb and ub are determined by See5 based on an analysis of the apparent sensitivity of classification to small changes in the threshold. They need not be symmetric -- a fuzzy threshold can be sharper on one side than on the other. 

Invoking the Fuzzy thresholds option gives the following decision tree: 

	TSH <= 6 (6.05): negative (2472/2)
	TSH >= 6.1 (6.05):
	:...FTI <= 64 (65.35):
	    :...thyroid surgery = t:
	    :   :...FTI <= 24 (38.25): negative (2.1)
	    :   :   FTI >= 52.5 (38.25): primary (2.1/0.1)
	    :   thyroid surgery = f:
	    :   :...TT4 <= 60 (61.5): primary (51/3.7)
	    :       TT4 >= 63 (61.5):
	    :       :...referral source in {WEST,SVHD}: primary (0)
	    :           referral source = STMW: primary (0.1)
	    :           referral source = SVHC: primary (1)
	    :           referral source = SVI: primary (3.8/0.8)
	    :           referral source = other:
	    :           :...TSH <= 19 (22.5): negative (6.4/2.7)
	    :               TSH >= 26 (22.5): primary (5.8/0.8)
	    FTI >= 65.7 (65.35):
	    :...on thyroxine = t: negative (37.7)
	        on thyroxine = f:
	        :...thyroid surgery = t: negative (6.8)
	            thyroid surgery = f:
	            :...TT4 >= 158 (153): negative (6/0.1)
	                TT4 <= 148 (153):
	                :...TT4 <= 31 (37.5): primary (2.5/0.2)
	                    TT4 >= 44 (37.5): compensated (174.6/24.8)

Each threshold is now of the form <= lb (t) or >= ub (t). In this example, most of the thresholds are still relatively tight, but notice the asymmetric threshold values for the test FTI <= 64. Soft thresholds slightly improve the classifier's accuracy on both training and test data. 
A final point: soft thresholds affect only decision tree classifiers -- they do not change the interpretation of rulesets. 


Advanced pruning options
Three further options enable aspects of the classifier-generation process to be tweaked. These are best regarded as advanced options that should be used sparingly (if at all), so that this section can be skipped without much loss. 

See5 constructs decision trees in two phases. A large tree is first grown to fit the data closely and is then `pruned' by removing parts that are predicted to have a relatively high error rate. This pruning process is first applied to every subtree to decide whether it should be replaced by a leaf or sub-branch, and then a global stage looks at the performance of the tree as a whole. 

Turning off the default Global pruning option disables this second pruning component and generally results in larger decision tees and rulesets. For the hypothyroid application, the tree increases in size from 14 to 15 leaves. 

The Pruning CF option affects the way that error rates are estimated and hence the severity of pruning; values smaller than the default (25%) cause more of the initial tree to be pruned, while larger values result in less pruning. 

The Minimum cases option constrains the degree to which the initial tree can fit the data. At each branch point in the decision tree, the stated minimum number of training cases must follow at least two of the branches. Values higher than the default (2 cases) can lead to an initial tree that fits the training data only approximately -- a form of pre-pruning. (This option is complicated by the presence of missing attribute values and by the use of differential misclassification costs, discussed below. Both cause adjustments to the apparent number of cases following a branch.) 


Sampling from large datasets
Even though See5 is relatively fast, building classifiers from large numbers of cases can take an inconveniently long time, especially when options such as boosting are employed. See5 incorporates a facility to extract a random sample from a dataset, construct a classifier from the sample, and then test the classifier on a disjoint collection of cases. By using a smaller set of training cases in this way, the process of generating a classifier is expedited, but at the cost of a possible reduction in the classifier's predictive performance. 

The Sample option with x% has two consequences. Firstly, a random sample containing x% of the cases in the application's data file is used to construct the classifier. Secondly, the classifier is evaluated on a non-overlapping set of test cases consisting of another (disjoint) sample of the same size as the training set (if x is less than 50%), or all cases that were not used in the training set (if x is greater than or equal to 50%). 

?? 快捷鍵說明

復制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
亚洲欧洲另类国产综合| 欧美三区免费完整视频在线观看| 日韩三级伦理片妻子的秘密按摩| 日本欧美一区二区| 精品福利一区二区三区| 狠狠色狠狠色综合日日91app| 欧美一区二区三级| 国内精品免费**视频| 久久蜜桃一区二区| 国产91精品免费| 中文字幕在线播放不卡一区| 91蝌蚪国产九色| 亚洲成人综合网站| 91精品久久久久久蜜臀| 免费成人在线观看| 国产亚洲午夜高清国产拍精品| 国产精品91xxx| 亚洲同性同志一二三专区| 色哟哟精品一区| 免费的成人av| 国产精品久久综合| 在线看国产一区| 男女性色大片免费观看一区二区 | 欧美色精品在线视频| 日韩高清在线观看| 国产欧美视频一区二区| 在线亚洲人成电影网站色www| 午夜视频在线观看一区| 精品国产乱码久久久久久久久| 成人av网址在线观看| 亚洲一区二区欧美日韩| 欧美精品一区视频| 91同城在线观看| 国产又粗又猛又爽又黄91精品| 中文字幕亚洲精品在线观看| 欧美日本免费一区二区三区| 国产精品中文字幕日韩精品| 亚洲人成精品久久久久| 日韩欧美黄色影院| 99国产麻豆精品| 麻豆国产精品777777在线| 椎名由奈av一区二区三区| 日韩一区二区免费高清| 色婷婷亚洲精品| 国产91在线观看| 麻豆一区二区三| 亚洲一级在线观看| 欧美国产一区二区在线观看| 欧美一级片在线观看| 91久久线看在观草草青青| 日韩av中文字幕一区二区| 亚洲欧美另类小说| 国产欧美日韩在线| 精品奇米国产一区二区三区| 91蝌蚪国产九色| 国v精品久久久网| 久久精品国产成人一区二区三区| 亚洲男帅同性gay1069| 国产欧美日韩精品一区| 日韩精品一区二区三区蜜臀| 欧美在线制服丝袜| 色综合天天综合| 成人avav在线| 福利一区福利二区| 精品一区二区三区免费毛片爱| 亚洲国产成人tv| 亚洲自拍偷拍图区| 亚洲综合另类小说| 亚洲另类在线制服丝袜| 亚洲丝袜精品丝袜在线| 国产精品理伦片| 国产精品午夜春色av| 久久日韩粉嫩一区二区三区| 日韩精品一区国产麻豆| 91精品国产综合久久婷婷香蕉| 欧美亚洲一区三区| 精品视频在线免费看| 欧美日韩一区 二区 三区 久久精品 | 在线精品视频一区二区三四| av不卡一区二区三区| 成人av中文字幕| 99久久精品免费看国产免费软件| 东方欧美亚洲色图在线| 成人免费三级在线| 波多野结衣91| 色婷婷综合在线| 欧美性欧美巨大黑白大战| 欧美日韩情趣电影| 欧美一级二级三级乱码| 日韩欧美视频在线| 久久久久9999亚洲精品| 国产日韩欧美不卡| 国产精品素人视频| 国产精品看片你懂得| 亚洲另类中文字| 三级久久三级久久| 久久成人久久鬼色| 成人综合婷婷国产精品久久| 99v久久综合狠狠综合久久| 色88888久久久久久影院按摩| 欧美性猛交xxxxxx富婆| 欧美一级专区免费大片| 精品理论电影在线| 国产精品三级视频| 亚洲自拍偷拍网站| 久久精品国产免费| jvid福利写真一区二区三区| 在线观看日韩电影| 欧美成人bangbros| 国产精品白丝在线| 图片区小说区区亚洲影院| 精彩视频一区二区三区| 成人黄色大片在线观看| 在线一区二区观看| www国产精品av| 一区二区三区在线免费播放| 日本系列欧美系列| 成人午夜大片免费观看| 欧美日本精品一区二区三区| 国产日韩欧美制服另类| 亚洲午夜精品在线| 高清不卡在线观看| 欧美日韩免费电影| 国产精品视频在线看| 日韩精品91亚洲二区在线观看| 国产乱人伦偷精品视频免下载| 色婷婷久久99综合精品jk白丝| 欧美成人一区二区| 亚洲精品国产精品乱码不99| 狠狠色狠狠色综合日日91app| 色婷婷精品大在线视频| 久久精品视频一区二区| 天天综合日日夜夜精品| a级精品国产片在线观看| 日韩色在线观看| 夜夜夜精品看看| 国产一区二区三区四区五区美女| 欧美日韩一本到| 中文字幕在线一区免费| 精品一二三四区| 欧美午夜不卡在线观看免费| 国产精品网站在线播放| 精品一区二区综合| 欧美精品久久天天躁| 亚洲欧美国产高清| 不卡一卡二卡三乱码免费网站| 日韩欧美一级二级| 午夜精品久久久久久久久| 97se亚洲国产综合自在线观| 亚洲精品一区二区三区福利 | 亚洲人精品午夜| 国产精品自拍在线| 日韩欧美不卡一区| 日韩高清在线电影| 欧美日本高清视频在线观看| 中文字幕在线观看一区二区| 国产高清成人在线| 久久夜色精品一区| 精品一区二区三区香蕉蜜桃| 69p69国产精品| 肉色丝袜一区二区| 91精品中文字幕一区二区三区| 亚洲激情自拍偷拍| 91免费视频观看| 亚洲欧洲精品天堂一级| 国产91清纯白嫩初高中在线观看| 欧美电影免费观看高清完整版在线 | 91国内精品野花午夜精品| 中文字幕在线一区| 91无套直看片红桃| 亚洲美女免费在线| 色老头久久综合| 亚洲午夜免费福利视频| 欧美日韩成人激情| 麻豆一区二区三| 久久综合色鬼综合色| 国产毛片精品视频| 26uuu欧美| 丁香亚洲综合激情啪啪综合| 国产欧美精品一区二区三区四区 | 91蝌蚪国产九色| 亚洲一区二区三区四区中文字幕| 在线免费精品视频| 日本午夜精品一区二区三区电影| 欧美一区二区视频免费观看| 免费观看在线色综合| 久久综合久久综合九色| 国产电影精品久久禁18| 国产精品久久影院| 91久久人澡人人添人人爽欧美 | 亚洲精品在线免费播放| 成人免费毛片片v| 亚洲国产精品天堂| 日韩精品一区二区三区蜜臀| 国产剧情一区在线| 亚洲人xxxx| 日韩一区二区三区在线| 粉嫩aⅴ一区二区三区四区| 亚洲九九爱视频| 日韩午夜av电影|