亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频

? 歡迎來到蟲蟲下載站! | ?? 資源下載 ?? 資源專輯 ?? 關(guān)于我們
? 蟲蟲下載站

?? c4.5文檔說明.txt

?? C4.5文檔說明(數(shù)據(jù)類型
?? TXT
?? 第 1 頁 / 共 4 頁
字號:

Don't forget the commas between values! If you leave them out, See5 will not be able to process your data. 
Notice that `?' is used to denote a value that is missing or unknown. Similarly, `N/A' denotes a value that is not applicable for a particular case. Also note that the cases do not contain values for the attribute FTI since its values are computed from other attribute values. 

Test and cases files (optional)
Of course, the value of predictive patterns lies in their ability to make accurate predictions! It is difficult to judge the accuracy of a classifier by measuring how well it does on the cases used in its construction; the performance of the classifier on new cases is much more informative. (For instance, any number of gurus tell us about patterns that `explain' the rise/fall behavior of the stock market in the past. Even though these patterns may appear plausible, they are only valuable to the extent that they make useful predictions about future rises and falls.) 
The third kind of file used by See5 consists of new test cases (e.g. hypothyroid.test) on which the classifier can be evaluated. This file is optional and, if used, has exactly the same format as the data file. 

Another optional file, the cases file (e.g. hypothyroid.cases), differs from a test file only in allowing the cases' classes to be unknown (`?'). The cases file is used primarily with the cross-referencing procedure and public source code, both of which are described later on. 

Costs file (optional)
The last kind of file, the costs file (e.g. hypothyroid.costs), is also optional and sets out differential misclassification costs. In some applications there is a much higher penalty for certain types of mistakes. In this application, a prediction that hypothyroidism is not present could be very costly if in fact it is. On the other hand, predicting incorrectly that a patient is hypothyroid may be a less serious error. See5 allows different misclassification costs to be associated with each combination of real class and predicted class. We will return to this topic near the end of the tutorial. 

User Interface
It is difficult to see what is going on in an interface without actually using it. As a simple illustration, here is the main window of See5 after the hypothyroid application has been selected. 

 

The main window of See5 has six buttons on its toolbar. From left to right, they are 

Locate Data 
invokes a browser to find the files for your application, or to change the current application; 
Construct Classifier 
selects the type of classifier to be constructed and sets other options; 
Stop 
interrupts the classifier-generating process; 
Review Output 
re-displays the output from the last classifier construction (if any); 
Use Classifier 
interactively applies the current classifier to one or more cases; and 
Cross-Reference 
shows how cases in training or test data relate to (parts of) a classifier and vice versa. 
These functions can also be initiated from the File menu. 
The Edit menu facilities changes to the names and costs files after an application's files have been located. On-line help is available through the Help menu. 

Constructing Classifiers
Once the names, data, and optional files have been set up, everything is ready to use See5. 
The first step is to locate the date using the Locate Data button on the toolbar (or the corresponding selection from the File menu). We will assume that the hypothyroid data above has been located in this manner. 

There are several options that affect the type of classifier that See5 produces and the way that it is constructed. The Construct Classifier button on the toolbar (or selection from the File menu) displays a dialog box that sets out these classifier construction options: 

 

Many of the options have default values that should be satisfactory for most applications. 

Decision trees
When See5 is invoked with the default values of all options, it constructs a decision tree and generates output like this: 
	See5 [Release 1.20a]	Wed Sep  1 11:01:05 2004

	Class specified by attribute `diagnosis'
	
	Read 2772 cases (24 attributes) from hypothyroid.data
	
	Decision tree:
	
	TSH <= 6: negative (2472/2)
	TSH > 6:
	:...FTI <= 65:
	    :...thyroid surgery = t:
	    :   :...FTI <= 36.1: negative (2.1)
	    :   :   FTI > 36.1: primary (2.1/0.1)
	    :   thyroid surgery = f:
	    :   :...TT4 <= 61: primary (51/3.7)
	    :       TT4 > 61:
	    :       :...referral source in {WEST,SVHD}: primary (0)
	    :           referral source = STMW: primary (0.1)
	    :           referral source = SVHC: primary (1)
	    :           referral source = SVI: primary (3.8/0.8)
	    :           referral source = other:
	    :           :...TSH <= 22: negative (6.4/2.7)
	    :               TSH > 22: primary (5.8/0.8)
	    FTI > 65:
	    :...on thyroxine = t: negative (37.7)
	        on thyroxine = f:
	        :...thyroid surgery = t: negative (6.8)
	            thyroid surgery = f:
	            :...TT4 > 153: negative (6/0.1)
	                TT4 <= 153:
	                :...TT4 <= 37: primary (2.5/0.2)
	                    TT4 > 37: compensated (174.6/24.8)
	
	
	Evaluation on training data (2772 cases):
	
		    Decision Tree   
		  ----------------  
		  Size      Errors  
	
		    14    7( 0.3%)   <<
	
	
		   (a)   (b)   (c)   (d)    <-classified as
		  ----  ----  ----  ----
		    60     3                (a): class primary
		         153           1    (b): class compensated
		                       2    (c): class secondary
		           1        2552    (d): class negative
	
	
	Evaluation on test data (1000 cases):
	
		    Decision Tree   
		  ----------------  
		  Size      Errors  
	
		    14    4( 0.4%)   <<
	
	
		   (a)   (b)   (c)   (d)    <-classified as
		  ----  ----  ----  ----
		    31                 1    (a): class primary
		     1    39                (b): class compensated
		                            (c): class secondary
		           2         926    (d): class negative
	
	
	Time: 0.0 secs

(Since hardware platforms can differ in floating point precision and rounding, the output that you see might not be exactly the same as the above.) 
The first line identifies the version of See5 and the run date. See5 constructs a decision tree from the 2772 training cases in the file hypothyroid.data, and this appears next. Although it may not look much like a tree, this output can be paraphrased as: 


	if TSH is less than or equal to 6 then negative
	else
	if TSH is greater than 6 then
	    if FTI is less than or equal to 65 then
		if thyroid surgery equals t then
		    if FTI is less than or equal to 36.1 then negative
		    else
		    if FTI is greater than 36.1 then primary
		else
		if thyroid surgery equals f then
		    if TT4 is less than or equal to 61 then primary
		    else
		    if TT4 is greater than 61 then
		    . . . .

and so on. 
The tree employs a case's attribute values to map it to a leaf designating one of the classes. Every leaf of the tree is followed by a cryptic (n) or (n/m). For instance, the last leaf of the decision tree is compensated (174.6/24.8), for which n is 174.6 and m is 24.8. The value of n is the number of cases in the file hypothyroid.data that are mapped to this leaf, and m (if it appears) is the number of them that are classified incorrectly by the leaf. (A non-integral number of cases can arise because, when the value of an attribute in the tree is not known, See5 splits the case and sends a fraction down each branch.) 

The last section of the See5 output concerns the evaluation of the decision tree, first on the cases in hypothyroid.data from which it was constructed, and then on the new cases in hypothyroid.test. The size of the tree is its number of leaves and the column headed Errors shows the number and percentage of cases misclassified. The tree, with 14 leaves, misclassifies 7 of the 2772 given cases, an error rate of 0.3%. (This might seem inconsistent with the errors recorded at the leaves -- the leaf mentioned above shows 24.8 errors! The discrepancy arises because parts of a case split as a result of unknown attribute values can be misclassified and yet, when the votes from all the parts are aggregated, the correct class can still be chosen.) 

If the number of classes is twenty or less, performance on the training cases is further analyzed in a confusion matrix that pinpoints the kinds of errors made. In this example, the decision tree misclassifies 

three of the primary cases as compensated, 
one of the compensated cases as negative, 
both secondary cases as negative, and 
one negative case as compensated. 
A similar report of performance is given for the optional test cases. A very simple majority classifier predicts that every new case belongs to the most common class in the training data. In this example, 2553 of the 2772 training cases belong to class negative so that a majority classifier would always opt for negative. The 1000 test cases from file hypothyroid.test include 928 belonging to class negative, so a simple majority classifier would have an error rate of 7.2%. The decision tree has a lower error rate of 0.4% on the new cases, but notice that this is higher than its error rate on the training cases. If there are not more than twenty classes, the confusion matrix for the test cases again shows the detailed breakdown of correct and incorrect classifications. 


Discrete value subsets
By default, a test on a discrete attributes has a separate branch for each of its values that is present in the data. Tests with a high fan-out can have the undesirable side-effect of fragmenting the data during construction of the decision tree. See5 has a Subset option that can mitigate this fragmentation to some extent: attribute values are grouped into subsets and each subtree is associated with a subset rather than with a single value. 

In the hypothyroid example, invoking this option merely simplifies part of the tree as 


	referral source in {WEST,STMW,SVHC,SVI,SVHD}: primary (4.9/0.8)

with no effect on classification performance on either the training or test data. 
Although it does not help much for this application, the Subset option is recommended when a dataset has important discrete attributes with more than four or five values. 


Rulesets
Decision trees can sometimes be quite difficult to understand. An important feature of See5 is its ability to generate classifiers called rulesets that consist of unordered collections of (relatively) simple if-then rules. 

The Rulesets option causes classifiers to be expressed as rulesets rather than decision trees, here giving the following rules: 


	Rule 1: (31, lift 42.7)
		thyroid surgery = f
		TSH > 6
		TT4 <= 37
		->  class primary  [0.970]
	
	Rule 2: (63/6, lift 39.3)
		TSH > 6
		FTI <= 65
		->  class primary  [0.892]
	
	Rule 3: (270/116, lift 10.3)
		TSH > 6
		->  class compensated  [0.570]
	
	Rule 4: (2225/2, lift 1.1)
		TSH <= 6
		->  class negative  [0.999]
	
	Rule 5: (296, lift 1.1)
		on thyroxine = t
		FTI > 65
		->  class negative  [0.997]
	
	Rule 6: (240, lift 1.1)
		TT4 > 153
		->  class negative  [0.996]
	
	Rule 7: (29, lift 1.1)
		thyroid surgery = t
		FTI > 65
		->  class negative  [0.968]
	
	Default class: negative

Each rule consists of: 

A rule number -- this is quite arbitrary and serves only to identify the rule. 
Statistics (n, lift x) or (n/m, lift x) that summarize the performance of the rule. Similarly to a leaf, n is the number of training cases covered by the rule and m, if it appears, shows how many of them do not belong to the class predicted by the rule. The rule's accuracy is estimated by the Laplace ratio (n-m+1)/(n+2). The lift x is the result of dividing the rule's estimated accuracy by the relative frequency of the predicted class in the training set. 
One or more conditions that must all be satisfied if the rule is to be applicable. 
A class predicted by the rule. 
A value between 0 and 1 that indicates the confidence with which this prediction is made. (Note: If boosting is used, this confidence is measured using an artificial weighting of the training cases and so does not reflect the accuracy of the rule.) 
When a ruleset like this is used to classify a case, it may happen that several of the rules are applicable (that is, all their conditions are satisfied). If the applicable rules predict different classes, there is an implicit conflict that could be resolved in two ways: we could believe the rule with the highest confidence, or we could attempt to aggregate the rules' predictions to reach a verdict. See5 adopts the latter strategy -- each applicable rule votes for its predicted class with a voting weight equal to its confidence value, the votes are totted up, and the class with the highest total vote is chosen as the final prediction. There is also a default class, here negative, that is used when none of the rules apply. 
Rulesets are generally easier to understand than trees since each rule describes a specific context associated with a class. Furthermore, a ruleset generated from a tree usually has fewer rules than than the tree has leaves, another plus for comprehensibility. (In this example, the first decision tree with 14 leaves is reduced to seven rules.) Finally, rules are often more accurate predictors than decision trees -- a point not illustrated here, since the ruleset has an error rate of 0.5% on the test cases. For very large datasets, however, generating rules with the Ruleset option can require considerably more computer time. 

In the example above, rules are ordered by class and sub-ordered by confidence. An alternative ordering by estimated contribution to predictive accuracy can be selected using the Sort by utility option. Under this option, the rule that most reduces the error rate appears first and the rule that contributes least appears last. Furthermore, results are reported in a selected number of bands so that the predictive accuracies of the more important subsets of rules are also estimated. For example, if the Sort by utility option with four bands is selected, the hypothyroid rules are reordered as 

	Rule 1: (2225/2, lift 1.1)
		TSH <= 6
		->  class negative  [0.999]
	
	Rule 2: (270/116, lift 10.3)
		TSH > 6
		->  class compensated  [0.570]
	
	Rule 3: (63/6, lift 39.3)
		TSH > 6
		FTI <= 65
		->  class primary  [0.892]
	

?? 快捷鍵說明

復(fù)制代碼 Ctrl + C
搜索代碼 Ctrl + F
全屏模式 F11
切換主題 Ctrl + Shift + D
顯示快捷鍵 ?
增大字號 Ctrl + =
減小字號 Ctrl + -
亚洲欧美第一页_禁久久精品乱码_粉嫩av一区二区三区免费野_久草精品视频
中文字幕一区不卡| 久久狠狠亚洲综合| 一区二区三区四区av| 成人欧美一区二区三区小说| 成人欧美一区二区三区视频网页| 国产精品视频线看| ...xxx性欧美| 亚洲一区二区高清| 天天综合色天天综合色h| 亚洲电影一级黄| 秋霞午夜av一区二区三区 | 国产精一品亚洲二区在线视频| 久久99久久久欧美国产| 国产一区二区伦理片| 国产成人超碰人人澡人人澡| 高清不卡在线观看| 色综合天天综合| 欧美精品三级日韩久久| 精品99999| 亚洲激情欧美激情| 日本在线不卡一区| 国产一二三精品| 99久久99久久免费精品蜜臀| 色视频成人在线观看免| 欧美精品一二三| 久久伊人蜜桃av一区二区| 国产精品电影一区二区三区| 亚洲一区在线观看免费观看电影高清| 日韩国产欧美三级| 国产精品一区二区黑丝| 91在线观看视频| 欧美一区二区三区视频免费播放| 久久久久久日产精品| 中文字幕亚洲一区二区va在线| 亚洲福利视频一区| 国产一区二区久久| 日本精品一区二区三区高清 | 3d成人动漫网站| 久久伊99综合婷婷久久伊| 亚洲人成人一区二区在线观看| 亚洲第一精品在线| 国产福利91精品一区二区三区| 在线视频观看一区| 久久久777精品电影网影网| 玉米视频成人免费看| 久久精品国内一区二区三区| 91老师国产黑色丝袜在线| 91精品国产色综合久久不卡蜜臀 | 亚洲精品免费一二三区| 免费国产亚洲视频| 99久久精品国产导航| 日韩一区二区电影网| 国产精品免费观看视频| 蜜臀av一区二区三区| 色域天天综合网| 久久久久久久网| 天堂蜜桃一区二区三区 | 久久久久久黄色| 亚洲成va人在线观看| 成人性生交大片免费看中文| 欧美一区二区三区四区五区 | 欧美日韩精品一区二区天天拍小说 | 国产亚洲欧美激情| 亚洲福利电影网| 不卡视频一二三四| www一区二区| 日韩极品在线观看| 在线观看区一区二| 国产精品进线69影院| 韩国av一区二区三区在线观看| 日本韩国一区二区| 中文字幕一区二区三区在线播放| 激情欧美日韩一区二区| 欧美顶级少妇做爰| 伊人开心综合网| 99久久精品国产导航| 国产亚洲欧美中文| 狠狠色伊人亚洲综合成人| 91精品国产一区二区人妖| 亚洲一区免费视频| 色吧成人激情小说| 亚洲欧美一区二区在线观看| 国产黄色成人av| 久久综合九色综合欧美98| 日韩电影在线观看网站| 欧美在线视频日韩| 一区二区三区精品在线| 91丨九色丨尤物| 亚洲欧美在线视频| 91无套直看片红桃| 亚洲视频免费看| 一本色道a无线码一区v| 亚洲私人影院在线观看| 99国产精品99久久久久久| 中文字幕在线一区免费| 成人一区二区视频| 国产精品动漫网站| 色综合视频一区二区三区高清| 中文字幕一区二区不卡| 91网站在线观看视频| 亚洲免费观看高清| 91成人免费在线| 一区二区免费在线播放| 欧美日韩一区小说| 日本伊人色综合网| 欧美成人精品福利| 国产美女一区二区| 中文字幕不卡的av| aaa欧美色吧激情视频| 亚洲视频一二三| 欧美丝袜丝交足nylons图片| 狠狠v欧美v日韩v亚洲ⅴ| 久久青草欧美一区二区三区| 大白屁股一区二区视频| 自拍av一区二区三区| 91成人免费在线| 日韩电影在线免费看| 精品精品欲导航| 国产不卡高清在线观看视频| 亚洲欧洲无码一区二区三区| 色综合色狠狠综合色| 婷婷久久综合九色综合绿巨人| 日韩精品最新网址| 国产成人欧美日韩在线电影| 亚洲欧美欧美一区二区三区| 欧美日韩专区在线| 久久99热99| 日韩一区欧美小说| 777a∨成人精品桃花网| 国产黑丝在线一区二区三区| 亚洲色欲色欲www在线观看| 欧美日韩精品一区视频| 国产乱码精品一区二区三区忘忧草 | 日韩福利电影在线| 国产亚洲成aⅴ人片在线观看| 91免费小视频| 美国精品在线观看| 国产精品第一页第二页第三页| 欧美日韩在线观看一区二区 | 成人免费的视频| 亚洲一区二区三区四区在线观看 | 亚洲激情av在线| 日韩欧美一级在线播放| 成人性色生活片免费看爆迷你毛片| 一区二区高清视频在线观看| 亚洲精品在线观看视频| 一本久久a久久精品亚洲| 久久97超碰色| 一区二区不卡在线播放| 久久久久久久久久美女| 欧美亚洲综合久久| 国产精品一品二品| 亚洲国产精品久久久久婷婷884| 亚洲精品在线观看网站| 91电影在线观看| 国产a区久久久| 天天综合日日夜夜精品| 中文字幕在线一区免费| 日韩精品一区二区三区中文精品 | 欧美电影免费观看高清完整版在线 | 在线一区二区观看| 韩国av一区二区| 婷婷综合五月天| 亚洲色欲色欲www在线观看| 欧美精品一区二区三区久久久 | 精品美女一区二区| 欧美自拍偷拍一区| 国产98色在线|日韩| 蜜臀av性久久久久蜜臀av麻豆| 亚洲精品自拍动漫在线| 国产亚洲精品超碰| 日韩免费高清视频| 欧美日韩不卡一区二区| 色视频成人在线观看免| 成人黄页毛片网站| 国产综合成人久久大片91| 日韩va亚洲va欧美va久久| 一区二区成人在线观看| 国产精品久久久久久久久动漫 | 久久精品免费观看| 亚洲成人av在线电影| 亚洲乱码日产精品bd| 国产精品污网站| 国产日产精品1区| 久久综合av免费| 欧美tickling网站挠脚心| 91精品国产91久久综合桃花 | 亚洲三级小视频| 国产精品少妇自拍| 国产欧美综合在线观看第十页| 亚洲精品一区二区三区精华液 | 国产精品久久久久影视| 久久久久久麻豆| 久久久久久久免费视频了| 精品久久一区二区| 欧美一区二区三区精品| 欧美伊人久久久久久午夜久久久久| 91在线丨porny丨国产| 成人免费毛片app| proumb性欧美在线观看|