?? readme_knowledgeflow
字號:
===============================================================KnowledgeFlow GUI Quick Primer===============================================================Introduction:The KnowledgeFlow provides an alternative to the Explorer as agraphical front end to Weka's core algorithms. The KnowledgeFlow is awork in progress so some of the functionality from the Explorer is notyet available. On the other hand, there are things that can be done inthe KnowledgeFlow but not in the Explorer.The KnowledgeFlow presents a "data-flow" inspired interface toWeka. The user can select Weka components from a tool bar, place themon a layout canvas and connect them together in order to form a"knowledge flow" for processing and analyzing data. At present, all ofWeka's classifiers, filters, clusterers, loaders and savers areavailable in the KnowledgeFlow along with some extra tools.The KnowledgeFlow can handle data either incrementally or in batches(the Explorer handles batch data only). Of course learning from dataincrementally requires a classifier that can be updated on an instanceby instance basis. Currently in Weka there are five classifiers thatcan handle data incrementally: NaiveBayesUpdateable, IB1, IBk, LWR(locally weighted regression). There is also one meta classifier -RacedIncrementalLogitBoost - that can use of any regression baselearner to learn from discrete class data incrementally.Features of the KnowledgeFlow:* intuitive data flow style layout* process data in batches or incrementally * process multiple batches or streams in parallel! (each separate flow executes in its own thread)* chain filters together* view models produced by classifiers for each fold in a cross validation* visualize performance of incremental classifiers during processing (scrolling plots of classification accuracy, RMS error, predictions etc)Components available in the KnowledgeFlow:DataSources: All of Weka's loaders are availableDataSinks: All of Weka's savers are availableFilters: All of Weka's filters are availableClassifiers: All of Weka's classifiers are availableClusterers: All of Weka's clusterers are availableEvaluation: TrainingSetMaker - make a data set into a training set TestSetMaker - make a data set into a test set CrossValidationFoldMaker - split any data set, training set or test set into folds TrainTestSplitMaker - split any data set, training set or test set into a training set and a test set ClassAssigner - assign a column to be the class for any data set, training set or test set ClassValuePicker - choose a class value to be considered as the "positive" class. This is useful when generating data for ROC style curves (see below) ClassifierPerformanceEvaluator - evaluate the performance of batch trained/tested classifiers IncrementalClassifierEvaluator - evaluate the performance of incrementally trained classifiers ClustererPerformanceEvaluator - evaluate the performance of batch trained/tested clusterers PredictionAppender - append classifier predictions to a test set. For discrete class problems, can either append predicted class labels or probability distributionsVisualization: DataVisualizer - component that can pop up a panel for visualizing data in a single large 2D scatter plot ScatterPlotMatrix - component that can pop up a panel containing a matrix of small scatter plots (clicking on a small plot pops up a large scatter plot) AttributeSummarizer - component that can pop up a panel containing a matrix of histogram plots - one for each of the attributes in the input data ModelPerformanceChart - component that can pop up a panel for visualizing threshold (i.e. ROC style) curves. TextViewer - component for showing textual data. Can show data sets, classification performance statistics etc. GraphViewer - component that can pop up a panel for visualizing tree based models StripChart - component that can pop up a panel that displays a scrolling plot of data (used for viewing the online performance of incremental classifiers)---------------Launching the KnowledgeFlow:The Weka GUI Chooser window is used to launch Weka's graphicalenvironments. Select the button labeled "KnowledgeFlow" to start theKnowledgeFlow. Alternatively, you can launch the KnowledgeFlow from aterminal window by typing "java weka.gui.beans.KnowledgeFlow".At the top of the KnowledgeFlow window is are seven tabs: DataSources,DataSinks, Filters, Classifiers, Clusterers, Evaluation andVisualization. The names are pretty much self explanatory.EXAMPLE:-----------------Setting up a flow to load an arff file (batch mode) andperform a cross validation using J48 (Weka's C4.5 implementation).First start the KnowlegeFlow.Next click on the DataSources tab and choose "ArffLoader" from thetoolbar (the mouse pointer will change to a "cross hairs").Next place the ArffLoader component on the layout area by clickingsomewhere on the layout (A copy of the ArffLoader icon will appear onthe layout area).Next specify an arff file to load by first right clicking the mouseover the ArffLoader icon on the layout. A pop-up menu willappear. Select "Configure" under "Edit" in the list from this menu andbrowse to the location of your arff file.Next click the "Evaluation" tab at the top of the window and choose the"ClassAssigner" (allows you to choose which column to be the class)component from the toolbar. Place this on the layout.Now connect the ArffLoader to the ClassAssigner: first right clickover the ArffLoader and select the "dataSet" under "Connections" inthe menu. A "rubber band" line will appear. Move the mouse over theClassAssigner component and left click - a red line labeled "dataSet"will connect the two components.Next right click over the ClassAssigner and choose "Configure" fromthe menu. This will pop up a window from which you can specify whichcolumn is the class in your data (last is the default).Next grab a "CrossValidationFoldMaker" component from the Evaluationtoolbar and place it on the layout. Connect the ClassAssigner to theCrossValidationFoldMaker by right clicking over "ClassAssigner" andselecting "dataSet" from under "Connections" in the menu.Next click on the "Classifiers" tab at the top of the window andscroll along the toolbar until you reach the "J48" component in the"trees" section. Place a J48 component on the layout.Connect the CrossValidationFoldMaker to J48 TWICE by first choosing"trainingSet" and then "testSet" from the pop-up menu for theCrossValidationFoldMaker.Next go back to the "Evaluation" tab and place a"ClassifierPerformanceEvaluator" component on the layout. Connect J48to this component by selecting the "batchClassifier" entry from thepop-up menu for J48.Next go to the "Visualization" toolbar and place a "TextViewer"component on the layout. Connect the ClassifierPerformanceEvaluator tothe TextViewer by selecting the "text" entry from the pop-up menu forClassifierPerformanceEvaluator.Now start the flow executing by selecting "Start loading" from thepop-up menu for ArffLoader. Depending on how big the data set is andhow long cross validation takes you will see some animation from someof the icons in the layout (J48's tree will "grow" in the icon and theticks will animate on the ClassifierPerformanceEvaluator). You willalso see some progress information in the "Status" bar and "Log" atthe bottom of the window.When finished you can view the results by choosing show results fromthe pop-up menu for the TextViewer component.Other cool things to add to this flow: connect a TextViewer and/or aGraphViewer to J48 in order to view the textual or graphicalrepresentations of the trees produced for each fold of the crossvalidation (this is something that is not possible in the Explorer).-----------------------------
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -