?? trips.html
字號:
<html xmlns:mwsh="http://www.mathworks.com/namespace/mcode/v1/syntaxhighlight.dtd"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <!--This HTML is auto-generated from an M-file.To make changes, update the M-file and republish this document. --> <title>Modeling Traffic Patterns using Subtractive Clustering</title> <meta name="generator" content="MATLAB 7.1"> <meta name="date" content="2005-07-28"> <meta name="m-file" content="trips"> <link rel="stylesheet" type="text/css" href="../../../matlab/demos/private/style.css"> </head> <body> <div class="header"> <div class="left"><a href="matlab:edit trips">Open trips.m in the Editor</a></div> <div class="right"><a href="matlab:echodemo trips">Run in the Command Window</a></div> </div> <div class="content"> <h1>Modeling Traffic Patterns using Subtractive Clustering</h1> <introduction> <p>This demo demonstrates the use of subtractive clustering to model traffic patterns in an area based on the area's demographics.</p> </introduction> <h2>Contents</h2> <div> <ul> <li><a href="#1">The problem: Understanding Traffic Patterns</a></li> <li><a href="#2">The Data</a></li> <li><a href="#6">Why clustering and fuzzy logic?</a></li> <li><a href="#7">Clustering the data</a></li> <li><a href="#13">Generating the Fuzzy Inference System (FIS)</a></li> <li><a href="#15">Understanding the clusters-FIS relationship</a></li> <li><a href="#22">Using the FIS for data exploration</a></li> <li><a href="#27">Conclusion</a></li> <li><a href="#28">Glossary</a></li> </ul> </div> <h2>The problem: Understanding Traffic Patterns<a name="1"></a></h2> <p>In this demo we attempt to understand the relationship between the number of automobile trips generated from an area and the area's demographics. Demographic and trip data were collected from traffic analysis zones in New Castle County, Delaware. Five demographic factors are considered: population, number of dwelling units, vehicle ownership, median household income and total employment. </p> <p>Hereon, the demographic factors will be addressed as inputs and the trips generated will be addressed as output. Hence our problem has five input variables (five demographic factors) and one output variable (num of trips generated). </p> <h2>The Data<a name="2"></a></h2> <p>We will now load the input and output variables used for this demo into the workspace.</p><pre class="codeinput">tripdata</pre><p>Two variables are loaded in the workspace, <tt>datin</tt> and <tt>datout</tt>. <tt>datin</tt> has 5 columns representing the 5 input variables and <tt>datout</tt> has 1 column representing the 1 output variable. </p><pre class="codeinput">subplot(2,1,1)plot(datin)legend(<span class="string">'population'</span>, <span class="string">'num. of dwelling units'</span>, <span class="string">'vehicle ownership'</span>,<span class="keyword">...</span> <span class="string">'median household income'</span>, <span class="string">'total employment'</span>);title(<span class="string">'Input Variables'</span>)subplot(2,1,2)plot(datout)legend(<span class="string">'num of trips'</span>);title(<span class="string">'Output Variable'</span>)</pre><img vspace="5" hspace="5" src="trips_01.png"> <p><b>Figure 1:</b> Input and Output variables </p> <p>The number of rows in <tt>datin</tt> and <tt>datout</tt>, 75, represent the number of observations or samples or datapoints available. A row in <tt>datin</tt>, say row 11, constitutes a set of observed values of the 5 input variables (population, number of dwelling units, vehicle ownership, median household income and total employment) and the corresponding row, row 11, in <tt>datout</tt> represents the observed value for the number of trips generated given the observations made for the input variables. </p> <p>We will model the relationship between the input variables (demographics) and the output variable (trips) by first clustering the data. The cluster centers will then be used as a basis to define a Fuzzy Inference System (FIS) which can then be used to explore and understand traffic patterns. </p> <h2>Why clustering and fuzzy logic?<a name="6"></a></h2> <p>Clustering can be a very effective technique to identify natural groupings in data from a large data set, thereby allowing concise representation of relationships embedded in the data. In this example, clustering allows us to group traffic patterns into broad categories hence allowing for easier understandability. </p> <p>Fuzzy logic is an effective paradigm to handle uncertainty. It can be used to take fuzzy or imprecise observations for inputs and yet arrive at crisp and precise values for outputs. Also, the <a href="matlab:helpview([docroot,'/toolbox/fuzzy/fuzzy.map'],'fuzzy_inference_systems')">Fuzzy Inference System (FIS)</a> is a simple and commonsensical way to build systems without using complex analytical equations. </p> <p>In our example, fuzzy logic will be employed to capture the broad categories identified during clustering into a Fuzzy Inference System (FIS). The FIS will then act as a model that will reflect the relationship between demographics and auto trips. </p> <p>Clustering and fuzzy logic together provide a simple yet powerful means to model the traffic relationship that we want to study. </p> <h2>Clustering the data<a name="7"></a></h2> <p><tt>subclust</tt> is the function that implements a clustering technique called subtractive clustering. Subtractive clustering, [Chi94], is a fast, one-pass algorithm for estimating the number of clusters and the cluster centers in a dataset. </p> <p>In this section, we will see how subtractive clustering is performed on a dataset and in the next section we will explore independently how clustering is used to build a Fuzzy Inference System(FIS). </p><pre class="codeinput">[C,S] = subclust([datin datout],0.5);</pre><p>The first argument to the <tt>subclust</tt> function is the data to be clustered. The second argument to the function is the <tt>radii</tt> which marks a cluster's radius of influence in the <a href="#28">input space</a>. </p> <p>The variable <tt>C</tt> now holds all the centers of the clusters that have been identified by <tt>subclust</tt>. Each row of <tt>C</tt> contains the position of a cluster. </p><pre class="codeinput">C</pre><pre class="codeoutput">C = 1.8770 0.7630 0.9170 18.7500 1.5650 2.1830 0.3980 0.1510 0.1320 8.1590 0.6250 0.6480 3.1160 1.1930 1.4870 19.7330 0.6030 2.3850</pre><p>In this case, <tt>C</tt> has 3 rows representing 3 clusters with 6 columns representing the positions of the clusters in each dimension. </p> <p><tt>subclust</tt> has hence identified 3 natural groupings in the demographic-trip dataset being considered. The following plot shows how the clusters have been identified in the 'total employment' and 'trips' dimensions of the input space. </p><pre class="codeinput">clf;plot(datin(:,5), datout(:,1), <span class="string">'.'</span>, C(:,5),C(:,6),<span class="string">'r*'</span>)legend(<span class="string">'Data points'</span>, <span class="string">'Cluster centers'</span>, <span class="string">'Location'</span>, <span class="string">'SouthEast'</span>)xlabel(<span class="string">'total employment'</span>)ylabel(<span class="string">'num of trips'</span>)title(<span class="string">'Data and Clusters in selected two dimensions of the input space'</span>)</pre><img vspace="5" hspace="5" src="trips_02.png"> <p><b>Figure 2:</b> Cluster centers in the 'total employment' and 'trips' dimensions of the input space </p> <p>The variable <tt>S</tt> contains the sigma values that specify the range of influence of a cluster center in each of the data dimensions. All cluster centers share the same set of sigma values. </p><pre class="codeinput">S</pre><pre class="codeoutput">S = 1.1621 0.4117 0.6555 7.6139 2.8931 1.4395</pre><p><tt>S</tt> in this case has 6 columns representing the influence of the cluster centers on each of the 6 dimensions. </p> <h2>Generating the Fuzzy Inference System (FIS)<a name="13"></a></h2> <p><tt>genfis2</tt> is the function that creates a FIS using subtractive clustering. <tt>genfis2</tt> employs <tt>subclust</tt> behind the scenes to cluster the data and uses the cluster centers and their range of influences to build a FIS which will then be used to explore and understand traffic patterns. </p><pre class="codeinput">myfis=genfis2(datin,datout,0.5);</pre><p>The first argument is the input variables matrix <tt>datin</tt>, the second argument is the output variables matrix <tt>datout</tt> and the third argument is the <tt>radii</tt> that should be used while using <tt>subclust</tt>. </p> <p><tt>genfis2</tt> assigns default names for inputs, outputs and membership functions. For our understanding it is beneficial to rename the inputs and outputs meaningfully. </p><pre class="codeinput"><span class="comment">% Assign names to inputs and outputs</span>myfis = setfis(myfis, <span class="string">'input'</span>,1,<span class="string">'name'</span>,<span class="string">'population'</span>);myfis = setfis(myfis, <span class="string">'input'</span>,2,<span class="string">'name'</span>,<span class="string">'dwelling units'</span>);myfis = setfis(myfis, <span class="string">'input'</span>,3,<span class="string">'name'</span>,<span class="string">'num vehicles'</span>);myfis = setfis(myfis, <span class="string">'input'</span>,4,<span class="string">'name'</span>,<span class="string">'income'</span>);myfis = setfis(myfis, <span class="string">'input'</span>,5,<span class="string">'name'</span>,<span class="string">'employment'</span>);myfis = setfis(myfis, <span class="string">'output'</span>,1,<span class="string">'name'</span>,<span class="string">'num of trips'</span>);</pre><h2>Understanding the clusters-FIS relationship<a name="15"></a></h2> <p>A FIS is composed of inputs, outputs and rules. Each input and output can have any number of membership functions. The rules dictate the behavior of the fuzzy system based on inputs, outputs and membership functions. <tt>genfis2</tt> constructs the FIS in an attempt to capture the the position and influence of each cluster in the input space. </p> <p><tt>myfis</tt> is the FIS that <tt>genfis2</tt> has generated. Since the dataset has 5 input variables and 1 output variable, <tt>genfis2</tt> constructs a FIS with 5 inputs and 1 output. Each input and output has as many membership functions as the number of clusters that <tt>subclust</tt> has identified. As seen previously, for the current dataset <tt>subclust</tt> identified 3 clusters. Therefore each input and output will be characterized by 3 membership functions. Also, the number of rules equals the number of clusters and hence 3 rules are created. </p> <p>We can now probe the FIS to understand how the clusters got converted internally into membership functions and rules.</p><pre class="codeinput">fuzzy(myfis)</pre><img vspace="5" hspace="5" src="trips_03.png"> <p><b>Figure 3:</b> The graphical editor for building Fuzzy Inference Systems (FIS) </p> <p><tt>fuzzy</tt> is the function that launches the graphical editor for building fuzzy systems. <tt>fuzzy(myfis)</tt> launches the editor set up to edit <tt>myfis</tt>, the FIS that we just generated. As can be seen, the FIS has 5 inputs and 1 output with the inputs mapped to the outputs through a rulebase (white box in the figure). </p> <p>Let's now try to analyze how the cluster centers and the membership functions are related.</p><pre class="codeinput">mfedit(myfis)</pre><img vspace="5" hspace="5" src="trips_04.png"> <p><b>Figure 4:</b> The graphical membership function editor </p> <p><tt>mfedit(myfis)</tt> launches the graphical membership function editor. It can also be launched by clicking on the inputs or the outputs in the FIS editor launched by <tt>fuzzy</tt>. </p> <p>Notice that all the inputs and outputs have exactly 3 membership functions. The 3 membership functions represent the 3 clusters that were identified by <tt>subclust</tt>. </p> <p>Each input in the FIS represents an input variable in the input dataset <tt>datin</tt> and each output in the FIS represents an output variable in the output dataset <tt>datout</tt>. </p> <p>By default, the first membership function, <tt>in1cluster1</tt>, of the first input <tt>population</tt> would be selected in the membership function editor. Notice that the membership function type is "gaussmf" (gaussian type membership function) and the parameters of the membership function are <tt>[1.162 1.877]</tt>, where <tt>1.162</tt> represents the spread coefficient of the gaussian curve and <tt>1.877</tt> represents the center of the gaussian curve. <tt>in1cluster1</tt> captures the position and influence of the first cluster for the input variable <tt>population</tt>. <tt>(C(1,1)=1.877, S(1)=1.1621 )</tt></p> <p>Similarly, the position and influence of the other 2 clusters for the input variable <tt>population</tt> are captured by the other two membership functions <tt>in1cluster2</tt> and <tt>in1cluster3</tt>. </p> <p>The rest of the 4 inputs follow the exact pattern mimicking the position and influence of the 3 clusters along their respective dimensions in the dataset. </p> <p>Now, let's explore how the fuzzy rules are constructed.</p><pre class="codeinput">ruleedit(myfis)</pre><img vspace="5" hspace="5" src="trips_05.png"> <p><b>Figure 5:</b> The graphical rule editor </p> <p><tt>ruleedit</tt> is the graphical fuzzy rule editor. As you can notice, there are exactly three rules. Each rule attempts to map a cluster in the input space to a cluster in the output space. </p> <p>The first rule can be explained simply as follows. If the inputs to the FIS, <tt>population</tt>, <tt>dwelling units</tt>, <tt>num vehicles</tt>, <tt>income</tt>, and <tt>employment</tt>, strongly belong to their respective <tt>cluster1</tt> membership functions then the output, <tt>num of trips</tt>, must strongly belong to its <tt>cluster1</tt> membership function. The (1) at the end of the rule is to indicate that the rule has a weight or an importance of "1". Weights can take any value between 0 and 1. Rules with lesser weights will count for less in the final output. </p> <p>The significance of the rule is that it succinctly maps cluster 1 in the input space to cluster 1 in the output space. Similarly the other two rules map cluster 2 and cluster 3 in the input space to cluster 2 and cluster 3 in the output space. </p> <p>If a datapoint closer to the first cluster, or in other words having strong membership to the first cluster, is fed as input to <tt>myfis</tt> then rule1 will fire with more <a href="#28">firing strength</a> than the other two rules. Similarly, an input with strong membership to the second cluster will fire the second rule will with more firing strength than the other two rules and so on. </p> <p>The output of the rules (firing strengths) are then used to generate the output of the FIS through the output membership functions.</p> <p>The one output of the FIS, <tt>num of trips</tt>, has 3 linear membership functions representing the 3 clusters identified by <tt>subclust</tt>. The coefficients of the linear membership functions though are not taken directly from the cluster centers. Instead, they are estimated from the dataset using least squares estimation technique. </p> <p>All 3 membership functions in this case will be of the form <tt>a*population + b*dwelling units + c*num vehicles + d*income + e*employment + f</tt>, where <tt>a</tt>, <tt>b</tt>, <tt>c</tt>, <tt>d</tt>, <tt>e</tt> and <tt>f</tt> represent the coefficients of the linear membership function. Click on any of the <tt>num of trips</tt> membership functions in the membership function editor to observe the parameters of these linear membership functions. </p> <h2>Using the FIS for data exploration<a name="22"></a></h2> <p>You can now use the FIS that has been constructed to understand the underlying dynamics of relationship being modeled.</p><pre class="codeinput">surfview(myfis)</pre><img vspace="5" hspace="5" src="trips_06.png"> <p><b>Figure 6:</b> Input-Output Surface viewer </p> <p><tt>surfview</tt> is the surface viewer that helps view the input-output surface of the fuzzy system. In other words, this tool simulates the response of the fuzzy system for the entire range of inputs that the system is configured to work for. Thereafter, the output or the response of the FIS to the inputs are plotted against the inputs as a surface. This visualization is very helpful to understand how the system is going to behave for the entire range of values in the input space.
?? 快捷鍵說明
復(fù)制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -