?? userguide.tex
字號:
\documentclass[12pt]{book}\usepackage{amssymb}\newcommand{\toolboxname}{NetPack}\title{\toolboxname \\ Userguide (command line) }%\parindent 0cm\begin{document}\newcommand{\file}[1]{\texttt{#1}}\newcommand{\field}[1]{\texttt{#1}}\newcommand{\matlab}{Matlab}\newcommand{\netpackroot}{\texttt{NETPACKROOT}}% begin definitie \tbcmd{\catcode`\}=12\catcode`\]=2\global\def\tbcmd#{\verb}\let\troep]]% eind definitie \tbcmd% begin definitie \mlcmd{\catcode`\}=12\catcode`\]=2\global\def\mlcmd#{\verb}\let\troep]]% eind definitie \mlcmd% begin definitie \seealso{\catcode`\}=12\catcode`\]=2\global\def\seealso#{\par See also: \verb}\let\troep]]% eind definitie \seealso% begin definitie environment example\newenvironment{example}{\redefxverbatim\begin{verbatim}}{\end{verbatim}}{\makeatletter\catcode`\/=0\catcode`\\=12/catcode`/{=12/catcode`/}=12/catcode`/[=1/catcode`/]=2/global/def/redefxverbatim[/def/@xverbatim##1\end{example}[##1/end[example]]]] % eind definitie environment example\newcommand{\MU}{MU}\newcommand{\veci}[1]{\vec{#1}}\newcommand{\veco}[1]{#1}\newcommand{\veca}[1]{#1}\newcommand{\vw}{\vec{w}}\newcommand{\vdw}{\vec{\Delta w}}\newcommand{\vx}{\vec{x}}\newcommand{\vxmu}{\vec{x}_{\mu}}\newcommand{\vxa}[1]{\vec{x}_{#1}}\newcommand{\vt}{\vec{t}}\newcommand{\vtmu}{\vec{t}_{\mu}}\newcommand{\vta}[1]{\vec{t}_{#1}}\newcommand{\vy}{\vec{y}}\newcommand{\vymu}{\vec{y}_{\mu}}\newcommand{\vxi}{\vec{\xi}}\newcommand{\vf}{\vec{f}}\newcommand{\vfhat}{\vec{\hat{f}}}\newcommand{\vecay}{Y}\newcommand{\vecat}{T}\newcommand{\gradE}{\triangledown E}\maketitle%---------------------------------------------------------------------%---------------------------------------------------------------------\chapter{Overview}%---------------------------------------------------------------------The functions in SNN's \toolboxname\ toolbox for \matlab\ can be used fromthe \matlab\ command line, or from a graphical user interface. This manual describes mainly the command line interface. The next chapter describes how to install the toolbox. Chapter\ref{nn} is a short introduction to feedforward neural networks andensembles and chapter \ref{netpacktoolbox} descriptes how to use thetoolbox.\section{Toolbox functions and algorithms}The \toolboxname\ toolbox can be used to train anensemble of feedforward neural networks on regression tasks.For training you can choose from three different algorithms, known as\emph{gradient descent}, \emph{conjugate gradient} and\emph{Levenberg-Marquardt}, which all try to minimize a cost function.For this cost function, you can choose to write your own or you can use \tbcmd{wcf_snn}, which isa very flexible cost function. It gives a weighted average of outputerrors, where different weights can be set for each training patternand for each output. The output errors are calculated from outputerror functions, which can be set for each output separately. For theoutput error functions you can choose from squared error, relativeerror, log likelihood, cross entropy and cross logistic, or you canwrite your own.Additionally, for each pattern/output pair you can specify not to includethe output error in the average, thus allowing for multi tasklearning. %Mathematically this cost function is:%\begin{equation}% E(y, t) = \frac{1}{Z} \sum_{ \{i, \mu | \Delta_{i\mu} = 1\} } % a_i g_{\mu} f^i(y_{i\mu}, t_{i\mu})%\end{equation}%with $Z$ a normalization factor:%\begin{equation}%Z = \sum_{ \{i, \mu | \Delta_{i\mu} = 1\} } a_i g_{\mu}%\end{equation}%%\begin{tabular}{c l}%$i$ & output index \\%$\mu$ & pattern index \\%$a_i$ & output weight \\%$g_{\mu}$ & pattern weight \\%$f^i$ & output error function for output i \\%$y_{i\mu}$ & prediction for output i, pattern $\mu$ \\%$t_{i\mu}$ & target for output i, pattern $\mu$ \\%$\Delta_{i\mu}$ & parameter controlling over which outputs for%which patterns the summation is done \\%\end{tabular}Instead of using just one network, a more robust estimator can beobtained by training an ensemble of networks.To create the ensemble, the networks can be trained on differentsubdivisions of the input data set in a training and validation set.This subdivision can be done in several ways, of which this toolbox supports\emph{bootstrapping} and \emph{leaving half out}.Each network in the ensemble gives an estimate of the regression.By combining the networks, a more robust estimate is obtained. The networks can be combined in severalways. This toolbox provides a technique called \emph{balancing} whichcomputes a weighted average of the networks in the ensemble. The standard bagging, i.e. equal averaging, is also available.After balancing, you can compute \emph{confidence} and\emph{prediction intervals}, which give you an idea about the accuracyof the estimates.Using \emph{backward elimination} the relevance of (groups of) inputscan be determined.%---------------------------------------------------------------------%---------------------------------------------------------------------\chapter{Installation}\label{install}%---------------------------------------------------------------------\section{Requirements}The toolbox can be installed on Unix and Windows. To use it you musthave \matlab\ installed (tested on version 5.3.1). \emph{Note:} For the graphical user interface (GUI) a stand-alone versionis available, which does not require \matlab. \section{Unpacking}This toolbox is installed by unpacking in a zip file, named \file{netpack-toolbox-1.1.zip}. To unpack follow the instructions below.\subsection{Unix}The toolbox can be installed in any directory you want. We advise youto install it in your home directory. To install it in the directory\file{/home/user/} just unzip the file \file{netpack-toolbox-1.1.zip}: \begin{example}> unzip -d /home/user/netpack-toolbox-1.1.zip\end{example}This will create a directory \file{/home/user/netpack-toolbox-1.1}, which contains the toolbox and documentation. From now on, we will refer to this directory as \netpackroot.\subsection{Windows}Use a standard unzip utility to extract all files from the file\file{netpack-toolbox-1.1.zip}. Youcan use any folder you like to extract the files to. Allfiles in the toolbox are installed in a subfolder called \file{netpack-toolbox-1.1}. So, for example, when you install the toolbox in \file{C:}, allfiles will be in \file{C:$\backslash$netpack-toolbox-1.1}. From now on, we will refer to this subfolder as \netpackroot.%---------------------------------------------------------------------\chapter[Feedforward neural networks]{Feedforward neural networks for regression and classification tasks}\label{nn}%---------------------------------------------------------------------%---------------------------------------------------------------------\section{Regression tasks}In a regression task, we try to estimate an underlying mathematicalfunction between input variables $\vx$ and output variables $\vt$, based on a finite number of data points possibly corrupted by noise \cite{bis95c}.We are given a data set of $\MU$ pairs $\{ \vxmu, \vtmu \}$ of inputs and output targets (also known as patterns), which are assumed to be generated according to\begin{equation}\vtmu = \vf(\vxmu) + \vxi(\vxmu),\end{equation}where $\vxi(\vx)$ denotes noise with zero mean.The regression task is to find an estimator $\vfhat(\vx)$ of theregression $\vf(\vx)$. \section{Classification tasks}In a classification task, $\vtmu$ is restricted to $t_{\mu} \in \{-1,1\}$. The targets are assumed to be generated from a probabilitydistribution given by\begin{equation}p(t_{\mu} | \vx) = \frac{1}{1 + e^{- t_{\mu} f(\vx)} }\end{equation}The classification task is to find an estimator $\vfhat(\vx)$ of theclassifier $\vf(\vx)$ \cite{bis95c}.\section{Outputs and cost functions}A feedforward neural network can be understood as a function producing some output $\vy(\vx, \vw)$ given some input $\vx$ andnetwork parameters $\vw$. With the network parameters $\vw$ fixed, we can interpret the output $\vy(\vx)$ of the network as an estimator of the regression or classifier $\vf(\vx)$. Training a network means adjusting the network parameters $\vw$ insuch a way that the network output is a good estimator.This is done by minimizing a cost function $E(\vecay, \vecat)$ (alsoknown as performance function) withrespect to $\vw$. This cost function depends on all network outputs $\vecay = \{\vy(\vxa{1}),...,\vy(\vxa{\MU})\}$ for the given inputs,and all target outputs $\vecat = \{\vta{1},...,\vta{\MU}\}$. The costfunction must have a global minimum for $\vecay = \vecat$. It is ameasure for the error of the network outputs. Both regression and classification tasks can be implemented in thisway, each with a particular cost function.\section{Network architecture}The functional form of $\vy(\vx, \vw)$ depends on the architecture ofthe network. A feedforward network consists of a number of layerswith for each layer a number of units, where the value of each unitdepends on the value of the units in the previous layer. The inputs$\vx$ form the zeroth layer. For each subsequent layer, the value$v^l_i$ of the $i$th unit in the $l$th layer is computed from\begin{equation}v^l_i = g_l(\sum w^l_{ij} v^{l-1}_j + b^l_i)\end{equation}where $g_l$ is a transfer function. All the $w^l_{ij}$ and $b^l_i$together form the network parameters $\vw$ that need to be optimized. For input and output layer, we have:\begin{equation}v^0_i = x_i\end{equation}\begin{equation}y_i = v^L_i\end{equation}where $L$ is the total number of layers in the network.Thus the functional form of $\vy$ is specified by the number oflayers, the number of units in each layer and the transfer functionfor each layer. \section{Training algorithms}Finding the (global) minimum of the cost function is a difficult task.In training multi layered feedforward networks, one tries to reach aminimum by iteratively adjusting the network parameters $\vw$. In socalled back-propagation algorithms \cite{pdp86} $\vw$ changes in each iterationdepending on the gradient of the cost function. This gradient isdetermined using the back-propagation technique, which involvesperforming computations backwards through the network. Many variations of back-propagation algorithms exist, which vary in theway they use the gradient to compute the adjustment of the networkparameters.The simplest algorithm, \emph{gradient descent}, updatesthe parameters in the direction in which the cost function decreasesmost rapidly, i.e. the negative of the gradient. Although the decrease of the cost function is the largest in thisdirection, this algorithm does not necessarily produce the fastestconvergence to a minimum. Other algorithms which also use informationabout previous adjustments or second order derivatives generally givefaster convergence. Often used algorithms that use this information are\emph{conjugate gradient} algorithms and the \emph{Levenberg-Marquardt}algorithm \cite{lue84}. \section{Early stopping}When you train a network on a data set, it will try to fit thenetwork output $\vy$ as good as possible to the target outputs $\vtmu$.As a consequence, the network will get biased on
?? 快捷鍵說明
復制代碼
Ctrl + C
搜索代碼
Ctrl + F
全屏模式
F11
切換主題
Ctrl + Shift + D
顯示快捷鍵
?
增大字號
Ctrl + =
減小字號
Ctrl + -