Train a two layer neural network with a recursive prediction error
% algorithm ("recursive Gauss-Newton"). Also pruned (i.e., not fully
% connected) networks can be trained.
%
% The activation functions can either be linear or tanh. The network
% architecture is defined by the matrix NetDef , which has of two
% rows. The first row specifies the hidden layer while the second
% specifies the output layer.