Omlin and Giles (1996a) and Omlin and Giles (1996b) have used a
second-order4.6recurrent neural network (similar to the one used by
Giles et al. (1992),
Pollack (1991), Forcada and Carrasco (1995),
Watrous and Kuhn (1992), and Zeng et al. (1993) )
which may be formulated as a Mealy NSM described by a
next-state function whose -th coordinate
(
) is
Activation functions are usually required to be real-valued, monotonously growing, continuous (very often also differentiable), and bounded; they are usually nonlinear. Two commonly used examples of differentiable activation functions are the logistic function , which is bounded by and , and the hyperbolic tangent , which is bounded by and . Activation functions are usually required to be differentiable because this allows the use of learning algorithms based on gradients. There are also a number of architectures that do not use sigmoid-like activation functions but instead use radial basis functions ((Haykin, 1998), ch. 5; Hertz et al. (1991, 248)), which are not monotonous but instead are Gaussian-like functions that reach their maximum value for a given value of their input. DTRNN architectures using radial basis functions have been used by Frasconi et al. (1996); Cid-Sueiro et al. (1994).
Another Mealy NSM is that defined by
Robinson and Fallside (1991) under the name of recurrent
error propagation network, a first-order DTRNN which has a next-state
function whose -th coordinate
(
) is given by
(4.13) |
(4.14) |