Next: Application of DTRNN to
Up: Discrete-time recurrent neural networks
Previous: Neural Moore machines
  Contents
  Index
Other architectures without hidden state
There are a number of discrete-time neural network
architectures
that do not have a
hidden state (their
state is observable because it is simply a combination of
past inputs and past outputs) but may still be classified as
recurrent. One such example is the NARX (Nonlinear Auto-Regressive
with eXogenous inputs) network used by
Narendra and Parthasarathy (1990) and then later by
Lin et al. (1996) and Siegelmann et al. (1996)
(see also (Haykin, 1998, 746)), which may be formulated in
state-space form by defining a state that is simply a window of the
last inputs and a window of the last
outputs. Accordingly, the next-state function simply incorporates a new input (discarding the oldest
one) and a freshly computed output (discarding the oldest one) to the
windows and shifts each one of them one position. The
components of the state
vector are distributed as follows:
- The first components are allocated to the window of the last
inputs: (
) is stored in ;
- The components from to are allocated to
the window of the last outputs: ()
is stored in
.
The next-state function performs, therefore, the following operations:
- Incorporating the new input and shifting past inputs
|
(4.16) |
- Shifting past outputs:
|
(4.17) |
- Computing new state components using an intermediate hidden
layer of units:
|
(4.18) |
with
|
(4.19) |
The output function is then simply
|
(4.20) |
with
. Note that the output is computed by a
two-layer feedforward neural network. The
operation of a NARX network may then be summarized as follows (see figure 3.3):
|
(4.21) |
Their operation is therefore a nonlinear variation of that of an ARMA
(Auto-Regressive, Moving Average) model or that of an IIR ( Infinite-time Impulse Response) filter.
Figure 3.3:
Block diagram of a NARX network (the network is fully
connected but all arrows have not been drawn for clarity).
|
When the state of the discrete-time neural network is simply a window of
past inputs, we have a network usually called a time delay neural
network (TDNN) (see also
(Haykin, 1998, 641)). In state-space formulation, the
state is simply the window of past inputs and the next-state
function
simply incorporates a new input to the window and shifts it one
position in time:
|
(4.22) |
with
; and the
output is usually computed by a two-layer perceptron (feedforward
net):
|
(4.23) |
with
|
(4.24) |
The operation of a TDNN network may then be summarized as follows
(see figure 3.4):
|
(4.25) |
Their operation is therefore a nonlinear variation of that of an MA
(Moving Average) model or that of a FIR (Finite-time Impulse
Response) filter.
Figure 3.4:
Block diagram of a TDNN (the network is fully
connected but all arrows have not been drawn for clarity).
|
The weights
connecting the window of inputs to the hidden layer may be organized in
blocks sharing weight
values, so that the components of the hidden layer retain some of the temporal ordering in
the input window. TDNN have been used for tasks
such as phonetic transcription
(Sejnowski and Rosenberg, 1987), protein secondary
structure prediction (Qian and Sejnowski, 1988), or phoneme recognition
(Waibel et al., 1989; Lang et al., 1990). Clouse et al. (1997b) have studied the ability
of TDNN to represent and learn a class of finite-state
recognizers from examples (see also
(Clouse et al., 1997a) and (Clouse et al., 1994))4.8.
Next: Application of DTRNN to
Up: Discrete-time recurrent neural networks
Previous: Neural Moore machines
  Contents
  Index
Debian User
2002-01-21