Next: Neural Mealy machines Up: Sequence processing with neural Previous: State-based sequence processors Contents Index

Discrete-time recurrent neural networks

When neural networks are used to do sequence processing, the most general architecture is a recurrent neural network (that is, a neural network in which the output of some units is fed back as an input to some others), in which, for generality, unit outputs are allowed to take any real value in a given interval instead of simply two characteristic values as in threshold linear units . In particular, since sequences are discrete in nature (that is, they are made of data indexed by integers), the processing occurs in discrete steps, as if the network were driven by an external clock, and each of the neurons is assumed to compute its output instantaneously, hence the name discrete-time recurrent neural networks to account for this fact. There is another wide class of recurrent neural networks in which inputs and outputs are functions of a continuous time variable and neurons have a temporal response (relating state to inputs) that is described by a differential equation in time (Pineda, 1987). These networks are aptly called continous-time recurrent neural networks (for an excellent review, see Pearlmutter (1995)).

Discrete-time recurrent neural networks are adaptive, state-based sequence processors that may be applied to any of the four broad classes of sequence processing tasks mentioned in section 3.1: in sequence classification, the output of the DTRNN is examined only at the end of the sequence; in synchronous sequence transduction tasks, the DTRNN produces a temporal sequence of outputs corresponding to the sequence of inputs it is processing; in sequence continuation or prediction tasks, the output of the DTRNN after having seen an input sequence may be interpreted as a continuation of it; finally, in sequence generation tasks, a constant or no input may be applied in each cycle to generate a sequence of outputs.

In this document, it has been found to be convenient to see discrete-time recurrent neural networks (DTRNN) (see Haykin (1998), ch. 15; Hertz et al. (1991), ch. 7; Hush and Horne (1993); Tsoi and Back (1997)) as neural state machines (NSM), and to define them in a way that is parallel to the definitions of Mealy and Moore machines given in section 2.3. This parallelism is inspired in the relationship established by Pollack (1991) between deterministic finite-state automata (DFA) and a class of second-order DTRNN,^4.3 under the name of dynamical recognizers. A neural state machine is a six-tuple

$\begin{displaymath} N=(X,U,Y,{\bf f},{\bf h},{\bf x}_0) \end{displaymath}$

(4.4)

in which

$X=[S_0,S_1]^{n_X}$ is the state space of the NSM, with and the values defining the range of values for the state of each unit, and the number of state units;^4.4
$U=\mathbb{R}^{n_U}$ defines the set of possible input vectors , with the number of input lines;
$Y=[S_0,S_1]^{n_Y}$ is the set of outputs of the NSM, with the number of output units;
${\bf f}: X \times U \rightarrow X$ is the next-state function a feedforward neural network which computes a new state ${\bf x}[t]$ from the previous state ${\bf x}[t-1]$ and the input just read ${\bf u}[t]$ ^4.5.:

$\begin{displaymath} {\bf x}[t]={\bf f}({\bf x}[t-1],{\bf u}[t]); \end{displaymath}$ (4.5)
${\bf h}$ is the output function, which in the case of a Mealy NSM is ${\bf h}:X \times U \rightarrow Y$ , that is, a feedforward neural network which computes a new output ${\bf y}[t]$ from the previous state ${\bf x}[t-1]$ and the input just read ${\bf u}[t]$ :

$\begin{displaymath} {\bf y}[t]={\bf h}({\bf x}[t-1],{\bf u}[t]), \end{displaymath}$ (4.6)

and in the case of a Moore NSM is ${\bf h}:X\rightarrow Y$ , a feedforward neural network which computes a new output ${\bf y}[t]$ from the newly reached state ${\bf x}[t]$ :

$\begin{displaymath} {\bf y}[t]={\bf h}({\bf x}[t]); \end{displaymath}$ (4.7)
and finally, ${\bf x}_0$ is the initial state of the NSM , that is, the value that will be used for ${\bf x}[0]$ .

Most classical DTRNN architectures may be directly defined using the NSM scheme; the following sections show some examples (in all of them, weights and biases are assumed to be real numbers). The generic block diagrams of neural Mealy and neural Moore machines are given in figures 3.1 and 3.2 respectively.

**Figure 3.1:** Block diagram of a neural Mealy machine.
$\includegraphics[scale=0.8]{mealynet}$

**Figure 3.2:** Block diagram of a neural Moore machine.
$\includegraphics[scale=0.8]{moorenet}$

Subsections

Next: Neural Mealy machines Up: Sequence processing with neural Previous: State-based sequence processors Contents Index

Debian User 2002-01-21