#pollack90j###:

Next: #bengio94j###: Up: Papers Previous: #Elm90a###: Contents Index

Pollack (1990):

The paper by Pollack, ``Recursive distributed representations'' (http://www.dlsi.ua.es/~mlf/nnafmc/papers/pollack90recursive.pdf) introduces a new architecture, which is nowadays called recursive auto-associative memory (RAAM). When used to process sequences, the system is basically a set of two discrete-time recurrent neural networks , the encoder (or compressor) and the decoder (or reconstructor) . The encoder is trained to produce a different final state vector for each sequence, so that the trained decoder may trace back the steps and retrieve the sequences from the states; therefore, a distributed representation of sequences is achieved. But RAAMs are more general devices because they may be used not only to obtain distributed representations of sequences, but also of trees with a maximum valence.^4.13

A RAAM , that is, a recursive auto-associative memory (Pollack, 1990) with valence is a tree-storing device composed by two different subsystems:

the encoder $E=(X,U,V,{\bf f})$ , where $X=[S_0,S_1]^{n_X}$ is the state space of the RAAM, with and in $\mathbb{R}$ , the order (number of state units), $U=[S_0,S_1]^{n_U}$ the set of possible inputs with the number of input signals; and ${\bf f}:X^{V}\times U\rightarrow X$ is the encoding function;
the decoder $D=(X,U,V,{\bf h},{\bf d}_1,{\bf d}_2,\ldots,{\bf d}_{V})$ where ${\bf h}:X\rightarrow U$ is the output function and the functions ${\bf d}_i: X \rightarrow X$ , $i=1,\ldots V$ are the decoding functions.

Encoding, decoding, and output functions are realized in RAAM by feedforward neural networks, usually single-layer feedforward neural networks with neurons whose outputs are in the interval

. The usual choices for

and

are

and

when the activation function of neurons is the logistic function $g_L(x)=1/(1+\exp(-x))$ .

RAAMs may be used as tree-storing devices which store trees of maximum valence as follows:

A special value of , which we will call ${\bf x}_0$ is used to represent the missing daughters of a node in an input tree when there are less than daughters.
Each possible node label $\sigma_k$ in the set $\Sigma={\sigma_1,\sigma_2\ldots,\sigma_{\vert\Sigma\vert}}$ of tree node labels is assigned a value ${\bf u}_k\in U$ .
For each of the decoding functions ${\bf d}_i$ , a special region of , which will be called $X_\epsilon^{(i)}$ , is defined such that when the output of the decoding function is in that region the RAAM is interpreted as designating a missing daughter (for a node having less than daugthers). This is necessary for the RAAM to produce finite-sized trees as outputs (they have to end in nodes having no daughters)^4.14.
Each possible output node label $\sigma_m$ is also assigned a nonempty region $U_m\in U$ such that when the result of the output function is in the RAAM is interpreted as outputting a node with label $\sigma_m$ .
The encoder walks the input tree in a bottom-up fashion, computing a state value in for each possible subtree from the state values corresponding to its daughter subtrees; that is, it encodes the tree as a vector in .
the decoder generates the output tree in a top-down function, generating, from the state representation of each output node, suitable state representations in for its daughter nodes and suitable labels in ; that is, it decodes the vector obtained by the encoder to produce a tree.

Pollack (1990) trained RAAMs to store the trees in a learning set .

It has to be noted that, in principle, RAAMs may be used to store trees even when labels are not taken from a finite alphabet of symbols but instead consist of arbitrary vectors in , but Pollack (1990) emphasizes symbolic computations.

RAAMs have been used for various tasks, most of them related to language processing:

for translating sentences from one language to another, by training RAAMs to represent the source and target sentences and then either by training a multilayer perceptron to obtain the RAAM representation for the target sentence from the RAAM representation of the source sentence (Chalmers, 1990) or by training the two RAAM so that the corresponding representations are identical (Chrisman, 1991).
More recently, RAAM have been extended to RHAM (recursive heteroassociative memories) which learn to obtain representations of input trees that are directly decoded into a different output tree (Forcada and Ñeco, 1997).
Kwasny and Kalman (1995) have used Elman (1990) nets to obtain, from a sentence, a RAAM representation of its parse tree.
Sperduti (1994) has introduced labeling RAAMs which may be used to store directed labeled graphs (Sperduti and Starita, 1995; Sperduti, 1995,1994).

Next: #bengio94j###: Up: Papers Previous: #Elm90a###: Contents Index

Debian User 2002-01-21