next up previous contents index
Next: Index Up: Inference of context-free grammars Previous: #Mozer93p###   Contents   Index

Zeng et al. (1994)

describe a method --partially described earlier in (Zeng et al., 1993)-- to use and train a second-order DTRNN such as the one used by Giles et al. (1992) , without and with an external stack, so that stable finite-state or pushdown automaton behavior are ensured. The method has two basic ingredients: (a) a discretization function

\begin{displaymath}D(x)=\left\{\begin{array}{rcl} 0.8&\mbox{ if }&x>0.5\\
0.2&\mbox{otherwise}&
\end{array}\right.,\end{displaymath}

which is applied after the sigmoid function when computing the new state ${\bf x}[t]$ of the DTRNN, and (b) a pseudo-gradient learning method, which may be intuitively described as follows: the RTRL formulas are written for the corresponding second-order DTRNN without the discretization function (as in (Giles et al., 1992)) but used with discretized states instead. The resulting algorithm is empirically investigated to characterize its learning behavior; the conclusion is that, even if it does not guarantee the reduction of the error, the algorithm is able to train the DTRNN to perform the task. One of the advantages of the discretization is that FSM extraction is trivial: each FSM state is represented by a single point in state space. Special error functions and learning strategies are used for the case in which the DTRNN manipulates an external stack for the recognition of a subset of context-free languages (the stack alphabet is taken to be the same as the input alphabet and transitions consuming no input are not allowed; unlike Giles et al. (1990), these authors use a discrete external stack).


next up previous contents index
Next: Index Up: Inference of context-free grammars Previous: #Mozer93p###   Contents   Index
Debian User 2002-01-21