#Cle89a###:

Next: #pollack91j###: Up: Inference of finite-state machines Previous: Inference of finite-state machines Contents Index

Cleeremans et al. (1989):

this paper describes how an Elman (1990) simple recurrent network may be trained to read a string and predict the next symbol . When a network is trained to predict the next symbol, a one-hot, exclusive, or local interpretation (one unit per symbol, high when the symbol is present and low otherwise) is used for the alphabet, and a quadratic error function is used, it is the case that the actual outputs of the DTRNN after having seen a string are a good approximation to the frequencies observed for each of the successors of that string in the learning set. If the learning set is interpreted to be a representative draw from a given probability distribution over strings, then the outputs may actually be interpreted as probabilities, and the DTRNN may be used as a generator having a similar probability distribution.

However, Cleeremans et al. (1989) do not use the trained DTRNN as a generator but rather as an acceptor for the language (they explored two languages). A string is accepted if each of its successors is predicted with a probability higher than a threshold, which the authors set to 0.3^6.2. For the simplest language, using 70,000 random strings, of which only 210 were grammatical, the network only accepted the grammatical ones. Also, the network accepted all string in a set of 20,000 random grammatical strings. The second language modelled long-term dependencies: the strings started and ended with the same symbol; if the probabilities of intervening strings were independent of that symbol, the network failed to learn the language; however, when the distribution of intervening strings depended even if very slightly on the initial symbol, then the DTRNN learned the language.

The authors also perform a hierarchical clustering of the observed values of the hidden units and observe clusters that correspond to the states in the automata defining the languages.

Next: #pollack91j###: Up: Inference of finite-state machines Previous: Inference of finite-state machines Contents Index

Debian User 2002-01-21