Next: Representing and learning
Up: Grammatical inference with DTRNN
Previous: Grammatical inference (GI)
  Contents
  Index
Discrete-time recurrent neural networks for grammatical
inference
This chapter is concerned with the use of discrete-time recurrent
neural networks (DTRNN) for grammatical inference. DTRNN may be used
as sequence processors in three main modes:
- Neural
acceptors/recognizers:
- DTRNN may be trained to accept strings
belonging to a language and reject strings not belonging to it, by
producing suitable labels after the whole string has been processed. In
view of the computational equivalence between some DTRNN
architectures and some finite-state machine
(FSM)
classes, it is
reasonable to expect DTRNN to learn regular (finite-state)
languages. A set of neural acceptors (separately or merged in a
single DTRNN) may be used as a neural classifier.
- Neural transducers/translators:
- If the output of the DTRNN is examined not
only at the end but also after processing each one of the symbols in
the input, then its output may be interpreted as a synchronous, sequential
transduction (translation)
for the input string. DTRNN may be easily trained to perform
synchronous sequential transductions and also some asynchronous
transductions.
- Neural predictors:
-
DTRNN may be trained to predict the next symbol of strings in a
given language. The trained DTRNN, after reading string outputs a
mixture of the possible successor symbols; in certain conditions
(see e.g. Elman (1990) ), the
output of the DTRNN may be interpreted as the
probabilities of each of
the possible successors in the language. In this last case, the
DTRNN may be used as a probabilistic
generator of strings.
When DTRNN are used for grammatical inference, the following have to
be defined:
- A learning set. The learning set may
contain: strings labeled as belonging or not to a language or as
belonging to a class in a finite set of classes
(recognition/classification task)6.1; a draw of unlabeled
strings, possibly with repetitions, generated according to a given
probability distribution
(prediction/generation task); or pairs of strings
(translation/transduction task).
- An encoding for input symbols
as input signals for the DTRNN. This defines the number of input
lines of the DTRNN.
- An interpretation for outputs: as labels, probabilities for
successor symbols or
transduced symbols. This defines the number of output units of
the DTRNN.
- A suitable DTRNN architecture, the number of state units
and the number of units in other hidden layers.
- Initial values for the learnable parameters of the DTRNN (weights, biases and initial
states).
- A learning algorithm
(including a suitable error function and a suitable
stopping criterion) and a presentation scheme (the whole learning
set
may be presented from the beginning or a staged presentation may be
devised).
- An extraction mechanism to extract an automaton or grammar rules
from the weights of the DTRNN. This will be discussed in detail
in section 5.4.
Next: Representing and learning
Up: Grammatical inference with DTRNN
Previous: Grammatical inference (GI)
  Contents
  Index
Debian User
2002-01-21