One of the most interesting questions when using DTRNN for grammatical inference is expressed in the title of this section. Whereas in chapter 4 it was shown that certain DTRNN architectures may actually perform some symbolic string processing tasks because they behave like the corresponding automata, it remains to be seen whether the learning algorithms available are capable of finding the corresponding sets of weights by using examples of the task to be learned. This is because all of the learning algorithms implement a certain heuristic to search for the solution in weights space, but do not guarantee that the solution will be found, provided that it exists. Some of the problems have already been mentioned in chapter 3, such as the presence of local minima not corresponding to the solution or the problem of long-term dependencies along the sequences to be processed. It may be said that each learning algorithm has its own inductive bias, that is, its preferences for certain solutions in weight space.
But even when the DTRNN appears to have learned the task from the
examples, it may be the case that the internal
representation achieved may
not be easily interpretable in terms of grammar
rules or transitions in an automaton. In
fact, most learning algorithms do not force the DTRNN to acquire such
a representation, and this makes grammatical inference with
DTRNN a bit more difficult.
However, as we will see, in recognition and transduction tasks this
problem is surprisingly not frequent: hidden states cluster in certain regions of the state space
, and these clusters may be interpreted as automaton states or
variables in a grammar.