Finite-state machines may also be extracted from
DTRNN by means of clustering methods. These methods rely on the following
assumption: when the DTRNN behaves as a FSM the points it visits when reading strings form
low-dimensional clusters in state space that correspond to the states
of the FSM. Therefore,
one may use a clustering method to discover this structure and then
define the transitions of the
FSM in
terms of the transitions observed for these
clusters. The first
observation of this is reported by Cleeremans et al. (1989) for a
simple grammar. Manolios and Fanelli (1994) start with randomly distributed markers which are
iteratively moved toward the closest network states until they reach
the centroid of the sets of points they are closest to; then, they
check whether the transitions between DTRNN states are compatible with
clusters being interpreted as discrete states; if not, a new set of
markers is generated and clustering starts again.
Gori et al. (1998) use a clustering method to extract simple approximate FSM descriptions from DTRNN trained on noisy learning sets, that is, learning sets generated by flipping the membership labels of a few strings in a clean learning set compatible with a small FSM, and find that the original FSM is sometimes recovered. Their method relies on the assumption that small DTRNN, tend to form clusters corresponding to a simple finite-state description of a majority of the strings in the learning set, because a FSM corresponding exactly to the learning set may be impossible to represent. They contend that approximate FSM learning may indeed be one promising application of DTRNN.
Some authors integrate clustering in the learning algorithm, as Das and Mozer (1998) do (see also (Das and Das, 1991) and (Das and Mozer, 1994)).