Much of the work performed by researchers in the contact area between neural networks and formal language and computation theory concerns the training of sigmoid-based DTRNN to identify or approximate finite-state machines, and, in particular, regular language acceptors such as deterministic finite-state automata (DFA). This is the subject of chapter 5. Most of this work starts by assuming that a given DTRNN architecture is capable of performing the same computation as a finite-state machine (FSM). This section addresses the following question: ``When does a DTRNN behave as a FSM?''
In a recent paper, Casey (1996) has shown that a DTRNN performing a robust DFA-like computation (a special case of FSM-like computation) must organize its state space in mutually disjoint, closed sets with nonempty interiors corresponding to the states of the DFA. These states can be taken to be all of the points such that if the DTRNN is initialized with any of them, the DTRNN will produce the same output as the DFA initialized in the corresponding state. This formulation, which is closely connected to what Pollack (1991) called a dynamical recognizer, has inspired the following definition. A DTRNN behaves as a FSM when the following conditions are held:
Note that the regions , and , may have a nondenumerable 5.5 number of points, because of being subsets of for some . However, for a finite input alphabet , only a denumerable number of points in the state () and output () spaces are actually visited by the net for the set of all possible finite-length input strings over , denoted , which is also denumerable.
Note also that DFA represent a special case:
as said in section 2.3.3,
deterministic finite-state automata may be seen as
Moore or Mealy machines having an
output alphabet
whose output is only examined after
the last symbol of the input string is presented, and it is such that,
for a Moore machine,
(5.6) |
(5.7) |
Recently, Šíma (1997) has shown that the behavior of any DTRNN using threshold activation functions may be stably emulated by another DTRNN using activation functions in a very general class which includes the sigmoid functions considered in this paper, and describes the conditions under which this emulation is correct for inputs of any length. This means that any of the constructions described in section 4.2.1 to encode FSM in DTRNN may be converted, using the method by Šíma (1997) into an equally-capable analog DTRNN.