Computational Linguistics — Nondeterminism

Nondeterministic Finite Automata (NFA) [Wikipedia]

Extending a Finite State Machine to include the concept of "nondeterminism".

E.g., consider fixing a nonfunctioning automobile, with 10,000 parts, numbered from 1 to 10,000.

Suppose we begin by replacing part 1, then part 2, etc., until a fix is made. Suppose it takes us 150 replacements. Is that a measure of how difficult the repair actually is?

Suppose the best way to fix was to replace parts number 138 and 150. Then this is really just a two part problem.


E.g., consider finding your way through a maze. Suppose your path is 5 miles long, but there was a short route just 100 yards long. The actual problem is just 100 yards.

Nondeterminism is defined by the fact that at any given "state" in the solution of a problem, there may be several "next states" for any given input.


NFA Definition

A nondeterministic finite automaton is a 5-tuple,


   M=(Q,A,δ,q0,F),

where


   Q  = set of states
   A  = input alphabet
                                      Q
   δ  = transition function Q x A -> 2
   q0 = initial or start state
   F  = set of final states


Matt Foley Pictorial Example:

                             ____ 
            ----   a       | ---- |    b      ---- 
           |    |--------->||    ||--------->|    |
      ---->| s0 |--------->|| s1 ||          | s2 |
            ----   b       | ---- |<--------- ----
            ^ |              ----       a       |
           a| |b             |  |              b|
            | V              |  |               |
            ----            a|  |b              V
           |    |_____       |  |             ---- _______
           | s4 |     | b    |   ----------->|    |       | a,b
            ---- <----        -------------->| s3 |<------
                                              ----


Q = {s0, s1, s2, s3, s4}
A = {a,b}
q0 = s0
F = {s1}
δ: Q x A -> Q
Qab
s0{s1}{s1,s4}
s1{s3}{s2,s3}
s2{s1}{s3}
s3{s3}{s3}
s4{s0}{s4}


Extending δ to "operate" on strings.

Given nfa M with transition function d, the function d' is defined similarly as in the dfa case, but since:

                       Q
       δ  : Q x A  -> 2

We have

                       Q
       δ' : Q x A* -> 2

       δ'(q,lambda) = {q}       - on null input stays in original state
       δ'(q,a)      = δ(q,a)  if a in A

       δ'(q,w)      = {p| p in δ(r,a) where r is in δ'(q,x)}

                                       if w = ax for a in A, x in A*

This is called the extended transition function.


A clarifying example:

Given, F = {q2}


    δ  |  a    |    b
  -----|-------|--------
   q0  | {q0}  | {q0,q1}
  -----|-------|--------
   q1  | empty | {q2}
  -----|-------|--------
   q2  | empty | empty

[Draw state diagram.]

How F works on a sample string:


   [q0,ababb]

   |- [q0,babb]
   |- [q0,abb]
   |- [q0,bb]
   |- [q0,b]
   |- [q0,lambda]  --> no (halts in non-final state)

Sample string number 2:


   [q0,ababb]

   |- [q0,babb]
   |- [q1,abb]    --> no (halts by "incomplete specification")

Sample string number 3:


   [q0,ababb]

   |- [q0,babb]
   |- [q0,abb]
   |- [q0,bb]
   |- [q1,b]
   |- [q2,lambda]  --> yes (halts in final state)


The above shows the complexity of specifying what the "value" of δ', the extension function, is.

A machine is said to accept a string if there is any computation that would leave the machine in a final state with no remaining input.

[This is just saying that δ'(q0,w) intersect F is not empty.]

The language accepted by M, or the language of M is denoted L(M) and is defined to be:

L(M) is the set of all strings from A* accepted by the machine, M.


Epsilon-transitions (ε-transitions) aka lambda moves aka epsilon moves

Epsilon transitions allow the possibility of moving to a new state without consuming input.

δ : Q x (A U {ε}) -> 2Q
Or
δ' : Q x (A U {λ})* -> 2Q

It can be shown that "lambda move" machines are equivalent to (in terms of languages they can recognize) non-lambda move machines.


Tasks you should be able to do