Skip to main content

Nondeterministic Finite-State Automata

(Content adapted from Critchlow & Eck)

As mentioned briefly above, there is an alternative school of though as to what properties should be required of a finite-state automaton's transition function. Recall our motivating example of a C++ compiler and a legal if statement. In our description, we had the compiler in an "expecting a )" state; on seeing a ), the compiler moved into an "expecting a { or a legal statement" state. An alternative way to view this would be to say that the compiler, on seeing a ), could move into one of two different states: it could move to an "expecting a {" state or move to an "expecting a legal statement" state. Thus, from a single state, on input ), the compiler has multiple moves. This alternative interpretation is not allowed by the DFA model.

A second point on which one might question the DFA model is the fact that input must be consumed for the machine to change state. Think of the syntax for C++ function declarations. The return type of a function need not be specified (the default is taken to be int). The start state of the compiler when parsing a function declaration might be "expecting a return type"; then with no input consumed, the compiler can move to the state "expecting a legal function name". To model this, it might seem reasonable to allow transitions that do not require the consumption of input (such transitions are called ε\varepsilon-transitions). Again, this is not supported by the DFA abstraction. There is, therefore, a second class of finite-state automata that people study, the class of nondeterministic finite-state automata.

A nondeterministic finite-state automaton (NFA) is the same as a deterministic finite-state automaton except that the transition function is no longer a function that maps a state-input pair to a state; rather, it maps a state-input pair or a state-ε\varepsilon pair to a set of states. No longer do we have δ(q,a)=q\delta(q,a) = q', meaning that the machine must change to state qq' if it is in state qq and consumes an aa. Rather, we have (q,a)={q1,q2,,qn}\partial(q,a) = \{q_1, q_2, \ldots, q_n\}, meaning that if the machine is in state qq and consumes an aa, it might move directly to any one of the states q1,,qnq_1, \ldots, q_n. Note that the set of next states (q,a)\partial(q,a) is defined for every state qq and every input symbol aa, but for some qq's and aa's it could be empty, or contain just one state (there don't have to be multiple next states). The function \partial must also specify whether it is possible for the machine to make any moves without input being consumed, i.e., (q,ε)\partial(q, \varepsilon) must be specified for every state qq. Again, it is quite possible that (q,ε)\partial(q, \varepsilon) may be empty for some states qq: there need not be ε\varepsilon-transitions out of qq.

Formally, a nondeterministic finite-state automaton MM is specified by 5 components: M=(Q,Σ,q0,,F)M=(Q, \Sigma, q_0, \partial, F) where

  • QQ, Σ\Sigma, q0q_0 and FF are as in the definition of DFAs;
  • \partial is a transition function that takes state, input symbol\langle\textrm{state, input symbol}\rangle pairs and maps each one to a set of states. To say (q,a)={q1,q2,,qn}\partial(q,a) = \{q_1, q_2, \ldots , q_n\} means that if the machine is in state qq and the input symbol aa is consumed, then the machine may move directly into any one of states q1,q2,,qnq_1, q_2, \ldots , q_n. The function \partial must also be defined for every state,ε\langle\textrm{state},\varepsilon\rangle pair. To say (q,ε)={q1,q2,,qn}\partial(q,\varepsilon) = \{q_1, q_2, \ldots , q_n\} means that there are direct ε\varepsilon-transitions from state qq to each of q1,q2,,qnq_1, q_2, \ldots , q_n.

The formal description of the function \partial is :Q×(Σ{ε})P(Q)\partial : Q \times (\Sigma \cup \{\varepsilon\}) \rightarrow {\mathscr P}(Q).

The function \partial describes how the machine functions on zero or one input symbol. As with DFAs, we will often want to refer to the behavior of the machine on a string of inputs, and so we use the notation (q,w)\partial^*(q,w) as shorthand for "the set of states in which the machine might be if it starts in state qq and consumes input string ww". As with DFAs, (q,w)\partial^*(q,w) is determined by the specification of \partial. Note that for every state qq, (q,ε)\partial^*(q, \varepsilon) contains at least qq, and may contain additional states if there are (sequences of) ε\varepsilon-transitions out of qq.

We do have to think a bit carefully about what it means for an NFA to accept a string ww. Suppose (q0,w)\partial^*(q_0,w) contains both accepting and non-accepting states, i.e., the machine could end in an accepting state after consuming ww, but it might also end in a non-accepting state. Should we consider the machine to accept ww, or should we require every state in (q0,w)\partial^*(q_0,w) to be accepting before we admit ww to the ranks of the accepted? Think of the C++ compiler again: provided that an if statement fits one of the legal syntax specifications, the compiler will accept it. So we take as the definition of acceptance by an NFA: A string ww is accepted by an NFA provided that at least one of the states in (q0,w)\partial^*(q_0, w) is an accepting state. That is, if there is some sequence of steps of the machine that consumes ww and leaves the machine in an accepting state, then the machine accepts ww. Formally:

Let M=(Q,Σ,q0,,F)M= (Q, \Sigma, q_0, \partial, F) be a nondeterministic finite-state automaton. The string wΣw\in\Sigma^* is accepted by MM iff (q0,w)\partial^*(q_0,w) contains at least one state qFFq_F \in F.

The language accepted by MM, denoted L(M)L(M), is the set of all strings wΣw\in\Sigma^* that are accepted by MM: L(M)={wΣ  (q0,w)F}L(M) = \{ w \in \Sigma^* \ | \ \partial^*(q_0, w) \cap F \not= \emptyset\}.


Example: The NFA shown below accepts all strings of aa's and bb's in which the second-to-last symbol is aa.

Example NFA


Connection with Deterministic Finite-State Automata

It should be fairly clear that every language that is accepted by a DFA is also accepted by an NFA. Pictorially, a DFA looks exactly like an NFA (an NFA that doesn't happen to have any ε\varepsilon-transitions or multiple same-label transitions from any state), though there is slightly more going on behind the scenes. Formally, given the DFA M=(Q,Σ,q0,δ,F)M=(Q, \Sigma, q_0, \delta, F), you can build an NFA M=(Q,Σ,q0,,F)M'=(Q, \Sigma, q_0, \partial, F) where 4 of the 5 components are the same and where every transition δ(q,a)=q\delta(q,a) = q' has been replaced by (q,a)={q}\partial(q,a) = \{q'\}.

But is the reverse true? Can any NFA-recognized language be recognized by a DFA? Look, for example, at the language in the example above. Can you come up with a DFA that accepts this language? Try it. It's pretty difficult to do. But does that mean that there really is no DFA that accepts the language, or only that we haven't been clever enough to find one?

It turns out that the limitation is in fact in our cleverness, and not in the power of DFAs.1

Theorem: The Subset Construction
Every language that is accepted by an NFA is accepted by a DFA.

Proof: Suppose we are given an NFA N=(P,Σ,p0,,Fp)N = (P, \Sigma, p_0, \partial, F_p), and we want to build a DFA D=(Q,Σ,q0,δ,Fq)D=(Q, \Sigma, q_0, \delta, F_q) that accepts the same language. The idea is to make the states in DD correspond to subsets of NN's states, and then to set up DD's transition function δ\delta so that for any string ww, δ(q0,w)\delta^*(q_0, w) corresponds to (p0,w)\partial^*(p_0,w); i.e., the single state that ww gets you to in DD corresponds to the set of states that ww could get you to in NN. If any of those states is accepting in NN, ww would be accepted by NN, and so the corresponding state in DD would be made accepting as well.

So how do we make this work? The first thing to do is to deal with a start state q0q_0 for DD. If we're going to make this state correspond to a subset of NN's states, what subset should it be? Well, remember (1) that in any DFA, δ(q0,ε)=q0\delta^*(q_0, \varepsilon) = q_0; and (2) we want to make δ(q0,w)\delta^*(q_0, w) correspond to (p0,w)\partial^*(p_0,w) for every ww. Putting these two limitations together tells us that we should make q0q_0 correspond to (p0,ε)\partial^*(p_0, \varepsilon). So q0q_0 corresponds to the subset of all of NN's states that can be reached with no input.

Now we progressively set up DD's transition function δ\delta by repeatedly doing the following:

  • Find a state qq that has been added to DD but whose out-transitions have not yet been added. (Note that q0q_0 initially fits this description.) Remember that the state qq corresponds to some subset {p1,,pn}\{p_1, \ldots , p_n\} of NN's states.

  • For each input symbol aa, look at all NN's states that can be reached from any one of p1,,pnp_1, \ldots , p_n by consuming aa (perhaps making some ε\varepsilon-transitions as well). That is, look at (p1,a)(pn,a)\partial^*(p_1,a) \cup \ldots \cup \partial^*(p_n,a). If there is not already a DFA state qq' that corresponds to this subset of NN's states, then add one, and add the transition δ(q,a)=q\delta(q, a)= q' to DD's transitions.

The above process must halt eventually, as there are only a finite number of states nn in the NFA, and therefore there can be at most 2n2^n states in the DFA, as that is the number of subsets of the NFA's states. The final states of the new DFA are those where at least one of the associated NFA states is an accepting state of the NFA.

Can we now argue that L(D)=L(N)L(D) = L(N)? We can, if we can argue that δ(q0,w)\delta^*(q_0,w) corresponds to (p0,w)\partial^*(p_0,w) for all wΣw\in\Sigma^*: if this latter property holds, then wL(D)w \in L(D) iff δ(q0,w)\delta^*(q_0,w) is accepting, which we made be so iff (p0,w)\partial^*(p_0,w) contains an accepting state of NN, which happens iff NN accepts ww i.e., iff wL(N)w \in L(N).

So can we argue that δ(q0,w)\delta^*(q_0,w) does in fact correspond to (p0,w)\partial^*(p_0,w) for all ww? We can, using induction on the length of ww.

First, a preliminary observation. Suppose w=xaw=xa, i.e., ww is the string xx followed by the single symbol aa. How are (p0,x)\partial^*(p_0,x) and (p0,w)\partial^*(p_0,w) related? Well, recall that (p0,x)\partial^*(p_0,x) is the set of all states that NN can reach when it starts in p0p_0 and consumes xx: (p0,x)={p1,,pn}\partial^*(p_0,x) = \{p_1, \ldots, p_n\} for some states p1,,pnp_1, \ldots, p_n. Now, ww is just xx with an additional aa, so where might NN end up if it starts in p0p_0 and consumes ww? We know that xx gets NN to p1p_1 or \ldots or pnp_n, so xaxa gets NN to any state that can be reached from p1p_1 with an aa (and maybe some ε\varepsilon-transitions), and to any state that can be reached from p2p_2 with an aa (and maybe some ε\varepsilon-transitions), etc. Thus, our relationship between (p0,x)\partial^*(p_0,x) and (p0,w)\partial^*(p_0,w) is that if (p0,x)={p1,,pn}\partial^*(p_0,x) = \{p_1, \ldots, p_n\}, then (p0,w)=(p1,a)(pn,a)\partial^*(p_0,w) = \partial^*(p_1,a) \cup \ldots \cup \partial^*(p_n,a). With this observation in hand, let's proceed to our proof by induction.

We want to prove that δ(q0,w)\delta^*(q_0,w) corresponds to (p0,w)\partial^*(p_0,w) for all wΣw\in\Sigma^*. We use induction on the length of ww.

Base case: Suppose ww has length 0. The only string ww with length 0 is ε\varepsilon, so we want to show that δ(q0,ε)\delta^*(q_0,\varepsilon) corresponds to (p0,ε)\partial^*(p_0,\varepsilon). Well, δ(q0,ε)=q0\delta^*(q_0, \varepsilon) = q_0, since in a DFA, δ(q,ε)=q\delta^*(q, \varepsilon) = q for any state~qq. We explicitly made q0q_0 correspond to (p0,ε)\partial^*(p_0,\varepsilon), and so the property holds for ww with length 0.

Inductive case: Assume that the desired property holds for some number nn, i.e., that δ(q0,x)\delta^*(q_0,x) corresponds to (p0,x)\partial^*(p_0,x) for all xx with length nn. Look at an arbitrary string ww with length n+1n+1. We want to show that δ(q0,w)\delta^*(q_0,w) corresponds to (p0,w)\partial^*(p_0,w). Well, the string ww must look like xaxa for some string xx (whose length is nn) and some symbol aa. By our inductive hypothesis, we know δ(q0,x)\delta^*(q_0,x) corresponds to (p0,x)\partial^*(p_0,x). We know (p0,x)\partial^*(p_0,x) is a set of NN's states, say (p0,x)={p1,,pn}\partial^*(p_0,x) = \{p_1, \ldots, p_n\}.

At this point, our subsequent reasoning might be a bit clearer if we give explicit names to δ(q0,w)\delta^*(q_0,w) (the state DD reaches on input ww) and δ(q0,x)\delta^*(q_0,x) (the state DD reaches on input xx). Let qw=δ(q0,w)q_w = \delta^*(q_0, w), and let qx=δ(q0,x)q_x = \delta^*(q_0,x). We know, because w=xaw=xa, there must be an aa-transition from qxq_x to qwq_w. Look at how we added transitions to δ\delta: the fact that there is an aa-transition from qxq_x to qwq_w means that qwq_w corresponds to the set (p1,a)(pn,a)\partial^*(p_1,a) \cup \ldots \cup \partial^*(p_n,a) of NN's states. By our preliminary observation, (p1,a)(pn,a)\partial^*(p_1,a) \cup \ldots \cup \partial^*(p_n,a) is just (p0,w)\partial^*(p_0,w). So qwq_w (or δ(q0,w)\delta^*(q_0,w)) corresponds to (p0,w)\partial^*(p_0,w), which is what we wanted to prove. Since ww was an arbitrary string of length n+1n+1, we have shown that the property holds for n+1n+1.

Altogether, we have shown by induction that δ(q0,w)\delta^*(q_0,w) corresponds to (p0,w)\partial^*(p_0,w) for all wΣw\in\Sigma^*. As indicated at the very beginning of this proof, that is enough to prove that L(D)=L(N)L(D)= L(N). So for any NFA NN, we can find a DFA DD that accepts the same language.


Example: Consider the NFA shown below.

Example NFA

We start by looking at (p0,ε)\partial^*(p_0, \varepsilon), and then add transitions and states as described above.

  • (p0,ε)={p0}\partial^*(p_0, \varepsilon) = \{p_0\} so q0={p0}q_0 = \{p_0\}.

  • δ(q0,a)\delta(q_0,a) will be (p0,a)\partial^*(p_0,a), which is {p0}\{p_0\}, so δ(q0,a)=q0\delta(q_0,a) = q_0.

  • δ(q0,b)\delta(q_0,b) will be (p0,b)\partial^*(p_0,b), which is {p0,p1}\{p_0, p_1\}, so we need to add a new state q1={p0,p1}q_1 = \{p_0, p_1\} to the DFA; and add δ(q0,b)=q1\delta(q_0,b) = q_1 to the DFA's transition function.

  • δ(q1,a)\delta(q_1,a) will be (p0,a)\partial^*(p_0,a) unioned with (p1,a)\partial^*(p_1,a) since q1={p0,p1}q_1 = \{p_0, p_1\}. Since (p0,a)(p1,a)={p0}{p2}={p0,p2}\partial^*(p_0,a) \cup \partial^*(p_1,a) = \{p_0\} \cup \{p_2\} = \{p_0,p_2\}, we need to add a new state q2={p0,p2}q_2 = \{p_0, p_2\} to the DFA, and a transition δ(q1,a)=q2\delta(q_1,a) = q_2.

  • δ(q1,b)\delta(q_1,b) will be (p0,b)\partial^*(p_0,b) unioned with (p1,b)\partial^*(p_1,b), which gives {p0,p1}{p2}\{p_0, p_1\} \cup \{p_2\}, which again gives us a new state q3q_3 to add to the DFA, together with the transition δ(q1,b)=q3\delta(q_1,b) = q_3.

At this point, our partially-constructed DFA looks as shown below:

Example DFA

The construction continues as long as there are new states being added, and new transitions from those states that have to be computed. The final DFA is shown below.

Example DFA


Exercises

  1. What language does the NFA in the above example accept?

    Answer

    Strings of aa's and bb's where the third-from-last character is bb.

  2. Give a DFA that accepts the language accepted by the following NFA.

Example NFA

  1. Give a DFA that accepts the language accepted by the following NFA. (Be sure to note that, for example, it is possible to reach both q1q_1 and q3q_3 from q0q_0 on consumption of an aa, because of the ε\varepsilon-transition.)

Example NFA


  1. The fault, dear reader, is not in our δ\delta^*, but in ourselves.