Skip to main content

Finite-State Automata

(Content adapted from Critchlow & Eck)

We have seen how regular expressions can be used to generate languages mechanically. How might languages be recognized mechanically? The question is of interest because if we can mechanically recognize languages like L={all legal C++ programs that will not go into infinite loops on any input}L= \{\textrm{all legal C++ programs that will not go into infinite loops on any input}\}, then it would be possible to write über-compilers that can do semantic error-checking like testing for infinite loops, in addition to the syntactic error-checking they currently do.

What formalism might we use to model what it means to recognize a language "mechanically"? We look for inspiration to a language-recognizer with which we are all familiar, and which we've already in fact mentioned: a compiler. Consider how a C++ compiler might handle recognizing a legal if statement. Having seen the word if, the compiler will be in a state or phase of its execution where it expects to see a (; in this state, any other (non-blank) character will put the compiler in a "failure" state. If the compiler does in fact see a ( next, it will then be in an "expecting a boolean condition" state; if it sees a sequence of symbols that make up a legal boolean condition, it will then be in an "expecting a )" state; and then "expecting a { or a legal statement"; and so on. Thus one can think of the compiler as being in a series of states; on seeing a new input symbol, it moves on to a new state; and this sequence of transitions eventually leads to either a "failure" state (if the if statement is not syntactically correct) or a "success" state (if the if statement is legal). We isolate these three conceptsstates, input-inspired transitions from state to state, and "accepting" vs "non-accepting" statesas the key features of a mechanical language-recognizer, and capture them in a model called a finite-state automaton. (Whether this is a successful distillation of the essence of mechanical language recognition remains to be seen; the question will be taken up later in this chapter.)

A finite-state automaton (FSA), then, is a machine which takes, as input, a finite string of symbols from some alphabet Σ\Sigma. There is a finite set of states in which the machine can find itself. The state it is in before consuming any input is called the start state. Some of the states are accepting or final. If the machine ends in such a state after completely consuming an input string, the string is said to be accepted by the machine. The actual functioning of the machine is described by something called a transition function, which specifies what happens if the machine is in a particular state and looking at a particular input symbol. ("What happens" means "in which state does the machine end up".)


Example: Below is a table that describes the transition function of a finite-state automaton with states pp, qq, and rr, on inputs 00 and 11:

pqr
0pqr
1qrr

The table indicates, for example, that if the FSA were in state pp and consumed a 11, it would move to state qq.


FSAs actually come in two flavors depending on what properties you require of the transition function. We will look first at a class of FSAs called deterministic finite-state automata (DFAs). In these machines, the current state of the machine and the current input symbol together determine exactly which state the machine ends up in: for every current state, current input symbol\langle\textrm{current state, current input symbol}\rangle pair, there is exactly one possible next state for the machine.

Formally, a deterministic finite-state automaton MM is specified by 5 components: M=(Q,Σ,q0,δ,F)M=(Q, \Sigma, q_0, \delta, F) where

  • QQ is a finite set of states;
    • Σ\Sigma is an alphabet called the input alphabet;
    • q0Qq_0 \in Q is a state which is designated as the start state;
    • FF is a subset of QQ; the states in FF are states designated as final or accepting states;
    • δ\delta is a transition function that takes state, input symbol\langle\textrm{state, input symbol}\rangle pairs and maps each one to a state: δ:Q×ΣQ\delta : Q \times \Sigma \rightarrow Q. To say δ(q,a)=q\delta(q,a) = q' means that if the machine is in state qq and the input symbol aa is consumed, then the machine will move into state qq'. The function δ\delta must be a total function, meaning that δ(q,a)\delta(q,a) must be defined for every state qq and every input symbol aa. (Recall also that, according to the definition of a function, there can be only one output for any particular input. This means that for any given qq and aa, δ(q,a)\delta(q,a) can have only one value. This is what makes the finite-state automaton deterministic: given the current state and input symbol, there is only one possible move the machine can make.)

Example: The transition function described by the table in the preceding example is that of a DFA. If we take pp to be the start state and rr to be a final state, then the formal description of the resulting machine is
M=({p,q,r},{0,1},p,δ,{r})M= (\{p,q,r\}, \{0,1\}, p, \delta, \{r\}), where δ\delta is given by

δ(p,0)=pδ(p,1)=q\hspace{0.5in}\delta(p,0)=p \hspace{1.5in} \delta(p,1)=q

δ(q,0)=qδ(q,1)=r\hspace{0.5in}\delta(q,0)=q \hspace{1.5in} \delta(q,1)=r

δ(r,0)=rδ(r,1)=r\hspace{0.5in}\delta(r,0)=r \hspace{1.5in} \delta(r,1)=r


The transition function δ\delta describes only individual steps of the machine as individual input symbols are consumed. However, we will often want to refer to "the state the automaton will be in if it starts in state qq and consumes input string ww", where ww is a string of input symbols rather than a single symbol. Following the usual practice of using ^* to designate "0 or more", we define δ(q,w)\delta^*(q,w) as a convenient shorthand for "the state that the automaton will be in if it starts in state qq and consumes the input string ww". For any string, it is easy to see, based on δ\delta, what steps the machine will make as those symbols are consumed, and what δ(q,w)\delta^*(q,w) will be for any qq and ww. Note that if no input is consumed, a DFA makes no move, and so δ(q,ε)=q\delta^*(q, \varepsilon) = q for any state qq.1


Example: Let MM be the automaton in the preceding example. Then:

δ(p,001)=q\delta^*(p, 001)=q, since δ(p,0)=p\delta(p,0)=p, δ(p,0)=p\delta(p,0)=p, and δ(p,1)=q\delta(p,1)=q;

δ(p,01000)=q\delta^*(p, 01000)= q;

δ(p,1111)=r\delta^*(p, 1111) = r;

δ(q,0010)=r\delta^*(q, 0010) = r.


We have divided the states of a DFA into accepting and non-accepting states, with the idea that some strings will be recognized as "legal" by the automaton, and some not. Formally:

Let M=(Q,Σ,q0,δ,F)M=(Q, \Sigma, q_0, \delta, F). A string wΣw \in \Sigma^* is accepted by MM iff δ(q0,w)F\delta^*(q_0, w) \in F. (Don't get confused by the notation. Remember, it's just a shorter and neater way of saying "wΣw \in \Sigma^* is accepted by MM if and only if the state that MM will end up in if it starts in q0q_0 and consumes ww is one of the states in FF.")

The language accepted by MM, denoted L(M)L(M), is the set of all strings wΣw \in \Sigma^* that are accepted by MM: L(M)={wΣ  δ(q0,w)F}L(M) = \{ w \in\Sigma^* \ | \ \delta^*(q_0, w) \in F\}.

Note that we sometimes use a slightly different phrasing and say that a language LL is accepted by some machine MM. We don't mean by this that LL and maybe some other strings are accepted by MM; we mean L=L(M)L = L(M), i.e., LL is exactly the set of strings accepted by MM.

It may not be easy, looking at a formal specification of a DFA, to determine what language that automaton accepts. Fortunately, the mathematical description of the automaton M=(Q,Σ,q0,δ,F)M=(Q, \Sigma, q_0, \delta, F) can be neatly and helpfully captured in a picture called a transition diagram. Consider again the DFA of the two preceding examples. It can be represented pictorially as:

Example DFA

The arrow on the left indicates that pp is the start state; double circles indicate that a state is accepting. Looking at this picture, it should be fairly easy to see that the language accepted by the DFA MM is L(M)={x{0,1}  n1(x)2}L(M) = \{ x \in \{0,1\}^* \ | \ n_1(x) \geq 2\}.


Example: Find the language accepted by the DFA shown below (and describe it using a regular expression!):

Example DFA

The start state of MM is accepting, which means εL(M)\varepsilon \in L(M). If MM is in state q0q_0, a sequence of two aa's or three bb's will move MM back to q0q_0 and hence be accepted. So L(M)=L((aabbb))L(M) = L((aa | bbb)^*).


The state q4q_4 in the preceding example is often called a garbage, error, or trap state: it is a non-accepting state which, once reached by the machine, cannot be escaped. It is fairly common to omit such states from transition diagrams. For example, one is likely to see the diagram:

Example DFA

Note that this cannot be a complete DFA, because a DFA is required to have a transition defined for every state-input pair. The diagram is "short for" the full diagram:

Example DFA

As well as recognizing what language is accepted by a given DFA, we often want to do the reverse and come up with a DFA that accepts a given language. Building DFAs for specified languages is an art, not a science. There is no algorithm that you can apply to produce a DFA from an English-language description of the set of strings the DFA should accept. On the other hand, it is not generally successful, either, to simply write down a half-dozen strings that are in the language and design a DFA to accept those stringsinvariably there are strings that are in the language that aren't accepted, and other strings that aren't in the language that are accepted. So how do you go about building DFAs that accept all and only the strings they're supposed to accept? The best advice I can give is to think about relevant characteristics that determine whether a string is in the language or not, and to think about what the possible values or "states" of those characteristics are; then build a machine that has a state corresponding to each possible combination of values of relevant characteristics, and determine how the consumption of inputs affects those values. I'll illustrate what I mean with a couple of examples.


Example: Find a DFA with input alphabet Σ={a,b}\Sigma = \{a,b\} that accepts the language L={wΣ  na(w) and nb(w) are both even }L= \{w \in\Sigma^* \ | \ n_a(w) \textrm{ and } n_b(w) \textrm{ are both even } \}.

The characteristics that determine whether or not a string ww is in LL are the parity of na(w)n_a(w) and nb(w)n_b(w). There are four possible combinations of "values" for these characteristics: both numbers could be even, both could be odd, the first could be odd and the second even, or the first could be even and the second odd. So we build a machine with four states q1,q2,q3,q4q_1, q_2, q_3, q_4 corresponding to the four cases. We want to set up δ\delta so that the machine will be in state q1q_1 exactly when it has consumed a string with an even number of aa's and an even number of bb's, in state q2q_2 exactly when it has consumed a string with an odd number of aa's and an odd number of bb's, and so on.

To do this, we first make the state q1q_1 into our start state, because the DFA will be in the start state after consuming the empty string ε\varepsilon, and ε\varepsilon has an even number (zero) of both aa's and bb's. Now we add transitions by reasoning about how the parity of aa's and bb's is changed by additional input. For instance, if the machine is in q1q_1 (meaning an even number of aa's and an even number of bb's have been seen) and a further aa is consumed, then we want the machine to move to state q3q_3, since the machine has now consumed an odd number of aa's and still an even number of bb's. So we add the transition δ(q1,a)=q3\delta(q_1, a) = q_3 to the machine. Similarly, if the machine is in q2q_2 (meaning an odd number of aa's and an odd number of bb's have been seen) and a further bb is consumed, then we want the machine to move to state q3q_3 again, since the machine has still consumed an odd number of aa's, and now an even number of bb's. So we add the transition δ(q2,b)=q3\delta(q_2, b) = q_3 to the machine. Similar reasoning produces a total of eight transitions, one for each state-input pair. Finally, we have to decide which states should be final states. The only state that corresponds to the desired criteria for the language LL is q1q_1, so we make q1q_1 a final state. The complete machine is shown below.

Example DFA



Example: Find a DFA with input alphabet Σ={a,b}\Sigma = \{a,b\} that accepts the language LL = {wΣ  na(w) is divisible by 3 }\{w\in\Sigma^* \ | \ n_a(w) \textrm{ is divisible by 3 } \}.

The relevant characteristic here is of course whether or not the number of aa's in a string is divisible by 3, perhaps suggesting a two-state machine. But in fact, there is more than one way for a number to not be divisible by 3: dividing the number by 3 could produce a remainder of either 1 or 2 (a remainder of 0 corresponds to the number in fact being divisible by 3). So we build a machine with three states q0q_0, q1q_1, q2q_2, and add transitions so that the machine will be in state q0q_0 exactly when the number of aa's it has consumed is evenly divisible by 3, in state q1q_1 exactly when the number of aa's it has consumed is equivalent to 1mod31 \bmod{3}, and similarly for q2q_2. State q0q_0 will be the start state, as ε\varepsilon has 0 aa's and 0 is divisible by 3. The completed machine is shown below. Notice that because the consumption of a bb does not affect the only relevant characteristic, bb's do not cause changes of state.

Example DFA



Example: Find a DFA with input alphabet Σ={a,b}\Sigma = \{a,b\} that accepts the language LL = {wΣ w contains three consecutive a’s }\{w\in\Sigma^* \ | w \textrm{ contains three consecutive a's } \}.

Again, it is not quite so simple as making a two-state machine where the states correspond to "have seen aaaaaa" and "have not seen aaaaaa". Think dynamically: as you move through the input string, how do you arrive at the goal of having seen three consecutive aa's? You might have seen two consecutive aa's and still need a third, or you might just have seen one aa and be looking for two more to come immediately, or you might just have seen a bb and be right back at the beginning as far as seeing 3 consecutive aa's goes. So this time there will be four states, with the "last symbol was not an aa" state being the start state. The complete automaton is shown below.

Example DFA


Exercises

  1. Give DFAs that accept the following languages over Σ={a,b}\Sigma =\{a,b\}.

    • L1={x  x contains the substring aba}L_1= \{ x \ | \ x \textrm{ contains the substring } aba\}
    • L2=L(ab)L_2= L(a^*b^*)
    • L3={x  na(x)+nb(x) is even }L_3= \{ x \ | \ n_a(x)+n_b(x) \textrm{ is even }\}
    • L4={x  na(x) is a multiple of 5 }L_4= \{ x \ | \ n_a(x) \textrm{ is a multiple of 5 }\}
    • L5={x  x does not contain the substring abb}L_5= \{ x \ | \ x \textrm{ does not contain the substring } abb\}
    • L6={x  x has no a’s in the even positions}L_6= \{ x \ | \ x \textrm{ has no a's in the even positions} \}
    • L7=L(aaabab)L_7 = L(aa^* | aba^*b^*)
  2. What languages do the following DFAs accept?

Example DFA

Answer

Strings of aa's and bb's with no more than two consecutive bb's.

Example DFA

Answer

Strings of aa's and bb's with at least one of each.

  1. Let Σ={0,1}\Sigma=\{0,1\}. Give a DFA that accepts the language
    L={xΣ  x is the binary representation of an integer divisible by 3}.L = \{ x \in \Sigma^* \ | \ x \textrm{ is the binary representation of an integer divisible by 3}\}.

  1. δ\delta^* can be defined formally by saying that δ(q,ε)=q\delta^*(q,\varepsilon)=q for every state qq, and δ(q,ax)=δ(δ(q,a),x)\delta^*(q,ax)=\delta^*(\delta(q,a),x) for any state qq, aΣa\in\Sigma and xΣx\in\Sigma^*. Note that this is a recursive definition. You may also recognize it as the "left fold" of δ\delta over the string ww.