Skip to main content

Pushdown Automata

(Content adapted from Critchlow & Eck)

In the previous chapter, we saw that there is a neat correspondence between regular expressions and finite automata. That is, a language is generated by a regular expression if and only if that language is accepted by a finite automaton. Finite automata come in two types, deterministic and nondeterministic, but the two types of finite automata are equivalent in terms of their ability to recognize languages. So, the class of regular languages can be defined in two ways: either as the set of languages that can be generated by regular expressions or as the set of languages that can be recognized by finite automata (either deterministic or nondeterministic).

In this chapter, we have introduced the class of context-free languages, and we have considered how context-free grammars can be used to generate context-free languages. You might wonder whether there is any type of automaton that can be used to recognize context-free languages. In fact, there is: The abstract machines known as pushdown automata can be used to define context-free languages. That is, a language is context-free if and only if there is a pushdown automaton that accepts that language.

A pushdown automaton is essentially a finite automaton with an auxiliary data structure known as a stack. A stack consists of a finite list of symbols. Symbols can be added to and removed from the list, but only at one end of the list. The end of the list where items can be added and removed is called the top of the stack. The list is usually visualized as a vertical "stack" of symbols, with items being added and removed at the top. Adding a symbol at the top of the stack is referred to as pushing a symbol onto the stack, and removing a symbol is referred to as popping an item from the stack. During each step of its computation, a pushdown automaton is capable of doing several push and pop operations on its stack (this in addition to possibly reading a symbol from the input string that is being processed by the automaton).

Before giving a formal definition of pushdown automata, we will look at how they can be represented by transition diagrams. A diagram of a pushdown automaton is similar to a diagram for an NFA, except that each transition in the diagram can involve stack operations. We will use a label of the form σ,x\sigma,x/yy on a transition to mean that the automaton consumes σ\sigma from its input string, pops xx from the stack, and pushes yy onto the stack. σ\sigma can be either ε\varepsilon or a single symbol. xx and yy are strings, possibly empty. (When a string x=a1a2amx=a_1a_2\ldots a_m is popped from the stack, a1a_1 must be the top symbol on the stack, followed by a2a_2, etc.; for y=b1b2bny=b_1b_2\ldots b_n to be pushed onto a stack, the symbols are pushed in the order bn,,b1b_n,\ldots,b_1, so that b1b_1 ends up on the top of the stack.) For example, consider the following transition diagram for a pushdown automaton:

Sample PDA

This pushdown automaton has start state q0q_0 and one accepting state, q1q_1. It can read strings over the alphabet Σ={a,b}\Sigma=\{a,b\}. The transition from q0q_0 to q0q_0, labeled with a,εa,\varepsilon/1, means that if the machine is in state q0q_0, then it can read an aa from its input string, pop nothing from the stack, push 1 onto the stack, and remain in state q0q_0. Similarly, the transition from q1q_1 to q1q_1 means that if the machine is in state q1q_1, it can read a bb from its input string, pop a 1 from the stack, and push nothing onto the stack. Finally, the transition from state q0q_0 to q1q_1, labeled with ε,ε\varepsilon,\varepsilon/ε\varepsilon, means that the machine can transition from state q0q_0 to state q1q_1 without reading, pushing, or popping anything.

Note that the automation can follow transition b,1b,1/ε\varepsilon only if the next symbol in the input string is bb and if 11 is on the top of the stack. When it makes the transition, it consumes the bb from input and pops the 11 from the stack. Since in this case, the automaton pushes ε\varepsilon (that is, no symbols at all) onto the stack, the net change in the stack is simply to pop the 1.

We have to say what it means for this pushdown automaton to accept a string. For w{a,b}w\in\{a,b\}^*, we say that the pushdown automaton accepts ww if and only if it is possible for the machine to start in its start state, q0q_0, read all of ww, and finish in the accepting state, q1q_1, with an empty stack. Note in particular that it is not enough for the machine to finish in an accepting stateit must also empty the stack.1

It's not difficult to see that with this definition, the language accepted by our pushdown automaton is {anbn    nN}\{a^nb^n\;|\; n\in\N\}. In fact, given the string w=akbkw=a^kb^k, the machine can process this string by following the transition from q0q_0 to q0q_0 kk times. This will consume all the aa's and will push kk 1's onto the stack. The machine can then jump to state q1q_1 and follow the transition from q1q_1 to q1q_1 kk times. Each time it does so, it consumes one bb from the input and pops one 1 from the stack. At the end, the input has been completely consumed and the stack is empty. So, the string ww is accepted by the automaton. Conversely, this pushdown automaton only accepts strings of the form akbka^kb^k, since the only way that the automaton can finish in the accepting state, q1q_1, is to follow the transition from q0q_0 to q0q_0 some number of times, reading aa's as it does so, then jump at some point to q1q_1, and then follow the transition from q1q_1 to q1q_1 some number of times, reading bb's as it does so. This means that an accepted string must be of the form akba^kb^\ell for some k,Nk,\ell\in\N. However, in reading this string, the automaton pushes kk 1's onto the stack and pops \ell 1's from the stack. For the stack to end up empty, \ell must equal kk, which means that in fact the string is of the form akbka^kb^k, as claimed.


Here are two more examples. These pushdown automata use the capability to push or pop more than one symbol at a time:

Sample PDAs

The automaton on the left accepts the language {anbm    nm2n}\{a^nb^m\;|\; n\le m\le 2n\}. Each time it reads an aa, it pushes either one or two 1's onto the stack, so that after reading nn aa's, the number of 1's on the stack is between nn and 2n2n. If the machine then jumps to state q1q_1, it must be able to read exactly enough bb's to empty the stack, so any string accepted by this machine must be of the form anbma^nb^m with nm2nn\le m\le 2n. Conversely, any such string can be accepted by the machine. Similarly, the automaton on the right above accepts the language {anbm    n/2mn}\{a^nb^m\;|\; n/2 \le m \le n\}. To accept anbma^nb^m, it must push nn 1'a onto the stack and then pop one or two 1's for each b; this can succeed only if the number of bb's is between n/2n/2 and nn.


Note that an NFA can be considered to be a pushdown automaton that does not make any use of its stack. This means that any language that can be accepted by an NFA (that is, any regular language) can be accepted by a pushdown automaton. Since the language {anbn    nN}\{a^nb^n\;|\; n\in\N\} is context-free but not regular, and since it is accepted by the above pushdown automaton, we see that pushdown automata are capable of recognizing context-free languages that are not regular, therefore pushdown automata are strictly more powerful than finite automata.

Formal Definition

Although it is not particularly illuminating, we can give a formal definition of pushdown automaton. The definition does at least make it clear that the set of symbols that can be used on the stack is not necessarily the same as the set of symbols that can be used as input.

A pushdown automaton MM is specified by six components M=(Q,Σ,Λ,q0,,F)M=(Q,\Sigma,\Lambda,q_0,\partial,F) where

  • QQ is a finite set of states.
  • Σ\Sigma is an alphabet. Σ\Sigma is the input alphabet for MM.
  • Λ\Lambda is an alphabet. Λ\Lambda is the stack alphabet for MM.
  • q0Qq_0\in Q is the start state of MM.
  • FQF\subseteq Q is the set of final or accepting states in MM.
  • \partial is the set of transitions in MM. \partial can be taken to be a finite subset of the set (Q×(Σ{ε})×Λ)×(Q×Λ)(Q\times(\Sigma\cup\{\varepsilon\})\times\Lambda^*\big)\times\big(Q\times\Lambda^*\big). An element ((q1,σ,x),(q2,y))\big((q_1,\sigma,x),(q_2,y)\big) of \partial represents a transition from state q1q_1 to state q2q_2 in which MM reads σ\sigma from its input string, pops xx from the stack, and pushes yy onto the stack.

We can then define the language L(M)L(M) accepted by a pushdown automaton M=(Q,Σ,Λ,q0,,F)M=(Q,\Sigma,\Lambda,q_0,\partial,F) to be the set L(M)={wΣ    M can start from q0, read all of w, and end with an empty stack in a state in F}L(M)=\{w\in\Sigma^*\;|\;M\textrm{ can start from }q_0\textrm{, read all of }w\textrm{, and end with an empty stack in a state in }F\}.

With this definition, the class of languages accepted by pushdown automata is the same as the class of languages generated by context-free grammars.

Theorem: Let Σ\Sigma be an alphabet, and let LL be a language over LL. Then LL is context-free if and only if there is a pushdown automaton whose input alphabet is Σ\Sigma such that L=L(M)L=L(M).

We will not prove this theorem, but we do discuss how one direction can be proved. Suppose that LL is a context-free language over an alphabet Σ\Sigma. Let G=(V,Σ,P,S)G=(V,\Sigma,P,S) be a context-free grammar for LL. Then we can construct a pushdown automaton MM that accepts LL. In fact, we can take M=(Q,Σ,Λ,q0,,F)M=(Q,\Sigma,\Lambda,q_0,\partial,F) where Q={q0,q1}Q=\{q_0,q_1\}, Λ=ΣV\Lambda=\Sigma\cup V, F={q1}F=\{q_1\}, and \partial contains transitions of the forms

  • ((q0,ε,ε),(q1,S))\big((q_0,\varepsilon,\varepsilon),(q_1,S)\big);
  • ((q1,σ,σ),(q1,ε))\big((q_1,\sigma,\sigma),(q_1,\varepsilon)\big), for σΣ\sigma\in\Sigma; and
  • ((q1,ε,A),(q1,x))\big((q_1,\varepsilon,A),(q_1,x)\big), for each production AxA\longrightarrow x in GG.

The transition ((q0,ε,ε),(q1,S))\big((q_0,\varepsilon,\varepsilon),(q_1,S)\big) lets MM move from the start state q0q_0 to the accepting state q1q_1 while reading no input and pushing SS onto the stack. This is the only possible first move by MM.

A transition of the form ((q1,σ,σ),(q1,ε))\big((q_1,\sigma,\sigma),(q_1,\varepsilon)\big), for σΣ\sigma\in\Sigma, allows MM to read σ\sigma from its input string, provided there is a σ\sigma on the top of the stack. Note that if σ\sigma is at the top of the stack, then this transition is the only transition that applies. Effectively, any terminal symbol that appears at the top of the stack must be matched by the same symbol in the input string, and the transition rule allows MM to consume the symbol from the input string and remove it from the stack at the same time.

A transition of the third form, ((q1,ε,A),(q1,x))\big((q_1,\varepsilon,A),(q_1,x)\big), can be applied if and only if the non-terminal symbol AA is at the top of the stack. MM consumes no input when this rule is applied, but AA is replaced on the top of the stack by the string on the right-hand side of the production rule AxA\longrightarrow x. Since the grammar GG can contain several production rules that have AA as their left-hand side, there can be several transition rules in MM that apply when AA is on the top of the stack. This is the only source of nondeterminism in MM; note that is also the source of nondeterminism in GG.

The proof that L(M)=L(G)L(M)=L(G) follows from the fact that a computation of MM that accepts a string wΣw\in\Sigma^* corresponds in a natural way to a left derivation of ww from GG's start symbol, SS. Instead of giving a proof of this fact, we look at an example. Consider the following context-free grammar:

SABAaAbAεBbBBb\begin{aligned} S&\longrightarrow AB\\ A&\longrightarrow aAb\\ A&\longrightarrow \varepsilon\\ B&\longrightarrow bB\\ B&\longrightarrow b \end{aligned}

This grammar generates the language {anbm    m>n}\{a^nb^m\;|\; m > n\}. The pushdown automaton constructed from this grammar by the procedure given above has the following set of transition rules:

((q0,ε,ε),(q1,S))((q1,a,a),(q1,ε))((q1,b,b),(q1,ε))((q1,ε,S),(q1,AB))((q1,ε,A),(q1,aAb))((q1,ε,A),(q1,ε))((q1,ε,B),(q1,bB))((q1,ε,B),(q1,b))\begin{aligned} &\big((q_0,\varepsilon,\varepsilon),(q_1,S)\big)\\ &\big((q_1,a,a),(q_1,\varepsilon)\big)\\ &\big((q_1,b,b),(q_1,\varepsilon)\big)\\ &\big((q_1,\varepsilon,S),(q_1,AB)\big)\\ &\big((q_1,\varepsilon,A),(q_1,aAb)\big)\\ &\big((q_1,\varepsilon,A),(q_1,\varepsilon)\big)\\ &\big((q_1,\varepsilon,B),(q_1,bB)\big)\\ &\big((q_1,\varepsilon,B),(q_1,b)\big) \end{aligned}

Suppose that the automaton is run on the input aabbbbaabbbb. We can trace the sequence of transitions that are applied in a computation that accepts this input, and we can compare that computation to a left derivation of the string:

TransitionInput ConsumedStackDerivationProduction
((q0,ε,ε),(q1,S))\big((q_0,\varepsilon,\varepsilon),(q_1,S)\big)SS
((q1,ε,S),(q1,AB))\big((q_1,\varepsilon,S),(q_1,AB)\big)ABABSABS\Longrightarrow ABSABS\longrightarrow AB
((q1,ε,A),(q1,aAb))\big((q_1,\varepsilon,A),(q_1,aAb)\big)aAbBaAbBaAbB\Longrightarrow aAbBAaAbA\longrightarrow aAb
((q1,a,a),(q1,ε))\big((q_1,a,a),(q_1,\varepsilon)\big)aaAbBAbB
((q1,ε,A),(q1,aAb))\big((q_1,\varepsilon,A),(q_1,aAb)\big)aaaAbbBaAbbBaaAbbB\Longrightarrow aaAbbBAaAbA\longrightarrow aAb
((q1,a,a),(q1,ε))\big((q_1,a,a),(q_1,\varepsilon)\big)aaaaAbbBAbbB
((q1,ε,A),(q1,ε))\big((q_1,\varepsilon,A),(q_1,\varepsilon)\big)aaaabbBbbBaabbB\Longrightarrow aabbBAεA\longrightarrow\varepsilon
((q1,b,b),(q1,ε))\big((q_1,b,b),(q_1,\varepsilon)\big)aabaabbBbB
((q1,b,b),(q1,ε))\big((q_1,b,b),(q_1,\varepsilon)\big)aabbaabbBB
((q1,ε,B),(q1,bB))\big((q_1,\varepsilon,B),(q_1,bB)\big)aabbaabbbBbBaabbbB\Longrightarrow aabbbBBbBB\longrightarrow bB
((q1,b),(q1,b,ε))\big((q_1,b),(q_1,b,\varepsilon)\big)aabbbaabbbBB
((q1,ε,B),(q1,b))\big((q_1,\varepsilon,B),(q_1,b)\big)aabbbaabbbbbaabbbb\Longrightarrow aabbbbBbB\longrightarrow b
((q1,b,b),(q1,ε))\big((q_1,b,b),(q_1,\varepsilon)\big)aabbbbaabbbb

Note that at all times during this computation, the concatenation of the input that has been consumed so far with the contents of the stack is equal to one of the strings in the left derivation. Application of a rule of the form ((q1,σ,σ),(q1,ε))\big((q_1,\sigma,\sigma),(q_1,\varepsilon)\big) has the effect of removing one terminal symbol from the "Stack" column to the "Input Consumed" column. Application of a rule of the form ((q1,ε,A),(q1,x))\big((q_1,\varepsilon,A),(q_1,x)\big) has the effect of applying the next step in the left derivation to the non-terminal symbol on the top of the stack. (In the "Stack" column, the pushdown automaton's stack is shown with its top on the left.) In the end, the entire input string has been consumed and the stack is empty, which means that the string has been accepted by the pushdown automaton. It should be easy to see that for any context free grammar GG, the same correspondence will always hold between left derivations and computations performed by the pushdown automaton constructed from GG.


The computation of a pushdown automaton can involve nondeterminism. That is, at some point in the computation, there might be more than one transition rule that apply. When this is not the casethat is, when there is no circumstance in which two different transition rules applythen we say that the pushdown automaton is deterministic. Note that a deterministic pushdown automaton can have transition rules of the form ((qi,ε,x),(qj,y))\big((q_i,\varepsilon,x),(q_j,y)\big) (or even ((qi,ε,ε),(qj,y))\big((q_i,\varepsilon,\varepsilon),(q_j,y)\big) if that is the only transition from state qiq_i). Note also that is is possible for a deterministic pushdown automaton to get "stuck"; that is, it is possible that no rules apply in some circumstances even though the input has not been completely consumed or the stack is not empty. If a deterministic pushdown automaton gets stuck while reading a string xx, then xx is not accepted by the automaton.

The automaton given at the beginning of this section, which accepts the language {anbn    nN}\{a^nb^n\;|\; n\in\N\}, is not deterministic. However, it is easy to construct a deterministic pushdown automaton for this language:

DPDA for a^nb^n

However, consider the language {wwR    w{a,b}}\{ww^R\;|\; w\in\{a,b\}^*\}. Here is a pushdown automaton that accepts this language:

PDA for ww^R

In state q0q_0, this machine copies the first part of its input string onto the stack. In state q1q_1, it tries to match the remainder of the input against the contents of the stack. In order for this to work, it must "guess" where the middle of the string occurs by following the transition from state q0q_0 to state q1q_1. In this case, it is by no means clear that it is possible to construct a deterministic pushdown automaton that accepts the same language.


At this point, it might be tempting to define a deterministic context-free language as one for which there exists a deterministic pushdown automaton which accepts that language. However, there is a technical problem with this definition: we need to make it possible for the pushdown automaton to detect the end of the input string. Consider the language {w    w{a,b}na(w)=nb(w)}\{w\;|\; w\in\{a,b\}^* \land n_a(w)=n_b(w)\}, which consists of strings over the alphabet {a,b}\{a,b\} in which the number of aa's is equal to the number of bb's. This language is accepted by the following pushdown automaton:

Example PDA

In this automaton, a cc is first pushed onto the stack, and it remains on the bottom of the stack until the computation ends. During the process of reading an input string, if the machine is in state q3q_3, then the number of aa's that have been read is greater than or equal to the number of bb's that have been read, and the stack contains (copies of) the excess aa's that have been read. Similarly, if the machine is in state q4q_4, then the number of bb's that have been read is greater than or equal to the number of aa's that have been read, and the stack contains (copies of) the excess bb's that have been read. As the computation proceeds, if the stack contains nothing but a cc, then the number of aa's that have been consumed by the machine is equal to the number of bb's that have been consumed; in such cases, the machine can pop the cc from the stackleaving the stack emptyand jump to state q2q_2. If the entire string has been read at that time, then the string is accepted. This involves nondeterminism because the automaton has to "guess" when to jump to state q2q_2; it has no way of knowing whether it has actually reached the end of the string.

Although this pushdown automaton is not deterministic, we can modify it easily to get a deterministic pushdown automaton that accepts a closely related language. We just have to add a special end-of-string symbol to the language. We use the symbol $\$ for this purpose. The following deterministic automaton accepts the language {w$    w{a,b}na(w)=nb(w)}\{w\$\;|\; w\in \{a,b\}^*\land n_a(w)=n_b(w)\}\,:

Example DPDA

In this modified automaton, it is only possible for the machine to reach the accepting state q2q_2 by reading the end-of-string symbol at a time when the number of aa's that have been consumed is equal to the number of bb's. Taking our cue from this example, we define what it means for a language to be deterministic context-free as follows:

Let LL be a language over an alphabet Σ\Sigma, and let $\$ be a symbol that is not in Σ\Sigma. We say that LL is a deterministic context-free language if there is a deterministic pushdown automaton that accepts the language L$L\$ (which is equal to {w$    wL}\{w\$\;|\; w\in L\}).

There are context-free languages that are not deterministic context-free. An example, given without proof, is the language of even-length palindromes over the alphabet {a,b}\{a, b\}: {wwR    w{a,b}}\{ww^R\;|\; w\in\{a,b\}^*\}. This means that for pushdown automata, nondeterminism adds real power. This contrasts with the case of finite automata, where deterministic finite automata and nondeterministic finite automata are equivalent in power in the sense that they accept the same class of languages.

A deterministic context-free language can be parsed efficiently (in O(n)O(n) time, where nn is the length of the input). LL(1) parsing and LR(1) parsing can both be defined in terms of deterministic pushdown automata, although we have not pursued that approach here. It can be shown that every deterministic context-free language can be generated by an LR(1) grammar. However, the languages that have LL(1) grammars are a strict subset: an example of a language that has an LR(1) grammar but not an LL(1) grammar is {anbn    nN}{ancn    nN}\{a^nb^n\;|\; n\in\N\}\cup\{a^nc^n\;|\; n\in\N\}.

Exercises

  1. Identify the context-free language that is accepted by each of the following pushdown automata. Explain your answers.

    a)

Exercise PDA

b)

Exercise PDA

c)

Exercise PDA

d)

Exercise PDA

  1. Let BB be the language over the alphabet {(,)}\{\,\texttt{(}\,,\texttt{)}\,\} that consists of strings of parentheses that are balanced in the sense that every left parenthesis has a matching right parenthesis. Examples include ()\texttt{()}, (())()\texttt{(())()}, ((())())()(())\texttt{((())())()(())}, and the empty string. Find a deterministic pushdown automaton with a single state that accepts the language BB. Explain how your automaton works, and explain the circumstances in which it will fail to accept a given string of parentheses.

  2. Suppose that LL is a language over an alphabet Σ\Sigma. Suppose that there is a deterministic pushdown automaton that accepts LL. Show that LL is deterministic context-free. That is, show how to construct a deterministic pushdown automaton that accepts the language L$L\$. (Assume that the symbol $\$ is not in Σ\Sigma.)

  3. Find a deterministic pushdown automaton that accepts the language {wcwR    w{a,b}}\{wcw^R\;|\; w\in\{a,b\}^*\}.

  4. Show that the language {anbm    nm}\{a^nb^m\;|\; n\not=m\} is deterministic context-free.

  5. Show that the language L={w{a,b}    na(w)>nb(w)}L=\{w\in\{a,b\}^*\;|\; n_a(w) > n_b(w)\} is deterministic context-free.

  6. Let M=(Q,Σ,Λ,q0,,F)M=(Q,\Sigma,\Lambda,q_0,\partial,F) be a pushdown automaton. Define L(M)L^\prime(M) to be the language L(M)={wΣ L^\prime(M)=\{w\in\Sigma^*\ | it is possible for MM to start in state q0q_0, read all of ww, and end in an accepting state}\}. L(M)L^\prime(M) differs from L(M)L(M) in that for wL(M)w\in L^\prime(M), we do not require that the stack be empty at the end of the computation.

    • Show that there is a pushdown automaton MM^\prime such that L(M)=L(M)L(M^\prime)=L^\prime(M).
    • Show that a language LL is context-free if and only if there is a pushdown automaton MM such that L=L(M)L=L^\prime(M).
    • Identify the language L(M)L^\prime(M) for each of the automata in Exercise 1.
  7. Let LL be a regular language over an alphabet Σ\Sigma, and let KK be a context-free language over the same alphabet. Let M=(Q,Σ,q0,δ,F)M=(Q,\Sigma,q_0,\delta,F) be a DFA that accepts LL, and let N=(P,Σ,Λ,p0,,E))N=(P,\Sigma,\Lambda,p_0,\partial,E)) be a pushdown automaton that accepts KK. Show that the language LKL\cap K is context-free by constructing a pushdown automaton that accepts LKL\cap K. The pushdown automaton can be constructed as a "cross product" of MM and NN in which the set of states is Q×PQ\times P. The construction is analogous to the proof that the intersection of two regular languages is regular, as outlined in an exercise in the section on regular languages.


  1. We could relax this restriction and require only that the machine finish in an accepting state after reading the string ww, without requiring that the stack be empty. In fact, using this definition of accepting would not change the class of languages that are accepted by pushdown automata.