Introduction to the Theory of Computation, Third Edition

Version: 4 (current) | Updated: 11/12/2025, 10:32:48 PM

Added description

Description

Introduction to the Theory of Computation, Third Edition

Overview

This is a digital copy of the third edition of Michael Sipser’s textbook on the theory of computation. The file is a 25‑megabyte PDF (filename: Introduction to the theory of computation_third edition - Michael Sipser.pdf), hosted on the Arke Institute CDN and identified by the IPFS CID `bafkreiapw4caqqlid6oo7nczbqaggcuiui5whfa6k5bouvqndgkpztgccy`. The edition was published in 2012 by Cengage Learning and is written in English.

Background

Michael Sipser, a professor of computer science, has authored several foundational texts in theoretical computer science. The third edition, released in 2012, updates earlier versions with new material on complexity theory and recent developments in the field. Cengage Learning holds the copyright, and the electronic version is subject to restrictions: it may not be copied, scanned, or duplicated in whole or part.

Contents

The book is organized into chapters covering the core topics of automata theory, formal languages, computability, and computational complexity. Each chapter presents definitions, theorems, proofs, and illustrative examples, followed by exercises that reinforce the material. The text includes appendices on mathematical preliminaries and a glossary of key terms. The ISBN‑13 is 978‑1‑133‑18779‑0, and the ISBN‑10 is 1‑133‑18779‑X.

Scope

The textbook addresses the theoretical foundations of computer science, focusing on formal models of computation and their limitations. It covers deterministic and nondeterministic finite automata, context‑free grammars, Turing machines, decidability, and the P vs. NP problem, among other topics. The scope is limited to theoretical concepts; it does not delve into implementation details, programming languages, or practical software engineering. The material is intended for upper‑level undergraduate or graduate courses in computer science.

Entities

(loading...)

Entity Relationships

(loading...)

Raw Cheimarros Data

@book:document {title: "Introduction to the Theory of Computation, Third Edition", creator: @michael_sipser, publisher: @cengage_learning, year: @date_2012, isbn_13: "978-1-133-18779-0", isbn_10: "1-133-18779-X", language: "en", subjects: ["Theory of computation","Computer science","Textbooks"]}  
@michael_sipser:person {full_name: "Michael Sipser"}  
@cengage_learning:organization {name: "Cengage Learning"}  

@file_pinax:metadata {id: "01K9W88VRNVERAZNDKJRB5Y76F", type: "Text", creator: @michael_sipser, created: @date_2012, language: "en", subjects: ["Theory of computation","Computer science","Textbooks"], description: "Electronic version of the print textbook on the theory of computation"}  
@file_introduction_to_the_theory_of_computation_third_edition_michael_sipser_text:file {type: "text"}  

@file_pinax -> documents -> @book  
@file_introduction_to_the_theory_of_computation_third_edition_michael_sipser_text -> contains_text_of -> @book  

--- Concepts introduced in the text (content‑specific) ---  

@finite_automaton:concept {description: "Deterministic finite automaton (DFA), a 5‑tuple (Q, Σ, δ, q₀, F)"}  
@nondeterministic_finite_automaton:concept {description: "NFA, like a DFA but transition function maps to sets of states and may include ε‑moves"}  
@regular_language:concept {definition: "A language recognized by some finite automaton"}  
@regular_expression:concept {definition: "Algebraic description of regular languages using union, concatenation, and star"}  
@context_free_grammar:concept {definition: "A 4‑tuple (V, Σ, R, S) generating context‑free languages"}  
@pushdown_automaton:concept {description: "PDA, a 6‑tuple (Q, Σ, Γ, δ, q₀, F) with a stack"}  
@context_free_language:concept {definition: "A language generated by a CFG or recognized by a PDA"}  
@pumping_lemma_regular:concept {statement: "If L is regular, ∃p such that any s∈L, |s|≥p can be written s=xyz with |xy|≤p, |y|>0 and ∀i≥0 xyⁱz∈L"}  
@pumping_lemma_cfl:concept {statement: "If L is context‑free, ∃p such that any s∈L, |s|≥p can be written s=uvxyz with |vxy|≤p, |vy|>0 and ∀i≥0 uvⁱxyⁱz∈L"}  

--- Relationships among concepts (as presented in the book) ---  

@finite_automaton -> defines -> @regular_language  
@nondeterministic_finite_automaton -> equivalent_to -> @finite_automaton (in expressive power)  
@regular_expression -> equivalent_to -> @regular_language  
@regular_language -> closed_under -> @union_operation:concept  
@regular_language -> closed_under -> @concatenation_operation:concept  
@regular_language -> closed_under -> @star_operation:concept  
@context_free_grammar -> generates -> @context_free_language  
@pushdown_automaton -> recognizes -> @context_free_language  
@regular_language -> subset_of -> @context_free_language  
@pumping_lemma_regular -> used_to_prove -> @nonregular_language:concept  
@pumping_lemma_cfl -> used_to_prove -> @noncfl_language:concept  

--- Sample specific entities from the text (examples) ---  

@machine_m1:concept {type: "finite_automaton", states: ["q1","q2","q3"], alphabet: ["0","1"], start_state: "q1", accept_states: ["q2"], transitions: {"(q1,0)":"q1","(q1,1)":"q2","(q2,0)":"q3","(q2,1)":"q2","(q3,0)":"q2","(q3,1)":"q2"}}  
@machine_m2:concept {type: "finite_automaton", states: ["q1","q2"], alphabet: ["0","1"], start_state: "q1", accept_states: ["q2"], transitions: {"(q1,0)":"q1","(q1,1)":"q2","(q2,0)":"q1","(q2,1)":"q2"}}  

@machine_m1 -> example_of -> @finite_automaton  
@machine_m2 -> example_of -> @finite_automaton  

--- Additional structural entities ---  

@chapter1:document {title: "Regular Languages", part_of: @book}  
@chapter2:document {title: "Context‑Free Languages", part_of: @book}  

@chapter1 -> contains -> @finite_automaton  
@chapter1 -> contains -> @regular_expression  
@chapter2 -> contains -> @context_free_grammar  
@chapter2 -> contains -> @pushdown_automaton  

--- End of knowledge graph extraction ---

Metadata

Files (1)

Version History (4 versions)

  • ✓ v4 (current) · 11/12/2025, 10:32:48 PM
    "Added description"
  • v3 · 11/12/2025, 10:31:51 PM · View this version
    "Added knowledge graph extraction"
  • v2 · 11/12/2025, 2:42:16 PM · View this version
    "Added PINAX metadata"
  • v1 · 11/12/2025, 2:42:07 PM · View this version
    "Reorganization group: Textbook_References"

Additional Components

Introduction to the theory of computation_third edition - Michael Sipser_text.txt
File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 2 ---
This is an electronic version of the print textbook. Due to electronic rights restrictions, some third party content 
may be suppressed. Editorial review has deemed that any suppressed content does not materially affect the overall 
learning experience. The publisher reserves the right to remove content from this title at any time if subsequent 
rights restrictions require it. For valuable information on pricing, previous editions, changes to
current editions, and alternate formats, please visit www.cengage.com/highered to search by
ISBN#, author, title, or keyword for materials in your areas of interest .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 3 ---
Introduction to the Theory of
OMPUTATIOCN
THIRD  EDITIO N
MICHAEL SIPS ER
Australia • Brazil • Japan • Korea • Mexico • Singapore • Spain • United Kingdom • United States
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 4 ---
Introduction to the Theory of 
Computation, Third Edition
Michael Sipser
Editor-in-Chief: Marie Lee
Senior Product Manager:  
Alyssa Pratt
Associate Product Manager: 
Stephanie Lorenz
Content Project Manager:  
Jennifer Feltri-George
Art Director: GEX Publishing Services
Associate Marketing Manager: 
Shanna Shelton
Cover Designer: Wing-ip Ngan,  
Ink design, inc
Cover Image Credit: @Superstock© 2013 Cengage Learning
ALL RIGHTS RESERVED. No part of this work covered by the copy-
right herein may be reproduced, transmitted, stored or used in any 
form or by any means graphic, electronic, or mechanical, including 
but not limited to photocopying, recording, scanning, digitizing, tap-
ing, Web distribution, information networks, or information storage 
and retrieval systems, except as permitted under Section 107 or 108 
of the 1976 United States Copyright Act, without the prior written 
permission of the publisher.States Copyright Act, without the prior 
written permission of the publisher.
Library of Congress Control Number: 2012938665
ISBN-13: 978-1-133-18779-0
ISBN-10: 1-133-18779-X
Cengage Learning
20 Channel Center Street
Boston, MA 02210
USA
Cengage Learning is a leading provider of customized learning solu-
tions with office locations around the globe, including Singapore, the 
United Kingdom, Australia, Mexico, Brazil, and Japan. Locate your 
local office at:  international.cengage.com/region
Cengage Learning products are represented in Canada by Nelson 
Education, Ltd.
For your lifelong learning solutions, visit www.cengage.com
Cengage Learning reserves the right to revise this publication and 
make changes from time to time in its content without notice. 
The programs in this book are for instructional purposes only.
They have been tested with care, but are not guaranteed for any 
particular intent beyond educational purposes. The author and 
the publisher do not offer any warranties or representations, 
nor do they accept any liabilities with respect to the programs.
Printed in the United States of America  
1 2 3 4 5 6 7 8 16 15 14 13 12For product information and technology assistance, contact us at  
Cengage Learning Customer & Sales Support,  
1-800-354-9706 
For permission to use material from this text or product, 
submit all requests online at cengage.com/permissions 
Further permissions questions can be emailed to 
permissionrequest@cengage.com
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 5 ---
To I n a , R a c h e l , a n d A a r o n
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 6 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 7 ---
CONTENTS
Preface to the First Edition xi
To t h e s t u d e n t . . . . . . . . . . . . . . . . . . . . . . . . . . . x i
To t h e e d u c a t o r . . . . . . . . . . . . . . . . . . . . . . . . . . x i i
The first edition . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Feedback to the author . . . . . . . . . . . . . . . . . . . . . . xiii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Preface to the Second Edition xvii
Preface to the Third Edition xxi
0I n t r o d u c t i o n 1
0.1 Automata, Computability, and Complexity . . . . . . . . . . . . . 1
Complexity theory . . . . . . . . . . . . . . . . . . . . . . . . . 2
Computability theory . . . . . . . . . . . . . . . . . . . . . . . 3
Automata theory . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.2 Mathematical Notions and T erminology . . . . . . . . . . . . . . 3
Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Sequences and tuples . . . . . . . . . . . . . . . . . . . . . . . 6
Functions and relations . . . . . . . . . . . . . . . . . . . . . . 7
Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Strings and languages . . . . . . . . . . . . . . . . . . . . . . . 13
Boolean logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Summary of mathematical terms . . . . . . . . . . . . . . . . . 16
0.3 Definitions, Theorems, and Proofs . . . . . . . . . . . . . . . . . 17
Finding proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
0.4 T ypes of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Proof by construction . . . . . . . . . . . . . . . . . . . . . . . 21
Proof by contradiction . . . . . . . . . . . . . . . . . . . . . . . 21
Proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . 22
Exercises, Problems, and Solutions ................... 2 5
v
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 8 ---
vi CONTENTS
Part One: Automata and Languages 29
1R e g u l a r L a n g u a g e s 31
1.1 Finite Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Formal definition of a finite automaton . . . . . . . . . . . . . 35
Examples of finite automata . . . . . . . . . . . . . . . . . . . . 37
Formal definition of computation . . . . . . . . . . . . . . . . 40
Designing finite automata . . . . . . . . . . . . . . . . . . . . . 41
The regular operations . . . . . . . . . . . . . . . . . . . . . . 44
1.2 Nondeterminism . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Formal definition of a nondeterministic finite automaton . . . . 53
Equivalence of NFAs and DFAs . . . . . . . . . . . . . . . . . 54
Closure under the regular operations . . . . . . . . . . . . . . . 58
1.3 Regular Expressions . . . . . . . . . . . . . . . . . . . . . . . . . 63
Formal definition of a regular expression . . . . . . . . . . . . 64
Equivalence with finite automata . . . . . . . . . . . . . . . . . 66
1.4 Nonregular Languages . . . . . . . . . . . . . . . . . . . . . . . . 77
The pumping lemma for regular languages . . . . . . . . . . . 77
Exercises, Problems, and Solutions ................... 8 2
2C o n t e x t - F r e e L a n g u a g e s 101
2.1 Context-Free Grammars . . . . . . . . . . . . . . . . . . . . . . . 102
Formal definition of a context-free grammar . . . . . . . . . . 104
Examples of context-free grammars . . . . . . . . . . . . . . . 105
Designing context-free grammars . . . . . . . . . . . . . . . . 106
Ambiguity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
Chomsky normal form . . . . . . . . . . . . . . . . . . . . . . 108
2.2 Pushdown Automata . . . . . . . . . . . . . . . . . . . . . . . . . 111
Formal definition of a pushdown automaton . . . . . . . . . . . 113
Examples of pushdown automata . . . . . . . . . . . . . . . . . 114
Equivalence with context-free grammars . . . . . . . . . . . . . 117
2.3 Non-Context-Free Languages . . . . . . . . . . . . . . . . . . . . 125
The pumping lemma for context-free languages . . . . . . . . . 125
2.4 Deterministic Context-Free Languages . . . . . . . . . . . . . . . 130
Properties of DCFLs . . . . . . . . . . . . . . . . . . . . . . . 133
Deterministic context-free grammars . . . . . . . . . . . . . . 135
Relationship of DPDAs and DCFGs . . . . . . . . . . . . . . . 146
Parsing and LR(k) Grammars . . . . . . . . . . . . . . . . . . . 151
Exercises, Problems, and Solutions ...................1 5 4
Part Two: Computability Theory 163
3T h e C h u r c h – T u r i n g T h e s i s 165
3.1 T uring Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Formal definition of a T uring machine . . . . . . . . . . . . . . 167
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 9 ---
CONTENTS vii
Examples of T uring machines . . . . . . . . . . . . . . . . . . . 170
3.2 Variants of T uring Machines . . . . . . . . . . . . . . . . . . . . . 176
Multitape T uring machines . . . . . . . . . . . . . . . . . . . . 176
Nondeterministic T uring machines . . . . . . . . . . . . . . . . 178
Enumerators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
Equivalence with other models . . . . . . . . . . . . . . . . . . 181
3.3 The Definition of Algorithm . . . . . . . . . . . . . . . . . . . . 182
Hilbert’s problems . . . . . . . . . . . . . . . . . . . . . . . . . 182
Te r m i n o l o g y f o r d e s c r i b i n g Tu r i n g m a c h i n e s . . . . . . . . . . 1 8 4
Exercises, Problems, and Solutions ...................1 8 7
4D e c i d a b i l i t y 193
4.1 Decidable Languages . . . . . . . . . . . . . . . . . . . . . . . . . 194
Decidable problems concerning regular languages . . . . . . . 194
Decidable problems concerning context-free languages . . . . . 198
4.2 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
The diagonalization method . . . . . . . . . . . . . . . . . . . 202
An undecidable language . . . . . . . . . . . . . . . . . . . . . 207
AT u r i n g - u n r e c o g n i z a b l el a n g u a g e . . . . . . . . . . . . . . . . 2 0 9
Exercises, Problems, and Solutions ...................2 1 0
5R e d u c i b i l i t y 215
5.1 Undecidable Problems from Language Theory . . . . . . . . . . 216
Reductions via computation histories . . . . . . . . . . . . . . . 220
5.2 A Simple Undecidable Problem . . . . . . . . . . . . . . . . . . . 227
5.3 Mapping Reducibility . . . . . . . . . . . . . . . . . . . . . . . . 234
Computable functions . . . . . . . . . . . . . . . . . . . . . . . 234
Formal definition of mapping reducibility . . . . . . . . . . . . 235
Exercises, Problems, and Solutions ...................2 3 9
6A d v a n c e d T o p i c s i n C o m p u t a b i l i t y T h e o r y 245
6.1 The Recursion Theorem . . . . . . . . . . . . . . . . . . . . . . . 245
Self-reference . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Te r m i n o l o g y f o r t h e r e c u r s i o n t h e o r e m . . . . . . . . . . . . . 2 4 9
Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
6.2 Decidability of logical theories . . . . . . . . . . . . . . . . . . . 252
Ad e c i d a b l et h e o r y . . . . . . . . . . . . . . . . . . . . . . . . . 2 5 5
An undecidable theory . . . . . . . . . . . . . . . . . . . . . . . 257
6.3 T uring Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . 260
6.4 A Definition of Information . . . . . . . . . . . . . . . . . . . . . 261
Minimal length descriptions . . . . . . . . . . . . . . . . . . . 262
Optimality of the definition . . . . . . . . . . . . . . . . . . . . 266
Incompressible strings and randomness . . . . . . . . . . . . . 267
Exercises, Problems, and Solutions ...................2 7 0
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 10 ---
viii CONTENTS
Part Three: Complexity Theory 273
7T i m e C o m p l e x i t y 275
7.1 Measuring Complexity . . . . . . . . . . . . . . . . . . . . . . . . 275
Big-Oand small- onotation . . . . . . . . . . . . . . . . . . . . 276
Analyzing algorithms . . . . . . . . . . . . . . . . . . . . . . . 279
Complexity relationships among models . . . . . . . . . . . . . 282
7.2 The Class P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Polynomial time . . . . . . . . . . . . . . . . . . . . . . . . . . 284
Examples of problems in P . . . . . . . . . . . . . . . . . . . . 286
7.3 The Class NP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Examples of problems in NP . . . . . . . . . . . . . . . . . . . 295
The P versus NP question . . . . . . . . . . . . . . . . . . . . 297
7.4 NP-completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Polynomial time reducibility . . . . . . . . . . . . . . . . . . . 300
Definition of NP-completeness . . . . . . . . . . . . . . . . . . 304
The Cook–Levin Theorem . . . . . . . . . . . . . . . . . . . . 304
7.5 Additional NP-complete Problems . . . . . . . . . . . . . . . . . 311
The vertex cover problem . . . . . . . . . . . . . . . . . . . . . 312
The Hamiltonian path problem . . . . . . . . . . . . . . . . . 314
The subset sum problem . . . . . . . . . . . . . . . . . . . . . 319
Exercises, Problems, and Solutions ...................3 2 2
8S p a c e C o m p l e x i t y 331
8.1 Savitch’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 333
8.2 The Class PSPACE . . . . . . . . . . . . . . . . . . . . . . . . . 336
8.3 PSPACE-completeness . . . . . . . . . . . . . . . . . . . . . . . 337
The TQBF problem . . . . . . . . . . . . . . . . . . . . . . . . 338
Winning strategies for games . . . . . . . . . . . . . . . . . . . 341
Generalized geography . . . . . . . . . . . . . . . . . . . . . . 343
8.4 The Classes L and NL . . . . . . . . . . . . . . . . . . . . . . . . 348
8.5 NL-completeness . . . . . . . . . . . . . . . . . . . . . . . . . . 351
Searching in graphs . . . . . . . . . . . . . . . . . . . . . . . . 353
8.6 NL equals coNL . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Exercises, Problems, and Solutions ...................3 5 6
9I n t r a c t a b i l i t y 363
9.1 Hierarchy Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 364
Exponential space completeness . . . . . . . . . . . . . . . . . 371
9.2 Relativization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
Limits of the diagonalization method . . . . . . . . . . . . . . 377
9.3 Circuit Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Exercises, Problems, and Solutions ...................3 8 8
10 Advanced Topics in Complexity Theory 393
10.1 Approximation Algorithms . . . . . . . . . . . . . . . . . . . . . 393
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 11 ---
CONTENTS ix
10.2 Probabilistic Algorithms . . . . . . . . . . . . . . . . . . . . . . . 396
The class BPP . . . . . . . . . . . . . . . . . . . . . . . . . . . 396
Primality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399
Read-once branching programs . . . . . . . . . . . . . . . . . . 404
10.3 Alternation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
Alternating time and space . . . . . . . . . . . . . . . . . . . . 410
The Polynomial time hierarchy . . . . . . . . . . . . . . . . . . 414
10.4 Interactive Proof Systems . . . . . . . . . . . . . . . . . . . . . . 415
Graph nonisomorphism . . . . . . . . . . . . . . . . . . . . . . 415
Definition of the model . . . . . . . . . . . . . . . . . . . . . . 416
IP = PSPACE . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
10.5 Parallel Computation . . . . . . . . . . . . . . . . . . . . . . . . 427
Uniform Boolean circuits . . . . . . . . . . . . . . . . . . . . . 428
The class NC . . . . . . . . . . . . . . . . . . . . . . . . . . . 430
P-completeness . . . . . . . . . . . . . . . . . . . . . . . . . . 432
10.6 Cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Secret keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433
Public-key cryptosystems . . . . . . . . . . . . . . . . . . . . . 435
One-way functions . . . . . . . . . . . . . . . . . . . . . . . . . 435
Tr a p d o o r f u n c t i o n s . . . . . . . . . . . . . . . . . . . . . . . . 4 3 7
Exercises, Problems, and Solutions ...................4 3 9
Selected Bibliography 443
Index 448
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 12 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 13 ---
PREFACE TO THE
FIRST EDITION
TO THE STUDENT
Welcome!
You are about to embark on the study of a fascinating and important subject:
the theory of computation. It comprises the fundamental mathematical proper-
ties of computer hardware, software, and certain applications thereof. In study-
ing this subject, we seek to determine what can and cannot be computed, how
quickly, with how much memory, and on which type of computational model.
The subject has obvious connections with engineering practice, and, as in many
sciences, it also has purely philosophical aspects.
Ik n o wt h a tm a n yo fy o ua r el o o k i n gf o r w a r dt os t u d y i n gt h i sm a t e r i a lb u t
some may not be here out of choice. You may want to obtain a degree in com-
puter science or engineering, and a course in theory is required—God knows
why. After all, isn’t theory arcane, boring, and worst of all, irrelevant?
To s e e t h a t t h e o r y i s n e i t h e r a r c a n e n o r b o r i n g , b u t i n s t e a d q u i t e u n d e r s t a n d -
able and even interesting, read on. Theoretical computer science does have
many fascinating big ideas, but it also has many small and sometimes dull details
that can be tiresome. Learning any new subject is hard work, but it becomes
easier and more enjoyable if the subject is properly presented. My primary ob-
jective in writing this book is to expose you to the genuinely exciting aspects of
computer theory, without getting bogged down in the drudgery. Of course, the
only way to determine whether theory interests you is to try learning it.
xi
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 14 ---
xii PREFACE TO THE FIRST EDITION
Theory is relevant to practice. It provides conceptual tools that practition-
ers use in computer engineering. Designing a new programming language for a
specialized application? What you learned about grammars in this course comes
in handy. Dealing with string searching and pattern matching? Remember finite
automata andregular expressions . Confronted with a problem that seems to re-
quire more computer time than you can afford? Think back to what you learned
about NP-completeness .V a r i o u sa p p l i c a t i o na r e a s ,s u c ha sm o d e r nc r y p t o g r a p h i c
protocols, rely on theoretical principles that you will learn here.
Theory also is relevant to you because it shows you a new, simpler, and more
elegant side of computers, which we normally consider to be complicated ma-
chines. The best computer designs and applications are conceived with elegance
in mind. A theoretical course can heighten your aesthetic sense and help you
build more beautiful systems.
Finally, theory is good for you because studying it expands your mind. Com-
puter technology changes quickly. Specific technical knowledge, though useful
today, becomes outdated in just a few years. Consider instead the abilities to
think, to express yourself clearly and precisely, to solve problems, and to know
when you haven’t solved a problem. These abilities have lasting value. Studying
theory trains you in these areas.
Practical considerations aside, nearly everyone working with computers is cu-
rious about these amazing creations, their capabilities, and their limitations. A
whole new branch of mathematics has grown up in the past 30 years to answer
certain basic questions. Here’s a big one that remains unsolved: If I give you a
large number—say, with 500 digits—can you find its factors (the numbers that
divide it evenly) in a reasonable amount of time? Even using a supercomputer, no
one presently knows how to do that in all cases within the lifetime of the universe!
The factoring problem is connected to certain secret codes in modern cryptosys-
tems. Find a fast way to factor, and fame is yours!
TO THE EDUCATOR
This book is intended as an upper-level undergraduate or introductory gradu-
ate text in computer science theory. It contains a mathematical treatment of
the subject, designed around theorems and proofs. I have made some effort to
accommodate students with little prior experience in proving theorems, though
more experienced students will have an easier time.
My primary goal in presenting the material has been to make it clear and
interesting. In so doing, I have emphasized intuition and “the big picture” in the
subject over some lower level details.
For example, even though I present the method of proof by induction in
Chapter 0 along with other mathematical preliminaries, it doesn’t play an im-
portant role subsequently. Generally, I do not present the usual induction proofs
of the correctness of various constructions concerning automata. If presented
clearly, these constructions convince and do not need further argument. An in-
duction may confuse rather than enlighten because induction itself is a rather
sophisticated technique that many find mysterious. Belaboring the obvious with
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 15 ---
PREFACE TO THE FIRST EDITION xiii
an induction risks teaching students that a mathematical proof is a formal ma-
nipulation instead of teaching them what is and what is not a cogent argument.
As e c o n de x a m p l eo c c u r si nP a r t sT w oa n dT h r e e ,w h e r eId e s c r i b ea l g o r i t h m s
in prose instead of pseudocode. I don’t spend much time programming T uring
machines (or any other formal model). Students today come with a program-
ming background and find the Church– T uring thesis to be self-evident. Hence
Id o n ’ tp r e s e n tl e n g t h ys i m u l a t i o n so fo n em o d e lb ya n o t h e rt oe s t a b l i s ht h e i r
equivalence.
Besides giving extra intuition and suppressing some details, I give what might
be called a classical presentation of the subject material. Most theorists will find
the choice of material, terminology, and order of presentation consistent with
that of other widely used textbooks. I have introduced original terminology in
only a few places, when I found the standard terminology particularly obscure
or confusing. For example, I introduce the term mapping reducibility instead of
many–one reducibility .
Practice through solving problems is essential to learning any mathemati-
cal subject. In this book, the problems are organized into two main categories
called Exercises andProblems .T h e E x e r c i s e s r e v i e w d e fi n i t i o n s a n d c o n c e p t s .
The Problems require some ingenuity. Problems marked with a star are more
difficult. I have tried to make the Exercises and Problems interesting challenges.
THE FIRST EDITION
Introduction to the Theory of Computation first appeared as a Preliminary Edition
in paperback. The first edition differs from the Preliminary Edition in several
substantial ways. The final three chapters are new: Chapter 8 on space complex-
ity; Chapter 9 on provable intractability; and Chapter 10 on advanced topics in
complexity theory. Chapter 6 was expanded to include several advanced topics
in computability theory. Other chapters were improved through the inclusion
of additional examples and exercises.
Comments from instructors and students who used the Preliminary Edition
were helpful in polishing Chapters 0–7. Of course, the errors they reported have
been corrected in this edition.
Chapters 6 and 10 give a survey of several more advanced topics in com-
putability and complexity theories. They are not intended to comprise a cohesive
unit in the way that the remaining chapters are. These chapters are included to
allow the instructor to select optional topics that may be of interest to the serious
student. The topics themselves range widely. Some, such as Turing reducibility
andalternation ,a r ed i r e c te x t e n s i o n so fo t h e rc o n c e p t si nt h eb o o k .O t h e r s ,s u c h
asdecidable logical theories andcryptography ,a r eb r i e fi n t r o d u c t i o n st ol a r g efi e l d s .
FEEDBACK TO THE AUTHOR
The internet provides new opportunities for interaction between authors and
readers. I have received much e-mail offering suggestions, praise, and criticism,
and reporting errors for the Preliminary Edition. Please continue to correspond!
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 16 ---
xiv PREFACE TO THE FIRST EDITION
It r yt or e s p o n dt oe a c hm e s s a g ep e r s o n a l l y ,a st i m ep e r m i t s .T h ee - m a i la d d r e s s
for correspondence related to this book is
sipserbook@math.mit.edu .
Aw e bs i t et h a tc o n t a i n sal i s to fe r r a t ai sm a i n t a i n e d . O t h e rm a t e r i a lm a yb e
added to that site to assist instructors and students. Let me know what you
would like to see there. The location for that site is
http://math.mit.edu/~sipser/book.html .
ACKNOWLEDGMENTS
Ic o u l dn o th a v ew r i t t e nt h i sb o o kw i t h o u tt h eh e l po fm a n yf r i e n d s ,c o l l e a g u e s ,
and my family.
Iw i s ht ot h a n kt h et e a c h e r sw h oh e l p e ds h a p em ys c i e n t i fi cv i e w p o i n ta n d
educational style. Five of them stand out. My thesis advisor, Manuel Blum, is
due a special note for his unique way of inspiring students through clarity of
thought, enthusiasm, and caring. He is a model for me and for many others.
I am grateful to Richard Karp for introducing me to complexity theory, to John
Addison for teaching me logic and assigning those wonderful homework sets,
to Juris Hartmanis for introducing me to the theory of computation, and to my
father for introducing me to mathematics, computers, and the art of teaching.
This book grew out of notes from a course that I have taught at MIT for
the past 15 years. Students in my classes took these notes from my lectures. I
hope they will forgive me for not listing them all. My teaching assistants over
the years—Avrim Blum, Thang Bui, Benny Chor, Andrew Chou, Stavros Cos-
madakis, Aditi Dhagat, Wayne Goddard, Parry Husbands, Dina Kravets, Jakov
Kuˇcan, Brian O’Neill, Ioana Popescu, and Alex Russell—helped me to edit and
expand these notes and provided some of the homework problems.
Nearly three years ago, T om Leighton persuaded me to write a textbook on
the theory of computation. I had been thinking of doing so for some time, but
it took T om’s persuasion to turn theory into practice. I appreciate his generous
advice on book writing and on many other things.
I wish to thank Eric Bach, Peter Beebee, Cris Calude, Marek Chrobak, Anna
Chefter, Guang-Ien Cheng, Elias Dahlhaus, Michael Fischer, Steve Fisk, Lance
Fortnow, Henry J. Friedman, Jack Fu, Seymour Ginsburg, Oded Goldreich,
Brian Grossman, David Harel, Micha Hofri, Dung T. Huynh, Neil Jones, H.
Chad Lane, Kevin Lin, Michael Loui, Silvio Micali, T adao Murata, Chris-
tos Papadimitriou, Vaughan Pratt, Daniel Rosenband, Brian Scassellati, Ashish
Sharma, Nir Shavit, Alexander Shen, Ilya Shlyakhter, Matt Stallmann, Perry
Susskind, Y. C. T ay, Joseph T raub, Osamu Watanabe, Peter Widmayer, David
Williamson, Derick Wood, and Charles Yang for comments, suggestions, and
assistance as the writing progressed.
The following people provided additional comments that have improved
this book: Isam M. Abdelhameed, Eric Allender, Shay Artzi, Michelle Ather-
ton, Rolfe Blodgett, Al Briggs, Brian E. Brooks, Jonathan Buss, Jin Yi Cai,
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 17 ---
PREFACE TO THE FIRST EDITION xv
Steve Chapel, David Chow, Michael Ehrlich, Yaakov Eisenberg, Farzan Fallah,
Shaun Flisakowski, Hjalmtyr Hafsteinsson, C. R. Hale, Maurice Herlihy, Vegard
Holmedahl, Sandy Irani, Kevin Jiang, Rhys Price Jones, James M. Jowdy, David
M. Martin Jr., Manrique Mata-Montero, Ryota Matsuura, Thomas Minka,
Farooq Mohammed, T adao Murata, Jason Murray, Hideo Nagahashi, Kazuo
Ohta, Constantine Papageorgiou, Joseph Raj, Rick Regan, Rhonda A. Reumann,
Michael Rintzler, Arnold L. Rosenberg, Larry Roske, Max Rozenoer, Walter L.
Ruzzo, Sanatan Sahgal, Leonard Schulman, Steve Seiden, Joel Seiferas, Ambuj
Singh, David J. Stucki, Jayram S. Thathachar, H. Venkateswaran, T om Whaley,
Christopher Van Wyk, Kyle Young, and Kyoung Hwan Yun.
Robert Sloan used an early version of the manuscript for this book in a class
that he taught and provided me with invaluable commentary and ideas from
his experience with it. Mark Herschberg, Kazuo Ohta, and Latanya Sweeney
read over parts of the manuscript and suggested extensive improvements. Shafi
Goldwasser helped me with material in Chapter 10.
Ir e c e i v e de x p e r tt e c h n i c a ls u p p o r tf r o mW i l l i a mB a x t e ra tS u p e r s c r i p t ,w h o
wrote the LATEXm a c r op a c k a g ei m p l e m e n t i n gt h ei n t e r i o rd e s i g n ,a n df r o m
Larry Nolan at the MIT mathematics department, who keeps things running.
It has been a pleasure to work with the folks at PWS Publishing in creat-
ing the final product. I mention Michael Sugarman, David Dietz, Elise Kaiser,
Monique Calello, Susan Garland and T anja Brull because I have had the most
contact with them, but I know that many others have been involved, too. Thanks
to Jerry Moore for the copy editing, to Diane Levy for the cover design, and to
Catherine Hawkes for the interior design.
Ia mg r a t e f u lt ot h eN a t i o n a lS c i e n c eF o u n d a t i o nf o rs u p p o r tp r o v i d e du n d e r
grant CCR-9503322.
My father, Kenneth Sipser, and sister, Laura Sipser, converted the book di-
agrams into electronic form. My other sister, Karen Fisch, saved us in various
computer emergencies, and my mother, Justine Sipser, helped out with motherly
advice. I thank them for contributing under difficult circumstances, including
insane deadlines and recalcitrant software.
Finally, my love goes to my wife, Ina, and my daughter, Rachel. Thanks for
putting up with all of this.
Cambridge, Massachusetts Michael Sipser
October, 1996
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 18 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 19 ---
PREFACE TO THE
SECOND EDITION
Judging from the email communications that I’ve received from so many of you,
the biggest deficiency of the first edition is that it provides no sample solutions
to any of the problems. So here they are. Every chapter now contains a new
Selected Solutions section that gives answers to a representative cross-section of
that chapter’s exercises and problems. T o make up for the loss of the solved
problems as interesting homework challenges, I’ve also added a variety of new
problems. Instructors may request an Instructor’s Manual that contains addi-
tional solutions by contacting the sales representative for their region designated
atwww.course.com .
An u m b e ro fr e a d e r sw o u l dh a v el i k e dm o r ec o v e r a g eo fc e r t a i n“ s t a n d a r d ”
topics, particularly the Myhill–Nerode Theorem and Rice’s Theorem. I’ve par-
tially accommodated these readers by developing these topics in the solved prob-
lems. I did not include the Myhill–Nerode Theorem in the main body of the text
because I believe that this course should provide only an introduction to finite
automata and not a deep investigation. In my view, the role of finite automata
here is for students to explore a simple formal model of computation as a prelude
to more powerful models, and to provide convenient examples for subsequent
topics. Of course, some people would prefer a more thorough treatment, while
others feel that I ought to omit all references to (or at least dependence on) finite
automata. I did not include Rice’s Theorem in the main body of the text because,
though it can be a useful “tool” for proving undecidability, some students might
xvii
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 20 ---
xviii PREFACE TO THE SECOND EDITION
use it mechanically without really understanding what is going on. Using reduc-
tions instead, for proving undecidability, gives more valuable preparation for the
reductions that appear in complexity theory.
I am indebted to my teaching assistants—Ilya Baran, Sergi Elizalde, Rui Fan,
Jonathan Feldman, Venkatesan Guruswami, Prahladh Harsha, Christos Kapout-
sis, Julia Khodor, Adam Klivans, Kevin Matulef, Ioana Popescu, April Rasala,
Sofya Raskhodnikova, and Iuliu Vasilescu—who helped me to craft some of
the new problems and solutions. Ching Law, Edmond Kayi Lee, and Zulfikar
Ramzan also contributed to the solutions. I thank Victor Shoup for coming up
with a simple way to repair the gap in the analysis of the probabilistic primality
algorithm that appears in the first edition.
I appreciate the efforts of the people at Course T echnology in pushing me
and the other parts of this project along, especially Alyssa Pratt and Aimee
Poirier. Many thanks to Gerald Eisman, Weizhen Mao, Rupak Majumdar,
Chris Umans, and Christopher Wilson for their reviews. I’m indebted to Jerry
Moore for his superb job copy editing and to Laura Segel of ByteGraphics
(lauras@bytegraphics.com )f o rh e rb e a u t i f u lr e n d i t i o no ft h efi g u r e s .
The volume of email I’ve received has been more than I expected. Hearing
from so many of you from so many places has been absolutely delightful, and I’ve
tried to respond to all eventually—my apologies for those I missed. I’ve listed
here the people who made suggestions that specifically affected this edition, but
It h a n ke v e r y o n ef o rt h e i rc o r r e s p o n d e n c e :
Luca Aceto, Arash Afkanpour, Rostom Aghanian, Eric Allender, Karun Bak-
shi, Brad Ballinger, Ray Bartkus, Louis Barton, Arnold Beckmann, Mihir Bel-
lare, Kevin T rent Bergeson, Matthew Berman, Rajesh Bhatt, Somenath Biswas,
Lenore Blum, Mauro A. Bonatti, Paul Bondin, Nicholas Bone, Ian Bratt, Gene
Browder, Doug Burke, Sam Buss, Vladimir Bychkovsky, Bruce Carneal, Soma
Chaudhuri, Rong-Jaye Chen, Samir Chopra, Benny Chor, John Clausen, Alli-
son Coates, Anne Condon, Jeffrey Considine, John J. Crashell, Claude Crepeau,
Shaun Cutts, Susheel M. Daswani, Geoff Davis, Scott Dexter, Peter Drake,
Jeff Edmonds, Yaakov Eisenberg, Kurtcebe Eroglu, Georg Essl, Alexander T.
Fader, Farzan Fallah, Faith Fich, Joseph E. Fitzgerald, Perry Fizzano, David
Ford, Jeannie Fromer, Kevin Fu, Atsushi Fujioka, Michel Galley, K. Gane-
san, Simson Garfinkel, T ravis Gebhardt, Peymann Gohari, Ganesh Gopalakr-
ishnan, Steven Greenberg, Larry Griffith, Jerry Grossman, Rudolf de Haan,
Michael Halper, Nick Harvey, Mack Hendricks, Laurie Hiyakumoto, Steve
Hockema, Michael Hoehle, Shahadat Hossain, Dave Isecke, Ghaith Issa, Raj D.
Iyer, Christian Jacobi, Thomas Janzen, Mike D. Jones, Max Kanovitch, Aaron
Kaufman, Roger Khazan, Sarfraz Khurshid, Kevin Killourhy, Seungjoo Kim,
Victor Kuncak, Kanata Kuroda, Thomas Lasko, Suk Y. Lee, Edward D. Leg-
enski, Li-Wei Lehman, Kong Lei, Zsolt Lengvarszky, Jeffrey Levetin, Baekjun
Lim, Karen Livescu, Stephen Louie, TzerHung Low, Wolfgang Maass, Arash
Madani, Michael Manapat, Wojciech Marchewka, David M. Martin Jr., Anders
Martinson, Lyle McGeoch, Alberto Medina, Kurt Mehlhorn, Nihar Mehta, Al-
bert R. Meyer, Thomas Minka, Mariya Minkova, Daichi Mizuguchi, G. Allen
Morris III, Damon Mosk-Aoyama, Xiaolong Mou, Paul Muir, German Muller,
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 21 ---
PREFACE TO THE SECOND EDITION xix
Donald Nelson, Gabriel Nivasch, Mary Obelnicki, Kazuo Ohta, Thomas M.
Oleson, Jr., Curtis Oliver, Owen Ozier, Rene Peralta, Alexander Perlis, Holger
Petersen, Detlef Plump, Robert Prince, David Pritchard, Bina Reed, Nicholas
Riley, Ronald Rivest, Robert Robinson, Christi Rockwell, Phil Rogaway, Max
Rozenoer, John Rupf, T eodor Rus, Larry Ruzzo, Brian Sanders, Cem Say, Kim
Schioett, Joel Seiferas, Joao Carlos Setubal, Geoff Lee Seyon, Mark Skandera,
Bob Sloan, Geoff Smith, Marc L. Smith, Stephen Smith, Alex C. Snoeren, Guy
St-Denis, Larry Stockmeyer, Radu Stoleru, David Stucki, Hisham M. Sueyllam,
Kenneth T am, Elizabeth Thompson, Michel T oulouse, Eric T ria, Chittaranjan
T ripathy, Dan T rubow, Hiroki Ueda, Giora Unger, Kurt L. Van Etten, Jesir
Vargas, Bienvenido Velez-Rivera, Kobus Vos, Alex Vrenios, Sven Waibel, Marc
Waldman, Tom Whaley, Anthony Widjaja, Sean Williams, Joseph N. Wilson,
Chris Van Wyk, Guangming Xing, Vee Voon Yee, Cheng Yongxi, Neal Young,
Timothy Yuen, Kyle Yung, Jinghua Zhang, Lilla Zollei.
I thank Suzanne Balik, Matthew Kane, Kurt L. Van Etten, Nancy Lynch,
Gregory Roberts, and Cem Say for pointing out errata in the first printing.
Most of all, I thank my family—Ina, Rachel, and Aaron—for their patience,
understanding, and love as I sat for endless hours here in front of my computer
screen.
Cambridge, Massachusetts Michael Sipser
December, 2004
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 22 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 23 ---
PREFACE TO THE
THIRD EDITION
The third edition contains an entirely new section on deterministic context-free
languages. I chose this topic for several reasons. First of all, it fills an obvious
gap in my previous treatment of the theory of automata and languages. The
older editions introduced finite automata and T uring machines in deterministic
and nondeterministic variants, but covered only the nondeterministic variant of
pushdown automata. Adding a discussion of deterministic pushdown automata
provides a missing piece of the puzzle.
Second, the theory of deterministic context-free grammars is the basis for
LR(k)grammars, an important and nontrivial application of automata theory in
programming languages and compiler design. This application brings together
several key concepts, including the equivalence of deterministic and nondeter-
ministic finite automata, and the conversions between context-free grammars
and pushdown automata, to yield an efficient and beautiful method for parsing.
Here we have a concrete interplay between theory and practice.
Last, this topic seems underserved in existing theory textbooks, considering
its importance as a genuine application of automata theory. I studied LR(k)gram-
mars years ago but without fully understanding how they work, and without
seeing how nicely they fit into the theory of deterministic context-free languages.
My goal in writing this section is to give an intuitive yet rigorous introduction
to this area for theorists as well as practitioners, and thereby contribute to its
broader appreciation. One note of caution, however: Some of the material in
this section is rather challenging, so an instructor in a basic first theory course
xxi
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 24 ---
xxii PREFACE TO THE THIRD EDITION
may prefer to designate it as supplementary reading. Later chapters do not de-
pend on this material.
Many people helped directly or indirectly in developing this edition. I’m in-
debted to reviewers Christos Kapoutsis and Cem Say who read a draft of the new
section and provided valuable feedback. Several individuals at Cengage Learning
assisted with the production, notably Alyssa Pratt and Jennifer Feltri-George.
Suzanne Huizenga copyedited the text and Laura Segel of ByteGraphics created
the new figures and modified some of the older figures.
I wish to thank my teaching assistants at MIT, Victor Chen, Andy Drucker,
Michael Forbes, Elena Grigorescu, Brendan Juba, Christos Kapoutsis, Jon Kel-
ner, Swastik Kopparty, Kevin Matulef, Amanda Redlich, Zack Remscrim, Ben
Rossman, Shubhangi Saraf, and Oren Weimann. Each of them helped me by
discussing new problems and their solutions, and by providing insight into how
well our students understood the course content. I’ve greatly enjoyed working
with such talented and enthusiastic young people.
It has been gratifying to receive email from around the globe. Thanks to all
for your suggestions, questions, and ideas. Here is a list of those correspondents
whose comments affected this edition:
Djihed Afifi, Steve Aldrich, Eirik Bakke, Suzanne Balik, Victor Bandur, Paul
Beame, Elazar Birnbaum, Goutam Biswas, Rob Bittner, Marina Blanton, Rod-
ney Bliss, Promita Chakraborty, Lewis Collier, Jonathan Deber, Simon Dex-
ter, Matt Diephouse, Peter Dillinger, Peter Drake, Zhidian Du, Peter Fe-
jer, Margaret Fleck, Atsushi Fujioka, Valerio Genovese, Evangelos Georgiadis,
Joshua Grochow, Jerry Grossman, Andreas Guelzow, Hjalmtyr Hafsteinsson,
Arthur Hall III, Cihat Imamoglu, Chinawat Isradisaikul, Kayla Jacobs, Flem-
ming Jensen, Barbara Kaiser, Matthew Kane, Christos Kapoutsis, Ali Durlov
Khan, Edwin Sze Lun Khoo, Yongwook Kim, Akash Kumar, Eleazar Leal, Zsolt
Lengvarszky, Cheng-Chung Li, Xiangdong Liang, Vladimir Lifschitz, Ryan
Lortie, Jonathan Low, Nancy Lynch, Alexis Maciel, Kevin Matulef, Nelson
Max, Hans-Rudolf Metz, Mladen Mik ˆsa, Sara Miner More, Rajagopal Nagara-
jan, Marvin Nakayama, Jonas Nyrup, Gregory Roberts, Ryan Romero, Santhosh
Samarthyam, Cem Say, Joel Seiferas, John Sieg, Marc Smith, John Steinberger,
Nuri T as ¸demir, T amir T assa, Mark T esta, Jesse Tjang, John T rammell, Hi-
roki Ueda, Jeroen Vaelen, Kurt L. Van Etten, Guillermo V ´azquez, Phanisekhar
Botlaguduru Venkata, Benjamin Bing-Yi Wang, Lutz Warnke, David Warren,
Thomas Watson, Joseph Wilson, David Wittenberg, Brian Wongchaowart, Kis-
han Yerubandi, Dai Yi.
Above all, I thank my family—my wife, Ina, and our children, Rachel and
Aaron. Time is finite and fleeting. Your love is everything.
Cambridge, Massachusetts Michael Sipser
April, 2012
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 25 ---
0
INTRODUCTION
We begin with an overview of those areas in the theory of computation that
we present in this course. Following that, you’ll have a chance to learn and/or
review some mathematical concepts that you will need later.
0.1
AUTOMATA, COMPUTABILITY, AND COMPLEXITY
This book focuses on three traditionally central areas of the theory of computa-
tion: automata, computability, and complexity. They are linked by the question:
What are the fundamental capabilities and limitations of computers?
This question goes back to the 1930s when mathematical logicians first began
to explore the meaning of computation. T echnological advances since that time
have greatly increased our ability to compute and have brought this question out
of the realm of theory into the world of practical concern.
In each of the three areas—automata, computability, and complexity—this
question is interpreted differently, and the answers vary according to the in-
terpretation. Following this introductory chapter, we explore each area in a
1
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 26 ---
2 CHAPTER 0 / INTRODUCTION
separate part of this book. Here, we introduce these parts in reverse order be-
cause by starting from the end you can better understand the reason for the
beginning.
COMPLEXITY THEORY
Computer problems come in different varieties; some are easy, and some are
hard. For example, the sorting problem is an easy one. Say that you need to
arrange a list of numbers in ascending order. Even a small computer can sort
a million numbers rather quickly. Compare that to a scheduling problem. Say
that you must find a schedule of classes for the entire university to satisfy some
reasonable constraints, such as that no two classes take place in the same room
at the same time. The scheduling problem seems to be much harder than the
sorting problem. If you have just a thousand classes, finding the best schedule
may require centuries, even with a supercomputer.
What makes some problems computationally hard and others easy?
This is the central question of complexity theory. Remarkably, we don’t know
the answer to it, though it has been intensively researched for over 40 years.
Later, we explore this fascinating question and some of its ramifications.
In one important achievement of complexity theory thus far, researchers have
discovered an elegant scheme for classifying problems according to their com-
putational difficulty. It is analogous to the periodic table for classifying elements
according to their chemical properties. Using this scheme, we can demonstrate
am e t h o df o rg i v i n ge v i d e n c et h a tc e r t a i np r o b l e m sa r ec o m p u t a t i o n a l l yh a r d ,
even if we are unable to prove that they are.
You have several options when you confront a problem that appears to be
computationally hard. First, by understanding which aspect of the problem is at
the root of the difficulty, you may be able to alter it so that the problem is more
easily solvable. Second, you may be able to settle for less than a perfect solution
to the problem. In certain cases, finding solutions that only approximate the
perfect one is relatively easy. Third, some problems are hard only in the worst
case situation, but easy most of the time. Depending on the application, you may
be satisfied with a procedure that occasionally is slow but usually runs quickly.
Finally, you may consider alternative types of computation, such as randomized
computation, that can speed up certain tasks.
One applied area that has been affected directly by complexity theory is the
ancient field of cryptography. In most fields, an easy computational problem is
preferable to a hard one because easy ones are cheaper to solve. Cryptography
is unusual because it specifically requires computational problems that are hard,
rather than easy. Secret codes should be hard to break without the secret key
or password. Complexity theory has pointed cryptographers in the direction of
computationally hard problems around which they have designed revolutionary
new codes.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 27 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 3
COMPUTABILITY THEORY
During the first half of the twentieth century, mathematicians such as Kurt
G¨odel, Alan T uring, and Alonzo Church discovered that certain basic problems
cannot be solved by computers. One example of this phenomenon is the prob-
lem of determining whether a mathematical statement is true or false. This task
is the bread and butter of mathematicians. It seems like a natural for solution
by computer because it lies strictly within the realm of mathematics. But no
computer algorithm can perform this task.
Among the consequences of this profound result was the development of ideas
concerning theoretical models of computers that eventually would help lead to
the construction of actual computers.
The theories of computability and complexity are closely related. In com-
plexity theory, the objective is to classify problems as easy ones and hard ones;
whereas in computability theory, the classification of problems is by those that
are solvable and those that are not. Computability theory introduces several of
the concepts used in complexity theory.
AUTOMATA THEORY
Automata theory deals with the definitions and properties of mathematical mod-
els of computation. These models play a role in several applied areas of computer
science. One model, called the finite automaton ,i su s e di nt e x tp r o c e s s i n g ,c o m -
pilers, and hardware design. Another model, called the context-free grammar ,i s
used in programming languages and artificial intelligence.
Automata theory is an excellent place to begin the study of the theory of
computation. The theories of computability and complexity require a precise
definition of a computer .A u t o m a t at h e o r ya l l o w sp r a c t i c ew i t hf o r m a ld e fi n i t i o n s
of computation as it introduces concepts relevant to other nontheoretical areas
of computer science.
0.2
MATHEMATICAL NOTIONS AND TERMINOLOGY
As in any mathematical subject, we begin with a discussion of the basic mathe-
matical objects, tools, and notation that we expect to use.
SETS
Asetis a group of objects represented as a unit. Sets may contain any type of
object, including numbers, symbols, and even other sets. The objects in a set are
called its elements ormembers .S e t sm a yb ed e s c r i b e df o r m a l l yi ns e v e r a lw a y s .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 28 ---
4 CHAPTER 0 / INTRODUCTION
One way is by listing a set’s elements inside braces. Thus the set
S={7,21,57}
contains the elements 7,21,a n d 57.T h es y m b o l s ∈and̸∈denote set member-
ship and nonmembership. We write 7∈{7,21,57}and8̸∈ {7,21,57}.F o rt w o
setsAandB,w es a yt h a t Ais asubset ofB,w r i t t e n A⊆B,i fe v e r ym e m b e ro f
Aalso is a member of B.W es a yt h a t Ais aproper subset ofB,w r i t t e n A/subsetnoteqlB,
ifAis a subset of Band not equal to B.
The order of describing a set doesn’t matter, nor does repetition of its mem-
bers. We get the same set Sby writing {57,7,7,7,21}.I fw ed ow a n tt ot a k et h e
number of occurrences of members into account, we call the group a multiset
instead of a set. Thus {7}and{7,7}are different as multisets but identical as
sets. An infinite set contains infinitely many elements. We cannot write a list of
all the elements of an infinite set, so we sometimes use the “ ...”n o t a t i o nt om e a n
“continue the sequence forever.” Thus we write the set of natural numbers N
as
{1,2,3,...}.
The set of integers Zis written as
{...,−2,−1,0,1,2,...}.
The set with zero members is called the empty set and is written ∅.A s e t w i t h
one member is sometimes called a singleton set ,a n das e tw i t ht w om e m b e r si s
called an unordered pair .
When we want to describe a set containing elements according to some rule,
we write {n|rule about n}.Thus {n|n=m2for some m∈N} means the set of
perfect squares.
If we have two sets AandB,t h e union ofAandB,w r i t t e n A∪B,i st h es e tw e
get by combining all the elements in AandBinto a single set. The intersection
ofAandB,w r i t t e n A∩B,i st h es e to fe l e m e n t st h a ta r ei nb o t h AandB.T h e
complement ofA,w r i t t e n
 A,i st h es e to fa l le l e m e n t su n d e rc o n s i d e r a t i o nt h a t
arenotinA.
As is often the case in mathematics, a picture helps clarify a concept. For sets,
we use a type of picture called a Venn diagram .I t r e p r e s e n t s s e t s a s r e g i o n s
enclosed by circular lines. Let the set START -t be the set of all English words
that start with the letter “t”. For example, in the figure, the circle represents the
setSTART -t. Several members of this set are represented as points inside the
circle.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 29 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 5
FIGURE 0.1
Venn diagram for the set of English words starting with “t”
Similarly, we represent the set END -z of English words that end with “z” in
the following figure.
FIGURE 0.2
Venn diagram for the set of English words ending with “z”
To r e p r e s e n t b o t h s e t s i n t h e s a m e Ve n n d i a g r a m , w e m u s t d r a w t h e m s o t h a t
they overlap, indicating that they share some elements, as shown in the following
figure. For example, the word topaz is in both sets. The figure also contains a
circle for the set START -j. It doesn’t overlap the circle for START -t because no
word lies in both sets.
FIGURE 0.3
Overlapping circles indicate common elements
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 30 ---
6 CHAPTER 0 / INTRODUCTION
The next two Venn diagrams depict the union and intersection of sets A
andB.
FIGURE 0.4
Diagrams for (a) A∪Band (b) A∩B
SEQUENCES AND TUPLES
Asequence of objects is a list of these objects in some order. We usually designate
as e q u e n c eb yw r i t i n gt h el i s tw i t h i np a r e n t h e s e s .F o re x a m p l e ,t h es e q u e n c e 7,
21,57would be written
(7,21,57).
The order doesn’t matter in a set, but in a sequence it does. Hence (7,21,57)is
not the same as (57,7,21).S i m i l a r l y ,r e p e t i t i o nd o e sm a t t e ri nas e q u e n c e ,b u t
it doesn’t matter in a set. Thus (7,7,21,57)is different from both of the other
sequences, whereas the set {7,21,57}is identical to the set {7,7,21,57}.
As with sets, sequences may be finite or infinite. Finite sequences often are
called tuples .A s e q u e n c e w i t h kelements is a k-tuple .T h u s (7,21,57)is a
3-tuple. A 2-tuple is also called an ordered pair .
Sets and sequences may appear as elements of other sets and sequences. For
example, the power set ofAis the set of all subsets of A.I fAis the set {0,1},
the power set of Ais the set {∅,{0},{1},{0,1}}.The set of all ordered pairs
whose elements are 0s and 1s is {(0,0),(0,1),(1,0),(1,1)}.
IfAandBare two sets, the Cartesian product orcross product ofAand
B,w r i t t e n A×B,is the set of all ordered pairs wherein the first element is a
member of Aand the second element is a member of B.
EXAMPLE 0.5
IfA={1,2}andB={x, y, z },
A×B={(1,x),(1,y),(1,z),(2,x),(2,y),(2,z)}.
We can also take the Cartesian product of ksets, A1,A2,...,Ak,w r i t t e n
A1×A2×···× Ak.I ti st h es e tc o n s i s t i n go fa l l k-tuples (a1,a2,...,a k)where
ai∈Ai.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 31 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 7
EXAMPLE 0.6
IfAandBare as in Example 0.5,
A×B×A=⎪braceleftbig
(1,x ,1),(1,x ,2),(1,y,1),(1,y,2),(1,z,1),(1,z,2),
(2,x ,1),(2,x ,2),(2,y,1),(2,y,2),(2,z,1),(2,z,2)⎪bracerightbig
.
If we have the Cartesian product of a set with itself, we use the shorthand
k⎪bracehtipdownleft
 ⎪bracehtipupright⎪bracehtipupleft
 ⎪bracehtipdownright
A×A×···× A=Ak.
EXAMPLE 0.7
The set N2equals N×N .I t c o n s i s t s o f a l l o r d e r e d p a i r s o f n a t u r a l n u m b e r s .
We also may write it as {(i, j)|i, j≥1}.
FUNCTIONS AND RELATIONS
Functions are central to mathematics. A function is an object that sets up an
input–output relationship. A function takes an input and produces an output.
In every function, the same input always produces the same output. If fis a
function whose output value is bwhen the input value is a,w ew r i t e
f(a)=b.
Af u n c t i o na l s oi sc a l l e da mapping ,a n d ,i f f(a)=b,w es a yt h a t fmaps atob.
For example, the absolute value function abstakes a number xas input and
returns xifxis positive and −xifxis negative. Thus abs(2) = abs(−2) =
2.A d d i t i o n i s a n o t h e r e x a m p l e o f a f u n c t i o n , w r i t t e n add.T h e i n p u t t o t h e
addition function is an ordered pair of numbers, and the output is the sum of
those numbers.
The set of possible inputs to the function is called its domain .T h e o u t p u t s
of a function come from a set called its range .T h en o t a t i o nf o rs a y i n gt h a t fis
af u n c t i o nw i t hd o m a i n Dand range Ris
f:D−→R.
In the case of the function abs,i fw ea r ew o r k i n gw i t hi n t e g e r s ,t h ed o m a i na n d
the range are Z,s ow ew r i t e abs:Z− →Z .I n t h ec a s e o f t h ea d d i t i o n f u n c t i o n
for integers, the domain is the set of pairs of integers Z×Z and the range is Z,
so we write add:Z×Z − → Z .N o t et h a taf u n c t i o nm a yn o tn e c e s s a r i l yu s ea l l
the elements of the specified range. The function absnever takes on the value
−1even though −1∈Z.Af u n c t i o nt h a td o e su s ea l lt h ee l e m e n t so ft h er a n g e
is said to be onto the range.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 32 ---
8 CHAPTER 0 / INTRODUCTION
We may describe a specific function in several ways. One way is with a pro-
cedure for computing an output from a specified input. Another way is with a
table that lists all possible inputs and gives the output for each input.
EXAMPLE 0.8
Consider the function f:{0,1,2,3,4}− →{ 0,1,2,3,4}.
n
f(n)
0
 1
1
 2
2
 3
3
 4
4
 0
This function adds 1 to its input and then outputs the result modulo 5. A number
modulo mis the remainder after division by m.F o re x a m p l e ,t h em i n u t eh a n d
on a clock face counts modulo 60. When we do modular arithmetic, we define
Zm={0,1,2,...,m −1}.W i t h t h i s n o t a t i o n , t h ea f o r e m e n t i o n e df u n c t i o n f
has the form f:Z5−→ Z 5.
EXAMPLE 0.9
Sometimes a two-dimensional table is used if the domain of the function is the
Cartesian product of two sets. Here is another function, g:Z4×Z4−→ Z 4.T h e
entry at the row labeled iand the column labeled jin the table is the value of
g(i, j).
g
0123
0
0123
1
1230
2
2301
3
3012
The function gis the addition function modulo 4.
When the domain of a function fisA1×···× Akfor some sets A1,...,A k,t h e
input to fis ak-tuple (a1,a2,...,a k)and we call the aithearguments tof.A
function with karguments is called a k-ary function ,a n d kis called the arity of
the function. If kis 1,fhas a single argument and fis called a unary function .
Ifkis 2, fis abinary function . Certain familiar binary functions are written
in a special infix notation ,w i t ht h es y m b o lf o rt h ef u n c t i o np l a c e db e t w e e ni t s
two arguments, rather than in prefix notation ,w i t ht h es y m b o lp r e c e d i n g .F o r
example, the addition function addusually is written in infix notation with the
+symbol between its two arguments as in a+binstead of in prefix notation
add(a, b).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 33 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 9
Apredicate orproperty is a function whose range is {TRUE ,FALSE }.F o r
example, let even be a property that is TRUE if its input is an even number and
FALSE if its input is an odd number. Thus even(4) = TRUE andeven(5) =
FALSE .
Ap r o p e r t yw h o s ed o m a i ni sas e to f k-tuples A×···× Ais called a relation ,
ak-ary relation ,o ra k-ary relation on A.A c o m m o n c a s e i s a 2 - a r y r e l a t i o n ,
called a binary relation .W h e n w r i t i n g a n e x p r e s s i o n i n v o l v i n g a b i n a r y r e l a -
tion, we customarily use infix notation. For example, “less than” is a relation
usually written with the infix operation symbol <.“ E q u a l i t y ” ,w r i t t e nw i t ht h e
=symbol, is another familiar relation. If Ris a binary relation, the statement
aRbmeans that aRb=TRUE .S i m i l a r l y ,i f Ris ak-ary relation, the statement
R(a1,...,a k)means that R(a1,...,a k)= TRUE .
EXAMPLE 0.10
In a children’s game called Scissors–Paper–Stone, the two players simultaneously
select a member of the set {SCISSORS ,PAPER ,STONE }and indicate their selec-
tions with hand signals. If the two selections are the same, the game starts over.
If the selections differ, one player wins, according to the relation beats.
beats
 SCISSORS PAPER STONE
SCISSORS
 FALSE TRUE FALSE
PAPER
 FALSE FALSE TRUE
STONE
 TRUE FALSE FALSE
From this table we determine that SCISSORS beats PAPER isTRUE and that
PAPER beats SCISSORS isFALSE .
Sometimes describing predicates with sets instead of functions is more con-
venient. The predicate P:D−→ { TRUE ,FALSE }may be written (D,S),w h e r e
S={a∈D|P(a)= TRUE },or simply Sif the domain Dis obvious from the
context. Hence the relation beats may be written
{(SCISSORS ,PAPER ),(PAPER ,STONE ),(STONE ,SCISSORS )}.
As p e c i a lt y p eo fb i n a r yr e l a t i o n ,c a l l e da n equivalence relation ,c a p t u r e st h e
notion of two objects being equal in some feature. A binary relation Ris an
equivalence relation if Rsatisfies three conditions:
1.Risreflexive if for every x,xRx;
2.Rissymmetric if for every xandy,xRyimplies yRx;a n d
3.Ristransitive if for every x,y,a n d z,xRyandyRzimplies xRz.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 34 ---
10 CHAPTER 0 / INTRODUCTION
EXAMPLE 0.11
Define an equivalence relation on the natural numbers, written ≡7.F o r i, j∈N,
say that i≡7j,i fi−jis a multiple of 7. This is an equivalence relation because it
satisfies the three conditions. First, it is reflexive, as i−i=0,w h i c hi sam u l t i p l e
of 7. Second, it is symmetric, as i−jis a multiple of 7 if j−iis a multiple of 7.
Third, it is transitive, as whenever i−jis a multiple of 7 and j−kis a multiple
of 7, then i−k=(i−j)+(j−k)is the sum of two multiples of 7 and hence a
multiple of 7, too.
GRAPHS
Anundirected graph ,o rs i m p l ya graph ,i sas e to fp o i n t sw i t hl i n e sc o n n e c t i n g
some of the points. The points are called nodes orvertices ,a n dt h el i n e sa r e
called edges ,a ss h o w ni nt h ef o l l o w i n gfi g u r e .
FIGURE 0.12
Examples of graphs
The number of edges at a particular node is the degree of that node. In
Figure 0.12(a), all the nodes have degree 2. In Figure 0.12(b), all the nodes have
degree 3. No more than one edge is allowed between any two nodes. We may
allow an edge from a node to itself, called a self-loop ,d e p e n d i n go nt h es i t u a t i o n .
In a graph Gthat contains nodes iandj,t h ep a i r (i, j)represents the edge that
connects iandj.T h e o r d e r o f iandjdoesn’t matter in an undirected graph,
so the pairs (i, j)and(j, i)represent the same edge. Sometimes we describe
undirected edges with unordered pairs using set notation as in {i, j}.I fVis the
set of nodes of GandEis the set of edges, we say G=(V,E).W ec a nd e s c r i b e
ag r a p hw i t had i a g r a mo rm o r ef o r m a l l yb ys p e c i f y i n g VandE.F o re x a m p l e ,a
formal description of the graph in Figure 0.12(a) is
⎪parenleftbig
{1,2,3,4,5},{(1,2),(2,3),(3,4),(4,5),(5,1)}⎪parenrightbig
,
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 35 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 11
and a formal description of the graph in Figure 0.12(b) is
⎪parenleftbig
{1,2,3,4},{(1,2),(1,3),(1,4),(2,3),(2,4),(3,4)}⎪parenrightbig
.
Graphs frequently are used to represent data. Nodes might be cities and edges
the connecting highways, or nodes might be people and edges the friendships
between them. Sometimes, for convenience, we label the nodes and/or edges of
ag r a p h ,w h i c ht h e ni sc a l l e da labeled graph .F i g u r e0 . 1 3d e p i c t sag r a p hw h o s e
nodes are cities and whose edges are labeled with the dollar cost of the cheapest
nonstop airfare for travel between those cities if flying nonstop between them is
possible.
FIGURE 0.13
Cheapest nonstop airfares between various cities
We say that graph Gis asubgraph of graph Hif the nodes of Gare a subset
of the nodes of H,a n dt h ee d g e so f Gare the edges of Hon the corresponding
nodes. The following figure shows a graph Hand a subgraph G.
FIGURE 0.14
Graph G(shown darker) is a subgraph of H
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 36 ---
12 CHAPTER 0 / INTRODUCTION
Apath in a graph is a sequence of nodes connected by edges. A simple path
is a path that doesn’t repeat any nodes. A graph is connected if every two nodes
have a path between them. A path is a cycle if it starts and ends in the same node.
Asimple cycle is one that contains at least three nodes and repeats only the first
and last nodes. A graph is a treeif it is connected and has no simple cycles, as
shown in Figure 0.15. A tree may contain a specially designated node called the
root.T h e n o d e s o f d e g r e e1 i na t r e e ,o t h e r t h a n t h er o o t , a r ec a l l e d t h e leaves
of the tree.
FIGURE 0.15
(a) A path in a graph, (b) a cycle in a graph, and (c) a tree
Adirected graph has arrows instead of lines, as shown in the following figure.
The number of arrows pointing from a particular node is the outdegree of that
node, and the number of arrows pointing to a particular node is the indegree .
FIGURE 0.16
Ad i r e c t e dg r a p h
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 37 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 13
In a directed graph, we represent an edge from itojas a pair (i, j).T h e
formal description of a directed graph Gis(V,E),w h e r e Vis the set of nodes
andEis the set of edges. The formal description of the graph in Figure 0.16 is
⎪parenleftbig
{1,2,3,4,5,6},{(1,2),(1,5),(2,1),(2,4),(5,4),(5,6),(6,1),(6,3)}⎪parenrightbig
.
Ap a t hi nw h i c ha l lt h ea r r o w sp o i n ti nt h es a m ed i r e c t i o na si t ss t e p si sc a l l e da
directed path .Ad i r e c t e dg r a p hi s strongly connected if a directed path connects
every two nodes. Directed graphs are a handy way of depicting binary relations.
IfRis a binary relation whose domain is D×D,al a b e l e dg r a p h G=(D,E)
represents R,w h e r e E={(x, y)|xRy}.
EXAMPLE 0.17
The directed graph shown here represents the relation given in Example 0.10.
FIGURE 0.18
The graph of the relation beats
STRINGS AND LANGUAGES
Strings of characters are fundamental building blocks in computer science. The
alphabet over which the strings are defined may vary with the application. For
our purposes, we define an alphabet to be any nonempty finite set. The members
of the alphabet are the symbols of the alphabet. We generally use capital Greek
letters ΣandΓto designate alphabets and a typewriter font for symbols from an
alphabet. The following are a few examples of alphabets.
Σ1={0,1}
Σ2={a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}
Γ={0,1,x,y,z}
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 38 ---
14 CHAPTER 0 / INTRODUCTION
Astring over an alphabet is a finite sequence of symbols from that alphabet,
usually written next to one another and not separated by commas. If Σ1={0,1},
then 01001 is a string over Σ1.I fΣ2={a,b,c,...,z},t h e n abracadabra is a
string over Σ2.I fwis a string over Σ,t h e length ofw,w r i t t e n |w|,i st h en u m b e r
of symbols that it contains. The string of length zero is called the empty string
and is written ε.T h ee m p t ys t r i n gp l a y st h er o l eo f 0in a number system. If w
has length n,w ec a nw r i t e w=w1w2···wnwhere each wi∈Σ.T h e reverse
ofw,w r i t t e n wR,i st h es t r i n go b t a i n e db yw r i t i n g win the opposite order (i.e.,
wnwn−1···w1). String zis asubstring ofwifzappears consecutively within w.
For example, cadis a substring of abracadabra .
If we have string xof length mand string yof length n,t h e concatenation
ofxandy,w r i t t e n xy,i st h es t r i n go b t a i n e db ya p p e n d i n g yto the end of x,a s
inx1···xmy1···yn.T oc o n c a t e n a t eas t r i n gw i t hi t s e l fm a n yt i m e s ,w eu s et h e
superscript notation xkto mean
k⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownrightxx···x.
The lexicographic order of strings is the same as the familiar dictionary order.
We’ll occasionally use a modified lexicographic order, called shortlex order or
simply string order ,t h a ti si d e n t i c a lt ol e x i c o g r a p h i co r d e r ,e x c e p tt h a ts h o r t e r
strings precede longer strings. Thus the string ordering of all strings over the
alphabet {0,1}is
(ε,0,1,00,01,10,11,000,...).
Say that string xis aprefix of string yif a string zexists where xz=y,a n dt h a t
xis aproper prefix ofyif in addition x̸=y.Alanguage is a set of strings. A
language is prefix-free if no member is a proper prefix of another member.
BOOLEAN LOGIC
Boolean logic is a mathematical system built around the two values TRUE and
FALSE .T h o u g ho r i g i n a l l yc o n c e i v e do fa sp u r em a t h e m a t i c s ,t h i ss y s t e mi sn o w
considered to be the foundation of digital electronics and computer design. The
values TRUE and FALSE are called the Boolean values and are often represented
by the values 1and0.W eu s eB o o l e a nv a l u e si ns i t u a t i o n sw i t ht w op o s s i b i l i t i e s ,
such as a wire that may have a high or a low voltage, a proposition that may be
true or false, or a question that may be answered yes or no.
We can manipulate Boolean values with the Boolean operations .T h e s i m -
plest Boolean operation is the negation orNOT operation, designated with the
symbol ¬.T h en e g a t i o no faB o o l e a nv a l u ei st h eo p p o s i t ev a l u e .T h u s ¬0=1
and¬1=0 .W e d e s i g n a t e t h e conjunction orAND operation with the sym-
bol∧.T h ec o n j u n c t i o no ft w oB o o l e a nv a l u e si s 1if both of those values are 1.
The disjunction orORoperation is designated with the symbol ∨.T h ed i s j u n c -
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 39 ---
0.2 MATHEMATICAL NOTIONS AND TERMINOLOGY 15
tion of two Boolean values is 1if either of those values is 1.W es u m m a r i z et h i s
information as follows.
0∧0=0 0 ∨0=0 ¬0=1
0∧1=0 0 ∨1=1 ¬1=0
1∧0=0 1 ∨0=1
1∧1=1 1 ∨1=1
We use Boolean operations for combining simple statements into more com-
plex Boolean expressions, just as we use the arithmetic operations +and×to
construct complex arithmetic expressions. For example, if Pis the Boolean value
representing the truth of the statement “the sun is shining” and Qrepresents the
truth of the statement “today is Monday”, we may write P∧Qto represent the
truth value of the statement “the sun is shining andtoday is Monday” and sim-
ilarly for P∨Qwith andreplaced by or.T h e v a l u e s PandQare called the
operands of the operation.
Several other Boolean operations occasionally appear. The exclusive or ,o r
XOR,o p e r a t i o ni sd e s i g n a t e db yt h e ⊕symbol and is 1if either but not both of
its two operands is 1.T h e equality operation, written with the symbol ↔,i s1
if both of its operands have the same value. Finally, the implication operation
is designated by the symbol →and is 0if its first operand is 1and its second
operand is 0;o t h e r w i s e , →is1.W es u m m a r i z et h i si n f o r m a t i o na sf o l l o w s .
0⊕0=0 0 ↔0=1 0 →0=1
0⊕1=1 0 ↔1=0 0 →1=1
1⊕0=1 1 ↔0=0 1 →0=0
1⊕1=0 1 ↔1=1 1 →1=1
We can establish various relationships among these operations. In fact, we
can express all Boolean operations in terms of the AND and NOT operations, as
the following identities show. The two expressions in each row are equivalent.
Each row expresses the operation in the left-hand column in terms of operations
above it and AND and NOT .
P∨Q ¬(¬P∧¬Q)
P→Q ¬P∨Q
P↔Q (P→Q)∧(Q→P)
P⊕Q ¬(P↔Q)
The distributive law forAND and ORcomes in handy when we manipulate
Boolean expressions. It is similar to the distributive law for addition and multi-
plication, which states that a×(b+c)=( a×b)+(a×c).T h eB o o l e a nv e r s i o n
comes in two forms:
•P∧(Q∨R)equals (P∧Q)∨(P∧R),a n di t sd u a l
•P∨(Q∧R)equals (P∨Q)∧(P∨R).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 40 ---
16 CHAPTER 0 / INTRODUCTION
SUMMARY OF MATHEMATICAL TERMS
Alphabet Afi n i t e ,n o n e m p t ys e to fo b j e c t sc a l l e ds y m b o l s
Argument An input to a function
Binary relation A relation whose domain is a set of pairs
Boolean operation An operation on Boolean values
Boolean value The values TRUE orFALSE ,o f t e nr e p r e s e n t e db y 1or0
Cartesian product An operation on sets forming a set of all tuples of elements from
respective sets
Complement An operation on a set, forming the set of all elements not present
Concatenation An operation that joins strings together
Conjunction Boolean AND operation
Connected graph A graph with paths connecting every two nodes
Cycle Ap a t ht h a ts t a r t sa n de n d si nt h es a m en o d e
Directed graph A collection of points and arrows connecting some pairs of points
Disjunction Boolean ORoperation
Domain The set of possible inputs to a function
Edge Al i n ei nag r a p h
Element An object in a set
Empty set The set with no members
Empty string The string of length zero
Equivalence relation A binary relation that is reflexive, symmetric, and transitive
Function An operation that translates inputs into outputs
Graph Ac o l l e c t i o no fp o i n t sa n dl i n e sc o n n e c t i n gs o m ep a i r so fp o i n t s
Intersection An operation on sets forming the set of common elements
k-tuple Al i s to f kobjects
Language As e to fs t r i n g s
Member An object in a set
Node Ap o i n ti nag r a p h
Ordered pair A list of two elements
Path As e q u e n c eo fn o d e si nag r a p hc o n n e c t e db ye d g e s
Predicate Af u n c t i o nw h o s er a n g ei s {TRUE ,FALSE }
Property Ap r e d i c a t e
Range The set from which outputs of a function are drawn
Relation Ap r e d i c a t e ,m o s tt y p i c a l l yw h e nt h ed o m a i ni sas e to f k-tuples
Sequence Al i s to fo b j e c t s
Set Ag r o u po fo b j e c t s
Simple path A path without repetition
Singleton set A set with one member
String Afi n i t el i s to fs y m b o l sf r o ma na l p h a b e t
Symbol Am e m b e ro fa na l p h a b e t
Tr e e Ac o n n e c t e dg r a p hw i t h o u ts i m p l ec y c l e s
Union An operation on sets combining all elements into a single set
Unordered pair A set with two members
Vertex Ap o i n ti nag r a p h
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 41 ---
0.3 DEFINITIONS, THEOREMS, AND PROOFS 17
0.3
DEFINITIONS, THEOREMS, AND PROOFS
Theorems and proofs are the heart and soul of mathematics and definitions are
its spirit. These three entities are central to every mathematical subject, includ-
ing ours.
Definitions describe the objects and notions that we use. A definition may be
simple, as in the definition of setgiven earlier in this chapter, or complex as in
the definition of security in a cryptographic system. Precision is essential to any
mathematical definition. When defining some object, we must make clear what
constitutes that object and what does not.
After we have defined various objects and notions, we usually make math-
ematical statements about them. T ypically, a statement expresses that some
object has a certain property. The statement may or may not be true; but like a
definition, it must be precise. No ambiguity about its meaning is allowed.
Aproof is a convincing logical argument that a statement is true. In mathe-
matics, an argument must be airtight; that is, convincing in an absolute sense. In
everyday life or in the law, the standard of proof is lower. A murder trial demands
proof “beyond any reasonable doubt.” The weight of evidence may compel the
jury to accept the innocence or guilt of the suspect. However, evidence plays
no role in a mathematical proof. A mathematician demands proof beyond any
doubt.
Atheorem is a mathematical statement proved true. Generally we reserve the
use of that word for statements of special interest. Occasionally we prove state-
ments that are interesting only because they assist in the proof of another, more
significant statement. Such statements are called lemmas .O c c a s i o n a l l y at h e o -
rem or its proof may allow us to conclude easily that other, related statements
are true. These statements are called corollaries of the theorem.
FINDING PROOFS
The only way to determine the truth or falsity of a mathematical statement is
with a mathematical proof. Unfortunately, finding proofs isn’t always easy. It
can’t be reduced to a simple set of rules or processes. During this course, you will
be asked to present proofs of various statements. Don’t despair at the prospect!
Even though no one has a recipe for producing proofs, some helpful general
strategies are available.
First, carefully read the statement you want to prove. Do you understand
all the notation? Rewrite the statement in your own words. Break it down and
consider each part separately.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 42 ---
18 CHAPTER 0 / INTRODUCTION
Sometimes the parts of a multipart statement are not immediately evident.
One frequently occurring type of multipart statement has the form “ Pif and
only if Q”, often written “ PiffQ”, where both PandQare mathematical state-
ments. This notation is shorthand for a two-part statement. The first part is “ P
only if Q,” which means: If Pis true, then Qis true, written P⇒Q.T h es e c o n d
is “PifQ,” which means: If Qis true, then Pis true, written P⇐Q.T h efi r s t
of these parts is the forward direction of the original statement and the second
is the reverse direction .W e w r i t e“ Pif and only if Q”a sP⇐⇒Q.T op r o v ea
statement of this form, you must prove each of the two directions. Often, one of
these directions is easier to prove than the other.
Another type of multipart statement states that two sets AandBare equal.
The first part states that Ais a subset of B,a n dt h es e c o n dp a r ts t a t e st h a t B
is a subset of A.T h u s o n e c o m m o n w a y t o p r o v e t h a t A=Bis to prove that
every member of Aalso is a member of B,a n dt h a te v e r ym e m b e ro f Balso is a
member of A.
Next, when you want to prove a statement or part thereof, try to get an in-
tuitive, “gut” feeling of why it should be true. Experimenting with examples is
especially helpful. Thus if the statement says that all objects of a certain type
have a particular property, pick a few objects of that type and observe that they
actually do have that property. After doing so, try to find an object that fails to
have the property, called a counterexample .I ft h es t a t e m e n ta c t u a l l yi st r u e ,y o u
will not be able to find a counterexample. Seeing where you run into difficulty
when you attempt to find a counterexample can help you understand why the
statement is true.
EXAMPLE 0.19
Suppose that you want to prove the statement for every graph G,t h es u mo ft h e
degrees of all the nodes in Gis an even number .
First, pick a few graphs and observe this statement in action. Here are two
examples.
                    Next, try to find a counterexample; that is, a graph in which the sum is an odd
number.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 43 ---
0.3 DEFINITIONS, THEOREMS, AND PROOFS 19
Can you now begin to see why the statement is true and how to prove it?
If you are still stuck trying to prove a statement, try something easier. Attempt
to prove a special case of the statement. For example, if you are trying to prove
that some property is true for every k>0,fi r s tt r yt op r o v ei tf o r k=1.I fy o u
succeed, try it for k=2,a n ds oo nu n t i ly o uc a nu n d e r s t a n dt h em o r eg e n e r a l
case. If a special case is hard to prove, try a different special case or perhaps a
special case of the special case.
Finally, when you believe that you have found the proof, you must write it
up properly. A well-written proof is a sequence of statements, wherein each one
follows by simple reasoning from previous statements in the sequence. Carefully
writing a proof is important, both to enable a reader to understand it, and for
you to be sure that it is free from errors.
The following are a few tips for producing a proof.
•Be patient .F i n d i n g p r o o f s t a k e s t i m e .I f y o u d o n ’ t s e e h o w t o d o i t r i g h t
away, don’t worry. Researchers sometimes work for weeks or even years to
find a single proof.
•Come back to it .L o o k o v e r t h e s t a t e m e n t y o u w a n t t o p r o v e , t h i n k a b o u t
it a bit, leave it, and then return a few minutes or hours later. Let the
unconscious, intuitive part of your mind have a chance to work.
•Be neat .W h e n y o u a r e b u i l d i n g y o u r i n t u i t i o n f o r t h e s t a t e m e n t y o u a r e
trying to prove, use simple, clear pictures and/or text. You are trying to
develop your insight into the statement, and sloppiness gets in the way of
insight. Furthermore, when you are writing a solution for another person
to read, neatness will help that person understand it.
•Be concise .B r e v i t yh e l p sy o ue x p r e s sh i g h - l e v e li d e a sw i t h o u tg e t t i n gl o s ti n
details. Good mathematical notation is useful for expressing ideas concisely.
But be sure to include enough of your reasoning when writing up a proof
so that the reader can easily understand what you are trying to say.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 44 ---
20 CHAPTER 0 / INTRODUCTION
For practice, let’s prove one of DeMorgan’s laws.
THEOREM 0.20
For any two sets AandB,
A∪B=
A∩
B.
First, is the meaning of this theorem clear? If you don’t understand the mean-
ing of the symbols ∪or∩or the overbar, review the discussion on page 4.
To p r o v e t h i s t h e o r e m , w e m u s t s h o w t h a t t h e t w o s e t s
 A∪Band
A∩
Bare
equal. Recall that we may prove that two sets are equal by showing that every
member of one set also is a member of the other and vice versa. Before looking
at the following proof, consider a few examples and then try to prove it yourself.
PROOF This theorem states that two sets,
 A∪Band
A∩
B,a r ee q u a l . W e
prove this assertion by showing that every element of one also is an element of
the other and vice versa.
Suppose that xis an element of
 A∪B.T h e n xis not in A∪Bfrom the
definition of the complement of a set. Therefore, xis not in Aandxis not in B,
from the definition of the union of two sets. In other words, xis in
Aandxis in
B.H e n c et h ed e fi n i t i o no ft h ei n t e r s e c t i o no ft w os e t ss h o w st h a t xis in
 A∩
B.
For the other direction, suppose that xis in
 A∩
B.T h e n xis in both
 Aand
B.T h e r e f o r e , xis not in Aandxis not in B,a n dt h u sn o ti nt h eu n i o no f
these two sets. Hence xis in the complement of the union of these sets; in other
words, xis in
 A∪B,w h i c hc o m p l e t e st h ep r o o fo ft h et h e o r e m .
Let’s now prove the statement in Example 0.19.
THEOREM 0.21
For every graph G,t h es u mo ft h ed e g r e e so fa l lt h en o d e si n Gis an even
number.
PROOF Every edge in Gis connected to two nodes. Each edge contributes 1
to the degree of each node to which it is connected. Therefore, each edge con-
tributes 2to the sum of the degrees of all the nodes. Hence, if Gcontains e
edges, then the sum of the degrees of all the nodes of Gis2e,w h i c hi sa ne v e n
number.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 45 ---
0.4 TYPES OF PROOF 21
0.4
TYPES OF PROOF
Several types of arguments arise frequently in mathematical proofs. Here, we
describe a few that often occur in the theory of computation. Note that a proof
may contain more than one type of argument because the proof may contain
within it several different subproofs.
PROOF BY CONSTRUCTION
Many theorems state that a particular type of object exists. One way to prove
such a theorem is by demonstrating how to construct the object. This technique
is aproof by construction .
Let’s use a proof by construction to prove the following theorem. We define
ag r a p ht ob e k-regular if every node in the graph has degree k.
THEOREM 0.22
For each even number ngreater than 2,t h e r ee x i s t sa 3-regular graph with n
nodes.
PROOF Letnbe an even number greater than 2. Construct graph G=(V,E)
with nnodes as follows. The set of nodes of GisV={0,1,...,n −1},a n dt h e
set of edges of Gis the set
E={{i, i+1}|for0≤i≤n−2}∪{{ n−1,0}}
∪{ {i, i+n/2}|for0≤i≤n/2−1}.
Picture the nodes of this graph written consecutively around the circumference
of a circle. In that case, the edges described in the top line of Ego between
adjacent pairs around the circle. The edges described in the bottom line of Ego
between nodes on opposite sides of the circle. This mental picture clearly shows
that every node in Ghas degree 3.
PROOF BY CONTRADICTION
In one common form of argument for proving a theorem, we assume that the
theorem is false and then show that this assumption leads to an obviously false
consequence, called a contradiction. We use this type of reasoning frequently in
everyday life, as in the following example.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 46 ---
22 CHAPTER 0 / INTRODUCTION
EXAMPLE 0.23
Jack sees Jill, who has just come in from outdoors. On observing that she is
completely dry, he knows that it is not raining. His “proof” that it is not raining
is that if it were raining (the assumption that the statement is false), Jill would be
wet(the obviously false consequence). Therefore, it must not be raining.
Next, let’s prove by contradiction that the square root of 2is an irrational
number. A number is rational if it is a fractionm
n,w h e r e mandnare integers;
in other words, a rational number is the ratio of integers mandn.F o re x a m p l e ,
2
3obviously is a rational number. A number is irrational if it is not rational.
THEOREM 0.24
√
2is irrational.
PROOF First, we assume for the purpose of later obtaining a contradiction
that√
2is rational. Thus
√
2=m
n,
where mandnare integers. If both mandnare divisible by the same integer
greater than 1,d i v i d eb o t hb yt h el a r g e s ts u c hi n t e g e r .D o i n gs od o e s n ’ tc h a n g e
the value of the fraction. Now, at least one of mandnmust be an odd number.
We multiply both sides of the equation by nand obtain
n√
2=m.
We square both sides and obtain
2n2=m2.
Because m2is2times the integer n2,w ek n o wt h a t m2is even. Therefore, m,
too, is even, as the square of an odd number always is odd. So we can write
m=2kfor some integer k.T h e n ,s u b s t i t u t i n g 2kform,w eg e t
2n2=( 2k)2
=4k2.
Dividing both sides by 2,w eo b t a i n
n2=2k2.
But this result shows that n2is even and hence that nis even. Thus we have
established that both mandnare even. But we had earlier reduced mandnso
that they were notboth even—a contradiction.
PROOF BY INDUCTION
Proof by induction is an advanced method used to show that all elements of
an infinite set have a specified property. For example, we may use a proof by
induction to show that an arithmetic expression computes a desired quantity for
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 47 ---
0.4 TYPES OF PROOF 23
every assignment to its variables, or that a program works correctly at all steps
or for all inputs.
To i l l u s t r a t e h o w p r o o f b y i n d u c t i o n w o r k s , l e t ’s t a k e t h e i n fi n i t e s e t t o b e t h e
natural numbers, N={1,2,3,...},a n ds a yt h a tt h ep r o p e r t yi sc a l l e d P.O u r
goal is to prove that P(k)is true for each natural number k.I no t h e rw o r d s ,w e
want to prove that P(1)is true, as well as P(2),P(3),P(4),a n ds oo n .
Every proof by induction consists of two parts, the basis and the induction
step.E a c hp a r ti sa ni n d i v i d u a lp r o o fo ni t so w n .T h eb a s i sp r o v e st h a t P(1)is
true. The induction step proves that for each i≥1,i fP(i)is true, then so is
P(i+1 ).
When we have proven both of these parts, the desired result follows—namely,
thatP(i)is true for each i.W h y ?F i r s t , w ek n o w t h a t P(1)is true because the
basis alone proves it. Second, we know that P(2)is true because the induction
step proves that if P(1)is true then P(2)is true, and we already know that P(1)
is true. Third, we know that P(3)is true because the induction step proves that
ifP(2)is true then P(3)is true, and we already know that P(2)is true. This
process continues for all natural numbers, showing that P(4)is true, P(5)is
true, and so on.
Once you understand the preceding paragraph, you can easily understand
variations and generalizations of the same idea. For example, the basis doesn’t
necessarily need to start with 1; it may start with any value b.I n t h a t c a s e , t h e
induction proof shows that P(k)is true for every kthat is at least b.
In the induction step, the assumption that P(i)is true is called the induction
hypothesis .S o m e t i m e s h a v i n g t h e s t r o n g e r i n d u c t i o n h y p o t h e s i s t h a t P(j)is
true for every j≤iis useful. The induction proof still works because when we
want to prove that P(i+1 ) is true, we have already proved that P(j)is true for
every j≤i.
The format for writing down a proof by induction is as follows.
Basis: Prove that P(1)is true.
...
Induction step: For each i≥1,a s s u m et h a t P(i)is true and use this assumption
to show that P(i+1 ) is true.
...
Now, let’s prove by induction the correctness of the formula used to calculate
the size of monthly payments of home mortgages. When buying a home, many
people borrow some of the money needed for the purchase and repay this loan
over a certain number of years. T ypically, the terms of such repayments stipulate
that a fixed amount of money is paid each month to cover the interest, as well as
part of the original sum, so that the total is repaid in 30 years. The formula for
calculating the size of the monthly payments is shrouded in mystery, but actually
is quite simple. It touches many people’s lives, so you should find it interesting.
We use induction to prove that it works, making it a good illustration of that
technique.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 48 ---
24 CHAPTER 0 / INTRODUCTION
First, we set up the names and meanings of several variables. Let Pbe the
principal ,t h ea m o u n to ft h eo r i g i n a ll o a n .L e t I>0be the yearly interest rate of
the loan, where I=0.06indicates a 6%r a t eo fi n t e r e s t . L e t Ybe the monthly
payment. For convenience, we use Ito define another variable M,t h em o n t h l y
multiplier. It is the rate at which the loan changes each month because of the
interest on it. Following standard banking practice, the monthly interest rate is
one-twelfth of the annual rate so M=1+ I/12,a n di n t e r e s ti sp a i dm o n t h l y
(monthly compounding).
Tw o t h i n g s h a p p e n e a c h m o n t h . F i r s t , t h e a m o u n t o f t h e l o a n t e n d s t o i n -
crease because of the monthly multiplier. Second, the amount tends to decrease
because of the monthly payment. Let Ptbe the amount of the loan outstand-
ing after the tth month. Then P0=Pis the amount of the original loan,
P1=MP 0−Yis the amount of the loan after one month, P2=MP 1−Yis
the amount of the loan after two months, and so on. Now we are ready to state
and prove a theorem by induction on tthat gives a formula for the value of Pt.
THEOREM 0.25
For each t≥0,
Pt=PMt−Y⎪parenleftbiggMt−1
M−1⎪parenrightbigg
.
PROOF
Basis: Prove that the formula is true for t=0.I ft=0,t h e nt h ef o r m u l as t a t e s
that
P0=PM0−Y⎪parenleftbiggM0−1
M−1⎪parenrightbigg
.
We can simplify the right-hand side by observing that M0=1.T h u sw eg e t
P0=P,
which holds because we have defined P0to be P.T h e r e f o r e , w e h a v e p r o v e d
that the basis of the induction is true.
Induction step: For each k≥0,a s s u m et h a tt h ef o r m u l ai st r u ef o r t=kand
show that it is true for t=k+1.T h ei n d u c t i o nh y p o t h e s i ss t a t e st h a t
Pk=PMk−Y⎪parenleftbiggMk−1
M−1⎪parenrightbigg
.
Our objective is to prove that
Pk+1=PMk+1−Y⎪parenleftbiggMk+1−1
M−1⎪parenrightbigg
.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 49 ---
EXERCISES 25
We do so with the following steps. First, from the definition of Pk+1from
Pk,w ek n o wt h a t
Pk+1=PkM−Y.
Therefore, using the induction hypothesis to calculate Pk,
Pk+1=⎪bracketleftbigg
PMk−Y⎪parenleftbiggMk−1
M−1⎪parenrightbigg⎪bracketrightbigg
M−Y.
Multiplying through by Mand rewriting Yyields
Pk+1=PMk+1−Y⎪parenleftbiggMk+1−M
M−1⎪parenrightbigg
−Y⎪parenleftbiggM−1
M−1⎪parenrightbigg
=PMk+1−Y⎪parenleftbiggMk+1−1
M−1⎪parenrightbigg
.
Thus the formula is correct for t=k+1,w h i c hp r o v e st h et h e o r e m .
Problem 0.15 asks you to use the preceding formula to calculate actual mort-
gage payments.
EXERCISES
0.1 Examine the following formal descriptions of sets so that you understand which
members they contain. Write a short informal English description of each set.
a.{1,3,5,7,. . .}
b.{...,−4,−2,0,2,4,. . .}
c.{n|n=2mfor some minN}
d.{n|n=2mfor some minN,a n d n=3kfor some kinN}
e.{w|wis a string of 0sa n d 1sa n d wequals the reverse of w}
f.{n|nis an integer and n=n+1}
0.2 Write formal descriptions of the following sets.
a.The set containing the numbers 1, 10, and 100
b.The set containing all integers that are greater than 5
c.The set containing all natural numbers that are less than 5
d.The set containing the string aba
e.The set containing the empty string
f.The set containing nothing at all
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 50 ---
26 CHAPTER 0 / INTRODUCTION
0.3 LetAbe the set {x,y,z}andBbe the set {x,y}.
a.IsAas u b s e to f B?
b.IsBas u b s e to f A?
c.What is A∪B?
d.What is A∩B?
e.What is A×B?
f.What is the power set of B?
0.4 IfAhasaelements and Bhasbelements, how many elements are in A×B?
Explain your answer.
0.5 IfCis a set with celements, how many elements are in the power set of C?E x p l a i n
your answer.
0.6 LetXbe the set {1,2,3,4,5}andYbe the set {6,7,8,9,10}.T h eu n a r yf u n c t i o n
f:X−→Yand the binary function g:X×Y−→Yare described in the following
tables.
n
f(n)
1
 6
2
 7
3
 6
4
 7
5
 6g
6789 1 0
1
10 10 10 10 10
2
789 1 0 6
3
77889
4
9876 1 0
5
66666
a.What is the value of f(2)?
b.What are the range and domain of f?
c.What is the value of g(2,10)?
d.What are the range and domain of g?
e.What is the value of g(4,f(4))?
0.7 For each part, give a relation that satisfies the condition.
a.Reflexive and symmetric but not transitive
b.Reflexive and transitive but not symmetric
c.Symmetric and transitive but not reflexive
0.8 Consider the undirected graph G=(V,E)where V,t h es e to fn o d e s ,i s {1,2,3,4}
and E,t h es e to fe d g e s ,i s {{1,2},{2,3},{1,3},{2,4},{1,4}}.D r a w t h e
graph G.W h a t a r e t h e d e g r e e s o f e a c h n o d e ?I n d i c a t e a p a t h f r o m n o d e 3 t o
node 4 on your drawing of G.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 51 ---
PROBLEMS 27
0.9 Write a formal description of the following graph.
PROBLEMS
0.10 Find the error in the following proof that 2=1 .
Consider the equation a=b.Multiply both sides by ato obtain a2=ab.Subtract
b2from both sides to get a2−b2=ab−b2.Now factor each side, (a+b)(a−b)=
b(a−b),and divide each side by (a−b)to get a+b=b.Finally, let aandbequal 1,
which shows that 2=1 .
0.11 LetS(n)=1 + 2 + ···+nbe the sum of the first nnatural numbers and let
C(n)=13+23+···+n3be the sum of the first ncubes. Prove the following
equalities by induction on n,t oa r r i v ea tt h ec u r i o u sc o n c l u s i o nt h a t C(n)=S2(n)
for every n.
a.S(n)=1
2n(n+1 ).
b.C(n)=1
4(n4+2n3+n2)=1
4n2(n+1 )2.
0.12 Find the error in the following proof that all horses are the same color.
CLAIM :I na n ys e to f hhorses, all horses are the same color.
PROOF :B yi n d u c t i o no n h.
Basis: Forh=1.I n a n y s e t c o n t a i n i n g j u s t o n e h o r s e , a l l h o r s e s c l e a r l y a r e t h e
same color.
Induction step: Fork≥1,a s s u m et h a tt h ec l a i mi st r u ef o r h=kand prove that
it is true for h=k+1.T a k ea n ys e t Hofk+1horses. We show that all the horses
in this set are the same color. Remove one horse from this set to obtain the set H1
with just khorses. By the induction hypothesis, all the horses in H1are the same
color. Now replace the removed horse and remove a different one to obtain the set
H2.B yt h es a m ea r g u m e n t ,a l lt h eh o r s e si n H2are the same color. Therefore, all
the horses in Hmust be the same color, and the proof is complete.
0.13 Show that every graph with two or more nodes contains two nodes that have equal
degrees.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 52 ---
28 CHAPTER 0 / INTRODUCTION
A⋆0.14 Ramsey’s theorem. LetGbe a graph. A clique inGis a subgraph in which every
two nodes are connected by an edge. An anti-clique ,a l s oc a l l e da n independent
set,i sas u b g r a p hi nw h i c he v e r yt w on o d e sa r en o tc o n n e c t e db ya ne d g e . S h o w
that every graph with nnodes contains either a clique or an anti-clique with at least
1
2log2nnodes.
A0.15 Use Theorem 0.25 to derive a formula for calculating the size of the monthly pay-
ment for a mortgage in terms of the principal P,t h ei n t e r e s tr a t e I,a n dt h en u m b e r
of payments t.A s s u m e t h a t a f t e r tpayments have been made, the loan amount is
reduced to 0. Use the formula to calculate the dollar amount of each monthly pay-
ment for a 30-year mortgage with 360 monthly payments on an initial loan amount
of $100,000 with a 5% annual interest rate.
SELECTED SOLUTIONS
0.14 Make space for two piles of nodes: AandB.T h e n ,s t a r t i n gw i t ht h ee n t i r eg r a p h ,
repeatedly add each remaining node xtoAif its degree is greater than one half the
number of remaining nodes and to Botherwise, and discard all nodes to which x
isn’t (is) connected if it was added to A(B). Continue until no nodes are left. At
most half of the nodes are discarded at each of these steps, so at least log2nsteps
will occur before the process terminates. Each step adds a node to one of the piles,
so one of the piles ends up with at least1
2log2nnodes. The Apile contains the
nodes of a clique and the Bpile contains the nodes of an anti-clique.
0.15 We let Pt=0and solve for Yto get the formula: Y=PMt(M−1)/(Mt−1).
ForP=$ 1 0 0 ,000,I=0.05,a n d t=3 6 0 ,w eh a v e M=1+( 0 .05)/12.W eu s ea
calculator to find that Y≈$536.82is the monthly payment.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 53 ---
PART ONE
AUTOMATA AND LANGUAGES
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 54 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 55 ---
1
REGULAR LANGUAGES
The theory of computation begins with a question: What is a computer? It is
perhaps a silly question, as everyone knows that this thing I type on is a com-
puter. But these real computers are quite complicated—too much so to allow us
to set up a manageable mathematical theory of them directly. Instead, we use an
idealized computer called a computational model .A sw i t ha n ym o d e li ns c i e n c e ,
ac o m p u t a t i o n a lm o d e lm a yb ea c c u r a t ei ns o m ew a y sb u tp e r h a p sn o ti no t h e r s .
Thus we will use several different computational models, depending on the fea-
tures we want to focus on. We begin with the simplest model, called the finite
state machine orfinite automaton .
1.1
FINITE AUTOMATA
Finite automata are good models for computers with an extremely limited
amount of memory. What can a computer do with such a small memory? Many
useful things! In fact, we interact with such computers all the time, as they lie at
the heart of various electromechanical devices.
The controller for an automatic door is one example of such a device. Often
found at supermarket entrances and exits, automatic doors swing open when the
controller senses that a person is approaching. An automatic door has a pad
31
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 56 ---
32 CHAPTER 1 / REGULAR LANGUAGES
in front to detect the presence of a person about to walk through the doorway.
Another pad is located to the rear of the doorway so that the controller can hold
the door open long enough for the person to pass all the way through and also
so that the door does not strike someone standing behind it as it opens. This
configuration is shown in the following figure.
FIGURE 1.1
To p v i e w o f a n a u t o m a t i c d o o r
The controller is in either of two states: “ OPEN ”o r“ CLOSED ,” representing
the corresponding condition of the door. As shown in the following figures,
there are four possible input conditions: “ FRONT ”( m e a n i n gt h a tap e r s o ni s
standing on the pad in front of the doorway), “ REAR ”( m e a n i n gt h a tap e r s o ni s
standing on the pad to the rear of the doorway), “ BOTH ”( m e a n i n gt h a tp e o p l e
are standing on both pads), and “ NEITHER ”( m e a n i n gt h a tn oo n ei ss t a n d i n g
on either pad).
FIGURE 1.2
State diagram for an automatic door controller
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 57 ---
1.1 FINITE AUTOMATA 33
input signal
NEITHER FRONT REAR BOTH
CLOSED
 CLOSED OPEN CLOSED CLOSEDstate
OPEN
 CLOSED OPEN OPEN OPEN
FIGURE 1.3
State transition table for an automatic door controller
The controller moves from state to state, depending on the input it receives.
When in the CLOSED state and receiving input NEITHER orREAR ,i tr e m a i n s
in the CLOSED state. In addition, if the input BOTH is received, it stays CLOSED
because opening the door risks knocking someone over on the rear pad. But if
the input FRONT arrives, it moves to the OPEN state. In the OPEN state, if input
FRONT ,REAR ,o r BOTH is received, it remains in OPEN .I f i n p u t NEITHER
arrives, it returns to CLOSED .
For example, a controller might start in state CLOSED and receive the series
of input signals FRONT ,REAR ,NEITHER ,FRONT ,BOTH ,NEITHER ,REAR ,
and NEITHER .I tt h e nw o u l dg ot h r o u g ht h es e r i e so fs t a t e s CLOSED (starting),
OPEN ,OPEN ,CLOSED ,OPEN ,OPEN ,CLOSED ,CLOSED ,a n d CLOSED .
Thinking of an automatic door controller as a finite automaton is useful be-
cause that suggests standard ways of representation as in Figures 1.2 and 1.3.
This controller is a computer that has just a single bit of memory, capable of
recording which of the two states the controller is in. Other common devices
have controllers with somewhat larger memories. In an elevator controller, a
state may represent the floor the elevator is on and the inputs might be the sig-
nals received from the buttons. This computer might need several bits to keep
track of this information. Controllers for various household appliances such as
dishwashers and electronic thermostats, as well as parts of digital watches and
calculators, are additional examples of computers with limited memories. The
design of such devices requires keeping the methodology and terminology of
finite automata in mind.
Finite automata and their probabilistic counterpart Markov chains are useful
tools when we are attempting to recognize patterns in data. These devices are
used in speech processing and in optical character recognition. Markov chains
have even been used to model and predict price changes in financial markets.
We will now take a closer look at finite automata from a mathematical per-
spective. We will develop a precise definition of a finite automaton, terminology
for describing and manipulating finite automata, and theoretical results that de-
scribe their power and limitations. Besides giving you a clearer understanding
of what finite automata are and what they can and cannot do, this theoreti-
cal development will allow you to practice and become more comfortable with
mathematical definitions, theorems, and proofs in a relatively simple setting.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 58 ---
34 CHAPTER 1 / REGULAR LANGUAGES
In beginning to describe the mathematical theory of finite automata, we do
so in the abstract, without reference to any particular application. The following
figure depicts a finite automaton called M1.
FIGURE 1.4
Afi n i t ea u t o m a t o nc a l l e d M1that has three states
Figure 1.4 is called the state diagram ofM1.I th a st h r e e states ,l a b e l e d q1,q2,
andq3.T h e start state ,q1,i si n d i c a t e db yt h ea r r o wp o i n t i n ga ti tf r o mn o w h e r e .
The accept state ,q2,i st h eo n ew i t had o u b l ec i r c l e .T h ea r r o w sg o i n gf r o mo n e
state to another are called transitions .
When this automaton receives an input string such as 1101 ,i tp r o c e s s e st h a t
string and produces an output. The output is either accept orreject .W e w i l l
consider only this yes/no type of output for now to keep things simple. The
processing begins in M1’s start state. The automaton receives the symbols from
the input string one by one from left to right. After reading each symbol, M1
moves from one state to another along the transition that has that symbol as its
label. When it reads the last symbol, M1produces its output. The output is
accept ifM1is now in an accept state and reject if it is not.
For example, when we feed the input string 1101 into the machine M1in
Figure 1.4, the processing proceeds as follows:
1.Start in state q1.
2.Read 1,f o l l o wt r a n s i t i o nf r o m q1toq2.
3.Read 1,f o l l o wt r a n s i t i o nf r o m q2toq2.
4.Read 0,f o l l o wt r a n s i t i o nf r o m q2toq3.
5.Read 1,f o l l o wt r a n s i t i o nf r o m q3toq2.
6.Accept because M1is in an accept state q2at the end of the input.
Experimenting with this machine on a variety of input strings reveals that it
accepts the strings 1,01,11,a n d 0101010101 .I nf a c t , M1accepts any string that
ends with a 1,a si tg o e st oi t sa c c e p ts t a t e q2whenever it reads the symbol 1.I n
addition, it accepts strings 100,0100 ,110000 ,a n d 0101000000 ,a n da n ys t r i n g
that ends with an even number of 0sf o l l o w i n gt h el a s t 1.I tr e j e c t so t h e rs t r i n g s ,
such as 0,10,101000 . Can you describe the language consisting of all strings
thatM1accepts? We will do so shortly.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 59 ---
1.1 FINITE AUTOMATA 35
FORMAL DEFINITION OF A FINITE AUTOMATON
In the preceding section, we used state diagrams to introduce finite automata.
Now we define finite automata formally. Although state diagrams are easier to
grasp intuitively, we need the formal definition, too, for two specific reasons.
First, a formal definition is precise. It resolves any uncertainties about what
is allowed in a finite automaton. If you were uncertain about whether finite
automata were allowed to have 0accept states or whether they must have ex-
actly one transition exiting every state for each possible input symbol, you could
consult the formal definition and verify that the answer is yes in both cases. Sec-
ond, a formal definition provides notation. Good notation helps you think and
express your thoughts clearly.
The language of a formal definition is somewhat arcane, having some simi-
larity to the language of a legal document. Both need to be precise, and every
detail must be spelled out.
Afi n i t ea u t o m a t o nh a ss e v e r a lp a r t s .I th a sas e to fs t a t e sa n dr u l e sf o rg o i n g
from one state to another, depending on the input symbol. It has an input al-
phabet that indicates the allowed input symbols. It has a start state and a set of
accept states. The formal definition says that a finite automaton is a list of those
five objects: set of states, input alphabet, rules for moving, start state, and accept
states. In mathematical language, a list of five elements is often called a 5-tuple.
Hence we define a finite automaton to be a 5-tuple consisting of these five parts.
We use something called a transition function ,f r e q u e n t l yd e n o t e d δ,t od e -
fine the rules for moving. If the finite automaton has an arrow from a state x
to a state ylabeled with the input symbol 1,t h a tm e a n st h a ti ft h ea u t o m a t o ni s
in state xwhen it reads a 1,i tt h e nm o v e st os t a t e y.W e c a n i n d i c a t e t h es a m e
thing with the transition function by saying that δ(x,1)=y.T h i sn o t a t i o ni sa
kind of mathematical shorthand. Putting it all together, we arrive at the formal
definition of finite automata.
DEFINITION 1.5
Afinite automaton is a 5-tuple (Q,Σ,δ ,q 0,F),w h e r e
1.Qis a finite set called the states ,
2.Σis a finite set called the alphabet ,
3.δ:Q×Σ−→Qis the transition function ,1
4.q0∈Qis the start state ,a n d
5.F⊆Qis the set of accept states .2
1Refer back to page 7 if you are uncertain about the meaning of δ:Q×Σ−→Q.
2Accept states sometimes are called final states .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 60 ---
36 CHAPTER 1 / REGULAR LANGUAGES
The formal definition precisely describes what we mean by a finite automa-
ton. For example, returning to the earlier question of whether 0accept states is
allowable, you can see that setting Fto be the empty set ∅yields 0accept states,
which is allowable. Furthermore, the transition function δspecifies exactly one
next state for each possible combination of a state and an input symbol. That an-
swers our other question affirmatively, showing that exactly one transition arrow
exits every state for each possible input symbol.
We can use the notation of the formal definition to describe individual finite
automata by specifying each of the five parts listed in Definition 1.5. For exam-
ple, let’s return to the finite automaton M1we discussed earlier, redrawn here
for convenience.
FIGURE 1.6
The finite automaton M1
We can describe M1formally by writing M1=(Q,Σ,δ ,q 1,F),w h e r e
1.Q={q1,q2,q3},
2.Σ= {0,1},
3.δis described as
01
q1
q1q2
q2
q3q2
q3
q2q2,
4.q1is the start state, and
5.F={q2}.
IfAis the set of all strings that machine Maccepts, we say that Ais the
language of machine Mand write L(M)=A.W es a yt h a t Mrecognizes Aor
thatMaccepts A.B e c a u s et h et e r m accept has different meanings when we refer
to machines accepting strings and machines accepting languages, we prefer the
term recognize for languages in order to avoid confusion.
Am a c h i n em a ya c c e p ts e v e r a ls t r i n g s ,b u ti ta l w a y sr e c o g n i z e so n l yo n el a n -
guage. If the machine accepts no strings, it still recognizes one language—
namely, the empty language ∅.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 61 ---
1.1 FINITE AUTOMATA 37
In our example, let
A={w|wcontains at least one 1and
an even number of 0sf o l l o wt h el a s t 1}.
Then L(M1)=A,o re q u i v a l e n t l y , M1recognizes A.
EXAMPLES OF FINITE AUTOMATA
EXAMPLE 1.7
Here is the state diagram of finite automaton M2.
FIGURE 1.8
State diagram of the two-state finite automaton M2
In the formal description, M2is⎪parenleftbig
{q1,q2},{0,1},δ ,q 1,{q2}⎪parenrightbig
.T h e t r a n s i t i o n
function δis
01
q1
q1q2
q2
q1q2.
Remember that the state diagram of M2and the formal description of M2
contain the same information, only in different forms. You can always go from
one to the other if necessary.
Ag o o dw a yt ob e g i nu n d e r s t a n d i n ga n ym a c h i n ei st ot r yi to ns o m es a m p l e
input strings. When you do these “experiments” to see how the machine is
working, its method of functioning often becomes apparent. On the sample
string 1101 ,t h em a c h i n e M2starts in its start state q1and proceeds first to state
q2after reading the first 1,a n dt h e nt os t a t e s q2,q1,a n d q2after reading 1,0,
and1.T h es t r i n gi sa c c e p t e db e c a u s e q2is an accept state. But string 110leaves
M2in state q1,s oi ti sr e j e c t e d .A f t e rt r y i n gaf e wm o r ee x a m p l e s ,y o uw o u l ds e e
thatM2accepts all strings that end in a 1.T h u s L(M2)= {w|wends in a 1}.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 62 ---
38 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.9
Consider the finite automaton M3.
FIGURE 1.10
State diagram of the two-state finite automaton M3
Machine M3is similar to M2except for the location of the accept state. As
usual, the machine accepts all strings that leave it in an accept state when it has
finished reading. Note that because the start state is also an accept state, M3
accepts the empty string ε.A s s o o n a s a m a c h i n e b e g i n s r e a d i n g t h e e m p t y
string, it is at the end; so if the start state is an accept state, εis accepted. In
addition to the empty string, this machine accepts any string ending with a 0.
Here,
L(M3)={w|wis the empty string εor ends in a 0}.
EXAMPLE 1.11
The following figure shows a five-state machine M4.
FIGURE 1.12
Finite automaton M4
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 63 ---
1.1 FINITE AUTOMATA 39
Machine M4has two accept states, q1andr1,a n do p e r a t e so v e rt h ea l p h a b e t
Σ={a,b}.S o m ee x p e r i m e n t a t i o ns h o w st h a ti ta c c e p t ss t r i n g s a,b,aa,bb,a n d
bab,b u tn o ts t r i n g s ab,ba,o rbbba .T h i s m a c h i n e b e g i n s i n s t a t e s,a n da f t e r
it reads the first symbol in the input, it goes either left into the qstates or right
into the rstates. In both cases, it can never return to the start state (in contrast
to the previous examples), as it has no way to get from any other state back to s.
If the first symbol in the input string is a,t h e ni tg o e sl e f ta n da c c e p t sw h e nt h e
string ends with an a.S i m i l a r l y ,i ft h efi r s ts y m b o li sa b,t h em a c h i n eg o e sr i g h t
and accepts when the string ends in b.S o M4accepts all strings that start and
end with aor that start and end with b.I no t h e rw o r d s , M4accepts strings that
start and end with the same symbol.
EXAMPLE 1.13
Figure 1.14 shows the three-state machine M5,w h i c hh a saf o u r - s y m b o li n p u t
alphabet, Σ={⟨RESET ⟩,0,1,2}.W et r e a t ⟨RESET ⟩as a single symbol.
FIGURE 1.14
Finite automaton M5
Machine M5keeps a running count of the sum of the numerical input symbols
it reads, modulo 3.E v e r yt i m ei tr e c e i v e st h e ⟨RESET ⟩symbol, it resets the count
to0.I ta c c e p t si ft h es u mi s 0modulo 3,o ri no t h e rw o r d s ,i ft h es u mi sam u l t i p l e
of3.
Describing a finite automaton by state diagram is not possible in some cases.
That may occur when the diagram would be too big to draw or if, as in the next
example, the description depends on some unspecified parameter. In these cases,
we resort to a formal description to specify the machine.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 64 ---
40 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.15
Consider a generalization of Example 1.13, using the same four-symbol alpha-
betΣ.F o re a c h i≥1letAibe the language of all strings where the sum of the
numbers is a multiple of i,e x c e p tt h a tt h es u mi sr e s e tt o 0whenever the symbol
⟨RESET ⟩appears. For each Aiwe give a finite automaton Bi,r e c o g n i z i n g Ai.
We describe the machine Biformally as follows: Bi=(Qi,Σ,δi,q0,{q0}),
where Qiis the set of istates {q0,q1,q2,...,q i−1},a n dw ed e s i g nt h et r a n s i -
tion function δiso that for each j,i fBiis in qj,t h er u n n i n gs u mi s j,m o d u l o i.
For each qjlet
δi(qj,0)=qj,
δi(qj,1)=qk,where k=j+1modulo i,
δi(qj,2)=qk,where k=j+2modulo i,a n d
δi(qj,⟨RESET ⟩)=q0.
FORMAL DEFINITION OF COMPUTATION
So far we have described finite automata informally, using state diagrams, and
with a formal definition, as a 5-tuple. The informal description is easier to grasp
at first, but the formal definition is useful for making the notion precise, resolv-
ing any ambiguities that may have occurred in the informal description. Next we
do the same for a finite automaton’s computation. We already have an informal
idea of the way it computes, and we now formalize it mathematically.
LetM=(Q,Σ,δ ,q 0,F)be a finite automaton and let w=w1w2···wnbe
as t r i n gw h e r ee a c h wiis a member of the alphabet Σ.T h e n Maccepts wif a
sequence of states r0,r1,...,r ninQexists with three conditions:
1.r0=q0,
2.δ(ri,wi+1)=ri+1,f o r i=0,...,n −1,a n d
3.rn∈F.
Condition 1 says that the machine starts in the start state. Condition 2 says
that the machine goes from state to state according to the transition function.
Condition 3 says that the machine accepts its input if it ends up in an accept
state. We say that Mrecognizes language AifA={w|Maccepts w}.
DEFINITION 1.16
Al a n g u a g ei sc a l l e da regular language if some finite automaton
recognizes it.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 65 ---
1.1 FINITE AUTOMATA 41
EXAMPLE 1.17
Ta k e m a c h i n e M5from Example 1.13. Let wbe the string
10⟨RESET ⟩22⟨RESET ⟩012.
Then M5accepts waccording to the formal definition of computation because
the sequence of states it enters when computing on wis
q0,q1,q1,q0,q2,q1,q0,q0,q1,q0,
which satisfies the three conditions. The language of M5is
L(M5)={w|the sum of the symbols in wis 0 modulo 3,
except that ⟨RESET ⟩resets the count to 0 }.
AsM5recognizes this language, it is a regular language.
DESIGNING FINITE AUTOMATA
Whether it be of automaton or artwork, design is a creative process. As such,
it cannot be reduced to a simple recipe or formula. However, you might find
ap a r t i c u l a ra p p r o a c hh e l p f u lw h e nd e s i g n i n gv a r i o u st y p e so fa u t o m a t a . T h a t
is, put yourself in the place of the machine you are trying to design and then see
how you would go about performing the machine’s task. Pretending that you are
the machine is a psychological trick that helps engage your whole mind in the
design process.
Let’s design a finite automaton using the “reader as automaton” method just
described. Suppose that you are given some language and want to design a finite
automaton that recognizes it. Pretending to be the automaton, you receive an
input string and must determine whether it is a member of the language the
automaton is supposed to recognize. You get to see the symbols in the string
one by one. After each symbol, you must decide whether the string seen so far is
in the language. The reason is that you, like the machine, don’t know when the
end of the string is coming, so you must always be ready with the answer.
First, in order to make these decisions, you have to figure out what you need
to remember about the string as you are reading it. Why not simply remember
all you have seen? Bear in mind that you are pretending to be a finite automaton
and that this type of machine has only a finite number of states, which means
a finite memory. Imagine that the input is extremely long—say, from here to
the moon—so that you could not possibly remember the entire thing. You have
a finite memory—say, a single sheet of paper—which has a limited storage ca-
pacity. Fortunately, for many languages you don’t need to remember the entire
input. You need to remember only certain crucial information. Exactly which
information is crucial depends on the particular language considered.
For example, suppose that the alphabet is {0,1}and that the language consists
of all strings with an odd number of 1s. You want to construct a finite automaton
E1to recognize this language. Pretending to be the automaton, you start getting
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 66 ---
42 CHAPTER 1 / REGULAR LANGUAGES
an input string of 0sa n d 1ss y m b o lb ys y m b o l . D oy o un e e dt or e m e m b e rt h e
entire string seen so far in order to determine whether the number of 1si so d d ?
Of course not. Simply remember whether the number of 1ss e e ns of a ri se v e n
or odd and keep track of this information as you read new symbols. If you read
a1,fl i pt h ea n s w e r ;b u ti fy o ur e a da 0,l e a v et h ea n s w e ra si s .
But how does this help you design E1?O n c ey o uh a v ed e t e r m i n e dt h en e c e s -
sary information to remember about the string as it is being read, you represent
this information as a finite list of possibilities. In this instance, the possibilities
would be
1.even so far, and
2.odd so far.
Then you assign a state to each of the possibilities. These are the states of E1,a s
shown here.FIGURE 1.18
The two states qevenandqodd
Next, you assign the transitions by seeing how to go from one possibility to
another upon reading a symbol. So, if state qevenrepresents the even possibility
and state qoddrepresents the odd possibility, you would set the transitions to flip
state on a 1and stay put on a 0,a ss h o w nh e r e .
FIGURE 1.19
Tr a n s i t i o n s t e l l i n g h o w t h e p o s s i b i l i t i e s r e a r r a n g e
Next, you set the start state to be the state corresponding to the possibility
associated with having seen 0 symbols so far (the empty string ε). In this case,
the start state corresponds to state qevenbecause 0 is an even number. Last, set
the accept states to be those corresponding to possibilities where you want to
accept the input string. Set qoddto be an accept state because you want to accept
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 67 ---
1.1 FINITE AUTOMATA 43
when you have seen an odd number of 1s. These additions are shown in the
following figure.
FIGURE 1.20
Adding the start and accept states
EXAMPLE 1.21
This example shows how to design a finite automaton E2to recognize the regu-
lar language of all strings that contain the string 001as a substring. For example,
0010 ,1001 ,001,a n d 11111110011111 are all in the language, but 11and0000
are not. How would you recognize this language if you were pretending to
beE2?A s s y m b o l s c o m e i n , y o u w o u l d i n i t i a l l y s k i p o v e r a l l 1s. If you come
to a0,t h e ny o un o t et h a ty o um a yh a v ej u s ts e e nt h efi r s to ft h et h r e es y m b o l s
in the pattern 001you are seeking. If at this point you see a 1,t h e r ew e r et o o
few0s, so you go back to skipping over 1s. But if you see a 0at that point, you
should remember that you have just seen two symbols of the pattern. Now you
simply need to continue scanning until you see a 1.I fy o ufi n di t ,r e m e m b e rt h a t
you succeeded in finding the pattern and continue reading the input string until
you get to the end.
So there are four possibilities: You
1.haven’t just seen any symbols of the pattern,
2.have just seen a 0,
3.have just seen 00,o r
4.have seen the entire pattern 001.
Assign the states q,q0,q00,a n d q001to these possibilities. You can assign the
transitions by observing that from qreading a 1you stay in q,b u tr e a d i n ga 0you
move to q0.I nq0reading a 1you return to q,b u tr e a d i n ga 0you move to q00.
Inq00reading a 1you move to q001,b u tr e a d i n ga 0leaves you in q00.F i n a l l y ,i n
q001reading a 0or a1leaves you in q001.T h es t a r ts t a t ei s q,a n dt h eo n l ya c c e p t
state is q001,a ss h o w ni nF i g u r e1 . 2 2 .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 68 ---
44 CHAPTER 1 / REGULAR LANGUAGES
FIGURE 1.22
Accepts strings containing 001
THE REGULAR OPERATIONS
In the preceding two sections, we introduced and defined finite automata and
regular languages. We now begin to investigate their properties. Doing so will
help develop a toolbox of techniques for designing automata to recognize partic-
ular languages. The toolbox also will include ways of proving that certain other
languages are nonregular (i.e., beyond the capability of finite automata).
In arithmetic, the basic objects are numbers and the tools are operations for
manipulating them, such as +and×.I n t h e t h e o r y o f c o m p u t a t i o n , t h e o b -
jects are languages and the tools include operations specifically designed for
manipulating them. We define three operations on languages, called the reg-
ular operations ,a n du s et h e mt os t u d yp r o p e r t i e so ft h er e g u l a rl a n g u a g e s .
DEFINITION 1.23
LetAandBbe languages. We define the regular operations union ,
concatenation ,a n d staras follows:
•Union :A∪B={x|x∈Aorx∈B}.
•Concatenation :A◦B={xy|x∈Aandy∈B}.
•Star:A∗={x1x2...x k|k≥0and each xi∈A}.
You are already familiar with the union operation. It simply takes all the
strings in both AandBand lumps them together into one language.
The concatenation operation is a little trickier. It attaches a string from A
in front of a string from Bin all possible ways to get the strings in the new
language.
The star operation is a bit different from the other two because it applies to
as i n g l el a n g u a g er a t h e rt h a nt ot w od i f f e r e n tl a n g u a g e s .T h a ti s ,t h es t a ro p e r -
ation is a unary operation instead of a binary operation .I t w o r k s b y a t t a c h i n g
any number of strings in Atogether to get a string in the new language. Because
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 69 ---
1.1 FINITE AUTOMATA 45
“any number” includes 0 as a possibility, the empty string εis always a member
ofA∗,n om a t t e rw h a t Ais.
EXAMPLE 1.24
Let the alphabet Σbe the standard 26 letters {a,b,...,z}.I fA={good ,bad}
andB={boy,girl },t h e n
A∪B={good ,bad,boy,girl },
A◦B={goodboy ,goodgirl ,badboy ,badgirl },a n d
A∗={ε,good ,bad,goodgood ,goodbad ,badgood ,badbad ,
goodgoodgood ,goodgoodbad ,goodbadgood ,goodbadbad ,...}.
LetN={1,2,3,...}be the set of natural numbers. When we say that N
isclosed under multiplication ,w em e a nt h a tf o ra n y xandyinN,t h ep r o d u c t
x×yalso is in N.I n c o n t r a s t , Nis not closed under division, as 1 and 2 are
inNbut1/2is not. Generally speaking, a collection of objects is closed under
some operation if applying that operation to members of the collection returns
an object still in the collection. We show that the collection of regular languages
is closed under all three of the regular operations. In Section 1.3, we show that
these are useful tools for manipulating regular languages and understanding the
power of finite automata. We begin with the union operation.
THEOREM 1.25
The class of regular languages is closed under the union operation.
In other words, if A1andA2are regular languages, so is A1∪A2.
PROOF IDEA We have regular languages A1andA2and want to show that
A1∪A2also is regular. Because A1andA2are regular, we know that some finite
automaton M1recognizes A1and some finite automaton M2recognizes A2.T o
prove that A1∪A2is regular, we demonstrate a finite automaton, call it M,t h a t
recognizes A1∪A2.
This is a proof by construction. We construct Mfrom M1andM2.M a c h i n e
Mmust accept its input exactly when either M1orM2would accept it in order
to recognize the union language. It works by simulating both M1andM2and
accepting if either of the simulations accept.
How can we make machine Msimulate M1andM2?P e r h a p si tfi r s ts i m u l a t e s
M1on the input and then simulates M2on the input. But we must be careful
here! Once the symbols of the input have been read and used to simulate M1,
we can’t “rewind the input tape” to try the simulation on M2.W en e e da n o t h e r
approach.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 70 ---
46 CHAPTER 1 / REGULAR LANGUAGES
Pretend that you are M.A st h ei n p u ts y m b o l sa r r i v eo n eb yo n e ,y o us i m u l a t e
both M1andM2simultaneously. That way, only one pass through the input is
necessary. But can you keep track of both simulations with finite memory? All
you need to remember is the state that each machine would be in if it had read
up to this point in the input. Therefore, you need to remember a pair of states.
How many possible pairs are there? If M1hask1states and M2hask2states, the
number of pairs of states, one from M1and the other from M2,i st h ep r o d u c t
k1×k2.T h i sp r o d u c tw i l lb et h en u m b e ro fs t a t e si n M,o n ef o re a c hp a i r .T h e
transitions of Mgo from pair to pair, updating the current state for both M1and
M2.T h e a c c e p t s t a t e s o f Mare those pairs wherein either M1orM2is in an
accept state.
PROOF
LetM1recognize A1,w h e r e M1=(Q1,Σ,δ1,q1,F1),a n d
M2recognize A2,w h e r e M2=(Q2,Σ,δ2,q2,F2).
Construct Mto recognize A1∪A2,w h e r e M=(Q,Σ,δ ,q 0,F).
1.Q={(r1,r2)|r1∈Q1andr2∈Q2}.
This set is the Cartesian product of sets Q1andQ2and is written Q1×Q2.
It is the set of all pairs of states, the first from Q1and the second from Q2.
2.Σ,t h ea l p h a b e t ,i st h es a m ea si n M1andM2.I n t h i s t h e o r e m a n d i n a l l
subsequent similar theorems, we assume for simplicity that both M1and
M2have the same input alphabet Σ.T h e t h e o r e m r e m a i n s t r u e i f t h e y
have different alphabets, Σ1andΣ2.W e w o u l d t h e n m o d i f y t h e p r o o f t o
letΣ=Σ 1∪Σ2.
3.δ,t h et r a n s i t i o nf u n c t i o n ,i sd e fi n e da sf o l l o w s .F o re a c h (r1,r2)∈Qand
each a∈Σ,l e t
δ⎪parenleftbig
(r1,r2),a⎪parenrightbig
=⎪parenleftbig
δ1(r1,a),δ2(r2,a)⎪parenrightbig
.
Hence δgets a state of M(which actually is a pair of states from M1and
M2), together with an input symbol, and returns M’s next state.
4.q0is the pair (q1,q2).
5.Fis the set of pairs in which either member is an accept state of M1orM2.
We can write it as
F={(r1,r2)|r1∈F1orr2∈F2}.
This expression is the same as F=(F1×Q2)∪(Q1×F2).( N o t et h a ti ti s
notthe same as F=F1×F2.W h a tw o u l dt h a tg i v eu si n s t e a d ?3)
3This expression would define M’s accept states to be those for which bothmembers of
the pair are accept states. In this case, Mwould accept a string only if both M1andM2
accept it, so the resulting language would be the intersection and not the union. In fact,
this result proves that the class of regular languages is closed under intersection.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 71 ---
1.2 NONDETERMINISM 47
This concludes the construction of the finite automaton Mthat recognizes
the union of A1andA2.T h i sc o n s t r u c t i o ni sf a i r l ys i m p l e ,a n dt h u si t sc o r r e c t -
ness is evident from the strategy described in the proof idea. More complicated
constructions require additional discussion to prove correctness. A formal cor-
rectness proof for a construction of this type usually proceeds by induction. For
an example of a construction proved correct, see the proof of Theorem 1.54.
Most of the constructions that you will encounter in this course are fairly simple
and so do not require a formal correctness proof.
We have just shown that the union of two regular languages is regular, thereby
proving that the class of regular languages is closed under the union operation.
We now turn to the concatenation operation and attempt to show that the class
of regular languages is closed under that operation, too.
THEOREM 1.26
The class of regular languages is closed under the concatenation operation.
In other words, if A1andA2are regular languages then so is A1◦A2.
To p r o v e t h i s t h e o r e m , l e t ’s t r y s o m e t h i n g a l o n g t h e l i n e s o f t h e p r o o f o f t h e
union case. As before, we can start with finite automata M1andM2recognizing
the regular languages A1andA2.B u t n o w , i n s t e a d o f c o n s t r u c t i n g a u t o m a t o n
Mto accept its input if either M1orM2accept, it must accept if its input can
be broken into two pieces, where M1accepts the first piece and M2accepts the
second piece. The problem is that Mdoesn’t know where to break its input
(i.e., where the first part ends and the second begins). T o solve this problem, we
introduce a new technique called nondeterminism.
1.2
NONDETERMINISM
Nondeterminism is a useful concept that has had great impact on the theory of
computation. So far in our discussion, every step of a computation follows in a
unique way from the preceding step. When the machine is in a given state and
reads the next input symbol, we know what the next state will be—it is deter-
mined. We call this deterministic computation. In a nondeterministic machine,
several choices may exist for the next state at any point.
Nondeterminism is a generalization of determinism, so every deterministic
finite automaton is automatically a nondeterministic finite automaton. As Fig-
ure 1.27 shows, nondeterministic finite automata may have additional features.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 72 ---
48 CHAPTER 1 / REGULAR LANGUAGES
FIGURE 1.27
The nondeterministic finite automaton N1
The difference between a deterministic finite automaton, abbreviated DFA,
and a nondeterministic finite automaton, abbreviated NFA,i si m m e d i a t e l ya p -
parent. First, every state of a DFAalways has exactly one exiting transition arrow
for each symbol in the alphabet. The NFAshown in Figure 1.27 violates that
rule. State q1has one exiting arrow for 0,b u ti th a st w of o r 1;q2has one arrow
for0,b u ti th a sn o n ef o r 1.I n a n NFA,as t a t em a yh a v ez e r o ,o n e ,o rm a n y
exiting arrows for each alphabet symbol.
Second, in a DFA,l a b e l so nt h et r a n s i t i o na r r o w sa r es y m b o l sf r o mt h ea l p h a -
bet. This NFAhas an arrow with the label ε.I ng e n e r a l ,a n NFAmay have arrows
labeled with members of the alphabet or ε.Z e r o ,o n e ,o rm a n ya r r o w sm a ye x i t
from each state with the label ε.
How does an NFAcompute? Suppose that we are running an NFAon an input
string and come to a state with multiple ways to proceed. For example, say that
we are in state q1inNFAN1and that the next input symbol is a 1.A f t e rr e a d i n g
that symbol, the machine splits into multiple copies of itself and follows allthe
possibilities in parallel. Each copy of the machine takes one of the possible ways
to proceed and continues as before. If there are subsequent choices, the machine
splits again. If the next input symbol doesn’t appear on any of the arrows exiting
the state occupied by a copy of the machine, that copy of the machine dies, along
with the branch of the computation associated with it. Finally, if any one of these
copies of the machine is in an accept state at the end of the input, the NFAaccepts
the input string.
If a state with an εsymbol on an exiting arrow is encountered, something
similar happens. Without reading any input, the machine splits into multiple
copies, one following each of the exiting ε-labeled arrows and one staying at the
current state. Then the machine proceeds nondeterministically as before.
Nondeterminism may be viewed as a kind of parallel computation wherein
multiple independent “processes” or “threads” can be running concurrently.
When the NFAsplits to follow several choices, that corresponds to a process
“forking” into several children, each proceeding separately. If at least one of
these processes accepts, then the entire computation accepts.
Another way to think of a nondeterministic computation is as a tree of possi-
bilities. The root of the tree corresponds to the start of the computation. Every
branching point in the tree corresponds to a point in the computation at which
the machine has multiple choices. The machine accepts if at least one of the
computation branches ends in an accept state, as shown in Figure 1.28.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 73 ---
1.2 NONDETERMINISM 49
FIGURE 1.28
Deterministic and nondeterministic computations with an accepting
branch
Let’s consider some sample runs of the NFAN1shown in Figure 1.27. The
computation of N1on input 010110 is depicted in the following figure.
FIGURE 1.29
The computation of N1on input 010110
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 74 ---
50 CHAPTER 1 / REGULAR LANGUAGES
On input 010110 ,s t a r ti nt h es t a r ts t a t e q1and read the first symbol 0.F r o m
q1there is only one place to go on a 0—namely, back to q1—so remain there.
Next, read the second symbol 1.I nq1on a 1there are two choices: either stay in
q1or move to q2.N o n d e t e r m i n i s t i c a l l y ,t h em a c h i n es p l i t si nt w ot of o l l o we a c h
choice. Keep track of the possibilities by placing a finger on each state where a
machine could be. So you now have fingers on states q1andq2.A n εarrow exits
state q2so the machine splits again; keep one finger on q2,a n dm o v et h eo t h e r
toq3.Y o un o wh a v efi n g e r so n q1,q2,a n d q3.
When the third symbol 0is read, take each finger in turn. Keep the finger
onq1in place, move the finger on q2toq3,a n dr e m o v et h efi n g e rt h a th a sb e e n
onq3.T h a t l a s t fi n g e r h a d n o 0arrow to follow and corresponds to a process
that simply “dies.” At this point, you have fingers on states q1andq3.
When the fourth symbol 1is read, split the finger on q1into fingers on states
q1andq2,t h e nf u r t h e rs p l i tt h efi n g e ro n q2to follow the εarrow to q3,a n d
move the finger that was on q3toq4.Y o un o wh a v eafi n g e ro ne a c ho ft h ef o u r
states.
When the fifth symbol 1is read, the fingers on q1andq3result in fingers on
states q1,q2,q3,a n d q4,a sy o us a ww i t ht h ef o u r t hs y m b o l .T h efi n g e ro ns t a t e
q2is removed. The finger that was on q4stays on q4.N o wy o uh a v et w ofi n g e r s
onq4,s or e m o v eo n eb e c a u s ey o uo n l yn e e dt or e m e m b e rt h a t q4is a possible
state at this point, not that it is possible for multiple reasons.
When the sixth and final symbol 0is read, keep the finger on q1in place,
move the one on q2toq3,r e m o v et h eo n et h a tw a so n q3,a n dl e a v et h eo n eo n
q4in place. You are now at the end of the string, and you accept if some finger is
on an accept state. You have fingers on states q1,q3,a n d q4;a n da s q4is an accept
state, N1accepts this string.
What does N1do on input 010?S t a r t w i t ha fi n g e ro n q1.A f t e rr e a d i n gt h e
0,y o us t i l lh a v eafi n g e ro n l yo n q1;b u ta f t e rt h e 1there are fingers on q1,q2,
andq3(don’t forget the εarrow). After the third symbol 0,r e m o v et h efi n g e r
onq3,m o v et h efi n g e ro n q2toq3,a n dl e a v et h efi n g e ro n q1where it is. At this
point you are at the end of the input; and as no finger is on an accept state, N1
rejects this input.
By continuing to experiment in this way, you will see that N1accepts all
strings that contain either 101or11as a substring.
Nondeterministic finite automata are useful in several respects. As we will
show, every NFAcan be converted into an equivalent DFA,a n dc o n s t r u c t i n g
NFAsi ss o m e t i m e se a s i e rt h a nd i r e c t l yc o n s t r u c t i n g DFAs. An NFAmay be much
smaller than its deterministic counterpart, or its functioning may be easier to
understand. Nondeterminism in finite automata is also a good introduction
to nondeterminism in more powerful computational models because finite au-
tomata are especially easy to understand. Now we turn to several examples of
NFAs.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 75 ---
1.2 NONDETERMINISM 51
EXAMPLE 1.30
LetAbe the language consisting of all strings over {0,1}containing a 1in the
third position from the end (e.g., 000100 is inAbut0011 is not). The following
four-state NFAN2recognizes A.FIGURE 1.31
The NFAN2recognizing A
One good way to view the computation of this NFAis to say that it stays in the
start state q1until it “guesses” that it is three places from the end. At that point,
if the input symbol is a 1,i tb r a n c h e st os t a t e q2and uses q3andq4to “check” on
whether its guess was correct.
As mentioned, every NFAcan be converted into an equivalent DFA;b u ts o m e -
times that DFAmay have many more states. The smallest DFAforAcontains
eight states. Furthermore, understanding the functioning of the NFAis much
easier, as you may see by examining the following figure for the DFA.
FIGURE 1.32
ADFArecognizing A
Suppose that we added εto the labels on the arrows going from q2toq3and
from q3toq4in machine N2in Figure 1.31. So both arrows would then have
the label 0,1,εinstead of just 0,1.W h a tl a n g u a g ew o u l d N2recognize with this
modification? T ry modifying the DFAin Figure 1.32 to recognize that language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 76 ---
52 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.33
The following NFAN3has an input alphabet {0}consisting of a single symbol.
An alphabet containing only one symbol is called a unary alphabet .
FIGURE 1.34
The NFAN3
This machine demonstrates the convenience of having εarrows. It accepts
all strings of the form 0kwhere kis a multiple of 2or3.( R e m e m b e r t h a t t h e
superscript denotes repetition, not numerical exponentiation.) For example, N3
accepts the strings ε,00,000,0000 ,a n d 000000 ,b u tn o t 0or00000 .
Think of the machine operating by initially guessing whether to test for a
multiple of 2or a multiple of 3by branching into either the top loop or the bot-
tom loop and then checking whether its guess was correct. Of course, we could
replace this machine by one that doesn’t have εarrows or even any nondeter-
minism at all, but the machine shown is the easiest one to understand for this
language.
EXAMPLE 1.35
We give another example of an NFAin Figure 1.36. Practice with it to satisfy
yourself that it accepts the strings ε,a,baba ,a n d baa,b u tt h a ti td o e s n ’ ta c -
cept the strings b,bb,a n d babba .L a t e r w e u s e t h i s m a c h i n e t o i l l u s t r a t e t h e
procedure for converting NFAst oDFAs.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 77 ---
1.2 NONDETERMINISM 53
FIGURE 1.36
The NFAN4
FORMAL DEFINITION OF A
NONDETERMINISTIC FINITE AUTOMATON
The formal definition of a nondeterministic finite automaton is similar to that of
ad e t e r m i n i s t i cfi n i t ea u t o m a t o n .B o t hh a v es t a t e s ,a ni n p u ta l p h a b e t ,at r a n s i t i o n
function, a start state, and a collection of accept states. However, they differ in
one essential way: in the type of transition function. In a DFA,t h et r a n s i t i o n
function takes a state and an input symbol and produces the next state. In an
NFA,t h et r a n s i t i o nf u n c t i o nt a k e sas t a t ea n da ni n p u ts y m b o l or the empty string
and produces the set of possible next states .I no r d e rt ow r i t et h ef o r m a ld e fi n i t i o n ,
we need to set up some additional notation. For any set Qwe write P(Q)to be
the collection of all subsets of Q.H e r e P(Q)is called the power set ofQ.F o ra n y
alphabet Σwe write Σεto be Σ∪{ε}.N o ww ec a nw r i t et h ef o r m a ld e s c r i p t i o n
of the type of the transition function in an NFAasδ:Q×Σε−→ P(Q).
DEFINITION 1.37
Anondeterministic finite automaton is a 5-tuple (Q,Σ,δ ,q 0,F),
where
1.Qis a finite set of states,
2.Σis a finite alphabet,
3.δ:Q×Σε−→ P(Q)is the transition function,
4.q0∈Qis the start state, and
5.F⊆Qis the set of accept states.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 78 ---
54 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.38
Recall the NFAN1:The formal description of N1is(Q,Σ,δ ,q 1,F),w h e r e
1.Q={q1,q2,q3,q4},
2.Σ= {0,1},
3.δis given as
01 ε
q1
{q1}{q1,q2}∅
q2
{q3}∅{ q3}
q3
∅{ q4}∅
q4
{q4}{ q4}∅ ,
4.q1is the start state, and
5.F={q4}.
The formal definition of computation for an NFAis similar to that for a DFA.
LetN=(Q,Σ,δ ,q 0,F)be an NFAandwas t r i n go v e rt h ea l p h a b e t Σ.T h e n
we say that Naccepts wif we can write wasw=y1y2···ym,w h e r ee a c h yi
is a member of Σεand a sequence of states r0,r1,...,r mexists in Qwith three
conditions:
1.r0=q0,
2.ri+1∈δ(ri,yi+1),f o r i=0,...,m −1,a n d
3.rm∈F.
Condition 1 says that the machine starts out in the start state. Condition 2 says
that state ri+1is one of the allowable next states when Nis in state riand reading
yi+1.O b s e r v et h a t δ(ri,yi+1)is the setof allowable next states and so we say that
ri+1is a member of that set. Finally, condition 3 says that the machine accepts
its input if the last state is an accept state.
EQUIVALENCE OF NFAS AND DFAS
Deterministic and nondeterministic finite automata recognize the same class of
languages. Such equivalence is both surprising and useful. It is surprising be-
cause NFAsa p p e a rt oh a v em o r ep o w e rt h a n DFAs, so we might expect that NFAs
recognize more languages. It is useful because describing an NFAfor a given
language sometimes is much easier than describing a DFAfor that language.
Say that two machines are equivalent if they recognize the same language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 79 ---
1.2 NONDETERMINISM 55
THEOREM 1.39
Every nondeterministic finite automaton has an equivalent deterministic finite
automaton.
PROOF IDEA If a language is recognized by an NFA,t h e nw em u s ts h o wt h e
existence of a DFAthat also recognizes it. The idea is to convert the NFAinto an
equivalent DFAthat simulates the NFA.
Recall the “reader as automaton” strategy for designing finite automata. How
would you simulate the NFAif you were pretending to be a DFA?W h a td oy o u
need to keep track of as the input string is processed? In the examples of NFAs,
you kept track of the various branches of the computation by placing a finger
on each state that could be active at given points in the input. You updated the
simulation by moving, adding, and removing fingers according to the way the
NFAoperates. All you needed to keep track of was the set of states having fingers
on them.
Ifkis the number of states of the NFA,i th a s 2ksubsets of states. Each subset
corresponds to one of the possibilities that the DFAmust remember, so the DFA
simulating the NFAwill have 2kstates. Now we need to figure out which will
be the start state and accept states of the DFA,a n dw h a tw i l lb ei t st r a n s i t i o n
function. We can discuss this more easily after setting up some formal notation.
PROOF LetN=(Q,Σ,δ ,q 0,F)be the NFArecognizing some language A.
We construct a DFAM=(Q′,Σ,δ′,q0′,F′)recognizing A.B e f o r ed o i n gt h ef u l l
construction, let’s first consider the easier case wherein Nhas no εarrows. Later
we take the εarrows into account.
1.Q′=P(Q).
Every state of Mis a set of states of N.R e c a l l t h a t P(Q)is the set of
subsets of Q.
2.ForR∈Q′anda∈Σ,l e tδ′(R, a)={q∈Q|q∈δ(r, a)for some r∈R}.
IfRis a state of M,i ti sa l s oas e to fs t a t e so f N.W h e n Mreads a symbol
ain state R,i ts h o w sw h e r e atakes each state in R.B e c a u s ee a c hs t a t em a y
go to a set of states, we take the union of all these sets. Another way to
write this expression is
δ′(R, a)=⎪uniondisplay
r∈Rδ(r, a).4
3.q0′={q0}.
Mstarts in the state corresponding to the collection containing just the
start state of N.
4.F′={R∈Q′|Rcontains an accept state of N}.
The machine Maccepts if one of the possible states that Ncould be in at
this point is an accept state.
4The notation⎨uniondisplay
r∈Rδ(r, a)means: the union of the sets δ(r, a)for each possible rinR.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 80 ---
56 CHAPTER 1 / REGULAR LANGUAGES
Now we need to consider the εarrows. T o do so, we set up an extra bit of
notation. For any state RofM,w ed e fi n e E(R)to be the collection of states
that can be reached from members of Rby going only along εarrows, including
the members of Rthemselves. Formally, for R⊆Qlet
E(R)={q|qcan be reached from Rby traveling along 0 or more εarrows }.
Then we modify the transition function of Mto place additional fingers on all
states that can be reached by going along εarrows after every step. Replacing
δ(r, a)byE(δ(r, a))achieves this effect. Thus
δ′(R, a)={q∈Q|q∈E(δ(r, a))for some r∈R}.
Additionally, we need to modify the start state of Mto move the fingers ini-
tially to all possible states that can be reached from the start state of Nalong
theεarrows. Changing q0′to be E({q0})achieves this effect. We have now
completed the construction of the DFAMthat simulates the NFAN.
The construction of Mobviously works correctly. At every step in the com-
putation of Mon an input, it clearly enters a state that corresponds to the subset
of states that Ncould be in at that point. Thus our proof is complete.
Theorem 1.39 states that every NFAcan be converted into an equivalent DFA.
Thus nondeterministic finite automata give an alternative way of characterizing
the regular languages. We state this fact as a corollary of Theorem 1.39.
COROLLARY 1.40
Al a n g u a g ei sr e g u l a ri fa n do n l yi fs o m en o n d e t e r m i n i s t i cfi n i t ea u t o m a t o nr e c -
ognizes it.
One direction of the “if and only if” condition states that a language is regular
if some NFArecognizes it. Theorem 1.39 shows that any NFAcan be converted
into an equivalent DFA. Consequently, if an NFArecognizes some language, so
does some DFA,a n dh e n c et h el a n g u a g ei sr e g u l a r . T h eo t h e rd i r e c t i o no ft h e
“if and only if” condition states that a language is regular only if some NFArec-
ognizes it. That is, if a language is regular, some NFAmust be recognizing it.
Obviously, this condition is true because a regular language has a DFArecogniz-
ing it and any DFAis also an NFA.
EXAMPLE 1.41
Let’s illustrate the procedure we gave in the proof of Theorem 1.39 for convert-
ing an NFAto a DFAby using the machine N4that appears in Example 1.35. For
clarity, we have relabeled the states of N4to be {1,2,3}.T h u s i n t h e f o r m a l
description of N4=(Q,{a,b},δ ,1,{1}),t h es e to fs t a t e s Qis{1,2,3}as shown
in Figure 1.42.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 81 ---
1.2 NONDETERMINISM 57
To c o n s t r u c t a DFADthat is equivalent to N4,w efi r s td e t e r m i n e D’s states.
N4has three states, {1,2,3},s ow ec o n s t r u c t Dwith eight states, one for each
subset of N4’s states. We label each of D’s states with the corresponding subset.
Thus D’s state set is
⎪braceleftbig
∅,{1},{2},{3},{1,2},{1,3},{2,3},{1,2,3}⎪bracerightbig
.
FIGURE 1.42
The NFAN4
Next, we determine the start and accept states of D.T h es t a r ts t a t ei s E({1}),
the set of states that are reachable from 1by traveling along εarrows, plus 1
itself. An εarrow goes from 1to3,s oE({1})={1,3}.T h en e wa c c e p ts t a t e s
are those containing N4’s accept state; thus⎪braceleftbig
{1},{1,2},{1,3},{1,2,3}⎪bracerightbig
.
Finally, we determine D’s transition function. Each of D’s states goes to one
place on input aand one place on input b.W e i l l u s t r a t e t h e p r o c e s s o f d e t e r -
mining the placement of D’s transition arrows with a few examples.
InD,s t a t e {2}goes to {2,3}on input abecause in N4,s t a t e 2goes to both 2
and3on input aand we can’t go farther from 2or3along εarrows. State {2}
goes to state {3}on input bbecause in N4,s t a t e 2goes only to state 3on input
band we can’t go farther from 3along εarrows.
State {1}goes to ∅onabecause no aarrows exit it. It goes to {2}onb.
Note that the procedure in Theorem 1.39 specifies that we follow the εarrows
after each input symbol is read. An alternative procedure based on following the
εarrows before reading each input symbol works equally well, but that method
is not illustrated in this example.
State {3}goes to {1,3}onabecause in N4,s t a t e 3goes to 1onaand1in
turn goes to 3with an εarrow. State {3}onbgoes to ∅.
State {1,2}onagoes to {2,3}because 1points at no states with aarrows,
2points at both 2and3with aarrows, and neither points anywhere with εar-
rows. State {1,2}onbgoes to {2,3}. Continuing in this way, we obtain the
diagram for Din Figure 1.43.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 82 ---
58 CHAPTER 1 / REGULAR LANGUAGES
FIGURE 1.43
ADFADthat is equivalent to the NFAN4
We may simplify this machine by observing that no arrows point at states {1}
and{1,2},s ot h e ym a yb er e m o v e dw i t h o u ta f f e c t i n gt h ep e r f o r m a n c eo ft h e
machine. Doing so yields the following figure.
FIGURE 1.44
DFADafter removing unnecessary states
CLOSURE UNDER THE REGULAR OPERATIONS
Now we return to the closure of the class of regular languages under the regular
operations that we began in Section 1.1. Our aim is to prove that the union,
concatenation, and star of regular languages are still regular. We abandoned the
original attempt to do so when dealing with the concatenation operation was too
complicated. The use of nondeterminism makes the proofs much easier.
First, let’s consider again closure under union. Earlier we proved closure
under union by simulating deterministically both machines simultaneously via
a Cartesian product construction. We now give a new proof to illustrate the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 83 ---
1.2 NONDETERMINISM 59
technique of nondeterminism. Reviewing the first proof, appearing on page 45,
may be worthwhile to see how much easier and more intuitive the new proof is.
THEOREM 1.45
The class of regular languages is closed under the union operation.
PROOF IDEA We have regular languages A1andA2and want to prove that
A1∪A2is regular. The idea is to take two NFAs,N1andN2forA1andA2,a n d
combine them into one new NFA,N.
Machine Nmust accept its input if either N1orN2accepts this input. The
new machine has a new start state that branches to the start states of the old ma-
chines with εarrows. In this way, the new machine nondeterministically guesses
which of the two machines accepts the input. If one of them accepts the input,
Nwill accept it, too.
We represent this construction in the following figure. On the left, we in-
dicate the start and accept states of machines N1andN2with large circles and
some additional states with small circles. On the right, we show how to combine
N1andN2intoNby adding additional transition arrows.
FIGURE 1.46
Construction of an NFANto recognize A1∪A2
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 84 ---
60 CHAPTER 1 / REGULAR LANGUAGES
PROOF
LetN1=(Q1,Σ,δ1,q1,F1)recognize A1,a n d
N2=(Q2,Σ,δ2,q2,F2)recognize A2.
Construct N=(Q,Σ,δ ,q 0,F)to recognize A1∪A2.
1.Q={q0}∪Q1∪Q2.
The states of Nare all the states of N1andN2,w i t ht h ea d d i t i o no fan e w
start state q0.
2.The state q0is the start state of N.
3.The set of accept states F=F1∪F2.
The accept states of Nare all the accept states of N1andN2.T h a tw a y , N
accepts if either N1accepts or N2accepts.
4.Define δso that for any q∈Qand any a∈Σε,
δ(q,a)=⎧
⎪⎪⎪⎨
⎪⎪⎪⎩δ1(q,a)q∈Q1
δ2(q,a)q∈Q2
{q1,q2}q=q0anda=ε
∅ q=q0anda̸=ε.
Now we can prove closure under concatenation. Recall that earlier, without
nondeterminism, completing the proof would have been difficult.
THEOREM 1.47
The class of regular languages is closed under the concatenation operation.
PROOF IDEA We have regular languages A1andA2and want to prove that
A1◦A2is regular. The idea is to take two NFAs,N1andN2forA1andA2,a n d
combine them into a new NFANas we did for the case of union, but this time
in a different way, as shown in Figure 1.48.
Assign N’s start state to be the start state of N1.T h ea c c e p ts t a t e so f N1have
additional εarrows that nondeterministically allow branching to N2whenever
N1is in an accept state, signifying that it has found an initial piece of the input
that constitutes a string in A1.T h ea c c e p ts t a t e so f Nare the accept states of N2
only. Therefore, it accepts when the input can be split into two parts, the first
accepted by N1and the second by N2.W ec a nt h i n ko f Nas nondeterministically
guessing where to make the split.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 85 ---
1.2 NONDETERMINISM 61
FIGURE 1.48
Construction of Nto recognize A1◦A2
PROOF
LetN1=(Q1,Σ,δ1,q1,F1)recognize A1,a n d
N2=(Q2,Σ,δ2,q2,F2)recognize A2.
Construct N=(Q,Σ,δ ,q 1,F2)to recognize A1◦A2.
1.Q=Q1∪Q2.
The states of Nare all the states of N1andN2.
2.The state q1is the same as the start state of N1.
3.The accept states F2are the same as the accept states of N2.
4.Define δso that for any q∈Qand any a∈Σε,
δ(q,a)=⎧
⎪⎪⎪⎨
⎪⎪⎪⎩δ1(q,a) q∈Q1andq̸∈F1
δ1(q,a) q∈F1anda̸=ε
δ1(q,a)∪{q2}q∈F1anda=ε
δ2(q,a) q∈Q2.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 86 ---
62 CHAPTER 1 / REGULAR LANGUAGES
THEOREM 1.49
The class of regular languages is closed under the star operation.
PROOF IDEA We have a regular language A1and want to prove that A∗
1also
is regular. We take an NFAN1forA1and modify it to recognize A∗
1,a ss h o w ni n
the following figure. The resulting NFANwill accept its input whenever it can
be broken into several pieces and N1accepts each piece.
We can construct NlikeN1with additional εarrows returning to the start
state from the accept states. This way, when processing gets to the end of a piece
thatN1accepts, the machine Nhas the option of jumping back to the start state
to try to read another piece that N1accepts. In addition, we must modify N
so that it accepts ε,w h i c ha l w a y si sam e m b e ro f A∗
1.O n e( s l i g h t l y b a d ) i d e a i s
simply to add the start state to the set of accept states. This approach certainly
adds εto the recognized language, but it may also add other, undesired strings.
Exercise 1.15 asks for an example of the failure of this idea. The way to fix it is
to add a new start state, which also is an accept state, and which has an εarrow
to the old start state. This solution has the desired effect of adding εto the
language without adding anything else.
FIGURE 1.50
Construction of Nto recognize A∗
PROOF LetN1=(Q1,Σ,δ1,q1,F1)recognize A1.
Construct N=(Q,Σ,δ ,q 0,F)to recognize A∗
1.
1.Q={q0}∪Q1.
The states of Nare the states of N1plus a new start state.
2.The state q0is the new start state.
3.F={q0}∪F1.
The accept states are the old accept states plus the new start state.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 87 ---
1.3 REGULAR EXPRESSIONS 63
4.Define δso that for any q∈Qand any a∈Σε,
δ(q,a)=⎧
⎪⎪⎪⎪⎪⎪⎨
⎪⎪⎪⎪⎪⎪⎩δ1(q,a) q∈Q1andq̸∈F1
δ1(q,a) q∈F1anda̸=ε
δ1(q,a)∪{q1}q∈F1anda=ε
{q1} q=q0anda=ε
∅ q=q0anda̸=ε.
1.3
REGULAR EXPRESSIONS
In arithmetic, we can use the operations +and×to build up expressions such as
(5 + 3) ×4.
Similarly, we can use the regular operations to build up expressions describing
languages, which are called regular expressions .A ne x a m p l ei s :
(0∪1)0∗.
The value of the arithmetic expression is the number 32. The value of a regular
expression is a language. In this case, the value is the language consisting of all
strings starting with a 0or a1followed by any number of 0s. We get this result by
dissecting the expression into its parts. First, the symbols 0and1are shorthand
for the sets {0}and{1}.S o (0∪1)means ({0}∪{1}).T h ev a l u e o f t h i s p a r t
is the language {0,1}.T h e p a r t 0∗means {0}∗,a n di t sv a l u ei st h el a n g u a g e
consisting of all strings containing any number of 0s. Second, like the ×symbol
in algebra, the concatenation symbol ◦often is implicit in regular expressions.
Thus (0∪1)0∗actually is shorthand for (0∪1)◦0∗.T h ec o n c a t e n a t i o na t t a c h e s
the strings from the two parts to obtain the value of the entire expression.
Regular expressions have an important role in computer science applications.
In applications involving text, users may want to search for strings that satisfy
certain patterns. Regular expressions provide a powerful method for describing
such patterns. Utilities such as awkand grepin UNIX, modern programming
languages such as Perl, and text editors all provide mechanisms for the descrip-
tion of patterns by using regular expressions.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 88 ---
64 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.51
Another example of a regular expression is
(0∪1)∗.
It starts with the language (0∪1)and applies the ∗operation. The value of
this expression is the language consisting of all possible strings of 0sa n d 1s. If
Σ= {0,1},w ec a nw r i t e Σas shorthand for the regular expression (0∪1).M o r e
generally, if Σis any alphabet, the regular expression Σdescribes the language
consisting of all strings of length 1 over this alphabet, and Σ∗describes the lan-
guage consisting of all strings over that alphabet. Similarly, Σ∗1is the language
that contains all strings that end in a 1.T h el a n g u a g e (0Σ∗)∪(Σ∗1)consists of
all strings that start with a 0or end with a 1.
In arithmetic, we say that ×has precedence over +to mean that when there
is a choice, we do the ×operation first. Thus in 2+3×4,t h e 3×4is done before
the addition. T o have the addition done first, we must add parentheses to obtain
(2 + 3) ×4.I nr e g u l a re x p r e s s i o n s ,t h es t a ro p e r a t i o ni sd o n efi r s t ,f o l l o w e db y
concatenation, and finally union, unless parentheses change the usual order.
FORMAL DEFINITION OF A REGULAR EXPRESSION
DEFINITION 1.52
Say that Ris aregular expression ifRis
1.afor some ain the alphabet Σ,
2.ε,
3.∅,
4.(R1∪R2),w h e r e R1andR2are regular expressions,
5.(R1◦R2),w h e r e R1andR2are regular expressions, or
6.(R∗
1),w h e r e R1is a regular expression.
In items 1 and 2, the regular expressions aandεrepresent the
languages {a}and{ε},r e s p e c t i v e l y . I ni t e m3 ,t h er e g u l a re x p r e s -
sion ∅represents the empty language. In items 4, 5, and 6, the
expressions represent the languages obtained by taking the union
or concatenation of the languages R1andR2,o rt h es t a ro ft h e
language R1,r e s p e c t i v e l y .
Don’t confuse the regular expressions εand∅.T h e e x p r e s s i o n εrepresents
the language containing a single string—namely, the empty string—whereas ∅
represents the language that doesn’t contain any strings.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 89 ---
1.3 REGULAR EXPRESSIONS 65
Seemingly, we are in danger of defining the notion of a regular expression
in terms of itself. If true, we would have a circular definition ,w h i c hw o u l db e
invalid. However, R1andR2always are smaller than R.T h u s w e a c t u a l l y a r e
defining regular expressions in terms of smaller regular expressions and thereby
avoiding circularity. A definition of this type is called an inductive definition .
Parentheses in an expression may be omitted. If they are, evaluation is done
in the precedence order: star, then concatenation, then union.
For convenience, we let R+be shorthand for RR∗.I n o t h e r w o r d s , w h e r e a s
R∗has all strings that are 0or more concatenations of strings from R,t h el a n -
guage R+has all strings that are 1or more concatenations of strings from R.S o
R+∪ε=R∗.I na d d i t i o n ,w el e t Rkbe shorthand for the concatenation of kR’s
with each other.
When we want to distinguish between a regular expression Rand the lan-
guage that it describes, we write L(R)to be the language of R.
EXAMPLE 1.53
In the following instances, we assume that the alphabet Σis{0,1}.
1.0∗10∗={w|wcontains a single 1}.
2.Σ∗1Σ∗={w|whas at least one 1}.
3.Σ∗001Σ∗={w|wcontains the string 001as a substring }.
4.1∗(01+)∗={w|every 0inwis followed by at least one 1}.
5.(ΣΣ)∗={w|wis a string of even length }.5
6.(ΣΣΣ)∗={w|the length of wis a multiple of 3 }.
7.01∪10={01,10}.
8.0Σ∗0∪1Σ∗1∪0∪1={w|wstarts and ends with the same symbol }.
9.(0∪ε)1∗=01∗∪1∗.
The expression 0∪εdescribes the language {0,ε},s ot h ec o n c a t e n a t i o n
operation adds either 0orεbefore every string in 1∗.
10.(0∪ε)(1∪ε)={ε,0,1,01}.
11.1∗∅=∅.
Concatenating the empty set to any set yields the empty set.
12.∅∗={ε}.
The star operation puts together any number of strings from the language
to get a string in the result. If the language is empty, the star operation can
put together 0 strings, giving only the empty string.
5The length of a string is the number of symbols that it contains.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 90 ---
66 CHAPTER 1 / REGULAR LANGUAGES
If we let Rbe any regular expression, we have the following identities. They
are good tests of whether you understand the definition.
R∪∅=R.
Adding the empty language to any other language will not change it.
R◦ε=R.
Joining the empty string to any string will not change it.
However, exchanging ∅andεin the preceding identities may cause the equalities
to fail.
R∪εmay not equal R.
For example, if R=0,t h e n L(R)={0}butL(R∪ε)={0,ε}.
R◦∅may not equal R.
For example, if R=0,t h e n L(R)={0}butL(R◦∅)=∅.
Regular expressions are useful tools in the design of compilers for program-
ming languages. Elemental objects in a programming language, called tokens ,
such as the variable names and constants, may be described with regular ex-
pressions. For example, a numerical constant that may include a fractional part
and/or a sign may be described as a member of the language
⎪parenleftbig
+∪-∪ε⎪parenrightbig⎪parenleftbig
D+∪D+.D∗∪D∗.D+⎪parenrightbig
where D={0,1,2,3,4,5,6,7,8,9}is the alphabet of decimal digits. Examples
of generated strings are: 72,3.14159 ,+7.,a n d -.01 .
Once the syntax of a programming language has been described with a regular
expression in terms of its tokens, automatic systems can generate the lexical
analyzer ,t h ep a r to fac o m p i l e rt h a ti n i t i a l l yp r o c e s s e st h ei n p u tp r o g r a m .
EQUIVALENCE WITH FINITE AUTOMATA
Regular expressions and finite automata are equivalent in their descriptive
power. This fact is surprising because finite automata and regular expressions
superficially appear to be rather different. However, any regular expression can
be converted into a finite automaton that recognizes the language it describes,
and vice versa. Recall that a regular language is one that is recognized by some
finite automaton.
THEOREM 1.54
Al a n g u a g ei sr e g u l a ri fa n do n l yi fs o m er e g u l a re x p r e s s i o nd e s c r i b e si t .
This theorem has two directions. We state and prove each direction as a separate
lemma.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 91 ---
1.3 REGULAR EXPRESSIONS 67
LEMMA 1.55
If a language is described by a regular expression, then it is regular.
PROOF IDEA Say that we have a regular expression Rdescribing some lan-
guage A.W e s h o w h o w t o c o n v e r t Rinto an NFArecognizing A. By Corol-
lary 1.40, if an NFArecognizes Athen Ais regular.
PROOF Let’s convert Rinto an NFAN.W e c o n s i d e r t h e s i x c a s e s i n t h e
formal definition of regular expressions.
1.R=afor some a∈Σ.T h e n L(R)={a},a n dt h ef o l l o w i n g NFArecog-
nizes L(R).Note that this machine fits the definition of an NFAbut not that of
aDFAbecause it has some states with no exiting arrow for each possible
input symbol. Of course, we could have presented an equivalent DFAhere;
but an NFAis all we need for now, and it is easier to describe.
Formally, N=⎪parenleftbig
{q1,q2},Σ,δ ,q 1,{q2}⎪parenrightbig
,w h e r ew ed e s c r i b e δby saying
thatδ(q1,a)={q2}and that δ(r, b)=∅forr̸=q1orb̸=a.
2.R=ε.T h e n L(R)={ε},a n dt h ef o l l o w i n g NFArecognizes L(R).Formally, N=⎪parenleftbig
{q1},Σ,δ ,q 1,{q1}⎪parenrightbig
,w h e r e δ(r, b)=∅for any randb.
3.R=∅.T h e n L(R)=∅,a n dt h ef o l l o w i n g NFArecognizes L(R).Formally, N=⎪parenleftbig
{q},Σ,δ ,q, ∅⎪parenrightbig
,w h e r e δ(r, b)=∅for any randb.
4.R=R1∪R2.
5.R=R1◦R2.
6.R=R∗
1.
For the last three cases, we use the constructions given in the proofs that the
class of regular languages is closed under the regular operations. In other words,
we construct the NFAforRfrom the NFAsf o r R1andR2(or just R1in case 6)
and the appropriate closure construction.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 92 ---
68 CHAPTER 1 / REGULAR LANGUAGES
That ends the first part of the proof of Theorem 1.54, giving the easier di-
rection of the if and only if condition. Before going on to the other direction,
let’s consider some examples whereby we use this procedure to convert a regular
expression to an NFA.
EXAMPLE 1.56
We convert the regular expression (ab∪a)∗to an NFAin a sequence of stages.
We build up from the smallest subexpressions to larger subexpressions until we
have an NFAfor the original expression, as shown in the following diagram.
Note that this procedure generally doesn’t give the NFAwith the fewest states.
In this example, the procedure gives an NFAwith eight states, but the smallest
equivalent NFAhas only two states. Can you find it?
FIGURE 1.57
Building an NFAfrom the regular expression (ab∪a)∗
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 93 ---
1.3 REGULAR EXPRESSIONS 69
EXAMPLE 1.58
In Figure 1.59, we convert the regular expression (a∪b)∗abato an NFA.Af e w
of the minor steps are not shown.
FIGURE 1.59
Building an NFAfrom the regular expression (a∪b)∗aba
Now let’s turn to the other direction of the proof of Theorem 1.54.
LEMMA 1.60
If a language is regular, then it is described by a regular expression.
PROOF IDEA We need to show that if a language Ais regular, a regular
expression describes it. Because Ais regular, it is accepted by a DFA.W ed e s c r i b e
ap r o c e d u r ef o rc o n v e r t i n g DFAsi n t oe q u i v a l e n tr e g u l a re x p r e s s i o n s .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 94 ---
70 CHAPTER 1 / REGULAR LANGUAGES
We break this procedure into two parts, using a new type of finite automaton
called a generalized nondeterministic finite automaton ,GNFA .F i r s tw es h o w
how to convert DFAsi n t o GNFA s, and then GNFA si n t or e g u l a re x p r e s s i o n s .
Generalized nondeterministic finite automata are simply nondeterministic fi-
nite automata wherein the transition arrows may have any regular expressions as
labels, instead of only members of the alphabet or ε.T h e GNFA reads blocks of
symbols from the input, not necessarily just one symbol at a time as in an ordi-
nary NFA.T h e GNFA moves along a transition arrow connecting two states by
reading a block of symbols from the input, which themselves constitute a string
described by the regular expression on that arrow. A GNFA is nondeterministic
and so may have several different ways to process the same input string. It ac-
cepts its input if its processing can cause the GNFA to be in an accept state at the
end of the input. The following figure presents an example of a GNFA .
FIGURE 1.61
Ag e n e r a l i z e dn o n d e t e r m i n i s t i cfi n i t ea u t o m a t o n
For convenience, we require that GNFA sa l w a y sh a v eas p e c i a lf o r mt h a tm e e t s
the following conditions.
•The start state has transition arrows going to every other state but no arrows
coming in from any other state.
•There is only a single accept state, and it has arrows coming in from every
other state but no arrows going to any other state. Furthermore, the accept
state is not the same as the start state.
•Except for the start and accept states, one arrow goes from every state to
every other state and also from each state to itself.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 95 ---
1.3 REGULAR EXPRESSIONS 71
We can easily convert a DFAinto a GNFA in the special form. We simply add a
new start state with an εarrow to the old start state and a new accept state with ε
arrows from the old accept states. If any arrows have multiple labels (or if there
are multiple arrows going between the same two states in the same direction), we
replace each with a single arrow whose label is the union of the previous labels.
Finally, we add arrows labeled ∅between states that had no arrows. This last
step won’t change the language recognized because a transition labeled with ∅
can never be used. From here on we assume that all GNFA sa r ei nt h es p e c i a l
form.
Now we show how to convert a GNFA into a regular expression. Say that the
GNFA haskstates. Then, because a GNFA must have a start and an accept state
and they must be different from each other, we know that k≥2.I fk>2,w e
construct an equivalent GNFA with k−1states. This step can be repeated on
the new GNFA until it is reduced to two states. If k=2,t h e GNFA has a single
arrow that goes from the start state to the accept state. The label of this arrow
is the equivalent regular expression. For example, the stages in converting a DFA
with three states to an equivalent regular expression are shown in the following
figure.
FIGURE 1.62
Ty p i c a l s t a g e s i n c o n v e r t i n g a DFAto a regular expression
The crucial step is constructing an equivalent GNFA with one fewer state
when k>2.W e d o s o b y s e l e c t i n g a s t a t e , r i p p i n g i t o u t o f t h e m a c h i n e , a n d
repairing the remainder so that the same language is still recognized. Any state
will do, provided that it is not the start or accept state. We are guaranteed that
such a state will exist because k>2.L e t ’ sc a l lt h er e m o v e ds t a t e qrip.
After removing qripwe repair the machine by altering the regular expressions
that label each of the remaining arrows. The new labels compensate for the
absence of qripby adding back the lost computations. The new label going from
as t a t e qito a state qjis a regular expression that describes all strings that would
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 96 ---
72 CHAPTER 1 / REGULAR LANGUAGES
take the machine from qitoqjeither directly or via qrip.W e i l l u s t r a t e t h i s
approach in Figure 1.63.
FIGURE 1.63
Constructing an equivalent GNFA with one fewer state
In the old machine, if
1.qigoes to qripwith an arrow labeled R1,
2.qripgoes to itself with an arrow labeled R2,
3.qripgoes to qjwith an arrow labeled R3,a n d
4.qigoes to qjwith an arrow labeled R4,
then in the new machine, the arrow from qitoqjgets the label
(R1)(R2)∗(R3)∪(R4).
We make this change for each arrow going from any state qito any state qj,
including the case where qi=qj.T h e n e w m a c h i n e r e c o g n i z e s t h e o r i g i n a l
language.
PROOF Let’s now carry out this idea formally. First, to facilitate the proof,
we formally define the new type of automaton introduced. A GNFA is similar
to a nondeterministic finite automaton except for the transition function, which
has the form
δ:⎪parenleftbig
Q−{qaccept}⎪parenrightbig
×⎪parenleftbig
Q−{qstart}⎪parenrightbig
−→ R .
The symbol Ris the collection of all regular expressions over the alphabet Σ,
andqstartandqaccept are the start and accept states. If δ(qi,qj)=R,t h ea r r o w
from state qito state qjhas the regular expression Ras its label. The domain
of the transition function is⎪parenleftbig
Q−{qaccept}⎪parenrightbig
×⎪parenleftbig
Q−{qstart}⎪parenrightbig
because an arrow
connects every state to every other state, except that no arrows are coming from
qaccept or going to qstart.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 97 ---
1.3 REGULAR EXPRESSIONS 73
DEFINITION 1.64
Ageneralized nondeterministic finite automaton is a 5-tuple,
(Q,Σ,δ ,q start,qaccept),w h e r e
1.Qis the finite set of states,
2.Σis the input alphabet,
3.δ:⎪parenleftbig
Q−{qaccept}⎪parenrightbig
×⎪parenleftbig
Q−{qstart}⎪parenrightbig
−→ R is the transition
function,
4.qstartis the start state, and
5.qaccept is the accept state.
AGNFA accepts a string winΣ∗ifw=w1w2···wk,w h e r ee a c h wiis in Σ∗
and a sequence of states q0,q1,...,q kexists such that
1.q0=qstartis the start state,
2.qk=qaccept is the accept state, and
3.for each i,w eh a v e wi∈L(Ri),w h e r e Ri=δ(qi−1,qi);i no t h e rw o r d s , Ri
is the expression on the arrow from qi−1toqi.
Returning to the proof of Lemma 1.60, we let Mbe the DFAfor language
A.T h e n w e c o n v e r t Mto a GNFA Gby adding a new start state and a new
accept state and additional transition arrows as necessary. We use the procedure
CONVERT (G), which takes a GNFA and returns an equivalent regular expression.
This procedure uses recursion ,w h i c hm e a n st h a ti tc a l l si t s e l f .A ni n fi n i t el o o p
is avoided because the procedure calls itself only to process a GNFA that has
one fewer state. The case where the GNFA has two states is handled without
recursion.
CONVERT (G):
1.Letkbe the number of states of G.
2.Ifk=2,t h e n Gmust consist of a start state, an accept state, and a single
arrow connecting them and labeled with a regular expression R.
Return the expression R.
3.Ifk>2,w es e l e c ta n ys t a t e qrip∈Qdifferent from qstartandqaccept and let
G′be the GNFA (Q′,Σ,δ′,qstart,qaccept),w h e r e
Q′=Q−{qrip},
and for any qi∈Q′−{qaccept}and any qj∈Q′−{qstart},l e t
δ′(qi,qj)=( R1)(R2)∗(R3)∪(R4),
forR1=δ(qi,qrip),R2=δ(qrip,qrip),R3=δ(qrip,qj),a n d R4=δ(qi,qj).
4.Compute CONVERT (G′)a n dr e t u r nt h i sv a l u e .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 98 ---
74 CHAPTER 1 / REGULAR LANGUAGES
Next we prove that CONVERT returns a correct value.
CLAIM 1.65
For any GNFA G,CONVERT (G)i se q u i v a l e n tt o G.
We prove this claim by induction on k,t h en u m b e ro fs t a t e so ft h e GNFA .
Basis: Prove the claim true for k=2 states. If Ghas only two states, it can
have only a single arrow, which goes from the start state to the accept state. The
regular expression label on this arrow describes all the strings that allow Gto get
to the accept state. Hence this expression is equivalent to G.
Induction step: Assume that the claim is true for k−1states and use this as-
sumption to prove that the claim is true for kstates. First we show that Gand
G′recognize the same language. Suppose that Gaccepts an input w.T h e ni na n
accepting branch of the computation, Genters a sequence of states:
qstart,q1,q2,q3,...,q accept.
If none of them is the removed state qrip,c l e a r l y G′also accepts w.T h er e a s o n
is that each of the new regular expressions labeling the arrows of G′contains the
old regular expression as part of a union.
Ifqripdoes appear, removing each run of consecutive qripstates forms an
accepting computation for G′.T h es t a t e s qiandqjbracketing a run have a new
regular expression on the arrow between them that describes all strings taking qi
toqjviaqriponG.S oG′accepts w.
Conversely, suppose that G′accepts an input w.A s e a c h a r r o w b e t w e e na n y
two states qiandqjinG′describes the collection of strings taking qitoqjinG,
either directly or via qrip,Gmust also accept w.T h u s GandG′are equivalent.
The induction hypothesis states that when the algorithm calls itself recur-
sively on input G′,t h er e s u l ti sar e g u l a re x p r e s s i o nt h a ti se q u i v a l e n tt o G′
because G′hask−1states. Hence this regular expression also is equivalent to
G,a n dt h ea l g o r i t h mi sp r o v e dc o r r e c t .
This concludes the proof of Claim 1.65, Lemma 1.60, and Theorem 1.54.
EXAMPLE 1.66
In this example, we use the preceding algorithm to convert a DFAinto a regular
expression. We begin with the two-state DFAin Figure 1.67(a).
In Figure 1.67(b), we make a four-state GNFA by adding a new start state and
an e wa c c e p ts t a t e ,c a l l e d sandainstead of qstartandqaccept so that we can draw
them conveniently. T o avoid cluttering up the figure, we do not draw the arrows
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 99 ---
1.3 REGULAR EXPRESSIONS 75
labeled ∅,e v e nt h o u g ht h e ya r ep r e s e n t . N o t et h a tw er e p l a c et h el a b e l a,bon
the self-loop at state 2on the DFAwith the label a∪bat the corresponding point
on the GNFA .W ed os ob e c a u s et h e DFA’s label represents two transitions, one
foraand the other for b,w h e r e a st h e GNFA may have only a single transition
going from 2to itself.
In Figure 1.67(c), we remove state 2and update the remaining arrow labels.
In this case, the only label that changes is the one from 1toa.I np a r t( b )i tw a s
∅,b u ti np a r t( c )i ti s b(a∪b)∗.W eo b t a i nt h i sr e s u l tb yf o l l o w i n gs t e p3o ft h e
CONVERT procedure. State qiis state 1,s t a t e qjisa,a n d qripis2,s oR1=b,
R2=a∪b,R3=ε,a n d R4=∅.T h e r e f o r e ,t h en e wl a b e lo nt h ea r r o wf r o m 1
toais(b)(a∪b)∗(ε)∪∅.W es i m p l i f yt h i sr e g u l a re x p r e s s i o nt o b(a∪b)∗.
In Figure 1.67(d), we remove state 1from part (c) and follow the same pro-
cedure. Because only the start and accept states remain, the label on the arrow
joining them is the regular expression that is equivalent to the original DFA.
FIGURE 1.67
Converting a two-state DFAto an equivalent regular expression
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 100 ---
76 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.68
In this example, we begin with a three-state DFA.T h es t e p si nt h ec o n v e r s i o na r e
shown in the following figure.
            
   
FIGURE 1.69
Converting a three-state DFAto an equivalent regular expression
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 101 ---
1.4 NONREGULAR LANGUAGES 77
1.4
NONREGULAR LANGUAGES
To u n d e r s t a n d t h e p o w e r o f fi n i t e a u t o m a t a , y o u m u s t a l s o u n d e r s t a n d t h e i r
limitations. In this section, we show how to prove that certain languages cannot
be recognized by any finite automaton.
Let’s take the language B={0n1n|n≥0}.I f w e a t t e m p t t o fi n d a DFA
that recognizes B,w ed i s c o v e rt h a tt h em a c h i n es e e m st on e e dt or e m e m b e r
how many 0sh a v eb e e ns e e ns of a ra si tr e a d st h ei n p u t .B e c a u s et h en u m b e ro f
0si s n ’ tl i m i t e d ,t h em a c h i n ew i l lh a v et ok e e pt r a c ko fa nu n l i m i t e dn u m b e ro f
possibilities. But it cannot do so with any finite number of states.
Next, we present a method for proving that languages such as Bare not regu-
lar. Doesn’t the argument already given prove nonregularity because the number
of0si su n l i m i t e d ?I td o e sn o t .J u s tb e c a u s et h el a n g u a g ea p p e a r st or e q u i r eu n -
bounded memory doesn’t mean that it is necessarily so. It does happen to be true
for the language B;b u to t h e rl a n g u a g e ss e e mt or e q u i r ea nu n l i m i t e dn u m b e ro f
possibilities, yet actually they are regular. For example, consider two languages
over the alphabet Σ= {0,1}:
C={w|whas an equal number of 0sa n d 1s},a n d
D={w|whas an equal number of occurrences of 01and10as substrings }.
At first glance, a recognizing machine appears to need to count in each case,
and therefore neither language appears to be regular. As expected, Cis not
regular, but surprisingly Dis regular!6Thus our intuition can sometimes lead
us astray, which is why we need mathematical proofs for certainty. In this section,
we show how to prove that certain languages are not regular.
THE PUMPING LEMMA FOR REGULAR LANGUAGES
Our technique for proving nonregularity stems from a theorem about regular
languages, traditionally called the pumping lemma .T h i st h e o r e ms t a t e st h a ta l l
regular languages have a special property. If we can show that a language does
not have this property, we are guaranteed that it is not regular. The property
states that all strings in the language can be “pumped” if they are at least as
long as a certain special value, called the pumping length .T h a t m e a n s e a c h
such string contains a section that can be repeated any number of times with the
resulting string remaining in the language.
6See Problem 1.48.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 102 ---
78 CHAPTER 1 / REGULAR LANGUAGES
THEOREM 1.70
Pumping lemma IfAis a regular language, then there is a number p(the
pumping length) where if sis any string in Aof length at least p,t h e n smay be
divided into three pieces, s=xyz,s a t i s f y i n gt h ef o l l o w i n gc o n d i t i o n s :
1.for each i≥0,xyiz∈A,
2.|y|>0,a n d
3.|xy|≤p.
Recall the notation where |s|represents the length of string s,yimeans that i
copies of yare concatenated together, and y0equals ε.
When sis divided into xyz,e i t h e r xorzmay be ε,b u tc o n d i t i o n2s a y st h a t
y̸=ε.O b s e r v e t h a t w i t h o u t c o n d i t i o n 2 t h e t h e o r e m w o u l d b e t r i v i a l l y t r u e .
Condition 3 states that the pieces xandytogether have length at most p.I ti sa n
extra technical condition that we occasionally find useful when proving certain
languages to be nonregular. See Example 1.74 for an application of condition 3.
PROOF IDEA LetM=(Q,Σ,δ ,q 1,F)be a DFAthat recognizes A.W ea s s i g n
the pumping length pto be the number of states of M.W es h o wt h a ta n ys t r i n g
sinAof length at least pmay be broken into the three pieces xyz,s a t i s f y i n go u r
three conditions. What if no strings in Aare of length at least p?T h e no u rt a s k
is even easier because the theorem becomes vacuously true: Obviously the three
conditions hold for all strings of length at least pif there aren’t any such strings.
IfsinAhas length at least p,c o n s i d e rt h es e q u e n c eo fs t a t e st h a t Mgoes
through when computing with input s.I ts t a r t sw i t h q1the start state, then goes
to, say, q3,t h e n ,s a y , q20,t h e n q9,a n ds oo n ,u n t i li tr e a c h e st h ee n do f sin state
q13.W i t h sinA,w ek n o wt h a t Maccepts s,s oq13is an accept state.
If we let nbe the length of s,t h es e q u e n c eo fs t a t e s q1,q3,q20,q9,...,q 13has
length n+1.B e c a u s e nis at least p,w ek n o wt h a t n+1is greater than p,t h e
number of states of M.T h e r e f o r e ,t h es e q u e n c em u s tc o n t a i nar e p e a t e ds t a t e .
This result is an example of the pigeonhole principle ,af a n c yn a m ef o rt h er a t h e r
obvious fact that if ppigeons are placed into fewer than pholes, some hole has
to have more than one pigeon in it.
The following figure shows the string sand the sequence of states that M
goes through when processing s.S t a t e q9is the one that repeats.             FIGURE 1.71
Example showing state q9repeating when Mreads s
We now divide sinto the three pieces x,y,a n d z.P i e c e xis the part of s
appearing before q9,p i e c e yis the part between the two appearances of q9,a n d
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 103 ---
1.4 NONREGULAR LANGUAGES 79
piece zis the remaining part of s,c o m i n ga f t e rt h es e c o n do c c u r r e n c eo f q9.S o
xtakes Mfrom the state q1toq9,ytakes Mfrom q9back to q9,a n d ztakes M
from q9to the accept state q13,a ss h o w ni nt h ef o l l o w i n gfi g u r e .
FIGURE 1.72
Example showing how the strings x,y,a n d zaffect M
Let’s see why this division of ssatisfies the three conditions. Suppose that we
runMon input xyyz.W ek n o wt h a t xtakes Mfrom q1toq9,a n dt h e nt h efi r s t
ytakes it from q9back to q9,a sd o e st h es e c o n d y,a n dt h e n ztakes it to q13.
With q13being an accept state, Maccepts input xyyz.S i m i l a r l y , i t w i l l a c c e p t
xyizfor any i>0.F o rt h ec a s e i=0,xyiz=xz,w h i c hi sa c c e p t e df o rs i m i l a r
reasons. That establishes condition 1.
Checking condition 2, we see that |y|>0,a si tw a st h ep a r to f sthat occurred
between two different occurrences of state q9.
In order to get condition 3, we make sure that q9is the first repetition in the
sequence. By the pigeonhole principle, the first p+1states in the sequence must
contain a repetition. Therefore, |xy|≤p.
PROOF LetM=(Q,Σ,δ ,q 1,F)be a DFArecognizing Aandpbe the number
of states of M.
Lets=s1s2···snbe a string in Aof length n,w h e r e n≥p.L e t r1,...,r n+1
be the sequence of states that Menters while processing s,s ori+1=δ(ri,si)
for1≤i≤n.T h i ss e q u e n c eh a sl e n g t h n+1,w h i c hi sa tl e a s t p+1.A m o n g
the first p+1elements in the sequence, two must be the same state, by the
pigeonhole principle. We call the first of these rjand the second rl.B e c a u s e rl
occurs among the first p+1places in a sequence starting at r1,w eh a v e l≤p+1.
Now let x=s1···sj−1,y=sj···sl−1,a n d z=sl···sn.
Asxtakes Mfrom r1torj,ytakes Mfrom rjtorj,a n d ztakes Mfrom rj
torn+1,w h i c hi sa na c c e p ts t a t e , Mmust accept xyizfori≥0.W ek n o wt h a t
j̸=l,s o|y|>0;a n d l≤p+1,s o|xy|≤p.T h u sw eh a v es a t i s fi e da l lc o n d i t i o n s
of the pumping lemma.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 104 ---
80 CHAPTER 1 / REGULAR LANGUAGES
To u s e t h e p u m p i n g l e m m a t o p r o v e t h a t a l a n g u a g e Bis not regular, first as-
sume that Bis regular in order to obtain a contradiction. Then use the pumping
lemma to guarantee the existence of a pumping length psuch that all strings of
length por greater in Bcan be pumped. Next, find a string sinBthat has length
por greater but that cannot be pumped. Finally, demonstrate that scannot be
pumped by considering all ways of dividing sintox,y,a n d z(taking condition 3
of the pumping lemma into account if convenient) and, for each such division,
finding a value iwhere xyiz̸∈B.T h i s fi n a l s t e p o f t e n i n v o l v e s g r o u p i n g t h e
various ways of dividing sinto several cases and analyzing them individually.
The existence of scontradicts the pumping lemma if Bwere regular. Hence B
cannot be regular.
Finding ssometimes takes a bit of creative thinking. You may need to hunt
through several candidates for sbefore you discover one that works. T ry mem-
bers of Bthat seem to exhibit the “essence” of B’s nonregularity. We further
discuss the task of finding sin some of the following examples.
EXAMPLE 1.73
LetBbe the language {0n1n|n≥0}.W eu s et h ep u m p i n gl e m m at op r o v et h a t
Bis not regular. The proof is by contradiction.
Assume to the contrary that Bis regular. Let pbe the pumping length given
by the pumping lemma. Choose sto be the string 0p1p.B e c a u s e sis a member
ofBandshas length more than p,t h ep u m p i n gl e m m ag u a r a n t e e st h a t scan be
split into three pieces, s=xyz,w h e r ef o ra n y i≥0the string xyizis in B.W e
consider three cases to show that this result is impossible.
1.The string yconsists only of 0s. In this case, the string xyyz has more 0s
than 1sa n ds oi sn o tam e m b e ro f B,v i o l a t i n gc o n d i t i o n1o ft h ep u m p i n g
lemma. This case is a contradiction.
2.The string yconsists only of 1s. This case also gives a contradiction.
3.The string yconsists of both 0sa n d 1s. In this case, the string xyyz may
have the same number of 0sa n d 1s, but they will be out of order with some
1sb e f o r e 0s. Hence it is not a member of B,w h i c hi sac o n t r a d i c t i o n .
Thus a contradiction is unavoidable if we make the assumption that Bis reg-
ular, so Bis not regular. Note that we can simplify this argument by applying
condition 3 of the pumping lemma to eliminate cases 2 and 3.
In this example, finding the string swas easy because any string in Bof
length por more would work. In the next two examples, some choices for s
do not work so additional care is required.
EXAMPLE 1.74
LetC={w|whas an equal number of 0sa n d 1s}.W eu s et h ep u m p i n gl e m m a
to prove that Cis not regular. The proof is by contradiction.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 105 ---
1.4 NONREGULAR LANGUAGES 81
Assume to the contrary that Cis regular. Let pbe the pumping length given
by the pumping lemma. As in Example 1.73, let sbe the string 0p1p.W i t h
sbeing a member of Cand having length more than p,t h ep u m p i n gl e m m a
guarantees that scan be split into three pieces, s=xyz,w h e r ef o ra n y i≥0the
string xyizis in C.W ew o u l dl i k et os h o wt h a tt h i so u t c o m ei si m p o s s i b l e .B u t
wait, it ispossible! If we let xandzbe the empty string and ybe the string 0p1p,
then xyizalways has an equal number of 0sa n d 1sa n dh e n c ei si n C.S oi t seems
thatscan be pumped.
Here condition 3 in the pumping lemma is useful. It stipulates that when
pumping s,i tm u s tb ed i v i d e ds ot h a t |xy|≤p.T h a tr e s t r i c t i o no nt h ew a yt h a t
smay be divided makes it easier to show that the string s=0p1pwe selected
cannot be pumped. If |xy|≤p,t h e n ymust consist only of 0s, so xyyz ̸∈C.
Therefore, scannot be pumped. That gives us the desired contradiction.
Selecting the string sin this example required more care than in Exam-
ple 1.73. If we had chosen s=(01)pinstead, we would have run into trouble
because we need a string that cannot be pumped and that string canbe pumped,
even taking condition 3 into account. Can you see how to pump it? One way to
do so sets x=ε,y=01,a n d z=(01)p−1.T h e n xyiz∈Cfor every value of
i.I fy o uf a i lo ny o u rfi r s ta t t e m p tt ofi n das t r i n gt h a tc a n n o tb ep u m p e d ,d o n ’ t
despair. T ry another one!
An alternative method of proving that Cis nonregular follows from our
knowledge that Bis nonregular. If Cwere regular, C∩0∗1∗also would be
regular. The reasons are that the language 0∗1∗is regular and that the class of
regular languages is closed under intersection, which we proved in footnote 3
(page 46). But C∩0∗1∗equals B,a n dw ek n o wt h a t Bis nonregular from
Example 1.73.
EXAMPLE 1.75
LetF={ww|w∈{0,1}∗}.W es h o wt h a t Fis nonregular, using the pumping
lemma.
Assume to the contrary that Fis regular. Let pbe the pumping length given
by the pumping lemma. Let sbe the string 0p10p1.B e c a u s e sis a member of
Fandshas length more than p,t h ep u m p i n gl e m m ag u a r a n t e e st h a t scan be
split into three pieces, s=xyz,s a t i s f y i n gt h et h r e ec o n d i t i o n so ft h el e m m a .
We show that this outcome is impossible.
Condition 3 is once again crucial because without it we could pump sif we
letxandzbe the empty string. With condition 3 the proof follows because y
must consist only of 0s, so xyyz ̸∈F.
Observe that we chose s=0p10p1to be a string that exhibits the “essence” of
the nonregularity of F,a so p p o s e dt o ,s a y ,t h es t r i n g 0p0p.E v e nt h o u g h 0p0pis
am e m b e ro f F,i tf a i l st od e m o n s t r a t eac o n t r a d i c t i o nb e c a u s ei tc a nb ep u m p e d .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 106 ---
82 CHAPTER 1 / REGULAR LANGUAGES
EXAMPLE 1.76
Here we demonstrate a nonregular unary language. Let D={1n2|n≥0}.
In other words, Dcontains all strings of 1sw h o s el e n g t hi sap e r f e c ts q u a r e .
We use the pumping lemma to prove that Dis not regular. The proof is by
contradiction.
Assume to the contrary that Dis regular. Let pbe the pumping length given
by the pumping lemma. Let sbe the string 1p2.B e c a u s e sis a member of Dand
shas length at least p,t h ep u m p i n gl e m m ag u a r a n t e e st h a t scan be split into
three pieces, s=xyz,w h e r ef o ra n y i≥0the string xyizis in D.A s i n t h e
preceding examples, we show that this outcome is impossible. Doing so in this
case requires a little thought about the sequence of perfect squares:
0,1,4,9,16,25,36,49,...
Note the growing gap between successive members of this sequence. Large
members of this sequence cannot be near each other.
Now consider the two strings xyzandxy2z.T h e s es t r i n g s d i f f e rf r o m e a c h
other by a single repetition of y,a n dc o n s e q u e n t l yt h e i rl e n g t h sd i f f e rb yt h e
length of y.B y c o n d i t i o n 3 o f t h e p u m p i n g l e m m a , |xy|≤pand thus |y|≤p.
We have |xyz|=p2and so |xy2z|≤p2+p.B u t p2+p<p2+2p+1=( p+1 )2.
Moreover, condition 2 implies that yis not the empty string and so |xy2z|>
p2.T h e r e f o r e , t h e l e n g t h o f xy2zlies strictly between the consecutive perfect
squares p2and(p+1 )2.H e n c et h i sl e n g t hc a n n o tb eap e r f e c ts q u a r ei t s e l f .S o
we arrive at the contradiction xy2z̸∈Dand conclude that Dis not regular.
EXAMPLE 1.77
Sometimes “pumping down” is useful when we apply the pumping lemma. We
use the pumping lemma to show that E={0i1j|i>j }is not regular. The
proof is by contradiction.
Assume that Eis regular. Let pbe the pumping length for Egiven by the
pumping lemma. Let s=0p+11p.T h e n scan be split into xyz,s a t i s f y i n gt h e
conditions of the pumping lemma. By condition 3, yconsists only of 0s. Let’s
examine the string xyyz to see whether it can be in E.A d d i n g a n e x t r a c o p y
ofyincreases the number of 0s. But, Econtains all strings in 0∗1∗that have
more 0st h a n 1s, so increasing the number of 0sw i l ls t i l lg i v eas t r i n gi n E.N o
contradiction occurs. We need to try something else.
The pumping lemma states that xyiz∈Eeven when i=0,s ol e t ’ sc o n s i d e r
the string xy0z=xz.R e m o v i n gs t r i n g ydecreases the number of 0si ns.R e c a l l
thatshas just one more 0than 1.T h e r e f o r e , xzcannot have more 0st h a n 1s,
so it cannot be a member of E.T h u sw eo b t a i nac o n t r a d i c t i o n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 107 ---
EXERCISES 83
EXERCISES
A1.1 The following are the state diagrams of two DFAs,M1andM2.A n s w e rt h ef o l l o w -
ing questions about each of these machines.
a.What is the start state?
b.What is the set of accept states?
c.What sequence of states does the machine go through on input aabb ?
d.Does the machine accept the string aabb ?
e.Does the machine accept the string ε?
A1.2 Give the formal description of the machines M1andM2pictured in Exercise 1.1.
1.3 The formal description of a DFA Mis⎨parenleftbig
{q1,q2,q3,q4,q5},{u,d},δ ,q 3,{q3}⎨parenrightbig
,
where δis given by the following table. Give the state diagram of this machine.
ud
q1
q1q2
q2
q1q3
q3
q2q4
q4
q3q5
q5
q4q5
1.4 Each of the following languages is the intersection of two simpler languages. In
each part, construct DFAsf o rt h es i m p l e rl a n g u a g e s ,t h e nc o m b i n et h e mu s i n gt h e
construction discussed in footnote 3 (page 46) to give the state diagram of a DFA
for the language given. In all parts, Σ={a,b}.
a.{w|whas at least three a’s and at least two b’s}
Ab.{w|whas exactly two a’s and at least two b’s}
c.{w|whas an even number of a’s and one or two b’s}
Ad.{w|whas an even number of a’s and each ais followed by at least one b}
e.{w|wstarts with an aand has at most one b}
f.{w|whas an odd number of a’s and ends with a b}
g.{w|whas even length and an odd number of a’s}
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 108 ---
84 CHAPTER 1 / REGULAR LANGUAGES
1.5 Each of the following languages is the complement of a simpler language. In each
part, construct a DFAfor the simpler language, then use it to give the state diagram
of aDFAfor the language given. In all parts, Σ={a,b}.
Aa.{w|wdoes not contain the substring ab}
Ab.{w|wdoes not contain the substring baba }
c.{w|wcontains neither the substrings abnorba}
d.{w|wis any string not in a∗b∗}
e.{w|wis any string not in (ab+)∗}
f.{w|wis any string not in a∗∪b∗}
g.{w|wis any string that doesn’t contain exactly two a’s}
h.{w|wis any string except aandb}
1.6 Give state diagrams of DFAsr e c o g n i z i n gt h ef o l l o w i n gl a n g u a g e s .I na l lp a r t s ,t h e
alphabet is {0,1}.
a.{w|wbegins with a 1and ends with a 0}
b.{w|wcontains at least three 1s}
c.{w|wcontains the substring 0101 (i.e., w=x0101 yfor some xandy)}
d.{w|whas length at least 3and its third symbol is a 0}
e.{w|wstarts with 0and has odd length, or starts with 1and has even length }
f.{w|wdoesn’t contain the substring 110}
g.{w|the length of wis at most 5}
h.{w|wis any string except 11and111}
i.{w|every odd position of wis a1}
j.{w|wcontains at least two 0sa n da tm o s to n e 1}
k.{ε,0}
l.{w|wcontains an even number of 0s, or contains exactly two 1s}
m.The empty set
n.All strings except the empty string
1.7 Give state diagrams of NFAsw i t ht h es p e c i fi e dn u m b e ro fs t a t e sr e c o g n i z i n ge a c h
of the following languages. In all parts, the alphabet is {0,1}.
Aa.The language {w|wends with 00}with three states
b.The language of Exercise 1.6c with five states
c.The language of Exercise 1.6l with six states
d.The language {0}with two states
e.The language 0∗1∗0+with three states
Af.The language 1∗(001+)∗with three states
g.The language {ε}with one state
h.The language 0∗with one state
1.8 Use the construction in the proof of Theorem 1.45 to give the state diagrams of
NFAsr e c o g n i z i n gt h eu n i o no ft h el a n g u a g e sd e s c r i b e di n
a.Exercises 1.6a and 1.6b.
b.Exercises 1.6c and 1.6f.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 109 ---
EXERCISES 85
1.9 Use the construction in the proof of Theorem 1.47 to give the state diagrams of
NFAsr e c o g n i z i n gt h ec o n c a t e n a t i o no ft h el a n g u a g e sd e s c r i b e di n
a.Exercises 1.6g and 1.6i.
b.Exercises 1.6b and 1.6m.
1.10 Use the construction in the proof of Theorem 1.49 to give the state diagrams of
NFAsr e c o g n i z i n gt h es t a ro ft h el a n g u a g e sd e s c r i b e di n
a.Exercise 1.6b.
b.Exercise 1.6j.
c.Exercise 1.6m.
A1.11 Prove that every NFAcan be converted to an equivalent one that has a single accept
state.
1.12 LetD={w|wcontains an even number of a’s and an odd number of b’s and does
not contain the substring ab}.G i v ea DFAwith five states that recognizes Dand a
regular expression that generates D.( S u g g e s t i o n :D e s c r i b e Dmore simply.)
1.13 LetFbe the language of all strings over {0,1}that do not contain a pair of 1st h a t
are separated by an odd number of symbols. Give the state diagram of a DFAwith
five states that recognizes F.( Y o um a yfi n di th e l p f u lfi r s tt ofi n da4 - s t a t e NFAfor
the complement of F.)
1.14 a. Show that if Mis aDFAthat recognizes language B,s w a p p i n gt h ea c c e p t
and nonaccept states in Myields a new DFArecognizing the complement of
B.C o n c l u d et h a tt h ec l a s so fr e g u l a rl a n g u a g e si sc l o s e du n d e rc o m p l e m e n t .
b.Show by giving an example that if Mis an NFA that recognizes language
C,s w a p p i n gt h ea c c e p ta n dn o n a c c e p ts t a t e si n Mdoesn’t necessarily yield
an e w NFA that recognizes the complement of C.I s t h e c l a s s o f l a n g u a g e s
recognized by NFAsc l o s e du n d e rc o m p l e m e n t ?E x p l a i ny o u ra n s w e r .
1.15 Give a counterexample to show that the following construction fails to prove The-
orem 1.49, the closure of the class of regular languages under the star operation.7
LetN1=(Q1,Σ,δ1,q1,F1)recognize A1.C o n s t r u c t N=(Q1,Σ,δ ,q 1,F)as
follows. Nis supposed to recognize A∗
1.
a.The states of Nare the states of N1.
b.The start state of Nis the same as the start state of N1.
c.F={q1}∪F1.
The accept states Fare the old accept states plus its start state.
d.Define δso that for any q∈Q1and any a∈Σε,
δ(q,a)=⎨braceleftBigg
δ1(q,a) q̸∈F1ora̸=ε
δ1(q,a)∪{q1}q∈F1anda=ε.
(Suggestion: Show this construction graphically, as in Figure 1.50.)
7In other words, you must present a finite automaton, N1,f o rw h i c ht h ec o n s t r u c t e d
automaton Ndoes not recognize the star of N1’s language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 110 ---
86 CHAPTER 1 / REGULAR LANGUAGES
1.16 Use the construction given in Theorem 1.39 to convert the following two nonde-
terministic finite automata to equivalent deterministic finite automata.
1.17 a. Give an NFArecognizing the language (01∪001∪010)∗.
b.Convert this NFA to an equivalent DFA.G i v eo n l yt h ep o r t i o no ft h e DFA
that is reachable from the start state.
1.18 Give regular expressions generating the languages of Exercise 1.6.
1.19 Use the procedure described in Lemma 1.55 to convert the following regular ex-
pressions to nondeterministic finite automata.
a.(0∪1)∗000(0∪1)∗
b.(((00)∗(11))∪01)∗
c.∅∗
1.20 For each of the following languages, give two strings that are members and two
strings that are notmembers—a total of four strings for each part. Assume the
alphabet Σ={a,b}in all parts.
a.a∗b∗
b.a(ba)∗b
c.a∗∪b∗
d.(aaa)∗e.Σ∗aΣ∗bΣ∗aΣ∗
f.aba∪bab
g.(ε∪a)b
h.(a∪ba∪bb)Σ∗
1.21 Use the procedure described in Lemma 1.60 to convert the following finite au-
tomata to regular expressions.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 111 ---
EXERCISES 87
1.22 In certain programming languages, comments appear between delimiters such as
/#and#/.L e t Cbe the language of all valid delimited comment strings. A mem-
ber of Cmust begin with /#and end with #/but have no intervening #/.F o r
simplicity, assume that the alphabet for CisΣ={a,b,/,#}.
a.Give a DFAthat recognizes C.
b.Give a regular expression that generates C.
A1.23 LetBbe any language over the alphabet Σ.P r o v et h a t B=B+iffBB⊆B.
1.24 Afinite state transducer (FST)i sat y p eo fd e t e r m i n i s t i cfi n i t ea u t o m a t o nw h o s e
output is a string and not just accept orreject .T h ef o l l o w i n ga r es t a t ed i a g r a m so f
finite state transducers T1andT2.
Each transition of an FSTis labeled with two symbols, one designating the input
symbol for that transition and the other designating the output symbol. The two
symbols are written with a slash, /, separating them. In T1,t h et r a n s i t i o nf r o m
q1toq2has input symbol 2and output symbol 1.S o m e t r a n s i t i o n s m a y h a v e
multiple input–output pairs, such as the transition in T1from q1to itself. When
anFSTcomputes on an input string w,i tt a k e st h ei n p u ts y m b o l s w1···wnone by
one and, starting at the start state, follows the transitions by matching the input
labels with the sequence of symbols w1···wn=w.E v e r y t i m e i t g o e s a l o n g a
transition, it outputs the corresponding output symbol. For example, on input
2212011 ,m a c h i n e T1enters the sequence of states q1,q2,q2,q2,q2,q1,q1,q1and
produces output 1111000 .O ni n p u t abbb ,T2outputs 1011 .G i v et h es e q u e n c eo f
states entered and the output produced in each of the following parts.
a.T1on input 011
b.T1on input 211
c.T1on input 121
d.T1on input 0202e.T2on input b
f.T2on input bbab
g.T2on input bbbbbb
h.T2on input ε
1.25 Read the informal definition of the finite state transducer given in Exercise 1.24.
Give a formal definition of this model, following the pattern in Definition 1.5
(page 35). Assume that an FSThas an input alphabet Σand an output alphabet Γbut
not a set of accept states. Include a formal definition of the computation of an FST.
(Hint: An FSTis a 5-tuple. Its transition function is of the form δ:Q×Σ−→Q×Γ.)
1.26 Using the solution you gave to Exercise 1.25, give a formal description of the ma-
chines T1andT2depicted in Exercise 1.24.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 112 ---
88 CHAPTER 1 / REGULAR LANGUAGES
1.27 Read the informal definition of the finite state transducer given in Exercise 1.24.
Give the state diagram of an FSTwith the following behavior. Its input and output
alphabets are {0,1}.I t s o u t p u t s t r i n g i s i d e n t i c a l t o t h e i n p u t s t r i n g o n t h e e v e n
positions but inverted on the odd positions. For example, on input 0000111 it
should output 1010010 .
1.28 Convert the following regular expressions to NFAsu s i n gt h ep r o c e d u r eg i v e ni n
Theorem 1.54. In all parts, Σ={a,b}.
a.a(abb)∗∪b
b.a+∪(ab)+
c.(a∪b+)a+b+
1.29 Use the pumping lemma to show that the following languages are not regular.
Aa.A1={0n1n2n|n≥0}
b.A2={www|w∈{a,b}∗}
Ac.A3={a2n|n≥0}(Here, a2nmeans a string of 2na’s.)
1.30 Describe the error in the following “proof” that 0∗1∗is not a regular language. (An
error must exist because 0∗1∗isregular.) The proof is by contradiction. Assume
that0∗1∗is regular. Let pbe the pumping length for 0∗1∗given by the pumping
lemma. Choose sto be the string 0p1p.Y o uk n o wt h a t sis a member of 0∗1∗,b u t
Example 1.73 shows that scannot be pumped. Thus you have a contradiction. So
0∗1∗is not regular.
PROBLEMS
1.31 For any string w=w1w2···wn,t h e reverse ofw,w r i t t e n wR,i st h es t r i n g win
reverse order, wn···w2w1.F o ra n yl a n g u a g e A,l e tAR={wR|w∈A}.
Show that if Ais regular, so is AR.
1.32 Let
Σ3=⎨braceleftBig⎨bracketleftBig0
0
0⎨bracketrightBig
,⎨bracketleftBig0
0
1⎨bracketrightBig
,⎨bracketleftBig0
1
0⎨bracketrightBig
,...,⎨bracketleftBig1
1
1⎨bracketrightBig⎨bracerightBig
.
Σ3contains all size 3 columns of 0sa n d 1s. A string of symbols in Σ3gives three
rows of 0sa n d 1s. Consider each row to be a binary number and let
B={w∈Σ∗
3|the bottom row of wis the sum of the top two rows }.
For example,⎨bracketleftBig0
0
1⎨bracketrightBig⎨bracketleftBig1
0
0⎨bracketrightBig⎨bracketleftBig1
1
0⎨bracketrightBig
∈B, but⎨bracketleftBig0
0
1⎨bracketrightBig⎨bracketleftBig1
0
1⎨bracketrightBig
̸∈B.
Show that Bis regular. (Hint: Working with BRis easier. You may assume the
result claimed in Problem 1.31.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 113 ---
PROBLEMS 89
1.33 Let
Σ2=⎨braceleftbig⎨bracketleftbig0
0⎨bracketrightbig
,⎨bracketleftbig0
1⎨bracketrightbig
,⎨bracketleftbig1
0⎨bracketrightbig
,⎨bracketleftbig1
1⎨bracketrightbig⎨bracerightbig
.
Here, Σ2contains all columns of 0sa n d 1so fh e i g h tt w o . As t r i n go fs y m b o l si n
Σ2gives two rows of 0sa n d 1s. Consider each row to be a binary number and let
C={w∈Σ∗
2|the bottom row of wis three times the top row }.
For example,⎨bracketleftbig0
0⎨bracketrightbig⎨bracketleftbig0
1⎨bracketrightbig⎨bracketleftbig1
1⎨bracketrightbig⎨bracketleftbig0
0⎨bracketrightbig
∈C,b u t⎨bracketleftbig0
1⎨bracketrightbig⎨bracketleftbig0
1⎨bracketrightbig⎨bracketleftbig1
0⎨bracketrightbig
̸∈C.Show that Cis regular.
(You may assume the result claimed in Problem 1.31.)
1.34 LetΣ2be the same as in Problem 1.33. Consider each row to be a binary number
and let
D={w∈Σ∗
2|the top row of wis a larger number than is the bottom row }.
For example,⎨bracketleftbig0
0⎨bracketrightbig⎨bracketleftbig1
0⎨bracketrightbig⎨bracketleftbig1
1⎨bracketrightbig⎨bracketleftbig0
0⎨bracketrightbig
∈D,b u t⎨bracketleftbig0
0⎨bracketrightbig⎨bracketleftbig0
1⎨bracketrightbig⎨bracketleftbig1
1⎨bracketrightbig⎨bracketleftbig0
0⎨bracketrightbig
̸∈D.S h o wt h a t Dis regular.
1.35 LetΣ2be the same as in Problem 1.33. Consider the top and bottom rows to be
strings of 0sa n d 1s, and let
E={w∈Σ∗
2|the bottom row of wis the reverse of the top row of w}.
Show that Eis not regular.
1.36 LetBn={ak|kis a multiple of n}.S h o wt h a tf o re a c h n≥1,t h el a n g u a g e Bnis
regular.
1.37 LetCn={x|xis a binary number that is a multiple of n}.S h o w t h a t f o r e a c h
n≥1,t h el a n g u a g e Cnis regular.
1.38 Anall-NFA Mis a 5-tuple (Q,Σ,δ ,q 0,F)that accepts x∈Σ∗ifevery possible
state that Mcould be in after reading input xis a state from F. Note, in contrast,
that an ordinary NFAaccepts a string if some state among these possible states is an
accept state. Prove that all- NFAsr e c o g n i z et h ec l a s so fr e g u l a rl a n g u a g e s .
1.39 The construction in Theorem 1.54 shows that every GNFA is equivalent to a GNFA
with only two states. We can show that an opposite phenomenon occurs for DFAs.
Prove that for every k>1,al a n g u a g e Ak⊆{0,1}∗exists that is recognized by a
DFAwith kstates but not by one with only k−1states.
1.40 Recall that string xis aprefix of string yif a string zexists where xz=y,a n dt h a t
xis aproper prefix ofyif in addition x̸=y.I n e a c h o f t h e f o l l o w i n g p a r t s , w e
define an operation on a language A.S h o w t h a t t h e c l a s s o f r e g u l a r l a n g u a g e s i s
closed under that operation.
Aa.NOPREFIX (A)={w∈A|no proper prefix of wis a member of A}.
b.NOEXTEND (A)={w∈A|wis not the proper prefix of any string in A}.
1.41 For languages AandB,l e tt h e perfect shuffle ofAandBbe the language
{w|w=a1b1···akbk,where a1···ak∈Aandb1···bk∈B,each ai,bi∈Σ}.
Show that the class of regular languages is closed under perfect shuffle.
1.42 For languages AandB,l e tt h e shuffle ofAandBbe the language
{w|w=a1b1···akbk,where a1···ak∈Aandb1···bk∈B,each ai,bi∈Σ∗}.
Show that the class of regular languages is closed under shuffle.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 114 ---
90 CHAPTER 1 / REGULAR LANGUAGES
1.43 LetAbe any language. Define DROP-OUT (A)to be the language containing all
strings that can be obtained by removing one symbol from a string in A.T h u s ,
DROP-OUT (A)={xz|xyz∈Awhere x,z∈Σ∗,y∈Σ}.S h o w t h a t t h e c l a s s
of regular languages is closed under the DROP-OUT operation. Give both a proof
by picture and a more formal proof by construction as in Theorem 1.47.
A1.44 LetBandCbe languages over Σ={0,1}.D e fi n e
B1←C={w∈B|for some y∈C,strings wandycontain equal numbers of 1s}.
Show that the class of regular languages is closed under the1←operation.
⋆1.45 LetA/B ={w|wx∈Afor some x∈B}.S h o wt h a ti f Ais regular and Bis any
language, then A/B is regular.
1.46 Prove that the following languages are not regular. You may use the pumping
lemma and the closure of the class of regular languages under union, intersection,
and complement.
a.{0n1m0n|m,n≥0}
Ab.{0m1n|m̸=n}
c.{w|w∈{0,1}∗is not a palindrome }8
⋆d.{wtw|w,t∈{0,1}+}
1.47 LetΣ={1,#}and let
Y={w|w=x1#x2#···#xkfork≥0,each xi∈1∗,andxi̸=xjfori̸=j}.
Prove that Yis not regular.
1.48 LetΣ={0,1}and let
D={w|wcontains an equal number of occurrences of the substrings 01and10}.
Thus 101 ∈Dbecause 101contains a single 01and a single 10,b u t 1010 ̸∈D
because 1010 contains two 10sa n do n e 01.S h o wt h a t Dis a regular language.
1.49 a. LetB={1ky|y∈{0,1}∗andycontains at least k1s, for k≥1}.
Show that Bis a regular language.
b.LetC={1ky|y∈{0,1}∗andycontains at most k1s, for k≥1}.
Show that Cisn’t a regular language.
A1.50 Read the informal definition of the finite state transducer given in Exercise 1.24.
Prove that no FSTcan output wRfor every input wif the input and output alpha-
bets are {0,1}.
1.51 Letxandybe strings and let Lbe any language. We say that xandyaredistin-
guishable by Lif some string zexists whereby exactly one of the strings xzandyz
is a member of L;o t h e r w i s e ,f o re v e r ys t r i n g z,w eh a v e xz∈Lwhenever yz∈L
and we say that xandyareindistinguishable by L.I fxandyare indistinguishable
byL,w ew r i t e x≡Ly.S h o wt h a t ≡Lis an equivalence relation.
8Apalindrome is a string that reads the same forward and backward.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 115 ---
PROBLEMS 91
A⋆1.52 Myhill–Nerode theorem. Refer to Problem 1.51. Let Lbe a language and let X
be a set of strings. Say that Xispairwise distinguishable by Lif every two distinct
strings in Xare distinguishable by L.D e fi n e t h e index of Lto be the maximum
number of elements in any set that is pairwise distinguishable by L.T h e i n d e x o f
Lmay be finite or infinite.
a.Show that if Lis recognized by a DFAwith kstates, Lhas index at most k.
b.Show that if the index of Lis a finite number k,i ti sr e c o g n i z e db ya DFA
with kstates.
c.Conclude that Lis regular iff it has finite index. Moreover, its index is the
size of the smallest DFArecognizing it.
1.53 LetΣ={0,1,+,=}and
ADD ={x=y+z|x, y, z are binary integers, and xis the sum of yandz}.
Show that ADD is not regular.
1.54 Consider the language F={aibjck|i, j, k ≥0and if i=1then j=k}.
a.Show that Fis not regular.
b.Show that Facts like a regular language in the pumping lemma. In other
words, give a pumping length pand demonstrate that Fsatisfies the three
conditions of the pumping lemma for this value of p.
c.Explain why parts (a) and (b) do not contradict the pumping lemma.
1.55 The pumping lemma says that every regular language has a pumping length p,s u c h
that every string in the language can be pumped if it has length por more. If pis a
pumping length for language A,s oi sa n yl e n g t h p′≥p.T h e minimum pumping
length forAis the smallest pthat is a pumping length for A.F o r e x a m p l e , i f
A=01∗,t h em i n i m u mp u m p i n gl e n g t hi s 2.T h er e a s o ni st h a tt h es t r i n g s=0is
inAand has length 1yetscannot be pumped; but any string in Aof length 2or
more contains a 1and hence can be pumped by dividing it so that x=0,y=1,
andzis the rest. For each of the following languages, give the minimum pumping
length and justify your answer.
Aa.0001∗
Ab.0∗1∗
c.001∪0∗1∗
Ad.0∗1+0+1∗∪10∗1
e.(01)∗f.ε
g.1∗01∗01∗
h.10(11∗0)∗0
i.1011
j.Σ∗
⋆1.56 IfAis a set of natural numbers and kis a natural number greater than 1,l e t
Bk(A)={w|wis the representation in base kof some number in A}.
Here, we do not allow leading 0si nt h er e p r e s e n t a t i o no fan u m b e r .F o re x a m p l e ,
B2({3,5})={11,101}andB3({3,5})={10,12}.G i v ea ne x a m p l eo fas e t Afor
which B2(A)is regular but B3(A)is not regular. Prove that your example works.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 116 ---
92 CHAPTER 1 / REGULAR LANGUAGES
⋆1.57 IfAis any language, let A1
2−be the set of all first halves of strings in Aso that
A1
2−={x|for some y,|x|=|y|andxy∈A}.
Show that if Ais regular, then so is A1
2−.
⋆1.58 IfAis any language, let A1
3−1
3be the set of all strings in Awith their middle thirds
removed so that
A1
3−1
3={xz|for some y,|x|=|y|=|z|andxyz∈A}.
Show that if Ais regular, then A1
3−1
3is not necessarily regular.
⋆1.59 LetM=(Q,Σ,δ ,q 0,F)be a DFAand let hbe a state of Mcalled its “home”.
Asynchronizing sequence forMandhis a string s∈Σ∗where δ(q,s)=hfor
every q∈Q. (Here we have extended δto strings, so that δ(q,s)equals the state
where Mends up when Mstarts at state qand reads input s.) Say that Mis
synchronizable if it has a synchronizing sequence for some state h.P r o v et h a ti f M
is ak-state synchronizable DFA,t h e ni th a sas y n c h r o n i z i n gs e q u e n c eo fl e n g t ha t
most k3.C a ny o ui m p r o v eu p o nt h i sb o u n d ?
1.60 LetΣ= {a,b}.F o r e a c h k≥1,l e tCkbe the language consisting of all strings
that contain an aexactly kplaces from the right-hand end. Thus Ck=Σ∗aΣk−1.
Describe an NFA with k+1states that recognizes Ckin terms of both a state
diagram and a formal description.
1.61 Consider the languages Ckdefined in Problem 1.60. Prove that for each k,n oDFA
can recognize Ckwith fewer than 2kstates.
1.62 LetΣ= {a,b}.F o r e a c h k≥1,l e tDkbe the language consisting of all strings
that have at least one aamong the last ksymbols. Thus Dk=Σ∗a(Σ∪ε)k−1.
Describe a DFAwith at most k+1states that recognizes Dkin terms of both a state
diagram and a formal description.
⋆1.63 a. LetAbe an infinite regular language. Prove that Acan be split into two
infinite disjoint regular subsets.
b.LetBandDbe two languages. Write B⋐DifB⊆DandDcontains
infinitely many strings that are not in B.S h o w t h a t i f BandDare two
regular languages where B⋐D,t h e nw ec a nfi n dar e g u l a rl a n g u a g e C
where B⋐C⋐D.
1.64 LetNbe an NFAwith kstates that recognizes some language A.
a.Show that if Ais nonempty, Acontains some string of length at most k.
b.Show, by giving an example, that part (a) is not necessarily true if you replace
both A’s by
 A.
c.Show that if
 Ais nonempty,
 Acontains some string of length at most 2k.
d.Show that the bound given in part (c) is nearly tight; that is, for each k,
demonstrate an NFArecognizing a language Akwhere
 Akis nonempty and
where
 Ak’s shortest member strings are of length exponential in k.C o m ea s
close to the bound in (c) as you can.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 117 ---
PROBLEMS 93
⋆1.65 Prove that for each n>0,al a n g u a g e Bnexists where
a.Bnis recognizable by an NFAthat has nstates, and
b.ifBn=A1∪· · ·∪ Ak,f o rr e g u l a rl a n g u a g e s Ai,t h e na tl e a s to n eo ft h e Ai
requires a DFAwith exponentially many states.
1.66 Ahomomorphism is a function f:Σ−→Γ∗from one alphabet to strings over
another alphabet. We can extend fto operate on strings by defining f(w)=
f(w1)f(w2)···f(wn),w h e r e w=w1w2···wnand each wi∈Σ.W e f u r t h e r
extend fto operate on languages by defining f(A)= {f(w)|w∈A},f o ra n y
language A.
a.Show, by giving a formal construction, that the class of regular languages
is closed under homomorphism. In other words, given a DFAMthat rec-
ognizes Band a homomorphism f,c o n s t r u c tafi n i t ea u t o m a t o n M′that
recognizes f(B).C o n s i d e r t h e m a c h i n e M′that you constructed. Is it a
DFAin every case?
b.Show, by giving an example, that the class of non-regular languages is not
closed under homomorphism.
⋆1.67 Let the rotational closure of language AbeRC(A)={yx|xy∈A}.
a.Show that for any language A,w eh a v e RC(A)=RC(RC(A)).
b.Show that the class of regular languages is closed under rotational closure.
⋆1.68 In the traditional method for cutting a deck of playing cards, the deck is arbitrarily
split two parts, which are exchanged before reassembling the deck. In a more
complex cut, called Scarne’s cut, the deck is broken into three parts and the middle
part in placed first in the reassembly. We’ll take Scarne’s cut as the inspiration for
an operation on languages. For a language A,l e tCUT (A)={yxz|xyz∈A}.
a.Exhibit a language Bfor which CUT (B)̸=CUT (CUT (B)).
b.Show that the class of regular languages is closed under CUT .
1.69 LetΣ={0,1}.L e t WW k={ww|w∈Σ∗andwis of length k}.
a.Show that for each k,n oDFAcan recognize WW kwith fewer than 2kstates.
b.Describe a much smaller NFAfor
WW k,t h ec o m p l e m e n to f WW k.
1.70 We define the avoids operation for languages AandBto be
Aavoids B={w|w∈Aandwdoesn’t contain any string in Bas a substring }.
Prove that the class of regular languages is closed under the avoids operation.
1.71 LetΣ={0,1}.
a.LetA={0ku0k|k≥1andu∈Σ∗}.S h o wt h a t Ais regular.
b.LetB={0k1u0k|k≥1andu∈Σ∗}.S h o wt h a t Bis not regular.
1.72 LetM1andM2beDFAst h a th a v e k1andk2states, respectively, and then let
U=L(M1)∪L(M2).
a.Show that if U̸=∅,t h e n Ucontains some string s,w h e r e |s|<max( k1,k2).
b.Show that if U̸=Σ∗,t h e n Uexcludes some string s,w h e r e |s|<k 1k2.
1.73 LetΣ={0,1,#}.L e t C={x#xR#x|x∈{0,1}∗}.S h o wt h a t
 Cis aCFL.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 118 ---
94 CHAPTER 1 / REGULAR LANGUAGES
SELECTED SOLUTIONS
1.1 ForM1:(a)q1;(b){q2};(c)q1,q2,q3,q1,q1;(d)No; (e)No
ForM2:(a)q1;(b){q1,q4};(c)q1,q1,q1,q2,q4;(d)Yes; (e)Yes
1.2 M1=({q1,q2,q3},{a,b},δ1,q1,{q2}).
M2=({q1,q2,q3,q4},{a,b},δ2,q1,{q1,q4}).
The transition functions are
δ1
ab
q1
q2q1
q2
q3q3
q3
q2q1δ2
ab
q1
q1q2
q2
q3q4
q3
q2q1
q4
q3q4.
1.4 (b)The following are DFAsf o rt h et w ol a n g u a g e s {w|whas exactly two a’s}and
{w|whas at least two b’s}.aaa,bbbCombining them using the intersection construction gives the following DFA.
Though the problem doesn’t request you to simplify the DFA,c e r t a i ns t a t e sc a nb e
combined to give the following DFA.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 119 ---
SELECTED SOLUTIONS 95
(d)These are DFAsf o rt h et w ol a n g u a g e s {w|whas an even number of a’s}and
{w|each ainwis followed by at least one b}.Combining them using the intersection construction gives the following DFA.
Though the problem doesn’t request you to simplify the DFA,c e r t a i ns t a t e sc a nb e
combined to give the following DFA.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 120 ---
96 CHAPTER 1 / REGULAR LANGUAGES
1.5 (a)The left-hand DFArecognizes {w|wcontains ab}.T h er i g h t - h a n d DFArecog-
nizes its complement, {w|wdoesn’t contain ab}.(b)This DFArecognizes {w|wcontains baba }.
This DFArecognizes {w|wdoes not contain baba }.
1.7 (a) (f)1.11 LetN=(Q,Σ,δ ,q 0,F)be any NFA.C o n s t r u c ta n NFA N′with a single accept
state that recognizes the same language as N.I n f o r m a l l y , N′is exactly like N
except it has ε-transitions from the states corresponding to the accept states of
N,t oan e wa c c e p ts t a t e , qaccept.S t a t e qaccept has no emerging transitions. More
formally, N′=(Q∪{qaccept},Σ,δ′,q0,{qaccept}),w h e r ef o re a c h q∈Qanda∈Σε
δ′(q,a)=⎨braceleftBigg
δ(q,a) ifa̸=εorq̸∈F
δ(q,a)∪{qaccept}ifa=εandq∈F
andδ′(qaccept,a)=∅for each a∈Σε.
1.23 We prove both directions of the “iff.”
(→)Assume that B=B+and show that BB⊆B.
For every language BB⊆B+holds, so if B=B+,t h e n BB⊆B.
(←)Assume that BB⊆Band show that B=B+.
For every language B⊆B+,s ow en e e dt os h o wo n l y B+⊆B.I f w∈B+,
then w=x1x2···xkwhere each xi∈Bandk≥1.B e c a u s e x1,x2∈Band
BB⊆B,w eh a v e x1x2∈B.S i m i l a r l y , b e c a u s e x1x2is in Bandx3is in B,w e
have x1x2x3∈B.C o n t i n u i n g i n t h i s w a y , x1···xk∈B. Hence w∈B,a n ds o
we may conclude that B+⊆B.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 121 ---
SELECTED SOLUTIONS 97
The latter argument may be written formally as the following proof by induction.
Assume that BB⊆B.
Claim: For each k≥1,i fx1,...,x k∈B,t h e n x1···xk∈B.
Basis: Prove for k=1.T h i s s t a t e m e n t i s o b v i o u s l y t r u e .
Induction step: For each k≥1,a s s u m et h a tt h ec l a i mi st r u ef o r kand prove it to be
true for k+1.
Ifx1,...,x k,xk+1∈B,t h e nb yt h ei n d u c t i o na s s u m p t i o n , x1···xk∈B.T h e r e -
fore, x1···xkxk+1∈BB,b u t BB⊆B,s ox1···xk+1∈B.T h a t p r o v e s t h e
induction step and the claim. The claim implies that if BB⊆B,t h e n B+⊆B.
1.29 (a)Assume that A1={0n1n2n|n≥0}is regular. Let pbe the pumping length
given by the pumping lemma. Choose sto be the string 0p1p2p.B e c a u s e sis a
member of A1andsis longer than p,t h ep u m p i n gl e m m ag u a r a n t e e st h a t scan
be split into three pieces, s=xyz,w h e r ef o ra n y i≥0the string xyizis in A1.
Consider two possibilities:
1.The string yconsists only of 0s, only of 1s, or only of 2s. In these cases, the
string xyyz will not have equal numbers of 0s,1s, and 2s. Hence xyyz is not
am e m b e ro f A1,ac o n t r a d i c t i o n .
2.The string yconsists of more than one kind of symbol. In this case, xyyz
will have the 0s,1s, or 2s out of order. Hence xyyz is not a member of A1,
ac o n t r a d i c t i o n .
Either way we arrive at a contradiction. Therefore, A1is not regular.
(c)Assume that A3={a2n|n≥0}is regular. Let pbe the pumping length given
by the pumping lemma. Choose sto be the string a2p.B e c a u s e sis a member of
A3andsis longer than p,t h ep u m p i n gl e m m ag u a r a n t e e st h a t scan be split into
three pieces, s=xyz,s a t i s f y i n gt h et h r e ec o n d i t i o n so ft h ep u m p i n gl e m m a .
The third condition tells us that |xy|≤p.F u r t h e r m o r e , p<2pand so |y|<2p.
Therefore, |xyyz|=|xyz|+|y|<2p+2p=2p+1.T h es e c o n dc o n d i t i o nr e q u i r e s
|y|>0so2p<|xyyz|<2p+1.T h el e n g t ho f xyyz cannot be a power of 2. Hence
xyyz is not a member of A3,ac o n t r a d i c t i o n .T h e r e f o r e , A3is not regular.
1.40 (a)LetM=(Q,Σ,δ ,q 0,F)be a DFArecognizing A,w h e r e Ais some regular
language. Construct M′=(Q′,Σ,δ′,q0′,F′)recognizing NOPREFIX (A)as
follows:
1.Q′=Q.
2.Forr∈Q′anda∈Σ,d e fi n e δ′(r, a)=⎨braceleftBigg
{δ(r, a)}ifr/∈F
∅ ifr∈F.
3.q0′=q0.
4.F′=F.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 122 ---
98 CHAPTER 1 / REGULAR LANGUAGES
1.44 LetMB=(QB,Σ,δB,qB,FB)andMC=(QC,Σ,δC,qC,FC)beDFAsr e c o g -
nizing BandC,r e s p e c t i v e l y .C o n s t r u c t NFAM=(Q,Σ,δ ,q 0,F)that recognizes
B1←Cas follows. T o decide whether its input wis in B1←C,t h em a c h i n e
Mchecks that w∈B,a n di np a r a l l e ln o n d e t e r m i n i s t i c a l l yg u e s s e sas t r i n g ythat
contains the same number of 1sa sc o n t a i n e di n wand checks that y∈C.
1.Q=QB×QC.
2.For(q,r)∈Qanda∈Σε,d e fi n e
δ((q,r),a)=⎧
⎪⎨
⎪⎩{(δB(q,0),r)} ifa=0
{(δB(q,1),δC(r,1))}ifa=1
{(q,δ C(r,0))} ifa=ε.
3.q0=(qB,qC).
4.F=FB×FC.
1.46 (b)LetB={0m1n|m̸=n}.O b s e r v et h a t
 B∩0∗1∗={0k1k|k≥0}.I fBwere
regular, then
 Bwould be regular and so would
 B∩0∗1∗.B u tw ea l r e a d yk n o wt h a t
{0k1k|k≥0}isn’t regular, so Bcannot be regular.
Alternatively, we can prove Bto be nonregular by using the pumping lemma di-
rectly, though doing so is trickier. Assume that B={0m1n|m̸=n}is regular.
Letpbe the pumping length given by the pumping lemma. Observe that p!is di-
visible by all integers from 1top,w h e r e p!=p(p−1)(p−2)···1.T h e s t r i n g
s=0p1p+p!∈B,a n d |s|≥p.T h u st h ep u m p i n gl e m m ai m p l i e st h a t scan be di-
vided as xyzwith x=0a,y=0b,a n d z=0c1p+p!,w h e r e b≥1anda+b+c=p.
Lets′be the string xyi+1z,w h e r e i=p!/b.T h e n yi=0p!soyi+1=0b+p!,a n d
sos′=0a+b+c+p!1p+p!.T h a tg i v e s s′=0p+p!1p+p!̸∈B,ac o n t r a d i c t i o n .
1.50 Assume to the contrary that some FSTToutputs wRon input w.C o n s i d e r t h e
input strings 00and01.O ni n p u t 00,Tmust output 00,a n do ni n p u t 01,Tmust
output 10.I n b o t h c a s e s , t h e fi r s t i n p u t b i t i s a 0but the first output bits differ.
Operating in this way is impossible for an FSTbecause it produces its first output
bit before it reads its second input. Hence no such FSTcan exist.
1.52 (a)We prove this assertion by contradiction. Let Mbe a k-state DFAthat recog-
nizes L.S u p p o s ef o rac o n t r a d i c t i o nt h a t Lhas index greater than k.T h a tm e a n s
some set Xwith more than kelements is pairwise distinguishable by L.B e c a u s e M
haskstates, the pigeonhole principle implies that Xcontains two distinct strings x
andy,w h e r e δ(q0,x)=δ(q0,y). Here δ(q0,x)is the state that Mis in after start-
ing in the start state q0and reading input string x.T h e n , f o r a n y s t r i n g z∈Σ∗,
δ(q0,x z)=δ(q0,yz).T h e r e f o r e , e i t h e r b o t h xzandyzare in Lor neither are
inL.B u t t h e n xandyaren’t distinguishable by L,c o n t r a d i c t i n go u ra s s u m p t i o n
thatXis pairwise distinguishable by L.
(b)LetX={s1,...,s k}be pairwise distinguishable by L.W e c o n s t r u c t DFA
M=(Q,Σ,δ ,q 0,F)with kstates recognizing L.L e t Q={q1,...,q k},and
define δ(qi,a)to be qj,w h e r e sj≡Lsia(the relation ≡Lis defined in Prob-
lem 1.51). Note that sj≡Lsiafor some sj∈X;o t h e r w i s e , X∪siawould have
k+1elements and would be pairwise distinguishable by L,w h i c hw o u l dc o n t r a -
dict the assumption that Lhas index k.L e t F={qi|si∈L}.L e t t h e s t a r t
state q0be the qisuch that si≡Lε.Mis constructed so that for any state qi,
{s|δ(q0,s)=qi}={s|s≡Lsi}. Hence Mrecognizes L.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 123 ---
SELECTED SOLUTIONS 99
(c)Suppose that Lis regular and let kbe the number of states in a DFArecognizing
L.T h e n f r o m p a r t ( a ) , Lhas index at most k.C o n v e r s e l y , i f Lhas index k,t h e n
by part (b) it is recognized by a DFAwith kstates and thus is regular. T o show that
the index of Lis the size of the smallest DFAaccepting it, suppose that L’s index
isexactly k .T h e n , b y p a r t ( b ) , t h e r e i s a k-state DFAaccepting L.T h a t i s t h e
smallest such DFAbecause if it were any smaller, then we could show by part (a)
that the index of Lis less than k.
1.55 (a)The minimum pumping length is 4.T h e s t r i n g 000is in the language but
cannot be pumped, so 3is not a pumping length for this language. If shas length
4or more, it contains 1s. By dividing sintoxyz,w h e r e xis000andyis the first 1
andzis everything afterward, we satisfy the pumping lemma’s three conditions.
(b)The minimum pumping length is 1.T h ep u m p i n gl e n g t hc a n n o tb e 0because
the string εis in the language and it cannot be pumped. Every nonempty string in
the language can be divided into xyz,w h e r e x,y,a n d zareε,t h efi r s tc h a r a c t e r ,
and the remainder, respectively. This division satisfies the three conditions.
(d)The minimum pumping length is 3.T h ep u m p i n gl e n g t hc a n n o tb e 2because
the string 11is in the language and it cannot be pumped. Let sbe a string in the
language of length at least 3.I fsis generated by 0∗1+0+1∗andsbegins either 0
or11,w r i t e s=xyzwhere x=ε,yis the first symbol, and zis the remainder of
s.I fsis generated by 0∗1+0+1∗andsbegins 10,w r i t e s=xyzwhere x=10,yis
the next symbol, and zis the remainder of s.B r e a k i n g sup in this way shows that
it can be pumped. If sis generated by 10∗1,w ec a nw r i t ei ta s xyzwhere x=1,
y=0,a n d zis the remainder of s.T h i sd i v i s i o ng i v e saw a yt op u m p s.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 124 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 125 ---
2
CONTEXT-FREE
LANGUAGES
In Chapter 1 we introduced two different, though equivalent, methods of de-
scribing languages: finite automata andregular expressions .W es h o w e dt h a tm a n y
languages can be described in this way but that some simple languages, such as
{0n1n|n≥0},c a n n o t .
In this chapter we present context-free grammars ,am o r ep o w e r f u lm e t h o d
of describing languages. Such grammars can describe certain features that have
ar e c u r s i v es t r u c t u r e ,w h i c hm a k e st h e mu s e f u li nav a r i e t yo fa p p l i c a t i o n s .
Context-free grammars were first used in the study of human languages. One
way of understanding the relationship of terms such as noun,verb,a n d preposition
and their respective phrases leads to a natural recursion because noun phrases
may appear inside verb phrases and vice versa. Context-free grammars help us
organize and understand these relationships.
An important application of context-free grammars occurs in the specification
and compilation of programming languages. A grammar for a programming lan-
guage often appears as a reference for people trying to learn the language syntax.
Designers of compilers and interpreters for programming languages often start
by obtaining a grammar for the language. Most compilers and interpreters con-
tain a component called a parser that extracts the meaning of a program prior to
generating the compiled code or performing the interpreted execution. A num-
ber of methodologies facilitate the construction of a parser once a context-free
grammar is available. Some tools even automatically generate the parser from
the grammar.
101
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 126 ---
102 CHAPTER 2 / CONTEXT-FREE LANGUAGES
The collection of languages associated with context-free grammars are called
thecontext-free languages .T h e y i n c l u d e a l l t h e r e g u l a r l a n g u a g e s a n d m a n y
additional languages. In this chapter, we give a formal definition of context-free
grammars and study the properties of context-free languages. We also introduce
pushdown automata ,ac l a s so fm a c h i n e sr e c o g n i z i n gt h ec o n t e x t - f r e el a n g u a g e s .
Pushdown automata are useful because they allow us to gain additional insight
into the power of context-free grammars.
2.1
CONTEXT-FREE GRAMMARS
The following is an example of a context-free grammar, which we call G1.
A→0A1
A→B
B→#
Ag r a m m a rc o n s i s t so fac o l l e c t i o no f substitution rules ,a l s oc a l l e d produc-
tions.E a c h r u l e a p p e a r s a s a l i n e i n t h e g r a m m a r , c o m p r i s i n g a s y m b o l a n d
as t r i n gs e p a r a t e db ya na r r o w . T h es y m b o li sc a l l e da variable .T h e s t r i n g
consists of variables and other symbols called terminals .T h e v a r i a b l e s y m b o l s
often are represented by capital letters. The terminals are analogous to the in-
put alphabet and often are represented by lowercase letters, numbers, or special
symbols. One variable is designated as the start variable .I t u s u a l l y o c c u r s o n
the left-hand side of the topmost rule. For example, grammar G1contains three
rules. G1’s variables are AandB,w h e r e Ais the start variable. Its terminals are
0,1,a n d #.
You use a grammar to describe a language by generating each string of that
language in the following manner.
1.Write down the start variable. It is the variable on the left-hand side of the
top rule, unless specified otherwise.
2.Find a variable that is written down and a rule that starts with that variable.
Replace the written down variable with the right-hand side of that rule.
3.Repeat step 2 until no variables remain.
For example, grammar G1generates the string 000#111 .T h e s e q u e n c e o f
substitutions to obtain a string is called a derivation .A d e r i v a t i o n o f s t r i n g
000#111 in grammar G1is
A⇒0A1⇒00A11⇒000A111⇒000B111⇒000#111 .
You may also represent the same information pictorially with a parse tree .A n
example of a parse tree is shown in Figure 2.1.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 127 ---
2.1 CONTEXT-FREE GRAMMARS 103
FIGURE 2.1
Parse tree for 000#111 in grammar G1
All strings generated in this way constitute the language of the grammar .
We write L(G1)for the language of grammar G1.S o m e e x p e r i m e n t a t i o n w i t h
the grammar G1shows us that L(G1)is{0n#1n|n≥0}.A n yl a n g u a g et h a tc a n
be generated by some context-free grammar is called a context-free language
(CFL). For convenience when presenting a context-free grammar, we abbreviate
several rules with the same left-hand variable, such as A→0A1andA→B,
into a single line A→0A1|B,u s i n gt h es y m b o l“ |”a sa n“ o r ” .
The following is a second example of a context-free grammar, called G2,
which describes a fragment of the English language.
⟨SENTENCE ⟩→⟨ NOUN -PHRASE ⟩⟨VERB -PHRASE ⟩
⟨NOUN -PHRASE ⟩→⟨ CMPLX -NOUN ⟩|⟨CMPLX -NOUN ⟩⟨PREP -PHRASE ⟩
⟨VERB -PHRASE ⟩→⟨ CMPLX -VERB ⟩|⟨CMPLX -VERB ⟩⟨PREP -PHRASE ⟩
⟨PREP -PHRASE ⟩→⟨ PREP ⟩⟨CMPLX -NOUN ⟩
⟨CMPLX -NOUN ⟩→⟨ ARTICLE ⟩⟨NOUN ⟩
⟨CMPLX -VERB ⟩→⟨ VERB ⟩|⟨VERB ⟩⟨NOUN -PHRASE ⟩
⟨ARTICLE ⟩→a|the
⟨NOUN ⟩→boy|girl |flower
⟨VERB ⟩→touches |likes |sees
⟨PREP ⟩→with
Grammar G2has 10 variables (the capitalized grammatical terms written in-
side brackets); 27 terminals (the standard English alphabet plus a space charac-
ter); and 18 rules. Strings in L(G2)include:
ab o ys e e s
the boy sees a flower
ag i r lw i t haf l o w e rl i k e st h eb o y
Each of these strings has a derivation in grammar G2.T h ef o l l o w i n gi sad e r i v a -
tion of the first string on this list.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 128 ---
104 CHAPTER 2 / CONTEXT-FREE LANGUAGES
⟨SENTENCE ⟩⇒⟨ NOUN -PHRASE ⟩⟨VERB -PHRASE ⟩
⇒⟨CMPLX -NOUN ⟩⟨VERB -PHRASE ⟩
⇒⟨ARTICLE ⟩⟨NOUN ⟩⟨VERB -PHRASE ⟩
⇒a⟨NOUN ⟩⟨VERB -PHRASE ⟩
⇒ab o y ⟨VERB -PHRASE ⟩
⇒ab o y ⟨CMPLX -VERB ⟩
⇒ab o y ⟨VERB ⟩
⇒ab o ys e e s
FORMAL DEFINITION OF A CONTEXT-FREE GRAMMAR
Let’s formalize our notion of a context-free grammar ( CFG).
DEFINITION 2.2
Acontext-free grammar is a 4-tuple (V,Σ,R ,S ),w h e r e
1.Vis a finite set called the variables ,
2.Σis a finite set, disjoint from V,c a l l e dt h e terminals ,
3.Ris a finite set of rules ,w i t he a c hr u l eb e i n gav a r i a b l ea n da
string of variables and terminals, and
4.S∈Vis the start variable.
Ifu,v,a n d ware strings of variables and terminals, and A→wis a rule of the
grammar, we say that uAvyields uwv,w r i t t e n uAv⇒uwv.S a yt h a t uderives v,
written u∗⇒v,i fu=vor if a sequence u1,u2,...,u kexists for k≥0and
u⇒u1⇒u2⇒...⇒uk⇒v.
The language of the grammar is{w∈Σ∗|S∗⇒w}.
In grammar G1,V={A, B},Σ= {0,1,#},S=A,a n d Ris the collection
of the three rules appearing on page 102. In grammar G2,
V=⎪braceleftbig
⟨SENTENCE ⟩,⟨NOUN -PHRASE ⟩,⟨VERB -PHRASE ⟩,
⟨PREP -PHRASE ⟩,⟨CMPLX -NOUN ⟩,⟨CMPLX -VERB ⟩,
⟨ARTICLE ⟩,⟨NOUN ⟩,⟨VERB ⟩,⟨PREP ⟩⎪bracerightbig
,
andΣ= {a,b,c,...,z,“”}.The symbol “ ” is the blank symbol, placed invisibly
after each word ( a,boy,e t c . ) ,s ot h ew o r d sw o n ’ tr u nt o g e t h e r .
Often we specify a grammar by writing down only its rules. We can identify
the variables as the symbols that appear on the left-hand side of the rules and
the terminals as the remaining symbols. By convention, the start variable is the
variable on the left-hand side of the first rule.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 129 ---
2.1 CONTEXT-FREE GRAMMARS 105
EXAMPLES OF CONTEXT-FREE GRAMMARS
EXAMPLE 2.3
Consider grammar G3=({S},{a,b},R ,S ).T h es e to fr u l e s , R,i s
S→aSb|SS|ε.
This grammar generates strings such as abab ,aaabbb ,a n d aababb .Y o uc a n
see more easily what this language is if you think of aas a left parenthesis “ (”
andbas a right parenthesis “ )”. Viewed in this way, L(G3)is the language of
all strings of properly nested parentheses. Observe that the right-hand side of a
rule may be the empty string ε.
EXAMPLE 2.4
Consider grammar G4=(V,Σ,R ,⟨EXPR ⟩).
Vis{⟨EXPR ⟩,⟨TERM ⟩,⟨FACTOR ⟩}andΣis{a,+,x,(,)}.T h er u l e sa r e
⟨EXPR ⟩→⟨ EXPR ⟩+⟨TERM ⟩|⟨TERM ⟩
⟨TERM ⟩→⟨ TERM ⟩x⟨FACTOR ⟩|⟨FACTOR ⟩
⟨FACTOR ⟩→(⟨EXPR ⟩)|a
The two strings a+axaand(a+a) xacan be generated with grammar G4.
The parse trees are shown in the following figure.
FIGURE 2.5
Parse trees for the strings a+axaand(a+a) xa
Ac o m p i l e rt r a n s l a t e sc o d ew r i t t e ni nap r o g r a m m i n gl a n g u a g ei n t oa n o t h e r
form, usually one more suitable for execution. T o do so, the compiler extracts
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 130 ---
106 CHAPTER 2 / CONTEXT-FREE LANGUAGES
the meaning of the code to be compiled in a process called parsing .O n e r e p -
resentation of this meaning is the parse tree for the code, in the context-free
grammar for the programming language. We discuss an algorithm that parses
context-free languages later in Theorem 7.16 and in Problem 7.45.
Grammar G4describes a fragment of a programming language concerned
with arithmetic expressions. Observe how the parse trees in Figure 2.5 “group”
the operations. The tree for a+axagroups the xoperator and its operands
(the second two a’s) as one operand of the +operator. In the tree for (a+a) xa,
the grouping is reversed. These groupings fit the standard precedence of mul-
tiplication before addition and the use of parentheses to override the standard
precedence. Grammar G4is designed to capture these precedence relations.
DESIGNING CONTEXT-FREE GRAMMARS
As with the design of finite automata, discussed in Section 1.1 (page 41), the
design of context-free grammars requires creativity. Indeed, context-free gram-
mars are even trickier to construct than finite automata because we are more
accustomed to programming a machine for specific tasks than we are to describ-
ing languages with grammars. The following techniques are helpful, singly or in
combination, when you’re faced with the problem of constructing a CFG.
First, many CFLsa r et h eu n i o no fs i m p l e r CFLs. If you must construct a CFGfor
aCFLthat you can break into simpler pieces, do so and then construct individual
grammars for each piece. These individual grammars can be easily merged into
ag r a m m a rf o rt h eo r i g i n a ll a n g u a g eb yc o m b i n i n gt h e i rr u l e sa n dt h e na d d i n g
the new rule S→S1|S2|· · ·| Sk,w h e r et h ev a r i a b l e s Siare the start variables
for the individual grammars. Solving several simpler problems is often easier
than solving one complicated problem.
For example, to get a grammar for the language {0n1n|n≥0}∪{1n0n|n≥0},
first construct the grammar
S1→0S11|ε
for the language {0n1n|n≥0}and the grammar
S2→1S20|ε
for the language {1n0n|n≥0}and then add the rule S→S1|S2to give the
grammar
S→S1|S2
S1→0S11|ε
S2→1S20|ε.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 131 ---
2.1 CONTEXT-FREE GRAMMARS 107
Second, constructing a CFGfor a language that happens to be regular is easy
if you can first construct a DFAfor that language. You can convert any DFAinto
an equivalent CFGas follows. Make a variable Rifor each state qiof the DFA.
Add the rule Ri→aRjto the CFGifδ(qi,a)=qjis a transition in the DFA.A d d
the rule Ri→εifqiis an accept state of the DFA.M a k e R0the start variable of
the grammar, where q0is the start state of the machine. Verify on your own that
the resulting CFGgenerates the same language that the DFArecognizes.
Third, certain context-free languages contain strings with two substrings that
are “linked” in the sense that a machine for such a language would need to re-
member an unbounded amount of information about one of the substrings to
verify that it corresponds properly to the other substring. This situation occurs
in the language {0n1n|n≥0}because a machine would need to remember the
number of 0si no r d e rt ov e r i f yt h a ti te q u a l st h en u m b e ro f 1s. You can construct
aCFGto handle this situation by using a rule of the form R→uRv,w h i c hg e n -
erates strings wherein the portion containing the u’s corresponds to the portion
containing the v’s.
Finally, in more complex languages, the strings may contain certain structures
that appear recursively as part of other (or the same) structures. That situation
occurs in the grammar that generates arithmetic expressions in Example 2.4.
Any time the symbol aappears, an entire parenthesized expression might appear
recursively instead. T o achieve this effect, place the variable symbol generating
the structure in the location of the rules corresponding to where that structure
may recursively appear.
AMBIGUITY
Sometimes a grammar can generate the same string in several different ways.
Such a string will have several different parse trees and thus several different
meanings. This result may be undesirable for certain applications, such as pro-
gramming languages, where a program should have a unique interpretation.
If a grammar generates the same string in several different ways, we say that
the string is derived ambiguously in that grammar. If a grammar generates some
string ambiguously, we say that the grammar is ambiguous .
For example, consider grammar G5:
⟨EXPR ⟩→⟨ EXPR ⟩+⟨EXPR ⟩|⟨EXPR ⟩x⟨EXPR ⟩|(⟨EXPR ⟩)|a
This grammar generates the string a+axaambiguously. The following figure
shows the two different parse trees.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 132 ---
108 CHAPTER 2 / CONTEXT-FREE LANGUAGES
FIGURE 2.6
The two parse trees for the string a+axain grammar G5
This grammar doesn’t capture the usual precedence relations and so may
group the +before the ×or vice versa. In contrast, grammar G4generates
exactly the same language, but every generated string has a unique parse tree.
Hence G4is unambiguous, whereas G5is ambiguous.
Grammar G2(page 103) is another example of an ambiguous grammar. The
sentence the girl touches the boy with the flower has two different
derivations. In Exercise 2.8 you are asked to give the two parse trees and observe
their correspondence with the two different ways to read that sentence.
Now we formalize the notion of ambiguity. When we say that a grammar
generates a string ambiguously, we mean that the string has two different parse
trees, not two different derivations. T wo derivations may differ merely in the
order in which they replace variables yet not in their overall structure. T o con-
centrate on structure, we define a type of derivation that replaces variables in a
fixed order. A derivation of a string win a grammar Gis aleftmost derivation if
at every step the leftmost remaining variable is the one replaced. The derivation
preceding Definition 2.2 (page 104) is a leftmost derivation.
DEFINITION 2.7
As t r i n g wis derived ambiguously in context-free grammar Gif
it has two or more different leftmost derivations. Grammar Gis
ambiguous if it generates some string ambiguously.
Sometimes when we have an ambiguous grammar we can find an unambigu-
ous grammar that generates the same language. Some context-free languages,
however, can be generated only by ambiguous grammars. Such languages are
called inherently ambiguous .P r o b l e m2 . 2 9a s k sy o ut op r o v et h a tt h el a n g u a g e
{aibjck|i=jorj=k}is inherently ambiguous.
CHOMSKY NORMAL FORM
When working with context-free grammars, it is often convenient to have them
in simplified form. One of the simplest and most useful forms is called the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 133 ---
2.1 CONTEXT-FREE GRAMMARS 109
Chomsky normal form. Chomsky normal form is useful in giving algorithms
for working with context-free grammars, as we do in Chapters 4 and 7.
DEFINITION 2.8
Ac o n t e x t - f r e eg r a m m a ri si n Chomsky normal form if every rule is
of the form
A→BC
A→a
where ais any terminal and A,B,a n d Care any variables—except
thatBandCmay not be the start variable. In addition, we permit
the rule S→ε,w h e r e Sis the start variable.
THEOREM 2.9
Any context-free language is generated by a context-free grammar in Chomsky
normal form.
PROOF IDEA We can convert any grammar Ginto Chomsky normal form.
The conversion has several stages wherein rules that violate the conditions are
replaced with equivalent ones that are satisfactory. First, we add a new start
variable. Then, we eliminate all ε-rules of the form A→ε.W ea l s o e l i m i n a t e
allunit rules of the form A→B.I nb o t hc a s e sw ep a t c hu pt h eg r a m m a rt ob e
sure that it still generates the same language. Finally, we convert the remaining
rules into the proper form.
PROOF First, we add a new start variable S0and the rule S0→S,w h e r e
Swas the original start variable. This change guarantees that the start variable
doesn’t occur on the right-hand side of a rule.
Second, we take care of all ε-rules. We remove an ε-rule A→ε,w h e r e A
is not the start variable. Then for each occurrence of an Aon the right-hand
side of a rule, we add a new rule with that occurrence deleted. In other words,
ifR→uAvis a rule in which uandvare strings of variables and terminals, we
add rule R→uv.W ed os of o re a c h occurrence of an A,s ot h er u l e R→uAvAw
causes us to add R→uvAw ,R→uAvw ,a n d R→uvw.I f w e h a v e t h e r u l e
R→A,w ea d d R→εunless we had previously removed the rule R→ε.W e
repeat these steps until we eliminate all ε-rules not involving the start variable.
Third, we handle all unit rules. We remove a unit rule A→B.T h e n ,
whenever a rule B→uappears, we add the rule A→uunless this was a unit
rule previously removed. As before, uis a string of variables and terminals. We
repeat these steps until we eliminate all unit rules.
Finally, we convert all remaining rules into the proper form. We replace each
ruleA→u1u2···uk,w h e r e k≥3and each uiis a variable or terminal symbol,
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 134 ---
110 CHAPTER 2 / CONTEXT-FREE LANGUAGES
with the rules A→u1A1,A1→u2A2,A2→u3A3,...,a n d Ak−2→uk−1uk.
The Ai’s are new variables. We replace any terminal uiin the preceding rule(s)
with the new variable Uiand add the rule Ui→ui.
EXAMPLE 2.10
LetG6be the following CFGand convert it to Chomsky normal form by using
the conversion procedure just given. The series of grammars presented illus-
trates the steps in the conversion. Rules shown in bold have just been added.
Rules shown in gray have just been removed.
1.The original CFGG6is shown on the left. The result of applying the first
step to make a new start variable appears on the right.
S→ASA |aB
A→B|S
B→b|εS0→S
S→ASA |aB
A→B|S
B→b|ε
2.Remove ε-rules B→ε,s h o w no nt h el e f t ,a n d A→ε,s h o w no nt h er i g h t .
S0→S
S→ASA |aB|aaa
A→B|S|εεε
B→b|εεεS0→S
S→ASA |aB|a|SA|AS|S
A→B|S|εεε
B→b
3a. Remove unit rules S→S,s h o w no nt h el e f t ,a n d S0→S,s h o w no nt h e
right.
S0→S
S→ASA |aB|a|SA|AS|S
A→B|S
B→bS0→S|ASA |aaaB|aaa|SA|AS
S→ASA |aB|a|SA|AS
A→B|S
B→b
3b. Remove unit rules A→BandA→S.
S0→ASA |aB|a|SA|AS
S→ASA |aB|a|SA|AS
A→B|S|bbb
B→bS0→ASA |aB|a|SA|AS
S→ASA |aB|a|SA|AS
A→S|b|ASA |aaaB|aaa|SA|AS
B→b
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 135 ---
2.2 PUSHDOWN AUTOMATA 111
4.Convert the remaining rules into the proper form by adding additional vari-
ables and rules. The final grammar in Chomsky normal form is equivalent to G6.
(Actually the procedure given in Theorem 2.9 produces several variables Uiand
several rules Ui→a.W e s i m p l i fi e d t h e r e s u l t i n g g r a m m a r b y u s i n g a s i n g l e
variable Uand rule U→a.)
S0→AA1|UB|a|SA|AS
S→AA1|UB|a|SA|AS
A→b|AA1|UB|a|SA|AS
A1→SA
U→a
B→b
2.2
PUSHDOWN AUTOMATA
In this section we introduce a new type of computational model called pushdown
automata .T h e s ea u t o m a t aa r el i k en o n d e t e r m i n i s t i cfi n i t ea u t o m a t ab u th a v ea n
extra component called a stack .T h e s t a c k p r o v i d e s a d d i t i o n a l m e m o r yb e y o n d
the finite amount available in the control. The stack allows pushdown automata
to recognize some nonregular languages.
Pushdown automata are equivalent in power to context-free grammars. This
equivalence is useful because it gives us two options for proving that a language is
context free. We can give either a context-free grammar generating it or a push-
down automaton recognizing it. Certain languages are more easily described in
terms of generators, whereas others are more easily described by recognizers.
The following figure is a schematic representation of a finite automaton. The
control represents the states and transition function, the tape contains the in-
put string, and the arrow represents the input head, pointing at the next input
symbol to be read.FIGURE 2.11
Schematic of a finite automaton
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 136 ---
112 CHAPTER 2 / CONTEXT-FREE LANGUAGES
With the addition of a stack component we obtain a schematic representation
of a pushdown automaton, as shown in the following figure.
FIGURE 2.12
Schematic of a pushdown automaton
Ap u s h d o w na u t o m a t o n( PDA)c a nw r i t es y m b o l so nt h es t a c ka n dr e a dt h e m
back later. Writing a symbol “pushes down” all the other symbols on the stack.
At any time the symbol on the top of the stack can be read and removed. The
remaining symbols then move back up. Writing a symbol on the stack is of-
ten referred to as pushing the symbol, and removing a symbol is referred to as
popping it. Note that all access to the stack, for both reading and writing, may
be done only at the top. In other words a stack is a “last in, first out” storage
device. If certain information is written on the stack and additional information
is written afterward, the earlier information becomes inaccessible until the later
information is removed.
Plates on a cafeteria serving counter illustrate a stack. The stack of plates
rests on a spring so that when a new plate is placed on top of the stack, the plates
below it move down. The stack on a pushdown automaton is like a stack of
plates, with each plate having a symbol written on it.
As t a c ki sv a l u a b l eb e c a u s ei tc a nh o l da nu n l i m i t e da m o u n to fi n f o r m a t i o n .
Recall that a finite automaton is unable to recognize the language {0n1n|n≥0}
because it cannot store very large numbers in its finite memory. A PDAis able to
recognize this language because it can use its stack to store the number of 0si t
has seen. Thus the unlimited nature of a stack allows the PDAto store numbers of
unbounded size. The following informal description shows how the automaton
for this language works.
Read symbols from the input. As each 0is read, push it onto the stack. As
soon as 1sa r es e e n ,p o pa 0off the stack for each 1read. If reading the
input is finished exactly when the stack becomes empty of 0s, accept the
input. If the stack becomes empty while 1sr e m a i no ri ft h e 1sa r efi n i s h e d
while the stack still contains 0so ri fa n y 0sa p p e a ri nt h ei n p u tf o l l o w i n g
1s, reject the input.
As mentioned earlier, pushdown automata may be nondeterministic. Deter-
ministic and nondeterministic pushdown automata are notequivalent in power.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 137 ---
2.2 PUSHDOWN AUTOMATA 113
Nondeterministic pushdown automata recognize certain languages that no de-
terministic pushdown automata can recognize, as we will see in Section 2.4. We
give languages requiring nondeterminism in Examples 2.16 and 2.18. Recall
that deterministic and nondeterministic finite automata do recognize the same
class of languages, so the pushdown automata situation is different. We focus on
nondeterministic pushdown automata because these automata are equivalent in
power to context-free grammars.
FORMAL DEFINITION OF A PUSHDOWN AUTOMATON
The formal definition of a pushdown automaton is similar to that of a finite
automaton, except for the stack. The stack is a device containing symbols drawn
from some alphabet. The machine may use different alphabets for its input and
its stack, so now we specify both an input alphabet Σand a stack alphabet Γ.
At the heart of any formal definition of an automaton is the transition func-
tion, which describes its behavior. Recall that Σε=Σ∪{ε}andΓε=Γ∪{ε}.
The domain of the transition function is Q×Σε×Γε.T h u s t h e c u r r e n t s t a t e ,
next input symbol read, and top symbol of the stack determine the next move of
ap u s h d o w na u t o m a t o n .E i t h e rs y m b o lm a yb e ε,c a u s i n gt h em a c h i n et om o v e
without reading a symbol from the input or without reading a symbol from the
stack.
For the range of the transition function we need to consider what to allow
the automaton to do when it is in a particular situation. It may enter some
new state and possibly write a symbol on the top of the stack. The function δ
can indicate this action by returning a member of Qtogether with a member
ofΓε,t h a ti s ,am e m b e ro f Q×Γε.B e c a u s e w e a l l o w n o n d e t e r m i n i s m i n t h i s
model, a situation may have several legal next moves. The transition function
incorporates nondeterminism in the usual way, by returning a set of members of
Q×Γε,t h a ti s ,am e m b e ro f P(Q×Γε).P u t t i n g i t a l l t o g e t h e r , o u r t r a n s i t i o n
function δtakes the form δ:Q×Σε×Γε−→ P(Q×Γε).
DEFINITION 2.13
Apushdown automaton is a 6-tuple (Q,Σ,Γ,δ ,q 0,F),w h e r e Q,Σ,
Γ,a n d Fare all finite sets, and
1.Qis the set of states,
2.Σis the input alphabet,
3.Γis the stack alphabet,
4.δ:Q×Σε×Γε−→ P(Q×Γε)is the transition function,
5.q0∈Qis the start state, and
6.F⊆Qis the set of accept states.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 138 ---
114 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Ap u s h d o w na u t o m a t o n M=(Q,Σ,Γ,δ ,q 0,F)computes as follows. It ac-
cepts input wifwcan be written as w=w1w2···wm,w h e r ee a c h wi∈Σεand
sequences of states r0,r1,...,r m∈Qand strings s0,s1,...,s m∈Γ∗exist that
satisfy the following three conditions. The strings sirepresent the sequence of
stack contents that Mhas on the accepting branch of the computation.
1.r0=q0ands0=ε.T h i sc o n d i t i o ns i g n i fi e st h a t Mstarts out properly, in
the start state and with an empty stack.
2.Fori=0,...,m −1,w eh a v e (ri+1,b)∈δ(ri,wi+1,a),w h e r e si=at
andsi+1=btfor some a, b∈Γεandt∈Γ∗.T h i sc o n d i t i o ns t a t e st h a t M
moves properly according to the state, stack, and next input symbol.
3.rm∈F.T h i sc o n d i t i o ns t a t e st h a ta na c c e p ts t a t eo c c u r sa tt h ei n p u te n d .
EXAMPLES OF PUSHDOWN AUTOMATA
EXAMPLE 2.14
The following is the formal description of the PDA(page 112) that recognizes
the language {0n1n|n≥0}.L e t M1be(Q,Σ,Γ,δ ,q 1,F),w h e r e
Q={q1,q2,q3,q4},
Σ={0,1},
Γ={0,$},
F={q1,q4},a n d
δis given by the following table, wherein blank entries signify ∅.
Input:
 0
 1
 ε
Stack:
 0
$
 ε
 0
 $
ε
0
 $
 ε
q1
 {(q2,$)}
q2
 {(q2,0)}{(q3,ε)}
q3
 {(q3,ε)} {(q4,ε)}
q4
We can also use a state diagram to describe a PDA,a si nF i g u r e s2 . 1 5 ,2 . 1 7 ,
and 2.19. Such diagrams are similar to the state diagrams used to describe finite
automata, modified to show how the PDAuses its stack when going from state
to state. We write “ a,b→c”t os i g n i f yt h a tw h e nt h em a c h i n ei sr e a d i n ga n
afrom the input, it may replace the symbol bon the top of the stack with a c.
Any of a,b,a n d cmay be ε.I faisε,t h em a c h i n em a ym a k et h i st r a n s i t i o n
without reading any symbol from the input. If bisε,t h em a c h i n em a ym a k e
this transition without reading and popping any symbol from the stack. If c
isε,t h em a c h i n ed o e sn o tw r i t ea n ys y m b o lo nt h es t a c kw h e ng o i n ga l o n gt h i s
transition.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 139 ---
2.2 PUSHDOWN AUTOMATA 115
FIGURE 2.15
State diagram for the PDAM1that recognizes {0n1n|n≥0}
The formal definition of a PDAcontains no explicit mechanism to allow the
PDAto test for an empty stack. This PDAis able to get the same effect by initially
placing a special symbol $on the stack. Then if it ever sees the $again, it knows
that the stack effectively is empty. Subsequently, when we refer to testing for an
empty stack in an informal description of a PDA,w ei m p l e m e n tt h ep r o c e d u r ei n
the same way.
Similarly, PDAsc a n n o tt e s te x p l i c i t l yf o rh a v i n gr e a c h e dt h ee n do ft h ei n p u t
string. This PDAis able to achieve that effect because the accept state takes effect
only when the machine is at the end of the input. Thus from now on, we assume
that PDAsc a nt e s tf o rt h ee n do ft h ei n p u t ,a n dw ek n o wt h a tw ec a ni m p l e m e n t
it in the same manner.
EXAMPLE 2.16
This example illustrates a pushdown automaton that recognizes the language
{aibjck|i, j, k ≥0andi=jori=k}.
Informally, the PDAfor this language works by first reading and pushing
thea’s. When the a’s are done, the machine has all of them on the stack so
that it can match, them with either the b’s or the c’s. This maneuver is a bit
tricky because the machine doesn’t know in advance whether to match the a’s
with the b’s or the c’s. Nondeterminism comes in handy here.
Using its nondeterminism, the PDAcan guess whether to match the a’s with
theb’s or with the c’s, as shown in Figure 2.17. Think of the machine as having
two branches of its nondeterminism, one for each possible guess. If either of
them matches, that branch accepts and the entire machine accepts. Problem 2.57
asks you to show that nondeterminism is essential for recognizing this language
with a PDA.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 140 ---
116 CHAPTER 2 / CONTEXT-FREE LANGUAGES
FIGURE 2.17
State diagram for PDAM2that recognizes
{aibjck|i, j, k ≥0andi=jori=k}
EXAMPLE 2.18
In this example we give a PDAM3recognizing the language {wwR|w∈{0,1}∗}.
Recall that wRmeans wwritten backwards. The informal description and state
diagram of the PDAfollow.
Begin by pushing the symbols that are read onto the stack. At each point,
nondeterministically guess that the middle of the string has been reached and
then change into popping off the stack for each symbol read, checking to see that
they are the same. If they were always the same symbol and the stack empties at
the same time as the input is finished, accept; otherwise reject.
FIGURE 2.19
State diagram for the PDAM3that recognizes {wwR|w∈{0,1}∗}
Problem 2.58 shows that this language requires a nondeterministic PDA.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 141 ---
2.2 PUSHDOWN AUTOMATA 117
EQUIVALENCE WITH CONTEXT-FREE GRAMMARS
In this section we show that context-free grammars and pushdown automata are
equivalent in power. Both are capable of describing the class of context-free
languages. We show how to convert any context-free grammar into a pushdown
automaton that recognizes the same language and vice versa. Recalling that we
defined a context-free language to be any language that can be described with a
context-free grammar, our objective is the following theorem.
THEOREM 2.20
Al a n g u a g ei sc o n t e x tf r e ei fa n do n l yi fs o m ep u s h d o w na u t o m a t o nr e c o g n i z e si t .
As usual for “if and only if” theorems, we have two directions to prove. In
this theorem, both directions are interesting. First, we do the easier forward
direction.
LEMMA 2.21
If a language is context free, then some pushdown automaton recognizes it.
PROOF IDEA LetAbe a CFL.F r o mt h ed e fi n i t i o nw ek n o wt h a t Ahas a CFG,
G,g e n e r a t i n gi t .W es h o wh o wt oc o n v e r t Ginto an equivalent PDA,w h i c hw e
callP.
The PDAPthat we now describe will work by accepting its input w,i fGgen-
erates that input, by determining whether there is a derivation for w.R e c a l lt h a t
ad e r i v a t i o ni ss i m p l yt h es e q u e n c eo fs u b s t i t u t i o n sm a d ea sag r a m m a rg e n e r a t e s
as t r i n g . E a c hs t e po ft h ed e r i v a t i o ny i e l d sa n intermediate string of variables
and terminals. We design Pto determine whether some series of substitutions
using the rules of Gcan lead from the start variable to w.
One of the difficulties in testing whether there is a derivation for wis in
figuring out which substitutions to make. The PDA’s nondeterminism allows it
to guess the sequence of correct substitutions. At each step of the derivation, one
of the rules for a particular variable is selected nondeterministically and used to
substitute for that variable.
The PDAPbegins by writing the start variable on its stack. It goes through a
series of intermediate strings, making one substitution after another. Eventually
it may arrive at a string that contains only terminal symbols, meaning that it has
used the grammar to derive a string. Then Paccepts if this string is identical to
the string it has received as input.
Implementing this strategy on a PDArequires one additional idea. We need
to see how the PDAstores the intermediate strings as it goes from one to an-
other. Simply using the stack for storing each intermediate string is tempting.
However, that doesn’t quite work because the PDAneeds to find the variables in
the intermediate string and make substitutions. The PDAcan access only the top
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 142 ---
118 CHAPTER 2 / CONTEXT-FREE LANGUAGES
symbol on the stack and that may be a terminal symbol instead of a variable. The
way around this problem is to keep only partof the intermediate string on the
stack: the symbols starting with the first variable in the intermediate string. Any
terminal symbols appearing before the first variable are matched immediately
with symbols in the input string. The following figure shows the PDAP.
FIGURE 2.22
Prepresenting the intermediate string 01A1A0
The following is an informal description of P.
1.Place the marker symbol $and the start variable on the stack.
2.Repeat the following steps forever.
a.If the top of stack is a variable symbol A,n o n d e t e r m i n i s t i c a l l ys e l e c t
one of the rules for Aand substitute Aby the string on the right-hand
side of the rule.
b.If the top of stack is a terminal symbol a,r e a dt h en e x ts y m b o lf r o m
the input and compare it to a.I ft h e ym a t c h ,r e p e a t .I f t h e yd o n o t
match, reject on this branch of the nondeterminism.
c.If the top of stack is the symbol $,e n t e rt h ea c c e p ts t a t e . D o i n gs o
accepts the input if it has all been read.
PROOF We now give the formal details of the construction of the pushdown
automaton P=(Q,Σ,Γ,δ ,q start,F).T o m a k e t h e c o n s t r u c t i o n c l e a r e r , w e u s e
shorthand notation for the transition function. This notation provides a way to
write an entire string on the stack in one step of the machine. We can simulate
this action by introducing additional states to write the string one symbol at a
time, as implemented in the following formal construction.
Letqandrbe states of the PDAand let abe in Σεandsbe in Γε.S a y t h a t
we want the PDAto go from qtorwhen it reads aand pops s.F u r t h e r m o r e ,w e
want it to push the entire string u=u1···ulon the stack at the same time. We
can implement this action by introducing new states q1,...,q l−1and setting the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 143 ---
2.2 PUSHDOWN AUTOMATA 119
transition function as follows:
δ(q,a,s )to contain (q1,ul),
δ(q1,ε,ε)={(q2,ul−1)},
δ(q2,ε,ε)={(q3,ul−2)},
...
δ(ql−1,ε,ε)={(r, u1)}.
We use the notation (r, u)∈δ(q,a,s )to mean that when qis the state of the
automaton, ais the next input symbol, and sis the symbol on the top of the
stack, the PDAmay read the aand pop the s,t h e np u s ht h es t r i n g uonto the
stack and go on to the state r.T h ef o l l o w i n gfi g u r es h o w st h i si m p l e m e n t a t i o n .
a sxyza szyxFIGURE 2.23
Implementing the shorthand (r, xyz )∈δ(q,a,s )
The states of PareQ={qstart,qloop,qaccept}∪E,w h e r e Eis the set of states
we need for implementing the shorthand just described. The start state is qstart.
The only accept state is qaccept.
The transition function is defined as follows. We begin by initializing the
stack to contain the symbols $andS,i m p l e m e n t i n gs t e p1i nt h ei n f o r m a ld e -
scription: δ(qstart,ε,ε)={(qloop,S$)}.T h e nw ep u ti nt r a n s i t i o n sf o rt h em a i n
loop of step 2.
First, we handle case (a) wherein the top of the stack contains a variable. Let
δ(qloop,ε,A)={(qloop,w)|where A→wis a rule in R}.
Second, we handle case (b) wherein the top of the stack contains a terminal.
Letδ(qloop,a ,a)={(qloop,ε)}.
Finally, we handle case (c) wherein the empty stack marker $is on the top of
the stack. Let δ(qloop,ε,$)={(qaccept,ε)}.
The state diagram is shown in Figure 2.24.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 144 ---
120 CHAPTER 2 / CONTEXT-FREE LANGUAGES
FIGURE 2.24
State diagram of P
That completes the proof of Lemma 2.21.
EXAMPLE 2.25
We use the procedure developed in Lemma 2.21 to construct a PDAP1from the
following CFGG.
S→aTb|b
T→Ta|ε
The transition function is shown in the following diagram.
FIGURE 2.26
State diagram of P1
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 145 ---
2.2 PUSHDOWN AUTOMATA 121
Now we prove the reverse direction of Theorem 2.20. For the forward di-
rection, we gave a procedure for converting a CFGinto a PDA.T h em a i ni d e a
was to design the automaton so that it simulates the grammar. Now we want
to give a procedure for going the other way: converting a PDAinto a CFG.W e
design the grammar to simulate the automaton. This task is challenging because
“programming” an automaton is easier than “programming” a grammar.
LEMMA 2.27
If a pushdown automaton recognizes some language, then it is context free.
PROOF IDEA We have a PDAP,a n dw ew a n tt om a k ea CFGGthat generates
all the strings that Paccepts. In other words, Gshould generate a string if that
string causes the PDAto go from its start state to an accept state.
To a c h i e v e t h i s o u t c o m e , w e d e s i g n a g r a m m a r t h a t d o e s s o m e w h a t m o r e .
For each pair of states pandqinP,t h eg r a m m a rw i l lh a v eav a r i a b l e Apq.T h i s
variable generates all the strings that can take Pfrom pwith an empty stack to
qwith an empty stack. Observe that such strings can also take Pfrom ptoq,
regardless of the stack contents at p,l e a v i n gt h es t a c ka t qin the same condition
as it was at p.
First, we simplify our task by modifying Pslightly to give it the following
three features.
1.It has a single accept state, qaccept.
2.It empties its stack before accepting.
3.Each transition either pushes a symbol onto the stack (a push move) or pops
one off the stack (a popmove), but it does not do both at the same time.
Giving Pfeatures 1 and 2 is easy. T o give it feature 3, we replace each transition
that simultaneously pops and pushes with a two transition sequence that goes
through a new state, and we replace each transition that neither pops nor pushes
with a two transition sequence that pushes then pops an arbitrary stack symbol.
To d e s i g n Gso that Apqgenerates all strings that take Pfrom ptoq,s t a r t i n g
and ending with an empty stack, we must understand how Poperates on these
strings. For any such string x,P’s first move on xmust be a push, because every
move is either a push or a pop and Pcan’t pop an empty stack. Similarly, the last
move on xmust be a pop because the stack ends up empty.
Tw o p o s s i b i l i t i e s o c c u r d u r i n g P’s computation on x.E i t h e r t h e s y m b o l
popped at the end is the symbol that was pushed at the beginning, or not. If
so, the stack could be empty only at the beginning and end of P’s computation
onx.I f n o t , t h e i n i t i a l l y p u s h e d s y m b o l m u s t g e t p o p p e d a t s o m e p o i n t b e -
fore the end of xand thus the stack becomes empty at this point. We simulate
the former possibility with the rule Apq→aArsb,w h e r e ais the input read at
the first move, bis the input read at the last move, ris the state following p,
andsis the state preceding q.W e s i m u l a t e t h e l a t t e r p o s s i b i l i t y w i t h t h e r u l e
Apq→AprArq,w h e r e ris the state when the stack becomes empty.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 146 ---
122 CHAPTER 2 / CONTEXT-FREE LANGUAGES
PROOF Say that P=(Q,Σ,Γ,δ ,q 0,{qaccept})and construct G.T h ev a r i a b l e s
ofGare{Apq|p, q∈Q}.T h e s t a r t v a r i a b l e i s Aq0,qaccept.N o w w e d e s c r i b e G’s
rules in three parts.
1.For each p, q, r, s ∈Q,u∈Γ,a n d a, b∈Σε,i fδ(p, a,ε)contains (r, u)
andδ(s, b, u )contains (q,ε),p u tt h er u l e Apq→aArsbinG.
2.For each p, q, r ∈Q,p u tt h er u l e Apq→AprArqinG.
3.Finally, for each p∈Q,p u tt h er u l e App→εinG.
You may gain some insight for this construction from the following figures.
FIGURE 2.28
PDAcomputation corresponding to the rule Apq→AprArq
FIGURE 2.29
PDAcomputation corresponding to the rule Apq→aArsb
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 147 ---
2.2 PUSHDOWN AUTOMATA 123
Now we prove that this construction works by demonstrating that Apqgener-
atesxif and only if (iff) xcan bring Pfrom pwith empty stack to qwith empty
stack. We consider each direction of the iff as a separate claim.
CLAIM 2.30
IfApqgenerates x,t h e n xcan bring Pfrom pwith empty stack to qwith empty
stack.
We prove this claim by induction on the number of steps in the derivation of
xfrom Apq.
Basis: The derivation has 1 step.
Ad e r i v a t i o nw i t has i n g l es t e pm u s tu s ear u l ew h o s er i g h t - h a n ds i d ec o n t a i n sn o
variables. The only rules in Gwhere no variables occur on the right-hand side
areApp→ε. Clearly, input εtakes Pfrom pwith empty stack to pwith empty
stack so the basis is proved.
Induction step: Assume true for derivations of length at most k,w h e r e k≥1,
and prove true for derivations of length k+1.
Suppose that Apq∗⇒xwith k+1steps. The first step in this derivation is either
Apq⇒aArsborApq⇒AprArq.W eh a n d l et h e s et w oc a s e ss e p a r a t e l y .
In the first case, consider the portion yofxthatArsgenerates, so x=ayb.
Because Ars∗⇒ywith ksteps, the induction hypothesis tells us that Pcan go
from ron empty stack to son empty stack. Because Apq→aArsbis a rule of
G,δ(p, a,ε)contains (r, u)andδ(s, b, u )contains (q,ε),f o rs o m es t a c ks y m b o l
u.H e n c e , i f Pstarts at pwith empty stack, after reading ait can go to state r
and push uonto the stack. Then reading string ycan bring it to sand leave u
on the stack. Then after reading bit can go to state qand pop uoff the stack.
Therefore, xcan bring it from pwith empty stack to qwith empty stack.
In the second case, consider the portions yandzofxthatAprandArqre-
spectively generate, so x=yz.B e c a u s e Apr∗⇒yin at most ksteps and Arq∗⇒z
in at most ksteps, the induction hypothesis tells us that ycan bring Pfrom p
tor,a n d zcan bring Pfrom rtoq,w i t he m p t ys t a c k sa tt h eb e g i n n i n ga n d
end. Hence xcan bring it from pwith empty stack to qwith empty stack. This
completes the induction step.
CLAIM 2.31
Ifxcan bring Pfrom pwith empty stack to qwith empty stack, Apqgenerates x.
We prove this claim by induction on the number of steps in the computation
ofPthat goes from ptoqwith empty stacks on input x.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 148 ---
124 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Basis: The computation has 0 steps.
If a computation has 0 steps, it starts and ends at the same state—say, p.S ow e
must show that App∗⇒x.I n0s t e p s , Pcannot read any characters, so x=ε.B y
construction, Ghas the rule App→ε,s ot h eb a s i si sp r o v e d .
Induction step: Assume true for computations of length at most k,w h e r e k≥0,
and prove true for computations of length k+1.
Suppose that Phas a computation wherein xbrings ptoqwith empty stacks
ink+1steps. Either the stack is empty only at the beginning and end of this
computation, or it becomes empty elsewhere, too.
In the first case, the symbol that is pushed at the first move must be the same
as the symbol that is popped at the last move. Call this symbol u.L e t abe
the input read in the first move, bbe the input read in the last move, rbe the
state after the first move, and sbe the state before the last move. Then δ(p, a,ε)
contains (r, u)andδ(s, b, u )contains (q,ε),a n ds or u l e Apq→aArsbis in G.
Letybe the portion of xwithout aandb,s ox=ayb.I n p u t ycan bring
Pfrom rtoswithout touching the symbol uthat is on the stack and so Pcan
go from rwith an empty stack to swith an empty stack on input y.W e h a v e
removed the first and last steps of the k+1steps in the original computation on
xso the computation on yhas(k+1 ) −2=k−1steps. Thus the induction
hypothesis tells us that Ars∗⇒y.H e n c e Apq∗⇒x.
In the second case, let rbe a state where the stack becomes empty other than
at the beginning or end of the computation on x.T h e n t h e p o r t i o n s o f t h e
computation from ptorand from rtoqeach contain at most ksteps. Say that
yis the input read during the first portion and zis the input read during the
second portion. The induction hypothesis tells us that Apr∗⇒yandArq∗⇒z.
Because rule Apq→AprArqis in G,Apq∗⇒x,a n dt h ep r o o fi sc o m p l e t e .
That completes the proof of Lemma 2.27 and of Theorem 2.20.
We have just proved that pushdown automata recognize the class of context-
free languages. This proof allows us to establish a relationship between the reg-
ular languages and the context-free languages. Because every regular language
is recognized by a finite automaton and every finite automaton is automatically
ap u s h d o w na u t o m a t o nt h a ts i m p l yi g n o r e si t ss t a c k ,w en o wk n o wt h a te v e r y
regular language is also a context-free language.
COROLLARY 2.32
Every regular language is context free.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 149 ---
2.3 NON-CONTEXT-FREE LANGUAGES 125
FIGURE 2.33
Relationship of the regular and context-free languages
2.3
NON-CONTEXT-FREE LANGUAGES
In this section we present a technique for proving that certain languages are not
context free. Recall that in Section 1.4 we introduced the pumping lemma for
showing that certain languages are not regular. Here we present a similar pump-
ing lemma for context-free languages. It states that every context-free language
has a special value called the pumping length such that all longer strings in the
language can be “pumped.” This time the meaning of pumped is a bit more com-
plex. It means that the string can be divided into five parts so that the second and
the fourth parts may be repeated together any number of times and the resulting
string still remains in the language.
THE PUMPING LEMMA FOR CONTEXT-FREE LANGUAGES
THEOREM 2.34
Pumping lemma for context-free languages IfAis a context-free language,
then there is a number p(the pumping length) where, if sis any string in Aof
length at least p,t h e n smay be divided into five pieces s=uvxyz satisfying the
conditions
1.for each i≥0,uvixyiz∈A,
2.|vy|>0,a n d
3.|vxy|≤p.
When sis being divided into uvxyz ,c o n d i t i o n2s a y st h a te i t h e r voryis not
the empty string. Otherwise the theorem would be trivially true. Condition 3
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 150 ---
126 CHAPTER 2 / CONTEXT-FREE LANGUAGES
states that the pieces v,x,a n d ytogether have length at most p.T h i st e c h n i c a l
condition sometimes is useful in proving that certain languages are not context
free.
PROOF IDEA LetAbe a CFLand let Gbe a CFGthat generates it. We must
show that any sufficiently long string sinAcan be pumped and remain in A.
The idea behind this approach is simple.
Letsbe a very long string in A.( W em a k ec l e a rl a t e rw h a tw em e a nb y“ v e r y
long.”) Because sis in A,i ti sd e r i v a b l ef r o m Gand so has a parse tree. The
parse tree for smust be very tall because sis very long. That is, the parse tree
must contain some long path from the start variable at the root of the tree to
one of the terminal symbols at a leaf. On this long path, some variable symbol R
must repeat because of the pigeonhole principle. As the following figure shows,
this repetition allows us to replace the subtree under the second occurrence of
Rwith the subtree under the first occurrence of Rand still get a legal parse tree.
Therefore, we may cut sinto five pieces uvxyz as the figure indicates, and we
may repeat the second and fourth pieces and obtain a string still in the language.
In other words, uvixyizis in Afor any i≥0.
FIGURE 2.35
Surgery on parse trees
Let’s now turn to the details to obtain all three conditions of the pumping
lemma. We also show how to calculate the pumping length p.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 151 ---
2.3 NON-CONTEXT-FREE LANGUAGES 127
PROOF LetGbe a CFGforCFLA.L e t bbe the maximum number of symbols
in the right-hand side of a rule (assume at least 2). In any parse tree using this
grammar, we know that a node can have no more than bchildren. In other
words, at most bleaves are 1 step from the start variable; at most b2leaves are
within 2 steps of the start variable; and at most bhleaves are within hsteps of the
start variable. So, if the height of the parse tree is at most h,t h el e n g t ho ft h e
string generated is at most bh. Conversely, if a generated string is at least bh+1
long, each of its parse trees must be at least h+1high.
Say|V|is the number of variables in G.W es e t p,t h ep u m p i n gl e n g t h ,t ob e
b|V|+1.N o wi f sis a string in Aand its length is por more, its parse tree must
be at least |V|+1high, because b|V|+1≥b|V|+1.
To s e e h o w t o p u m p a n y s u c h s t r i n g s,l e tτbe one of its parse trees. If shas
several parse trees, choose τto be a parse tree that has the smallest number of
nodes. We know that τmust be at least |V|+1high, so its longest path from
the root to a leaf has length at least |V|+1.T h a tp a t hh a sa tl e a s t |V|+2nodes;
one at a terminal, the others at variables. Hence that path has at least |V|+1
variables. With Ghaving only |V|variables, some variable Rappears more than
once on that path. For convenience later, we select Rto be a variable that repeats
among the lowest |V|+1variables on this path.
We divide sintouvxyz according to Figure 2.35. Each occurrence of Rhas
as u b t r e eu n d e ri t ,g e n e r a t i n gap a r to ft h es t r i n g s.T h eu p p e ro c c u r r e n c eo f R
has a larger subtree and generates vxy,w h e r e a st h el o w e ro c c u r r e n c eg e n e r a t e s
justxwith a smaller subtree. Both of these subtrees are generated by the same
variable, so we may substitute one for the other and still obtain a valid parse tree.
Replacing the smaller by the larger repeatedly gives parse trees for the strings
uvixyizat each i>1.R e p l a c i n gt h e l a r g e rb y t h es m a l l e rg e n e r a t e st h es t r i n g
uxz.T h a t e s t a b l i s h e s c o n d i t i o n 1 o f t h e l e m m a .W e n o w t u r n t o c o n d i t i o n s 2
and 3.
To g e t c o n d i t i o n 2 , w e m u s t b e s u r e t h a t vandyare not both ε.I ft h e yw e r e ,
the parse tree obtained by substituting the smaller subtree for the larger would
have fewer nodes than τdoes and would still generate s.T h i sr e s u l ti s n ’ tp o s s i b l e
because we had already chosen τto be a parse tree for swith the smallest number
of nodes. That is the reason for selecting τin this way.
In order to get condition 3, we need to be sure that vxyhas length at most p.
In the parse tree for sthe upper occurrence of Rgenerates vxy.W ec h o s e Rso
that both occurrences fall within the bottom |V|+1variables on the path, and
we chose the longest path in the parse tree, so the subtree where Rgenerates
vxyis at most |V|+1high. A tree of this height can generate a string of length
at most b|V|+1=p.
For some tips on using the pumping lemma to prove that languages are not
context free, review the text preceding Example 1.73 (page 80) where we dis-
cuss the related problem of proving nonregularity with the pumping lemma for
regular languages.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 152 ---
128 CHAPTER 2 / CONTEXT-FREE LANGUAGES
EXAMPLE 2.36
Use the pumping lemma to show that the language B={anbncn|n≥0}is not
context free.
We assume that Bis aCFLand obtain a contradiction. Let pbe the pumping
length for Bthat is guaranteed to exist by the pumping lemma. Select the string
s=apbpcp. Clearly sis a member of Band of length at least p.T h ep u m p i n g
lemma states that scan be pumped, but we show that it cannot. In other words,
we show that no matter how we divide sintouvxyz ,o n eo ft h et h r e ec o n d i t i o n s
of the lemma is violated.
First, condition 2 stipulates that either voryis nonempty. Then we consider
one of two cases, depending on whether substrings vandycontain more than
one type of alphabet symbol.
1.When both vandycontain only one type of alphabet symbol, vdoes not
contain both a’s and b’s or both b’s and c’s, and the same holds for y.I n
this case, the string uv2xy2zcannot contain equal numbers of a’s,b’s, and
c’s. Therefore, it cannot be a member of B.T h a t v i o l a t e s c o n d i t i o n 1 o f
the lemma and is thus a contradiction.
2.When either vorycontains more than one type of symbol, uv2xy2zmay
contain equal numbers of the three alphabet symbols but not in the correct
order. Hence it cannot be a member of Band a contradiction occurs.
One of these cases must occur. Because both cases result in a contradiction, a
contradiction is unavoidable. So the assumption that Bis aCFLmust be false.
Thus we have proved that Bis not a CFL.
EXAMPLE 2.37
LetC={aibjck|0≤i≤j≤k}.W eu s et h ep u m p i n gl e m m at os h o wt h a t Cis
not a CFL.T h i sl a n g u a g ei ss i m i l a rt ol a n g u a g e Bin Example 2.36, but proving
that it is not context free is a bit more complicated.
Assume that Cis a CFLand obtain a contradiction. Let pbe the pumping
length given by the pumping lemma. We use the string s=apbpcpthat we
used earlier, but this time we must “pump down” as well as “pump up.” Let
s=uvxyz and again consider the two cases that occurred in Example 2.36.
1.When both vandycontain only one type of alphabet symbol, vdoes not
contain both a’s and b’s or both b’s and c’s, and the same holds for y.N o t e
that the reasoning used previously in case 1 no longer applies. The reason
is that Ccontains strings with unequal numbers of a’s,b’s, and c’s as long
as the numbers are not decreasing. We must analyze the situation more
carefully to show that scannot be pumped. Observe that because vand
ycontain only one type of alphabet symbol, one of the symbols a,b,o rc
doesn’t appear in vory.W ef u r t h e rs u b d i v i d et h i sc a s ei n t ot h r e es u b c a s e s
according to which symbol does not appear.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 153 ---
2.3 NON-CONTEXT-FREE LANGUAGES 129
a.Thea’s do not appear .T h e nw et r yp u m p i n gd o w nt oo b t a i nt h es t r i n g
uv0xy0z=uxz.T h a tc o n t a i n st h es a m en u m b e ro f a’s assdoes, but
it contains fewer b’s or fewer c’s. Therefore, it is not a member of C,
and a contradiction occurs.
b.Theb’s do not appear. Then either a’s orc’s must appear in vorybe-
cause both can’t be the empty string. If a’s appear, the string uv2xy2z
contains more a’s than b’s, so it is not in C.I fc’s appear, the string
uv0xy0zcontains more b’s than c’s, so it is not in C.E i t h e r w a y , a
contradiction occurs.
c.Thec’s do not appear. Then the string uv2xy2zcontains more a’s or
more b’s than c’s, so it is not in C,a n dac o n t r a d i c t i o no c c u r s .
2.When either vorycontains more than one type of symbol, uv2xy2zwill
not contain the symbols in the correct order. Hence it cannot be a member
ofC,a n dac o n t r a d i c t i o no c c u r s .
Thus we have shown that scannot be pumped in violation of the pumping
lemma and that Cis not context free.
EXAMPLE 2.38
LetD={ww|w∈{0,1}∗}.U s e t h e p u m p i n g l e m m a t o s h o w t h a t Dis not a
CFL.A s s u m et h a t Dis aCFLand obtain a contradiction. Let pbe the pumping
length given by the pumping lemma.
This time choosing string sis less obvious. One possibility is the string
0p10p1.I t i s a m e m b e r o f Dand has length greater than p,s oi ta p p e a r st o
be a good candidate. But this string canbe pumped by dividing it as follows, so
it is not adequate for our purposes.
0p1⎪bracehtipdownleft
 ⎪bracehtipupright⎪bracehtipupleft
 ⎪bracehtipdownright
000···000⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
u0⎪bracehtipupleft⎪bracehtipdownright⎪bracehtipdownleft⎪bracehtipupright
v1⎪bracehtipupleft⎪bracehtipdownright⎪bracehtipdownleft⎪bracehtipupright
x0p1⎪bracehtipdownleft
 ⎪bracehtipupright⎪bracehtipupleft
 ⎪bracehtipdownright
0⎪bracehtipupleft⎪bracehtipdownright⎪bracehtipdownleft⎪bracehtipupright
y000···0001⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
z
Let’s try another candidate for s.I n t u i t i v e l y , t h e s t r i n g 0p1p0p1pseems to
capture more of the “essence” of the language Dthan the previous candidate
did. In fact, we can show that this string does work, as follows.
We show that the string s=0p1p0p1pcannot be pumped. This time we use
condition 3 of the pumping lemma to restrict the way that scan be divided. It
says that we can pump sby dividing s=uvxyz ,w h e r e |vxy|≤p.
First, we show that the substring vxymust straddle the midpoint of s.O t h e r -
wise, if the substring occurs only in the first half of s,p u m p i n g sup to uv2xy2z
moves a 1into the first position of the second half, and so it cannot be of the
form ww.S i m i l a r l y , i f vxyoccurs in the second half of s,p u m p i n g sup to
uv2xy2zmoves a 0into the last position of the first half, and so it cannot be of
the form ww.
But if the substring vxystraddles the midpoint of s,w h e nw et r yt op u m p s
down to uxzit has the form 0p1i0j1p,w h e r e iandjcannot both be p.T h i s
string is not of the form ww.T h u s scannot be pumped, and Dis not a CFL.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 154 ---
130 CHAPTER 2 / CONTEXT-FREE LANGUAGES
2.4
DETERMINISTIC CONTEXT-FREE LANGUAGES
As you recall, deterministic finite automata and nondeterministic finite automata
are equivalent in language recognition power. In contrast, nondeterministic
pushdown automata are more powerful than their deterministic counterparts.
We will show that certain context-free languages cannot be recognized by deter-
ministic PDAs—these languages require nondeterministic PDAs. The languages
that are recognizable by deterministic pushdown automata ( DPDA s) are called
deterministic context-free languages ( DCFLs). This subclass of the context-free
languages is relevant to practical applications, such as the design of parsers in
compilers for programming languages, because the parsing problem is gener-
ally easier for DCFLst h a nf o r CFLs. This section gives a short overview of this
important and beautiful subject.
In defining DPDA s, we conform to the basic principle of determinism: at each
step of its computation, the DPDA has at most one way to proceed according to
its transition function. Defining DPDA si sm o r ec o m p l i c a t e dt h a nd e fi n i n g DFAs
because DPDA sm a yr e a da ni n p u ts y m b o lw i t h o u tp o p p i n gas t a c ks y m b o l ,a n d
vice versa. Accordingly, we allow ε-moves in the DPDA ’s transition function even
though ε-moves are prohibited in DFAs. These ε-moves take two forms: ε-input
moves corresponding to δ(q,ε,x),a n d ε-stack moves corresponding to δ(q,a,ε).
Am o v em a yc o m b i n eb o t hf o r m s ,c o r r e s p o n d i n gt o δ(q,ε,ε).I f a DPDA can
make an ε-move in a certain situation, it is prohibited from making a move in
that same situation that involves processing a symbol instead of ε.O t h e r w i s e
multiple valid computation branches might occur, leading to nondeterministic
behavior. The formal definition follows.
DEFINITION 2.39
Adeterministic pushdown automaton is a 6-tuple (Q,Σ,Γ,δ ,q 0,F),
where Q,Σ,Γ,a n d Fare all finite sets, and
1.Qis the set of states,
2.Σis the input alphabet,
3.Γis the stack alphabet,
4.δ:Q×Σε×Γε−→(Q×Γε)∪{ ∅ } is the transition function,
5.q0∈Qis the start state, and
6.F⊆Qis the set of accept states.
The transition function δmust satisfy the following condition.
For every q∈Q,a∈Σ,a n d x∈Γ,e x a c t l yo n eo ft h ev a l u e s
δ(q,a,x ),δ(q,a,ε),δ(q,ε,x),andδ(q,ε,ε)
is not ∅.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 155 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 131
The transition function may output either a single move of the form (r, y)
or it may indicate no action by outputting ∅.T o i l l u s t r a t e t h e s e p o s s i b i l i t i e s ,
let’s consider an example. Suppose a DPDA Mwith transition function δis in
state q,h a s aas its next input symbol, and has symbol xon the top of its stack.
Ifδ(q,a,x )=( r, y)then Mreads a,p o p s xoff the stack, enters state r,a n d
pushes yon the stack. Alternatively, if δ(q,a,x )=∅then when Mis in state q,
it has no move that reads aand pops x.I nt h a tc a s e ,t h ec o n d i t i o no n δrequires
that one of δ(q,ε,x),δ(q,a,ε),o rδ(q,ε,ε)is nonempty, and then Mmoves
accordingly. The condition enforces deterministic behavior by preventing the
DPDA from taking two different actions in the same situation, such as would be
the case if both δ(q,a,x )̸=∅andδ(q,a,ε)̸=∅.ADPDA has exactly one legal
move in every situation where its stack is nonempty. If the stack is empty, a DPDA
can move only if the transition function specifies a move that pops ε.O t h e r w i s e
theDPDA has no legal move and it rejects without reading the rest of the input.
Acceptance for DPDA sw o r k si nt h es a m ew a yi td o e sf o r PDAs. If a DPDA
enters an accept state after it has read the last input symbol of an input string,
it accepts that string. In all other cases, it rejects that string. Rejection occurs if
theDPDA reads the entire input but doesn’t enter an accept state when it is at the
end, or if the DPDA fails to read the entire input string. The latter case may arise
if the DPDA tries to pop an empty stack or if the DPDA makes an endless sequence
ofε-input moves without reading the input past a certain point.
The language of a DPDA is called a deterministic context-free language .
EXAMPLE 2.40
The language {0n1n|n≥0}in Example 2.14 is a DCFL.W ec a ne a s i l ym o d i f yi t s
PDAM1to be a DPDA by adding transitions for any missing state, input symbol,
and stack symbol combinations to a “dead” state from which acceptance isn’t
possible.
Examples 2.16 and 2.18 give CFLs{aibjck|i, j, k ≥0andi=jori=k}and
{wwR|w∈{0,1}∗},w h i c ha r en o t DCFLs. Problems 2.57 and 2.58 show that
nondeterminism is necessary for recognizing these languages.
Arguments involving DPDA st e n dt ob es o m e w h a tt e c h n i c a li nn a t u r e ,a n d
though we strive to emphasize the primary ideas behind the constructions, read-
ers may find this section to be more challenging than other sections in the first
few chapters. Later material in the book doesn’t depend on this section, so it
may be skipped if desired.
We’ll begin with a technical lemma that will simplify the discussion later on.
As noted, DPDA sm a yr e j e c ti n p u t sb yf a i l i n gt or e a dt h ee n t i r ei n p u ts t r i n g ,b u t
such DPDA si n t r o d u c em e s s yc a s e s . F o r t u n a t e l y ,t h en e x tl e m m as h o w st h a tw e
can convert a DPDA into one that avoids this inconvenient behavior.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 156 ---
132 CHAPTER 2 / CONTEXT-FREE LANGUAGES
LEMMA 2.41
Every DPDA has an equivalent DPDA that always reads the entire input string.
PROOF IDEA ADPDA may fail to read the entire input if it tries to pop an
empty stack or because it makes an endless sequence of ε-input moves. Call the
first situation hanging and the second situation looping .W e s o l v e t h e h a n g i n g
problem by initializing the stack with a special symbol. If that symbol is later
popped from the stack before the end of the input, the DPDA reads to the end of
the input and rejects. We solve the looping problem by identifying the looping
situations, i.e., those from which no further input symbol is ever read, and re-
programming the DPDA so that it reads and rejects the input instead of looping.
We must adjust these modifications to accommodate the case where hanging or
looping occurs on the last symbol of the input. If the DPDA enters an accept state
at any point after it has read the last symbol, the modified DPDA accepts instead
of rejects.
PROOF LetP=(Q,Σ,Γ,δ ,q 0,F)be a DPDA .F i r s t ,a d dan e ws t a r ts t a t e
qstart,a na d d i t i o n a la c c e p ts t a t e qaccept,an e ws t a t e qreject,a sw e l la so t h e rn e w
states as described. Perform the following changes for for every r∈Q,a∈Σε,
andx, y∈Γε.
First modify Pso that, once it enters an accept state, it remains in accepting
states until it reads the next input symbol. Add a new accept state qafor every
q∈Q.F o re a c h q∈Q,i fδ(q,ε,x)=( r, y),s e tδ(qa,ε,x)=( ra,y),a n dt h e ni f
q∈F,a l s oc h a n g e δso that δ(q,ε,x)=( ra,y).F o r e a c h q∈Qanda∈Γ,i f
δ(q,a,x )=( r, y)setδ(qa,a ,x)=( r, y).L e t F′be the set of new and old accept
states.
Next, modify Pto reject when it tries to pop an empty stack, by initializing
the stack with a special new stack symbol $.I fPsubsequently detects $while in
an o n - a c c e p t i n gs t a t e ,i te n t e r s qrejectand scans the input to the end. If Pdetects
$while in an accept state, it enters qaccept.T h e n ,i fa n yi n p u tr e m a i n su n r e a d ,i t
enters qrejectand scans the input to the end. Formally, set δ(qstart,ε,ε)=( q0,$).
Forx∈Γandδ(q,a,x )̸=∅,i fq̸∈F′then set δ(q,a,$)=( qreject,ε),a n di f
q∈F′then set δ(q,a,$)=( qaccept,ε).F o r a∈Σ,s e tδ(qreject,a ,ε)=( qreject,ε)
andδ(qaccept,a ,ε)=( qreject,ε).
Lastly, modify Pto reject instead of making an endless sequence of ε-input
moves prior to the end of the input. For every q∈Qandx∈Γ,c a l l (q,x)
alooping situation if, when Pis started in state qwith x∈Γon the top of the
stack, it never pops anything below xand it never reads an input symbol. Say
the looping situation is accepting ifPenters an accept state during its subsequent
moves, and otherwise it is rejecting .I f(q,x)is an accepting looping situation,
setδ(q,ε,x)=( qaccept,ε),w h e r e a si f (q,x)is a rejecting looping situation, set
δ(q,ε,x)=( qreject,ε).
For simplicity, we’ll assume henceforth that DPDA sr e a dt h e i ri n p u tt ot h ee n d .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 157 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 133
PROPERTIES OF DCFLS
We’ll explore closure and nonclosure properties of the class of DCFLs, and use
these to exhibit a CFLthat is not a DCFL.
THEOREM 2.42
The class of DCFLsi sc l o s e du n d e rc o m p l e m e n t a t i o n .
PROOF IDEA Swapping the accept and non-accept states of a DFAyields a
new DFAthat recognizes the complementary language, thereby proving that the
class of regular languages is closed under complementation. The same approach
works for DPDA s, except for one problem. The DPDA may accept its input by
entering both accept and non-accept states in a sequence of moves at the end of
the input string. Interchanging accept and non-accept states would still accept
in this case.
We fix this problem by modifying the DPDA to limit when acceptance can
occur. For each symbol of the input, the modified DPDA can enter an accept
state only when it is about to read the next symbol. In other words, only reading
states—states that always read an input symbol—may be accept states. Then, by
swapping acceptance and non-acceptance only among these reading states, we
invert the output of the DPDA .
PROOF First modify Pas described in the proof of Lemma 2.41 and let
(Q,Σ,Γ,δ ,q 0,F)be the resulting machine. This machine always reads the en-
tire input string. Moreover, once enters an accept state, it remains in accept
states until it reads the next input symbol.
In order to carry out the proof idea, we need to identify the reading states.
If the DPDA in state qreads an input symbol a∈Σwithout popping the stack,
i.e.,δ(q,a,ε)̸=∅,d e s i g n a t e qto be a reading state. However, if it reads and also
pops, the decision to read may depend on the popped symbol, so divide that step
into two: a pop and then a read. Thus if δ(q,a,x )=( r, y)fora∈Σandx∈Γ,
add a new state qxand modify δsoδ(q,ε,x)=( qx,ε)andδ(qx,a ,ε)=( r, y).
Designate qxto be a reading state. The states qxnever pop the stack, so their
action is independent of the stack contents. Assign qxto be an accept state if
q∈F.F i n a l l y ,r e m o v et h ea c c e p t i n gs t a t ed e s i g n a t i o nf r o ma n ys t a t ew h i c hi s n ’ t
ar e a d i n gs t a t e . T h em o d i fi e d DPDA is equivalent to P,b u ti te n t e r sa na c c e p t
state at most once per input symbol, when it is about to read the next symbol.
Now, invert which reading states are classified as accepting. The resulting
DPDA recognizes the complementary language.
This theorem implies that some CFLsa r en o t DCFLs. Any CFLwhose comple-
ment isn’t a CFLisn’t a DCFL.T h u s A={aibjck|i̸=jorj̸=kwhere i, j, k ≥0}
is aCFLbut not a DCFL.O t h e r w i s e
 Awould be a CFL,s ot h er e s u l to fP r o b l e m2 . 1 8
would incorrectly imply that
 A∩a∗b∗c∗={anbncn|n≥0}is context free.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 158 ---
134 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Problem 2.53 asks you to show that the class of DCFLsi s n ’ tc l o s e du n d e ro t h e r
familiar operations such as union, intersection, star, and reversal.
To s i m p l i f y a r g u m e n t s , w e w i l l o c c a s i o n a l l y c o n s i d e r endmarked inputs
whereby the special endmarker symbol ⊣⊣⊣is appended to the input string. Here
we add ⊣⊣⊣to the DPDA ’s input alphabet. As we show in the next theorem, adding
endmarkers doesn’t change the power of DPDA s. However, designing DPDA so n
endmarked inputs is often easier because we can take advantage of knowing when
the input string ends. For any language A,w ew r i t et h e endmarked language
A⊣⊣⊣to be the collection of strings w⊣⊣⊣where w∈A.
THEOREM 2.43
Ais aDCFL if and only if A⊣⊣⊣is aDCFL.
PROOF IDEA Proving the forward direction of this theorem is routine. Say
DPDA Precognizes A.T h e n DPDA P′recognizes A⊣⊣⊣by simulating Puntil P′
reads ⊣⊣⊣.A t t h a t p o i n t , P′accepts if Phad entered an accept state during the
previous symbol. P′doesn’t read any symbols after ⊣⊣⊣.
To p r o v e t h e r e v e r s e d i r e c t i o n , l e t DPDA Precognize A⊣⊣⊣and construct a
DPDA P′that recognizes A.A sP′reads its input, it simulates P.P r i o rt or e a d -
ing each input symbol, P′determines whether Pwould accept if that symbol
were ⊣⊣⊣.I f s o , P′enters an accept state. Observe that Pmay operate the stack
after it reads ⊣⊣⊣,s od e t e r m i n i n gw h e t h e ri ta c c e p t sa f t e rr e a d i n g ⊣⊣⊣may depend
on the stack contents. Of course, P′cannot afford to pop the entire stack at
every input symbol, so it must determine what Pwould do after reading ⊣⊣⊣,b u t
without popping the stack. Instead, P′stores additional information on the stack
that allows P′to determine immediately whether Pwould accept. This infor-
mation indicates from which states Pwould eventually accept while (possibly)
manipulating the stack, but without reading further input.
PROOF We give proof details of the reverse direction only. As we described in
the proof idea, let DPDA P=(Q,Σ∪{⊣⊣⊣},Γ,δ ,q 0,F)recognize A⊣⊣⊣and construct
aDPDA P′=(Q′,Σ,Γ′,δ′,q0′,F′)that recognizes A.F i r s t , m o d i f y Pso that
each of its moves does exactly one of the following operations: read an input
symbol; push a symbol onto the stack; or pop a symbol from the stack. Making
this modification is straightforward by introducing new states.
P′simulates P,w h i l em a i n t a i n i n gac o p yo fi t ss t a c kc o n t e n t si n t e r l e a v e d
with additional information on the stack. Every time P′pushes one of P’s stack
symbols, P′follows that by pushing a symbol that represents a subset of P’s
states. Thus we set Γ′=Γ∪P(Q).T h e s t a c k i n P′interleaves members of Γ
with members of P(Q).I fR∈P(Q)is the top stack symbol, then by starting
Pin any one of R’s states, Pwill eventually accept without reading any more
input.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 159 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 135
Initially, P′pushes the set R0on the stack, where R0contains every state
qsuch that when Pis started in qwith an empty stack, it eventually accepts
without reading any input symbols. Then P′begins simulating P.T os i m u l a t ea
pop move, P′first pops and discards the set of states that appears as the top stack
symbol, then it pops again to obtain the symbol that Pwould have popped at
this point, and uses it to determine the next move of P.S i m u l a t i n gap u s hm o v e
δ(q,ε,ε)=( r, x),w h e r e Ppushes xas it goes from state qto state r,g o e sa s
follows. First P′examines the set of states Ron the top of its stack, and then it
pushes xand after that the set S,w h e r e q∈Sifq∈For ifδ(q,ε,x)=( r,ε)and
r∈R.I no t h e rw o r d s , Sis the set of states that are either accepting immediately,
or that would lead to a state in Rafter popping x.L a s t l y , P′simulates a read
move δ(q,a,ε)=( r,ε),b ye x a m i n i n gt h es e t Ron the top of the stack and
entering an accept state if r∈R.I fP′is at the end of the input string when
it enters this state, it will accept the input. If it is not at the end of the input
string, it will continue simulating P,s ot h i sa c c e p ts t a t em u s ta l s or e c o r d P’s
state. Thus we create this state as a second copy of P’s original state, marking it
as an accept state in P′.
DETERMINISTIC CONTEXT-FREE GRAMMARS
This section defines deterministic context-free grammars, the counterpart to
deterministic pushdown automata. We will show that these two models are
equivalent in power, provided that we restrict our attention to endmarked lan-
guages, where all strings are terminated with ⊣⊣⊣.T h u s t h e c o r r e s p o n d e n c ei s n ’ t
quite as strong as we saw in regular expressions and finite automata, or in CFGs
and PDAs, where the generating model and the recognizing model describe ex-
actly the same class of languages without the need for endmarkers. However, in
the case of DPDA sa n d DCFG s, the endmarkers are necessary because equivalence
doesn’t hold otherwise.
In a deterministic automaton, each step in a computation determines the next
step. The automaton cannot make choices about how it proceeds because only a
single possibility is available at every point. T o define determinism in a grammar,
observe that computations in automata correspond to derivations in grammars.
In a deterministic grammar, derivations are constrained, as you will see.
Derivations in CFGsb e g i nw i t ht h es t a r tv a r i a b l ea n dp r o c e e d“ t o pd o w n ”
with a series of substitutions according to the grammar’s rules, until the deriva-
tion obtains a string of terminals. For defining DCFG sw et a k ea“ b o t t o mu p ”
approach, by starting with a string of terminals and processing the derivation
in reverse, employing a series of reduce steps until reaching the start variable.
Each reduce step is a reversed substitution, whereby the string of terminals and
variables on the right-hand side of a rule is replaced by the variable on the cor-
responding left-hand side. The string replaced is called the reducing string .W e
call the entire reversed derivation a reduction .D e t e r m i n i s t i c CFGsa r ed e fi n e d
in terms of reductions that have a certain property.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 160 ---
136 CHAPTER 2 / CONTEXT-FREE LANGUAGES
More formally, if uandvare strings of variables and terminals, write u/arrowtailrightv
to mean that vcan be obtained from uby a reduce step. In other words, u/arrowtailrightv
means the same as v⇒u.Areduction from utovis a sequence
u=u1/arrowtailrightu2/arrowtailright.../arrowtailrightuk=v
and we say that uis reducible to v,w r i t t e n u∗/arrowtailrightv.T h u s u∗/arrowtailrightvwhenever
v∗⇒u.Areduction from uis a reduction from uto the start variable. In a
leftmost reduction ,e a c hr e d u c i n gs t r i n gi sr e d u c e do n l ya f t e ra l lo t h e rr e d u c i n g
strings that lie entirely to its left. With a little thought we can see that a leftmost
reduction is a rightmost derivation in reverse.
Here’s the idea behind determinism in CFGs. In a CFGwith start variable S
and string win its language, say that a leftmost reduction of wis
w=u1/arrowtailrightu2/arrowtailright.../arrowtailrightuk=S.
First, we stipulate that every uidetermines the next reduce step and hence ui+1.
Thus wdetermines its entire leftmost reduction. This requirement implies only
that the grammar is unambiguous. T o get determinism, we need to go further.
In each ui,t h en e x tr e d u c es t e pm u s tb eu n i q u e l yd e t e r m i n e db yt h ep r e fi xo f
uiup through and including the reducing string hof that reduce step. In other
words, the leftmost reduce step in uidoesn’t depend on the symbols in uito the
right of its reducing string.
Introducing terminology will help us make this idea precise. Let wbe a string
in the language of CFGG,a n dl e t uiappear in a leftmost reduction of w.I n
the reduce step ui/arrowtailrightui+1,s a yt h a tr u l e T→hwas applied in reverse. That
means we can write ui=xhyandui+1=xT y,w h e r e his the reducing string,
xis the part of uithat appears leftward of h,a n d yis the part of uithat appears
rightward of h.P i c t o r i a l l y ,
ui=x⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownright
x1···xjh⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownright
h1···hky⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownright
y1···yl/arrowtailrightx⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownright
x1···xjT⎪bracehtipdownleft⎪bracehtipupright⎪bracehtipupleft⎪bracehtipdownright
Ty⎪bracehtipdownleft
⎪bracehtipupright⎪bracehtipupleft
⎪bracehtipdownright
y1···yl=ui+1.
FIGURE 2.44
Expanded view of xhy/arrowtailrightxT y
We call h,t o g e t h e rw i t hi t sr e d u c i n gr u l e T→h,ahandle ofui.I n o t h e r
words, a handle of a string uithat appears in a leftmost reduction of w∈L(G)is
the occurrence of the reducing string in ui,t o g e t h e rw i t ht h er e d u c i n gr u l ef o r
uiin this reduction. Occasionally we associate a handle with its reducing string
only, when we aren’t concerned with the reducing rule. A string that appears in
al e f t m o s tr e d u c t i o no fs o m es t r i n gi n L(G)is called a valid string .W e d e fi n e
handles only for valid strings.
Av a l i ds t r i n gm a yh a v es e v e r a lh a n d l e s ,b u to n l yi ft h eg r a m m a ri sa m b i g u -
ous. Unambiguous grammars may generate strings by one parse tree only, and
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 161 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 137
therefore the leftmost reductions, and hence the handles, are also unique. In
that case, we may refer to thehandle of a valid string.
Observe that y,t h ep o r t i o no f uifollowing a handle, is always a string of ter-
minals because the reduction is leftmost. Otherwise, ywould contain a variable
symbol and that could arise only from a previous reduce step whose reducing
string was completely to the right of h.B u tt h e nt h el e f t m o s tr e d u c t i o ns h o u l d
have reduced the handle at an earlier step.
EXAMPLE 2.45
Consider the grammar G1:
R→S|T
S→aSb|ab
T→aTbb|abb
Its language is B∪Cwhere B={ambm|m≥1}andC={amb2m|m≥1}.
In this leftmost reduction of the string aaabbb ∈L(G1),w e ’ v eu n d e r l i n e dt h e
handle at each step:
aaab
 bb/arrowtailrightaaSb
 b/arrowtailrightaSb
/arrowtailrightS
/arrowtailrightR.
Similarly, this is a leftmost reduction of the string aaabbbbbb :
aaabb
 bbbb /arrowtailrightaaTbb
 bb/arrowtailrightaTbb
 /arrowtailrightT
/arrowtailrightR.
In both cases, the leftmost reduction shown happens to be the only reduction
possible; but in other grammars where several reductions may occur, we must
use a leftmost reduction to define the handles. Notice that the handles of aaabbb
andaaabbbbbb are unequal, even though the initial parts of these strings agree.
We’ll discuss this point in more detail shortly when we define DCFG s.
APDAcan recognize L(G1)by using its nondeterminism to guess whether its
input is in Bor in C.T h e n ,a f t e ri tp u s h e st h e a’s on the stack, it pops the a’s and
matches each one with borbbaccordingly. Problem 2.55 asks you to show that
L(G1)is not a DCFL.I fy o ut r yt om a k ea DPDA that recognizes this language,
you’ll see that the machine cannot know in advance whether the input is in Bor
inCso it doesn’t know how to match the a’s with the b’s. Contrast this grammar
with grammar G2:
R→1S|2T
S→aSb|ab
T→aTbb|abb
where the first symbol in the input provides this information. Our definition of
DCFG sm u s ti n c l u d e G2yet exclude G1.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 162 ---
138 CHAPTER 2 / CONTEXT-FREE LANGUAGES
EXAMPLE 2.46
LetG3be the following grammar:
S→T⊣⊣⊣
T→T(T)|ε
This grammar illustrates several points. First, it generates an endmarked lan-
guage. We will focus on endmarked languages later on when we prove the
equivalence between DPDA sa n d DCFG s. Second, εhandles may occur in re-
ductions, as indicated with short underscores in the leftmost reduction of the
string ()() ⊣⊣⊣:
()() ⊣⊣⊣/arrowtailrightT(
)()⊣⊣⊣/arrowtailrightT(T)
()⊣⊣⊣/arrowtailrightT(
)⊣⊣⊣/arrowtailrightT(T)
⊣⊣⊣/arrowtailrightT⊣⊣⊣
/arrowtailrightS.
Handles play an important role in defining DCFG sb e c a u s eh a n d l e sd e t e r m i n e
reductions. Once we know the handle of a string, we know the next reduce step.
To m a k e s e n s e o f t h e c o m i n g d e fi n i t i o n , k e e p o u r g o a l i n m i n d : w e a i m t o d e fi n e
DCFG ss ot h a tt h e yc o r r e s p o n dt o DPDA s. We’ll establish that correspondence
by showing how to convert DCFG st oe q u i v a l e n t DPDA s, and vice versa. For this
conversion to work, the DPDA needs to find handles so that it can find reductions.
But finding a handle may be tricky. It seems that we need to know a string’s next
reduce step to identify its handle, but a DPDA doesn’t know the reduction in
advance. We’ll solve this by restricting handles in a DCFG so that the DPDA can
find them more easily.
To m o t i v a t e t h e d e fi n i t i o n , c o n s i d e r a m b i g u o u s g r a m m a r s , w h e r e s o m e
strings have several handles. Selecting a specific handle may require advance
knowledge of which parse tree derives the string, information that is certainly
unavailable to the DPDA .W e ’ l ls e et h a t DCFG sa r eu n a m b i g u o u ss oh a n d l e s
are unique. However, uniqueness alone is unsatisfactory for defining DCFG sa s
grammar G1in Example 2.45 shows.
Why don’t unique handles imply that we have a DCFG ?T h ea n s w e ri se v i d e n t
by examining the handles in G1.I fw∈B,t h eh a n d l ei s ab,w h e r e a si f w∈C,t h e
handle is abb.T h o u g h wdetermines which of these cases applies, discovering
which of aborabbis the handle may require examining all of w,a n da DPDA
hasn’t read the entire input when it needs to select the handle.
In order to define DCFG st h a tc o r r e s p o n dt o DPDA s, we impose a stronger
requirement on the handles. The initial part of a valid string, up to and including
its handle, must be sufficient to determine the handle. Thus, if we are reading a
valid string from left to right, as soon as we read the handle we know we have it.
We don’t need to read beyond the handle in order to identify the handle. Recall
that the unread part of the valid string contains only terminals because the valid
string has been obtained by a leftmost reduction of an initial string of terminals,
and the unread part hasn’t been processed yet. Accordingly, we say that a handle
hof a valid string v=xhyis aforced handle ifhis the unique handle in every
valid string xhˆywhere ˆy∈Σ∗.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 163 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 139
DEFINITION 2.47
Adeterministic context-free grammar is a context-free grammar
such that every valid string has a forced handle.
For simplicity, we’ll assume throughout this section on deterministic context-
free languages that the start variable of a CFGdoesn’t appear on the right-hand
side of any rule and that every variable in a grammar appears in a reduction of
some string in the grammar’s language, i.e., grammars contain no useless vari-
ables.
Though our definition of DCFG si sm a t h e m a t i c a l l yp r e c i s e ,i td o e s n ’ tg i v ea n y
obvious way to determine whether a CFGis deterministic. Next we’ll present a
procedure to do exactly that, called the DK-test. We’ll also use the construction
underlying the DK-test to enable a DPDA to find handles, when we show how to
convert a DCFG to a DPDA .
The DK-test relies on one simple but surprising fact. For any CFGGwe
can construct an associated DFADKthat can identify handles. Specifically, DK
accepts its input zif
1.zis the prefix of some valid string v=zy,a n d
2.zends with a handle of v.
Moreover, each accept state of DKindicates the associated reducing rule(s). In
ag e n e r a l CFG,m u l t i p l er e d u c i n gr u l e sm a ya p p l y ,d e p e n d i n go nw h i c hv a l i d v
extends z.B u t i n a DCFG ,a sw e ’ l ls e e ,e a c ha c c e p ts t a t ec o r r e s p o n d st oe x a c t l y
one reducing rule.
We will describe the DK-test after we’ve presented DKformally and estab-
lished its properties, but here’s the plan. In a DCFG ,a l lh a n d l e sa r ef o r c e d .T h u s
ifzyis a valid string with a prefix zthat ends in a handle of zy,t h a th a n d l ei s
unique, and it is also the handle for all valid strings zˆy.F o r t h e s e p r o p e r t i e s
to hold, each of DK’s accept states must be associated with a single handle and
hence with a single applicable reducing rule. Moreover, the accept state must
not have an outgoing path that leads to an accept state by reading a string in Σ∗.
Otherwise, the handle of zywould not be unique or it would depend on y.I n
theDK-test, we construct DKand then conclude that Gis deterministic if all of
its accept states have these properties.
To c o n s t r u c t DFADK,w e ’ l lc o n s t r u c ta ne q u i v a l e n t NFAKand convert Kto
DK1via the subset construction introduced in Theorem 1.39. T o understand
K,fi r s tc o n s i d e ra n NFAJthat performs a simpler task. It accepts every input
string that ends with the right-hand side of any rule. Constructing Jis easy. It
guesses which rule to use and it also guesses the point at which to start matching
the input with that rule’s right-hand side. As it matches the input, Jkeeps track
1The name DKis a mnemonic for “deterministic K”b u ti ta l s os t a n d sf o rD o n a l dK n u t h ,
who first proposed this idea.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 164 ---
140 CHAPTER 2 / CONTEXT-FREE LANGUAGES
of its progress through the chosen right-hand side. We represent this progress
by placing a dot in the corresponding point in the rule, yielding a dotted rule ,
also called an item in some other treatments of this material. Thus for each rule
B→u1u2···ukwith ksymbols on the right-hand side, we get k+1dotted
rules:
B→.u1u2···uk
B→u1.u2···uk
...
B→u1u2···.uk
B→u1u2···uk.
Each of these dotted rules corresponds to one state of J.W ei n d i c a t e t h es t a t e
associated with the dotted rule B→u.vwith a box around it,✄
✂
✁
 B→u.v.T h e
accept states✄
✂
✁
✄
✂
✁
 B→u.correspond to the completed rules that have the dot at
the end. We add a separate start state with a self-loop on all symbols and an
ε-move to✄
✂
✁
 B→.ufor each rule B→u.T h u s Jaccepts if the match completes
successfully at the end of the input. If a mismatch occurs or if the end of the
match doesn’t coincide with the end of the input, this branch of J’s computation
rejects.
NFAKoperates similarly, but it is more judicious about choosing a rule for
matching. Only potential reducing rules are allowed. Like J,i t ss t a t e sc o r -
respond to all dotted rules. It has a special start state that has an ε-move to✄
✂
✁
 S1→.ufor every rule involving the start variable S1.O n e a c h b r a n c h o f i t s
computation, Kmatches a potential reducing rule with a substring of the input.
If that rule’s right-hand side contains a variable, Kmay nondeterministically
switch to some rule that expands that variable. Lemma 2.48 formalizes this idea.
First we describe Kin detail.
The transitions come in two varieties: shift-moves and ϵ-moves. The shift-
moves appear for every athat is a terminal or variable, and every rule B→uav:B    u•avB    ua•vaThe ϵ-moves appear for all rules B→uCvandC→r:B    u•CvC    •rεThe accept states are all✄
✂
✁
✄
✂
✁
 B→u.corresponding to a completed rule. Accept
states have no outgoing transitions and are written with a double box.
The next lemma and its corollary prove that Kaccepts all strings zthat end
with handles for some valid extension of z.B e c a u s e Kis nondeterministic, we
say that it “may” enter a state to mean that Kdoes enter that state on some
branch of its nondeterminism.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 165 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 141
LEMMA 2.48
Kmay enter state✄
✂
✁
 T→u.von reading input ziffz=xuandxuvy is a valid
string with handle uvand reducing rule T→uv,f o rs o m e y∈Σ∗.
PROOF IDEA Koperates by matching a selected rule’s right-hand side with
ap o r t i o no ft h ei n p u t . I ft h a tm a t c hc o m p l e t e ss u c c e s s f u l l y ,i ta c c e p t s . I ft h a t
right-hand side contains a variable C,e i t h e ro ft w os i t u a t i o n sm a ya r i s e . I f C
is the next input symbol, then matching the selected rule simply continues. If
Chas been expanded, the input will contain symbols derived from C,s o K
nondeterministically selects a substitution rule for Cand starts matching from
the beginning of the right-hand side of that rule. It accepts when the right-hand
side of the currently selected rule has been matched completely.
PROOF First we prove the forward direction. Assume that Konwenters✄
✂
✁
 T→u.v.E x a m i n e K’s path from its start state to✄
✂
✁
 T→u.v.T h i n ko ft h ep a t h
as runs of shift-moves separated by ε-moves. The shift-moves are transitions
between states sharing the same rule, shifting the dot rightward over symbols
read from the input. In the ithrun, say that the rule is Si→uiSi+1vi,w h e r e
Si+1is the variable expanded in the next run. The penultimate run is for rule
Sl→ulTvl,a n dt h efi n a lr u nh a sr u l e T→uv.
Input zmust then equal u1u2...u lu=xubecause the strings uianduwere
the shift-move symbols read from the input. Letting y′=vl...v 2v1,w es e et h a t
xuvy′is derivable in Gbecause the rules above give the derivation as shown in
the parse tree illustrated in Figure 2.49.
……S1S2S3Txy’uv…xFIGURE 2.49
Parse tree leading to xuvy′
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 166 ---
142 CHAPTER 2 / CONTEXT-FREE LANGUAGES
To o b t a i n a v a l i d s t r i n g , f u l l y e x p a n d a l l v a r i a b l e s t h a t a p p e a r i n y′until each
variable derives some string of terminals, and call the resulting string y.T h e
string xuvy is valid because it occurs in a leftmost reduction of w∈L(G),a
string of terminals obtained by fully expanding all variables in xuvy.
As is evident from the figure below, uvis the handle in the reduction and its
reducing rule is T→uv.
……S1S2S3Txyuv…xFIGURE 2.50
Parse tree leading to valid string xuvy with handle uv
Now we prove the reverse direction of the lemma. Assume that string xuvy is
av a l i ds t r i n gw i t hh a n d l e uvand reducing rule T→uv.S h o wt h a t Kon input
xumay enter state✄
✂
✁
 T→u.v.
The parse tree for xuvy appears in the preceding figure. It is rooted at the
start variable S1and it must contain the variable Tbecause T→uvis the first
reduce step in the reduction of xuvy.L e t S2,...,S lbe the variables on the path
from S1toTas shown. Note that all variables in the parse tree that appear
leftward of this path must be unexpanded, or else uvwouldn’t be the handle.
In this parse tree, each Sileads to Si+1by some rule Si→uiSi+1vi.T h u s
the grammar must contain the following rules for some strings uiandvi.
S1→u1S2v1
S2→u2S3v2
...
Sl→ulTvl
T→uv
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 167 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 143
Kcontains the following path from its start state to state✄
✂
✁
 T→u.von reading
input z=xu.F i r s t , Kmakes an ϵ-move to✄
✂
✁
 S1→.u1S2v1.T h e n , w h i l e r e a d -
ing the symbols of u1,i tp e r f o r m st h ec o r r e s p o n d i n gs h i f t - m o v e su n t i li te n t e r s✄
✂
✁
 S1→u1.S2v1at the end of u1.T h e ni tm a k e sa n ε-move to✄
✂
✁
 S2→.u2S3v2and
continues with shift-moves on reading u2until it reaches✄
✂
✁
 S2→u2.S3v2and so
on. After reading ulit enters✄
✂
✁
 Sl→ul.Tvlwhich leads by an ϵ-move to✄
✂
✁
 T→.uv
and finally after reading uit is in✄
✂
✁
 T→u.v.
The following corollary shows that Kaccepts all strings ending with a handle
of some valid extension. It follows from Lemma 2.48 by taking u=handv=ε.
COROLLARY 2.51
Kmay enter accept state✄
✂
✁
✄
✂
✁
 T→h.on input ziffz=xhandhis a handle of
some valid string xhywith reducing rule T→h.
Finally, we convert NFAKtoDFADKby using the subset construction in
the proof of Theorem 1.39 on page 55 and then removing all states that are un-
reachable from the start state. Each of DK’s states thus contains one or more
dotted rules. Each accept state contains at least one completed rule. We can ap-
ply Lemma 2.48 and Corollary 2.51 to DKby referring to the states that contain
the indicated dotted rules.
Now we are ready to describe the DK-test .
Starting with a CFGG,c o n s t r u c tt h ea s s o c i a t e d DFADK.D e t e r m i n ew h e t h e r
Gis deterministic by examining DK’s accept states. The DK-test stipulates that
every accept state contains
1.exactly one completed rule, and
2.no dotted rule in which a terminal symbol immediately follows the dot,
i.e., no dotted rule of the form B→u.avfora∈Σ.
THEOREM 2.52
Gpasses the DK-test iff Gis aDCFG .
PROOF IDEA We’ll show that the DK-test passes if and only if all handles
are forced. Equivalently, the test fails iff some handle isn’t forced. First, suppose
that some valid string has an unforced handle. If we run DKon this string,
Corollary 2.51 says that DKenters an accept state at the end of the handle. The
DK-test fails because that accept state has either a second completed rule or an
outgoing path leading to an accept state, where the outgoing path begins with a
terminal symbol. In the latter case, the accept state would contain a dotted rule
with a terminal symbol following the dot.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 168 ---
144 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Conversely, if the DK-test fails because an accept state has two completed
rules, extend the associated string to two valid strings with differing handles at
that point. Similarly, if it has a completed rule and a dotted rule with a terminal
following the dot, employ Lemma 2.48 to get two valid extensions with differing
handles at that point. Constructing the valid extension corresponding to the
second rule is a bit delicate.
PROOF Start with the forward direction. Assume that Gisn’t deterministic
and show that it fails the DK-test. T ake a valid string xhythat has an unforced
handle h.H e n c es o m ev a l i d s t r i n g xhy′has a different handle ˆh̸=h,w h e r e y′
is a string of terminals. We can thus write xhy′asxhy′=ˆxˆhˆy.
Ifxh=ˆxˆh,t h er e d u c i n gr u l e sd i f f e rb e c a u s e handˆharen’t the same handle.
Therefore, input xhsends DKto a state that contains two completed rules, a
violation of the DK-test.
Ifxh̸=ˆxˆh,o n eo ft h e s ee x t e n d st h eo t h e r . A s s u m et h a t xhis the proper
prefix of ˆxˆh.T h e a r g u m e n t i s t h e s a m e w i t h t h e s t r i n g s i n t e r c h a n g e d a n d yin
place of y′,i fˆxˆhis the shorter string. Let qbe the state that DKenters on input
xh.S t a t e qmust be accepting because his a handle of xhy.A t r a n s i t i o n a r r o w
must exit qbecause ˆxˆhsends DKto an accept state via q.F u r t h e r m o r e , t h a t
transition arrow is labeled with a terminal symbol, because y′∈Σ+.H e r e y′̸=ε
because ˆxˆhextends xh.H e n c e qcontains a dotted rule with a terminal symbol
immediately following the dot, violating the DK-test.
To p r o v e t h e r e v e r s e d i r e c t i o n , a s s u m e Gfails the DK-test at some accept
state q,a n ds h o wt h a t Gisn’t deterministic by exhibiting an unforced handle.
Because qis accepting, it has a completed rule T→h..L e t zbe a string that
leads DKtoq.T h e n z=xhwhere some valid string xhyhas handle hwith
reducing rule T→h,f o r y∈Σ∗.N o w w e c o n s i d e r t w o c a s e s , d e p e n d i n g o n
how the DK-test fails.
First, say qhas another completed rule B→ˆh..T h e ns o m ev a l i ds t r i n g xhy′
must have a different handle ˆhwith reducing rule B→ˆh.T h e r e f o r e , hisn’t a
forced handle.
Second, say qcontains a rule B→u.avwhere a∈Σ.B e c a u s e xhtakes DK
toq,w eh a v e xh=ˆxu,w h e r e ˆxuav ˆyis valid and has a handle uavwith reducing
ruleB→uav,f o rs o m e ˆy∈Σ∗.T o s h o w t h a t his unforced, fully expand all
variables in vto get the result v′∈Σ∗,t h e nl e t y′=av′ˆyand notice that y′∈Σ∗.
The following leftmost reduction shows that xhy′is a valid string and his not
the handle.
xhy′=xhav′ˆy=ˆxu a v′ˆy∗/arrowtailrightˆxu a v ˆy/arrowtailrightˆxBˆy∗/arrowtailrightS
where Sis the start variable. We know that ˆxu a v ˆyis valid and we can obtain
ˆxu a v′ˆyfrom it by using a rightmost derivation so ˆxu a v′ˆyis also valid. More-
over, the handle of ˆxu a v′ˆyeither lies inside v′(ifv̸=v′)o ri s uav(ifv=v′).
In either case, the handle includes aor follows aand thus cannot be hbecause h
fully precedes a.H e n c e hisn’t a forced handle.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 169 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 145
When building the DFADKin practice, a direct construction may be faster
than first constructing the NFAK.B e g i n b y a d d i n g a d o t a t t h e i n i t i a l p o i n t i n
all rules involving the start variable and place these now-dotted rules into DK’s
start state. If a dot precedes a variable Cin any of these rules, place dots at the
initial position in all rules that have Con the left-hand side and add these rules
to the state, continuing this process until no new dotted rules are obtained. For
any symbol cthat follows a dot, add an outgoing edge labeled cto a new state
containing the dotted rules obtained by shifting the dot across the cin any of the
dotted rules where the dot precedes the c,a n da d dr u l e sc o r r e s p o n d i n gt ot h e
rules where a dot precedes a variable as before.
EXAMPLE 2.53
Here we illustrate how the DK-test fails for the following grammar.
S→E⊣⊣⊣
E→E+T|T
T→Txa|aES    •E⊣E   •E+TE   •TT    •T×aT    •aS   E•⊣E  E•+TE   E+•TT   •T×aT   •aT   a•S   E⊣•E   E+T•T    T•×aE   T•T    T•×a⊣
T   TוaT
T   T×a•+a××aT
T   T×a•aFIGURE 2.54
Example of a failed DK-test
Notice the two problematic states at the lower left and the second from the
top right, where an accept state contains a dotted rule where a terminal symbol
follows the dot.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 170 ---
146 CHAPTER 2 / CONTEXT-FREE LANGUAGES
EXAMPLE 2.55
Here is the DFADKshowing that the grammar below is a DCFG .
S→T⊣⊣⊣
T→T(T)|εS    •T⊣T    •T(T)T    •T    T(•T)T    •T(T)T    •TS    T•⊣T    T•(T)TT    T(T•)T    T•(T)S    T⊣•⊣T    (T)•)((
FIGURE 2.56
Example of a DK-test that passes
Observe that all accept states satisfy the DK-test conditions.
RELATIONSHIP OF DPDAS AND DCFGS
In this section we will show that DPDA sa n d DCFG sd e s c r i b et h es a m ec l a s so f
endmarked languages. First, we will demonstrate how to convert DCFG st o
equivalent DPDA s. This conversion works in all cases. Second, we will show
how to do the reverse conversion, from DPDA st oe q u i v a l e n t DCFG s. The latter
conversion works only for endmarked languages. We restrict the equivalence
to endmarked languages, because the models are not equivalent without this re-
striction. We showed earlier that endmarkers don’t affect the class of languages
that DPDA sr e c o g n i z e ,b u tt h e yd oa f f e c tt h ec l a s so fl a n g u a g e st h a t DCFG sg e n -
erate. Without endmarkers, DCFG sg e n e r a t eo n l yas u b c l a s so ft h e DCFLs—those
that are prefix-free (see Problem 2.52). Note that every endmarked language is
prefix-free.
THEOREM 2.57
An endmarked language is generated by a deterministic context-free grammar if
and only if it is deterministic context free.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 171 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 147
We have two directions to prove. First we will show that every DCFG has an
equivalent DPDA .T h e nw ew i l ls h o wt h a te v e r y DPDA that recognizes an end-
marked language has an equivalent DCFG .W eh a n d l et h e s et w od i r e c t i o n si n
separate lemmas.
LEMMA 2.58
Every DCFG has an equivalent DPDA .
PROOF IDEA We show how to convert a DCFG Gto an equivalent DPDA P.
Puses the DFADKto operate as follows. It simulates DKon the symbols it reads
from the input until DKaccepts. As shown in the proof of Theorem 2.52, DK’s
accept state indicates a specific dotted rule because Gis deterministic, and that
rule identifies a handle for some valid string extending the input it has seen so far.
Moreover, this handle applies to every valid extension because Gis deterministic,
and in particular it will apply to the full input to P,i ft h a ti n p u ti si n L(G).S o
Pcan use this handle to identify the first reduce step for its input string, even
though it has read only a part of its input at this point.
How does Pidentify the second and subsequent reduce steps? One idea is to
perform the reduce step directly on the input string, and then run the modified
input through DKas we did above. But the input can be neither modified nor
reread so this idea doesn’t work. Another approach would be to copy the input
to the stack and carry out the reduce step there, but then Pwould need to pop
the entire stack to run the modified input through DKand so the modified input
would not remain available for later steps.
The trick here is to store the states of DKon the stack, instead of storing the
input string there. Every time Preads an input symbol and simulates a move in
DK,i tr e c o r d s DK’s state by pushing it on the stack. When it performs a reduce
step using reducing rule T→u,i tp o p s |u|states off the stack, revealing the
state DKwas in prior to reading u.I t r e s e t s DKto that state, then simulates
it on input Tand pushes the resulting state on the stack. Then Pproceeds by
reading and processing input symbols as before.
When Ppushes the start variable on the stack, it has found a reduction of its
input to the start variable, so it enters an accept state.
Next we prove the other direction of Theorem 2.57.
LEMMA 2.59
Every DPDA that recognizes an endmarked language has an equivalent DCFG .
PROOF IDEA This proof is a modification of the construction in Lemma 2.27
on page 121 that describes the conversion of a PDAPto an equivalent CFGG.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 172 ---
148 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Here PandGare deterministic. In the proof idea for Lemma 2.27, we altered
Pto empty its stack and enter a specific accept state qaccept when it accepts. A
PDAcannot directly determine that it is at the end of its input, so Puses its
nondeterminism to guess that it is in that situation. We don’t want to introduce
nondeterminism in constructing DPDA P.I n s t e a d w e u s e t h e a s s u m p t i o n t h a t
L(P)is endmarked. We modify Pto empty its stack and enter qaccept when it
enters one of its original accept states after it has read the endmarker ⊣⊣⊣.
Next we apply the grammar construction to obtain G.S i m p l y a p p l y i n g t h e
original construction to a DPDA produces a nearly deterministic grammar be-
cause the CFG’s derivations closely correspond to the DPDA ’s computations. That
grammar fails to be deterministic in one minor, fixable way.
The original construction introduces rules of the form Apq→AprArqand
these may cause ambiguity. These rules cover the case where Apqgenerates a
string that takes Pfrom state pto state qwith its stack empty at both ends,
and the stack empties midway. The substitution corresponds to dividing the
computation at that point. But if the stack empties several times, several divisions
are possible. Each of these divisions yields different parse trees, so the resulting
grammar is ambiguous. We fix this problem by modifying the grammar to divide
the computation only at the very last point where the stack empties midway,
thereby removing this ambiguity. For illustration, a similar but simpler situation
occurs in the ambiguous grammar
S→T⊣⊣⊣
T→TT|(T)|ε
which is equivalent to the unambiguous, and deterministic, grammar
S→T⊣⊣⊣
T→T(T)|ε.
Next we show the modified grammar is deterministic by using the DK-test.
The grammar is designed to simulate the DPDA .A sw ep r o v e di nL e m m a2 . 2 7 ,
Apqgenerates exactly those strings on which Pgoes from state pon empty stack
to state qon empty stack. We’ll prove G’s determinism using P’s determinism
so we will find it useful to define P’s computation on valid strings to observe
its action on handles. Then we can use P’s deterministic behavior to show that
handles are forced.
PROOF Say that P=(Q,Σ,Γ,δ ,q 0,{qaccept})and construct G.T h e s t a r t
variable is Aq0,qaccept.T h e c o n s t r u c t i o n o n p a g e 1 2 1 c o n t a i n s p a r t s 1 , 2 , a n d 3 ,
repeated here for convenience.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 173 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 149
1.For each p, q, r, s ∈Q,u∈Γ,a n d a, b∈Σε,i fδ(p, a,ε)contains (r, u)
andδ(s, b, u )contains (q,ε),p u tt h er u l e Apq→aArsbinG.
2.For each p, q, r ∈Q,p u tt h er u l e Apq→AprArqinG.
3.For each p∈Q,p u tt h er u l e App→εinG.
We modify the construction to avoid introducing ambiguity, by combining
rules of types 1 and 2 into a single type 1-2 rule that achieves the same effect.
1-2. For each p, q, r, s, t ∈Q,u∈Γ,a n d a, b∈Σε,i fδ(r, a,ε)=( s, u)and
δ(t, b, u )=( q,ε),p u tt h er u l e Apq→ApraAstbinG.
To s e e t h a t t h e m o d i fi e d g r a m m a r g e n e r a t e s t h e s a m e l a n g u a g e , c o n s i d e r a n y
derivation in the original grammar. For each substitution due to a type 2 rule
Apq→AprArq,w ec a na s s u m et h a t risP’s state when it is at the rightmost point
where the stack becomes empty midway by modifying the proof of Claim 2.31
on page 123 to select rin this way. Then the subsequent substitution of Arq
must expand it using a type 1 rule Arq→aAstb.W e c a n c o m b i n e t h e s e t w o
substitutions into a single type 1-2 rule Apq→ApraAstb.
Conversely, in a derivation using the modified grammar, if we replace each
type 1-2 rule Apq→ApraAstbby the type 2 rule Apq→AprArqfollowed by
the type 1 rule Arq→aAstb,w eg e tt h es a m er e s u l t .
Now we use the DK-test to show that Gis deterministic. T o do that, we’ll
analyze how Poperates on valid strings by extending its input alphabet and tran-
sition function to process variable symbols in addition to terminal symbols. We
add all symbols ApqtoP’s input alphabet and we extend its transition function δ
by defining δ(p, A pq,ε)=( q,ε).S e ta l lo t h e rt r a n s i t i o n si n v o l v i n g Apqto∅.T o
preserve P’s deterministic behavior, if Preads Apqfrom the input then disallow
anε-input move.
The following claim applies to a derivation of any string winL(G)such as
Aq0,qaccept=v0⇒v1⇒···⇒ vi⇒···⇒ vk=w.
CLAIM 2.60
IfPreads vicontaining a variable Apq,i te n t e r ss t a t e pjust prior to reading Apq.
The proof uses induction on i,t h en u m b e ro fs t e p st od e r i v e vifrom Aq0,qaccept.
Basis: i=0.
In this case, vi=Aq0,qacceptandPstarts in state q0so the basis is true.
Induction step: Assume the claim for iand prove it for i+1.
First consider the case where vi=xApqyandApqis the variable substituted
in the step vi⇒vi+1.T h e i n d u c t i o n h y p o t h e s i s i m p l i e s t h a t Penters state p
after it reads x,p r i o rt or e a d i n gs y m b o l Apq.A c c o r d i n gt o G’s construction the
substitution rules may be of two types:
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 174 ---
150 CHAPTER 2 / CONTEXT-FREE LANGUAGES
1.Apq→ApraAstbor
2.App→ε.
Thus either vi+1=xApraAstbyorvi+1=xy,d e p e n d i n go nw h i c ht y p eo f
rule was used. In the first case, when Preads ApraAstbinvi+1,w ek n o wi t
starts in state p,b e c a u s ei th a sj u s tfi n i s h e dr e a d i n g x.A s Preads ApraAstbin
vi+1,i te n t e r st h es e q u e n c eo fs t a t e s r,s,t,a n d q,d u et ot h es u b s t i t u t i o nr u l e ’ s
construction. Therefore, it enters state pjust prior to reading Aprand it enters
state sjust prior to reading Ast,t h e r e b ye s t a b l i s h i n gt h ec l a i mf o rt h e s et w o
occurrences of variables. The claim holds on occurrences of variables in the y
part because, after Preads bit enters state qand then it reads string y.O ni n p u t
vi,i ta l s oe n t e r s qjust before reading y,s ot h ec o m p u t a t i o n sa g r e eo nt h e yparts
ofviandvi+1.O b v i o u s l y , t h e c o m p u t a t i o n s a g r e e o n t h e xparts. Therefore,
the claim holds for vi+1.I nt h es e c o n dc a s e ,n on e wv a r i a b l e sa r ei n t r o d u c e d ,s o
we only need to observe that the computations agree on the xandyparts of vi
andvi+1.T h i sp r o v e st h ec l a i m .
CLAIM 2.61
Gpasses the DK-test.
We show that each of DK’s accept states satisfies the DK-test requirements.
Select one of these accept states. It contains a completed rule R.T h i s c o m -
pleted rule may have one of two forms:
1.Apq→ApraAstb.
2.App→.
In both situations, we need to show that the accept state cannot contain
a.another completed rule, and
b.ad o t t e dr u l et h a th a sat e r m i n a ls y m b o li m m e d i a t e l ya f t e rt h ed o t .
We consider each of these four cases separately. In each case, we start by
considering a string zon which DKgoes to the accept state we selected above.
Case 1a. Here Ris a completed type 1-2 rule. For any rule in this accept state,
zmust end with the symbols preceding the dot in that rule because DKgoes to
that state on z.H e n c e t h e s y m b o l s p r e c e d i n g t h e d o t m u s t b e c o n s i s t e n t i n a l l
such rules. These symbols are ApraAstbinRso any other type 1-2 completed
rule must have exactly the same symbols on the right-hand side. It follows that
the variables on the left-hand side must also agree, so the rules must be the same.
Suppose the accept state contains Rand some type 3 completed ε-rule T.
From Rwe know that zends with ApraAstb.M o r e o v e r ,w e k n o w t h a t Ppops
its stack at the very end of zbecause a pop occurs at that point in R,d u et o
G’s construction. According to the way we build DK,ac o m p l e t e d ε-rule in a
state must derive from a dotted rule that resides in the same state, where the dot
isn’t at the very beginning and the dot immediately precedes some variable. (An
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 175 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 151
exception occurs at DK’s start state, where this dot may occur at the beginning
of the rule, but this accept state cannot be the start state because it contains a
completed type 1-2 rule.) In G,t h a tm e a n s Tderives from a type 1-2 dotted
rule where the dot precedes the second variable. From G’s construction a push
occurs just before the dot. This implies that Pdoes a push move at the very end
ofz,c o n t r a d i c t i n go u rp r e v i o u ss t a t e m e n t .T h u st h ec o m p l e t e d ε-rule Tcannot
exist. Either way, a second completed rule of either type cannot occur in this
accept state.
Case 2a. Here Ris a completed ε-rule App→..W e s h o w t h a t n o o t h e r c o m -
pleted ε-rule Aqq→.can coexist with R.I fi td o e s ,t h ep r e c e d i n gc l a i ms h o w s
thatPmust be in pafter reading zand it must also be in qafter reading z.H e n c e
p=qand therefore the two completed ε-rules are the same.
Case 1b. Here Ris a completed type 1-2 rule. From Case 1a, we know that
Ppops its stack at the end of z.S u p p o s et h ea c c e p ts t a t ea l s oc o n t a i n sad o t t e d
ruleTwhere a terminal symbol immediately follows the dot. From Twe know
thatPdoesn’t pop its stack at the end of z.T h i s c o n t r a d i c t i o n s h o w s t h a t t h i s
situation cannot arise.
Case 2b. Here Ris a completed ε-rule. Assume that the accept state also
contains a dotted rule Twhere a terminal symbol immediately follows the dot.
Because Tis of type 1-2, a variable symbol immediately precedes the dot, and
thus zends with that variable symbol. Moreover, after Preads zit is prepared
to read a non- εinput symbol because a terminal follows the dot. As in Case 1a,
the completed ε-rule Rderives from a type 1-2 dotted rule Swhere the dot
immediately precedes the second variable. (Again this accept state cannot be
DK’s start state because the dot doesn’t occur at the beginning of T.) Thus some
symbol ˆa∈Σεimmediately precedes the dot in Sand so zends with ˆa.E i t h e r
ˆa∈Σorˆa=ε,b u tb e c a u s e zends with a variable symbol, ˆa̸∈Σsoˆa=ε.
Therefore, after Preads zbut before it makes the ε-input move to process ˆa,i t
is prepared to read an εinput. We also showed above that Pis prepared to read
an o n - εinput symbol at this point. But a DPDA isn’t allowed to make both an
ε-input move and a move that reads a non- εinput symbol at a given state and
stack, so the above situation is impossible. Thus this situation cannot occur.
PARSING AND LR(K) GRAMMARS
Deterministic context-free languages are of major practical importance. Their
algorithms for membership and parsing are based on DPDA sa n da r et h e r e f o r ee f -
ficient, and they encompass a rich class of CFLst h a ti n c l u d em o s tp r o g r a m m i n g
languages. However, DCFG sa r es o m e t i m e si n c o n v e n i e n tf o re x p r e s s i n gp a r t i c -
ular DCFLs. The requirement that all handles are forced is often an obstacle to
designing intuitive DCFG s.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 176 ---
152 CHAPTER 2 / CONTEXT-FREE LANGUAGES
Fortunately, a broader class of grammars called the LR(k)grammars gives us
the best of both worlds. They are close enough to DCFG st oa l l o wd i r e c tc o n v e r -
sion into DPDA s. Yet they are expressive enough for many applications.
Algorithms for LR(k)grammars introduce lookahead .I n a DCFG ,a l lh a n d l e s
are forced. A handle depends only on the symbols in a valid string up through
and including the handle, but not on terminal symbols that follow the handle. In
anLR(k)grammar, a handle may also depend on symbols that follow the handle,
but only on the first kof these. The acronym LR(k)stands for: L
 eft to right input
processing, R
 ightmost derivations (or equivalently, leftmost reductions), and k
symbols of lookahead.
To m a k e t h i s p r e c i s e , l e t hbe a handle of a valid string v=xhy.S a yt h a t his
forced by lookahead kifhis the unique handle of every valid string xhˆywhere
ˆy∈Σ∗and where yandˆyagree on their first ksymbols. (If either string is
shorter than k,t h es t r i n g sm u s ta g r e eu pt ot h el e n g t ho ft h es h o r t e ro n e . )
DEFINITION 2.62
AnLRLRLR(k)grammar is a context-free grammar such that the handle
of every valid string is forced by lookahead k.
Thus a DCFG is the same as an LR(0)grammar. We can show that for every k
we can convert LR(k)grammars to DPDA s. We’ve already shown that DPDA sa r e
equivalent to LR(0)grammars. Hence LR(k)grammars are equivalent in power for
allkand all describe exactly the DCFLs. The following example shows that LR(1)
grammars are more convenient than DCFG sf o rs p e c i f y i n gc e r t a i nl a n g u a g e s .
To a v o i d c u m b e r s o m e n o t a t i o n a n d t e c h n i c a l d e t a i l s , w e w i l l s h o w h o w t o
convert LR(k)grammars to DPDA so n l yf o rt h es p e c i a lc a s ew h e r e k=1.T h e
conversion in the general case works in essentially the same way.
To b e g i n , w e ’ l l p r e s e n t a v a r i a n t o f t h e DK-test, modified for LR(1)grammars.
We call it the DK-test with lookahead 1, or simply the DK 1-test. As before,
we’ll construct an NFA,c a l l e d K1here, and convert it to a DFADK 1.E a c h o f
K1’s states has a dotted rule T→u.vand now also a terminal symbol a,c a l l e d
thelookahead symbol ,s h o w na s✄
✂
✁
 T→u.va.T h i s s t a t e i n d i c a t e s t h a t K1has
recently read the string u,w h i c hw o u l db eap a r to fah a n d l e uvprovided that v
follows after uandafollows after v.
The formal construction works much as before. The start state has an ε-move
to✄
✂
✁
 S1→.ua for every rule involving the start variable S1and every a∈Σ.
The shift transitions take✄
✂
✁
 T→u.xv a to✄
✂
✁
 T→ux.va on input xwhere xis
av a r i a b l es y m b o lo rt e r m i n a ls y m b o l . T h e ε-transitions take✄
✂
✁
 T→u.Cv a to✄
✂
✁
 C→.rb for each rule C→r,w h e r e bis the first symbol of any string of
terminals that can be derived from v.I fvderives ε,a d d b=a.T h ea c c e p ts t a t e s
are all✄
✂
✁
✄
✂
✁
 B→u.afor completed rules B→u.anda∈Σ.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 177 ---
2.4 DETERMINISTIC CONTEXT-FREE LANGUAGES 153
LetR1be a completed rule with lookahead symbol a1,a n dl e t R2be a dotted
rule with lookahead symbol a2.S a yt h a t R1andR2areconsistent if
1.R2is completed and a1=a2,o r
2.R2is not completed and a1immediately follows its dot.
Now we are ready to describe the DK 1-test. Construct the DFADK 1.T h et e s t
stipulates that every accept state must not contain any two consistent dotted
rules.
THEOREM 2.63
Gpasses the DK 1-test iff Gis an LR(1)grammar.
PROOF IDEA Corollary 2.51 still applies to DK 1because we can ignore the
lookahead symbols.
EXAMPLE 2.64
This example shows that the following grammar passes the DK 1-test. Recall
that in Example 2.53 this grammar was shown to fail the DK-test. Hence it is an
example of a grammar that is LR(1)but not a DCFG .
S→E⊣⊣⊣
E→E+T|T
T→Txa|aES    •E⊣E   •E+TE   •TT    •T×aT    •aa+×⊣+⊣+⊣×+⊣×+⊣S   E•⊣E  E•+Ta+×⊣+⊣E   E+•TT   •T×aT   •a+⊣×+⊣×+⊣T   a•×+⊣S   E⊣•a+×⊣E   E+T•T    T•×a+⊣×+⊣E   T•T    T•×a+⊣×+⊣⊣
T   Tוa×+⊣T
T   T×a•×+⊣+a××aTaFIGURE 2.65
Passing the DK 1-test
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 178 ---
154 CHAPTER 2 / CONTEXT-FREE LANGUAGES
THEOREM 2.66
An endmarked language is generated by an LR(1)grammar iff it is a DCFL.
We’ve already shown that every DCFL has an LR(0)grammar, because an LR(0)
grammar is the same as a DCFG .T h a tp r o v e st h er e v e r s ed i r e c t i o no ft h et h e -
orem. What remains is the following lemma, which shows how to convert an
LR(1)grammar to a DPDA .
LEMMA 2.67
Every LR(1)grammar has an equivalent DPDA .
PROOF IDEA We construct P1,am o d i fi e dv e r s i o no ft h e DPDA Pthat we
presented in Lemma 2.67. P1reads its input and simulates DK 1,w h i l eu s i n gt h e
stack to keep track of the state DK 1would be in if all reduce steps were applied
to this input up to this point. Moreover, P1reads 1symbol ahead and stores this
lookahead information in its finite state memory. Whenever DK 1reaches an
accept state, P1consults its lookahead to see whether to perform a reduce step,
and which step to do if several possibilities appear in this state. Only one option
can apply because the grammar is LR(1).
EXERCISES
2.1 Recall the CFG G4that we gave in Example 2.4. For convenience, let’s rename its
variables with single letters as follows.
E→E+T|T
T→TxF|F
F→(E)|a
Give parse trees and derivations for each string.
a.a
b.a+ac.a+a+a
d.((a))
2.2 a. Use the languages A={ambncn|m, n≥0}andB={anbncm|m, n≥0}
together with Example 2.36 to show that the class of context-free languages
is not closed under intersection.
b.Use part (a) and DeMorgan’s law (Theorem 0.20) to show that the class of
context-free languages is not closed under complementation.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 179 ---
EXERCISES 155
A2.3 Answer each part for the following context-free grammar G.
R→XRX |S
S→aTb|bTa
T→XTX |X|ε
X→a|b
a.What are the variables of G?
b.What are the terminals of G?
c.Which is the start variable of G?
d.Give three strings in L(G).
e.Give three strings notinL(G).
f.Tr u e o r F a l s e : T⇒aba.
g.Tr u e o r F a l s e : T∗⇒aba.
h.Tr u e o r F a l s e : T⇒T.i.Tr u e o r F a l s e : T∗⇒T.
j.Tr u e o r F a l s e : XXX∗⇒aba.
k.Tr u e o r F a l s e : X∗⇒aba.
l.Tr u e o r F a l s e : T∗⇒XX.
m.Tr u e o r F a l s e : T∗⇒XXX .
n.Tr u e o r F a l s e : S∗⇒ε.
o.Give a description in English of
L(G).
2.4 Give context-free grammars that generate the following languages. In all parts, the
alphabet Σis{0,1}.
Aa.{w|wcontains at least three 1s}
b.{w|wstarts and ends with the same symbol }
c.{w|the length of wis odd }
Ad.{w|the length of wis odd and its middle symbol is a 0}
e.{w|w=wR,t h a ti s , wis a palindrome }
f.The empty set
2.5 Give informal descriptions and state diagrams of pushdown automata for the lan-
guages in Exercise 2.4.
2.6 Give context-free grammars generating the following languages.
Aa.The set of strings over the alphabet {a,b}with more a’s than b’s
b.The complement of the language {anbn|n≥0}
Ac.{w#x|wRis a substring of xforw,x∈{0,1}∗}
d.{x1#x2#···#xk|k≥1,e a c h xi∈{a,b}∗,a n df o rs o m e iandj,xi=xR
j}
A2.7 Give informal English descriptions of PDAsf o rt h el a n g u a g e si nE x e r c i s e2 . 6 .
A2.8 Show that the string the girl touches the boy with the flower has two
different leftmost derivations in grammar G2on page 103. Describe in English the
two different meanings of this sentence.
2.9 Give a context-free grammar that generates the language
A={aibjck|i=jorj=kwhere i, j, k ≥0}.
Is your grammar ambiguous? Why or why not?
2.10 Give an informal description of a pushdown automaton that recognizes the lan-
guage Ain Exercise 2.9.
2.11 Convert the CFG G4given in Exercise 2.1 to an equivalent PDA,u s i n gt h ep r o c e -
dure given in Theorem 2.20.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 180 ---
156 CHAPTER 2 / CONTEXT-FREE LANGUAGES
2.12 Convert the CFGGgiven in Exercise 2.3 to an equivalent PDA,u s i n gt h ep r o c e d u r e
given in Theorem 2.20.
2.13 LetG=(V,Σ,R ,S )be the following grammar. V={S, T, U };Σ= {0,#};a n d
Ris the set of rules:
S→TT|U
T→0T|T0|#
U→0U00|#
a.Describe L(G)in English.
b.Prove that L(G)is not regular.
2.14 Convert the following CFG into an equivalent CFG in Chomsky normal form,
using the procedure given in Theorem 2.9.
A→BAB |B|ε
B→00|ε
2.15 Give a counterexample to show that the following construction fails to prove that
the class of context-free languages is closed under star. Let Abe a CFLthat is
generated by the CFG G=(V,Σ,R ,S ).A d d t h e n e w r u l e S→SSand call the
resulting grammar G′.T h i sg r a m m a ri ss u p p o s e dt og e n e r a t e A∗.
2.16 Show that the class of context-free languages is closed under the regular operations,
union, concatenation, and star.
2.17 Use the results of Exercise 2.16 to give another proof that every regular language is
context free, by showing how to convert a regular expression directly to an equiv-
alent context-free grammar.
PROBLEMS
A2.18 a. LetCbe a context-free language and Rbe a regular language. Prove that
the language C∩Ris context free.
b.LetA={w|w∈{a,b,c}∗andwcontains equal numbers of a’s,b’s, and c’s}.
Use part (a) to show that Ais not a CFL.
⋆2.19 LetCFG Gbe the following grammar.
S→aSb|bY|Ya
Y→bY|aY|ε
Give a simple description of L(G)in English. Use that description to give a CFG
for
L(G),t h ec o m p l e m e n to f L(G).
2.20 LetA/B ={w|wx∈Afor some x∈B}.S h o wt h a ti f Ais context free and Bis
regular, then A/B is context free.
⋆2.21 LetΣ={a,b}.G i v ea CFG generating the language of strings with twice as many
a’s asb’s. Prove that your grammar is correct.
⋆2.22 LetC={x#y|x,y∈{0,1}∗andx̸=y}.S h o wt h a t Cis a context-free language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 181 ---
PROBLEMS 157
⋆2.23 LetD={xy|x,y∈{0,1}∗and|x|=|y|butx̸=y}.S h o wt h a t Dis a context-free
language.
⋆2.24 LetE={aibj|i̸=jand2i̸=j}.S h o wt h a t Eis a context-free language.
2.25 For any language A,l e tSUFFIX (A)={v|uv∈Afor some string u}.Show that
the class of context-free languages is closed under the SUFFIX operation.
2.26 Show that if Gis aCFG in Chomsky normal form, then for any string w∈L(G)
of length n≥1,e x a c t l y 2n−1steps are required for any derivation of w.
⋆2.27 LetG=(V,Σ,R ,⟨STMT ⟩)be the following grammar.
⟨STMT ⟩→⟨ ASSIGN ⟩|⟨IF-THEN ⟩|⟨IF-THEN -ELSE ⟩
⟨IF-THEN ⟩→if condition then ⟨STMT ⟩
⟨IF-THEN -ELSE ⟩→if condition then ⟨STMT ⟩else ⟨STMT ⟩
⟨ASSIGN ⟩→a:=1
Σ={if,condition ,then ,else ,a:=1 }
V={⟨STMT ⟩,⟨IF-THEN ⟩,⟨IF-THEN -ELSE ⟩,⟨ASSIGN ⟩}
Gis a natural-looking grammar for a fragment of a programming language, but G
is ambiguous.
a.Show that Gis ambiguous.
b.Give a new unambiguous grammar for the same language.
⋆2.28 Give unambiguous CFGsf o rt h ef o l l o w i n gl a n g u a g e s .
a.{w|in every prefix of wthe number of a’s is at least the number of b’s}
b.{w|the number of a’s and the number of b’s inware equal }
c.{w|the number of a’s is at least the number of b’s inw}
⋆2.29 Show that the language Ain Exercise 2.9 is inherently ambiguous.
2.30 Use the pumping lemma to show that the following languages are not context free.
a.{0n1n0n1n|n≥0}
Ab.{0n#02n#03n|n≥0}
Ac.{w#t|wis a substring of t,w h e r e w,t∈{a,b}∗}
d.{t1#t2#···#tk|k≥2,e a c h ti∈{a,b}∗,a n d ti=tjfor some i̸=j}
2.31 LetBbe the language of all palindromes over {0,1}containing equal numbers of
0sa n d 1s. Show that Bis not context free.
2.32 LetΣ={1,2,3,4}andC={w∈Σ∗|inw,t h en u m b e ro f 1se q u a l st h en u m b e r
of2s, and the number of 3se q u a l st h en u m b e ro f 4s}.S h o wt h a t Cis not context
free.
⋆2.33 Show that F={aibj|i=kjfor some positive integer k}is not context free.
2.34 Consider the language B=L(G),w h e r e Gis the grammar given in Exercise 2.13.
The pumping lemma for context-free languages, Theorem 2.34, states the exis-
tence of a pumping length pforB.W h a ti st h em i n i m u mv a l u eo f pthat works in
the pumping lemma? Justify your answer.
2.35 LetGbe aCFG in Chomsky normal form that contains bvariables. Show that if G
generates some string with a derivation having at least 2bsteps, L(G)is infinite.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 182 ---
158 CHAPTER 2 / CONTEXT-FREE LANGUAGES
2.36 Give an example of a language that is not context free but that acts like a CFLin the
pumping lemma. Prove that your example works. (See the analogous example for
regular languages in Problem 1.54.)
⋆2.37 Prove the following stronger form of the pumping lemma, wherein bothpieces v
andymust be nonempty when the string sis broken up.
IfAis a context-free language, then there is a number kwhere, if sis any string in
Aof length at least k,t h e n smay be divided into five pieces, s=uvxyz ,s a t i s f y i n g
the conditions:
a.for each i≥0,uvixyiz∈A,
b.v̸=εandy̸=ε,a n d
c.|vxy|≤k.
A2.38 Refer to Problem 1.41 for the definition of the perfect shuffle operation. Show that
the class of context-free languages is not closed under perfect shuffle.
2.39 Refer to Problem 1.42 for the definition of the shuffle operation. Show that the
class of context-free languages is not closed under shuffle.
⋆2.40 Say that a language is prefix-closed if all prefixes of every string in the language
are also in the language. Let Cbe an infinite, prefix-closed, context-free language.
Show that Ccontains an infinite regular subset.
⋆2.41 Read the definitions of NOPREFIX (A)andNOEXTEND (A)in Problem 1.40.
a.Show that the class of CFLsi sn o tc l o s e du n d e r NOPREFIX .
b.Show that the class of CFLsi sn o tc l o s e du n d e r NOEXTEND .
⋆2.42 LetY={w|w=t1#t2#···#tkfork≥0,e a c h ti∈1∗,a n d ti̸=tjwhenever i̸=j}.
Here Σ={1,#}.P r o v et h a t Yis not context free.
2.43 For strings wandt,w r i t e w/circleequaltif the symbols of ware a permutation of the
symbols of t.I no t h e rw o r d s , w/circleequaltiftandwhave the same symbols in the same
quantities, but possibly in a different order.
For any string w,d e fi n e SCRAMBLE (w)={t|t/circleequalw}.F o r a n y l a n g u a g e A,l e t
SCRAMBLE (A)={t|t∈SCRAMBLE (w)for some w∈A}.
a.Show that if Σ= {0,1},t h e nt h e SCRAMBLE of a regular language is con-
text free.
b.What happens in part (a) if Σcontains three or more symbols? Prove your
answer.
2.44 IfAandBare languages, define A⋄B={xy|x∈Aandy∈Band|x|=|y|}.
Show that if AandBare regular languages, then A⋄Bis aCFL.
⋆2.45 LetA={wtwR|w,t∈{0,1}∗and|w|=|t|}.P r o v et h a t Ais not a CFL.
2.46 Consider the following CFG G:
S→SS|T
T→aTb|ab
Describe L(G)and show that Gis ambiguous. Give an unambiguous grammar H
where L(H)=L(G)and sketch a proof that His unambiguous.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 183 ---
PROBLEMS 159
2.47 LetΣ= {0,1}and let Bbe the collection of strings that contain at least one 1in
their second half. In other words, B={uv|u∈Σ∗,v∈Σ∗1Σ∗and|u|≥|v|}.
a.Give a PDAthat recognizes B.
b.Give a CFG that generates B.
2.48 LetΣ= {0,1}.L e t C1be the language of all strings that contain a 1in their
middle third. Let C2be the language of all strings that contain two 1si nt h e i r
middle third. So C1={xyz|x, z∈Σ∗andy∈Σ∗1Σ∗,w h e r e |x|=|z|≥|y|}
andC2={xyz|x,z∈Σ∗andy∈Σ∗1Σ∗1Σ∗,w h e r e |x|=|z|≥|y|}.
a.Show that C1is aCFL.
b.Show that C2is not a CFL.
⋆2.49 We defined the rotational closure of language Ato be RC(A)={yx|xy∈A}.
Show that the class of CFLsi sc l o s e du n d e rr o t a t i o n a lc l o s u r e .
⋆2.50 We defined the CUT of language Ato be CUT (A)={yxz|xyz∈A}.S h o wt h a t
the class of CFLsi sn o tc l o s e du n d e r CUT .
2.51 Show that every DCFG is an unambiguous CFG.
A⋆2.52 Show that every DCFG generates a prefix-free language.
⋆2.53 Show that the class of DCFL si sn o tc l o s e du n d e rt h ef o l l o w i n go p e r a t i o n s :
a.Union
b.Intersection
c.Concatenation
d.Star
e.Reversal
2.54 LetGbe the following grammar:
S→T⊣⊣⊣
T→TaTb|TbTa|ε
a.Show that L(G)={w⊣⊣⊣|wcontains equal numbers of a’s and b’s}.U s e a
proof by induction on the length of w.
b.Use the DK-test to show that Gis aDCFG .
c.Describe a DPDA that recognizes L(G).
2.55 LetG1be the following grammar that we introduced in Example 2.45. Use the
DK-test to show that G1is not a DCFG .
R→S|T
S→aSb|ab
T→aTbb|abb
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 184 ---
160 CHAPTER 2 / CONTEXT-FREE LANGUAGES
⋆2.56 LetA=L(G1)where G1is defined in Problem 2.55. Show that Ais not a DCFL .
(Hint: Assume that Ais aDCFL and consider its DPDA P.M o d i f y Pso that its
input alphabet is {a,b,c}.W h e n i t fi r s t e n t e r s a n a c c e p t s t a t e , i t p r e t e n d s t h a t
c’s are b’s in the input from that point on. What language would the modified P
accept?)
⋆2.57 LetB={aibjck|i, j, k ≥0andi=jori=k}.P r o v et h a t Bis not a DCFL .
⋆2.58 LetC={wwR|w∈{0,1}∗}.P r o v e t h a t Cis not a DCFL . (Hint: Suppose that
when some DPDA Pis started in state qwith symbol xon the top of its stack, P
never pops its stack below x,n om a t t e rw h a ti n p u ts t r i n g Preads from that point
on. In that case, the contents of P’s stack at that point cannot affect its subsequent
behavior, so P’s subsequent behavior can depend only on qandx.)
⋆2.59 If we disallow ε-rules in CFGs, we can simplify the DK-test. In the simplified test,
we only need to check that each of DK’s accept states has a single rule. Prove that
aCFG without ε-rules passes the simplified DK-test iff it is a DCFG .
SELECTED SOLUTIONS
2.3 (a)R,X, S, T ;(b)a,b;(c)R;(d)Three strings in L(G)areab,ba,andaab;
(e)Three strings not in L(G)area,b,a n d ε;(f)False; (g)Tr u e ; (h)False;
(i)Tr u e ; (j)Tr u e ; (k)False; (l)Tr u e ; (m)Tr u e ; (n)False; (o)L(G)consists
of all strings over aandbthat are not palindromes.
2.4 (a)S→R1R1R1R
R→0R|1R|ε(d)S→0|0S0|0S1|1S0|1S1
2.6 (a)S→TaT
T→TT|aTb|bTa|a|ε
Tgenerates all strings with at least as
many a’s asb’s, and Sforces an extra a.(c)S→TX
T→0T0|1T1|#X
X→0X|1X|ε
2.7 (a)The PDAuses its stack to count the number of a’s minus the number of b’s. It
enters an accepting state whenever this count is positive. In more detail, it operates
as follows. The PDAscans across the input. If it sees a band its top stack symbol
is an a,i tp o p st h es t a c k . S i m i l a r l y ,i fi ts c a n sa n aand its top stack symbol is a
b,i tp o p st h es t a c k . I na l lo t h e rc a s e s ,i tp u s h e st h ei n p u ts y m b o lo n t ot h es t a c k .
After the PDAfinishes the input, if ais on top of the stack, it accepts. Otherwise it
rejects.
(c)The PDAscans across the input string and pushes every symbol it reads until
it reads a #.I f a #is never encountered, it rejects. Then, the PDAskips over part
of the input, nondeterministically deciding when to stop skipping. At that point,
it compares the next input symbols with the symbols it pops off the stack. At any
disagreement, or if the input finishes while the stack is nonempty, this branch of
the computation rejects. If the stack becomes empty, the machine reads the rest of
the input and accepts.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 185 ---
SELECTED SOLUTIONS 161
2.8 Here is one derivation:
⟨SENTENCE ⟩→⟨ NOUN -PHRASE ⟩⟨VERB -PHRASE ⟩→
⟨CMPLX -NOUN ⟩⟨VERB -PHRASE ⟩→
⟨ARTICLE ⟩⟨NOUN ⟩⟨VERB -PHRASE ⟩→
The⟨NOUN ⟩⟨VERB -PHRASE ⟩→
The girl ⟨VERB -PHRASE ⟩→
The girl ⟨CMPLX -VERB ⟩⟨PREP -PHRASE ⟩→
The girl ⟨VERB ⟩⟨NOUN -PHRASE ⟩⟨PREP -PHRASE ⟩→
The girl touches ⟨NOUN -PHRASE ⟩⟨PREP -PHRASE ⟩→
The girl touches ⟨CMPLX -NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches ⟨ARTICLE ⟩⟨NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches the ⟨NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches the boy ⟨PREP -PHRASE ⟩→
The girl touches the boy ⟨PREP ⟩⟨CMPLX -NOUN ⟩→
The girl touches the boy with ⟨CMPLX -NOUN ⟩→
The girl touches the boy with ⟨ARTICLE ⟩⟨NOUN ⟩→
The girl touches the boy with the ⟨NOUN ⟩→
The girl touches the boy with the flower
Here is another leftmost derivation:
⟨SENTENCE ⟩→⟨ NOUN -PHRASE ⟩⟨VERB -PHRASE ⟩→
⟨CMPLX -NOUN ⟩⟨VERB -PHRASE ⟩→
⟨ARTICLE ⟩⟨NOUN ⟩⟨VERB -PHRASE ⟩→
The⟨NOUN ⟩⟨VERB -PHRASE ⟩→
The girl ⟨VERB -PHRASE ⟩→
The girl ⟨CMPLX -VERB ⟩→
The girl ⟨VERB ⟩⟨NOUN -PHRASE ⟩→
The girl touches ⟨NOUN -PHRASE ⟩→
The girl touches ⟨CMPLX -NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches ⟨ARTICLE ⟩⟨NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches the ⟨NOUN ⟩⟨PREP -PHRASE ⟩→
The girl touches the boy ⟨PREP -PHRASE ⟩→
The girl touches the boy ⟨PREP ⟩⟨CMPLX -NOUN ⟩→
The girl touches the boy with ⟨CMPLX -NOUN ⟩→
The girl touches the boy with ⟨ARTICLE ⟩⟨NOUN ⟩→
The girl touches the boy with the ⟨NOUN ⟩→
The girl touches the boy with the flower
Each of these derivations corresponds to a different English meaning. In the first
derivation, the sentence means that the girl used the flower to touch the boy. In
the second derivation, the boy is holding the flower when the girl touches her.
2.18 (a)LetCbe a context-free language and Rbe a regular language. Let Pbe the
PDAthat recognizes C,a n d Dbe the DFAthat recognizes R.I fQis the set of
states of PandQ′is the set of states of D,w ec o n s t r u c ta PDAP′that recognizes
C∩Rwith the set of states Q×Q′.P′will do what Pdoes and also keep track of
the states of D.I ta c c e p t sas t r i n g wif and only if it stops at a state q∈FP×FD,
where FPis the set of accept states of PandFDis the set of accept states of D.
Since C∩Ris recognized by P′,i ti sc o n t e x tf r e e .
(b)LetRbe the regular language a∗b∗c∗.I fAwere a CFLthen A∩Rwould be
aCFLby part (a). However, A∩R={anbncn|n≥0},a n dE x a m p l e2 . 3 6p r o v e s
thatA∩Ris not context free. Thus Ais not a CFL.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 186 ---
162 CHAPTER 2 / CONTEXT-FREE LANGUAGES
2.30 (b)LetB={0n#02n#03n|n≥0}.L e t pbe the pumping length given by the
pumping lemma. Let s=0p#02p#03p.W e s h o w t h a t s=uvxyz cannot be
pumped.
Neither vnorycan contain #,o t h e r w i s e uv2xy2zcontains more than two #s.
Therefore, if we divide sinto three segments by #’s:0p,02p,and03p,a tl e a s to n e
of the segments is not contained within either vory. Hence uv2xy2zis not in B
because the 1:2:3 length ratio of the segments is not maintained.
(c)LetC={w#t|wis a substring of t,w h e r e w,t∈{a,b}∗}.L e t pbe the
pumping length given by the pumping lemma. Let s=apbp#apbp.W es h o wt h a t
the string s=uvxyz cannot be pumped.
Neither vnorycan contain #,o t h e r w i s e uv0xy0zdoes not contain #and therefore
is not in C.I fb o t h vandyoccur on the left-hand side of the #,t h es t r i n g uv2xy2z
cannot be in Cbecause it is longer on the left-hand side of the #.S i m i l a r l y ,i fb o t h
strings occur on the right-hand side of the #,t h es t r i n g uv0xy0zcannot be in C
because it is again longer on the left-hand side of the #.I fo n eo f vandyis empty
(both cannot be empty), treat them as if both occurred on the same side of the #as
above.
The only remaining case is where both vandyare nonempty and straddle the #.
But then vconsists of b’s and yconsists of a’s because of the third pumping lemma
condition |vxy|≤p. Hence, uv2xy2zcontains more b’s on the left-hand side of
the#,s oi tc a n n o tb eam e m b e ro f C.
2.38 LetAbe the language {0k1k|k≥0}and let Bbe the language {akb3k|k≥0}.
The perfect shuffle of AandBis the language C={(0a)k(0b)k(1b)2k|k≥0}.
Languages AandBare easily seen to be CFLs, but Cis not a CFL,a sf o l l o w s .
IfCwere a CFL,l e t pbe the pumping length given by the pumping lemma, and
letsbe the string (0a)p(0b)p(1b)2p.B e c a u s e sis longer than pands∈C,w e
can divide s=uvxyz satisfying the pumping lemma’s three conditions. Strings
inCare exactly one-fourth 1sa n do n e - e i g h t h a’s. In order for uv2xy2zto have
that property, the string vxymust contain both 1sa n d a’s. But that is impossible,
because the 1sa n d a’s are separated by 2psymbols in syet the third condition says
that|vxy|≤p. Hence Cis not context free.
2.52 We use a proof by contradiction. Assume that wandwzare two unequal strings in
L(G),w h e r e Gis aDCFG .B o t ha r ev a l i ds t r i n g ss ob o t hh a v eh a n d l e s ,a n dt h e s e
handles must agree because we can write w=xhyandwz=xhyz =xhˆywhere h
is the handle of w. Hence, the first reduce steps of wandwzproduce valid strings
uanduz,r e s p e c t i v e l y . W ec a nc o n t i n u et h i sp r o c e s su n t i lw eo b t a i n S1andS1z
where S1is the start variable. However, S1does not appear on the right-hand side
of any rule so we cannot reduce S1z.T h a tg i v e sac o n t r a d i c t i o n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 187 ---
PART TWO
COMPUTABILITY THEORY
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 188 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 189 ---
3
THE CHURCH- - - -TURING
THESIS
So far in our development of the theory of computation, we have presented sev-
eral models of computing devices. Finite automata are good models for devices
that have a small amount of memory. Pushdown automata are good models for
devices that have an unlimited memory that is usable only in the last in, first out
manner of a stack. We have shown that some very simple tasks are beyond the
capabilities of these models. Hence they are too restricted to serve as models of
general purpose computers.
3.1
TURING MACHINES
We turn now to a much more powerful model, first proposed by Alan Turing
in 1936, called the Turing machine .S i m i l a r t o a fi n i t e a u t o m a t o n b u t w i t h a n
unlimited and unrestricted memory, a T uring machine is a much more accurate
model of a general purpose computer. A T uring machine can do everything
that a real computer can do. Nonetheless, even a T uring machine cannot solve
certain problems. In a very real sense, these problems are beyond the theoretical
limits of computation.
The T uring machine model uses an infinite tape as its unlimited memory. It
has a tape head that can read and write symbols and move around on the tape.
165
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 190 ---
166 CHAPTER 3 / THE CHURCH---TURING THESIS
Initially the tape contains only the input string and is blank everywhere else. If
the machine needs to store information, it may write this information on the
tape. T o read the information that it has written, the machine can move its
head back over it. The machine continues computing until it decides to produce
an output. The outputs accept andreject are obtained by entering designated
accepting and rejecting states. If it doesn’t enter an accepting or a rejecting state,
it will go on forever, never halting.FIGURE 3.1
Schematic of a T uring machine
The following list summarizes the differences between finite automata and
Tu r i n g m a c h i n e s .
1.AT u r i n gm a c h i n ec a nb o t hw r i t eo nt h et a p ea n dr e a df r o mi t .
2.The read–write head can move both to the left and to the right.
3.The tape is infinite.
4.The special states for rejecting and accepting take effect immediately.
Let’s introduce a T uring machine M1for testing membership in the language
B={w#w|w∈{0,1}∗}.W e w a n t M1to accept if its input is a member of B
and to reject otherwise. T o understand M1better, put yourself in its place by
imagining that you are standing on a mile-long input consisting of millions of
characters. Your goal is to determine whether the input is a member of B—that
is, whether the input comprises two identical strings separated by a #symbol.
The input is too long for you to remember it all, but you are allowed to move
back and forth over the input and make marks on it. The obvious strategy is
to zig-zag to the corresponding places on the two sides of the #and determine
whether they match. Place marks on the tape to keep track of which places
correspond.
We design M1to work in that way. It makes multiple passes over the input
string with the read–write head. On each pass it matches one of the characters
on each side of the #symbol. T o keep track of which symbols have been checked
already, M1crosses off each symbol as it is examined. If it crosses off all the
symbols, that means that everything matched successfully, and M1goes into an
accept state. If it discovers a mismatch, it enters a reject state. In summary, M1’s
algorithm is as follows.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 191 ---
3.1 TURING MACHINES 167
M1=“On input string w:
1.Zig-zag across the tape to corresponding positions on either
side of the #symbol to check whether these positions contain
the same symbol. If they do not, or if no #is found, reject .
Cross off symbols as they are checked to keep track of which
symbols correspond.
2.When all symbols to the left of the #have been crossed off,
check for any remaining symbols to the right of the #.I f a n y
symbols remain, reject ;o t h e r w i s e , accept .”
The following figure contains several nonconsecutive snapshots of M1’s tape
after it is started on input 011000#011000 .
FIGURE 3.2
Snapshots of T uring machine M1computing on input 011000#011000
This description of T uring machine M1sketches the way it functions but does
not give all its details. We can describe T uring machines in complete detail by
giving formal descriptions analogous to those introduced for finite and push-
down automata. The formal descriptions specify each of the parts of the formal
definition of the T uring machine model to be presented shortly. In actuality, we
almost never give formal descriptions of T uring machines because they tend to
be very big.
FORMAL DEFINITION OF A TURING MACHINE
The heart of the definition of a T uring machine is the transition function δbe-
cause it tells us how the machine gets from one step to the next. For a T uring
machine, δtakes the form: Q×Γ−→Q×Γ×{L,R}.That is, when the machine
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 192 ---
168 CHAPTER 3 / THE CHURCH---TURING THESIS
is in a certain state qand the head is over a tape square containing a symbol a,
and if δ(q,a)=( r, b,L),t h em a c h i n ew r i t e st h es y m b o l breplacing the a,a n d
goes to state r.T h et h i r dc o m p o n e n ti se i t h e rLo rRa n di n d i c a t e sw h e t h e rt h e
head moves to the left or right after writing. In this case, the L indicates a move
to the left.
DEFINITION 3.3
ATuring machine is a 7-tuple, (Q,Σ,Γ,δ ,q 0,qaccept,qreject),w h e r e
Q,Σ,Γare all finite sets and
1.Qis the set of states,
2.Σis the input alphabet not containing the blank symbol ␣,
3.Γis the tape alphabet, where ␣∈ΓandΣ⊆Γ,
4.δ:Q×Γ−→Q×Γ×{L,R}is the transition function,
5.q0∈Qis the start state,
6.qaccept ∈Qis the accept state, and
7.qreject∈Qis the reject state, where qreject̸=qaccept.
AT u r i n gm a c h i n e M=(Q,Σ,Γ,δ ,q 0,qaccept,qreject)computes as follows. Ini-
tially, Mreceives its input w=w1w2...w n∈Σ∗on the leftmost nsquares of
the tape, and the rest of the tape is blank (i.e., filled with blank symbols). The
head starts on the leftmost square of the tape. Note that Σdoes not contain the
blank symbol, so the first blank appearing on the tape marks the end of the input.
Once Mhas started, the computation proceeds according to the rules described
by the transition function. If Mever tries to move its head to the left off the
left-hand end of the tape, the head stays in the same place for that move, even
though the transition function indicates L. The computation continues until it
enters either the accept or reject states, at which point it halts. If neither occurs,
Mgoes on forever.
As a T uring machine computes, changes occur in the current state, the cur-
rent tape contents, and the current head location. A setting of these three items
is called a configuration of the T uring machine. Configurations often are rep-
resented in a special way. For a state qand two strings uandvover the tape
alphabet Γ,w ew r i t e uqvfor the configuration where the current state is q,t h e
current tape contents is uv,a n dt h ec u r r e n th e a dl o c a t i o ni st h efi r s ts y m b o l
ofv.T h et a p ec o n t a i n so n l yb l a n k sf o l l o w i n gt h el a s ts y m b o lo f v.F o re x a m p l e ,
1011 q701111 represents the configuration when the tape is 101101111 ,t h ec u r -
rent state is q7,a n dt h eh e a di sc u r r e n t l yo nt h es e c o n d 0.F i g u r e 3 . 4 d e p i c t s a
Tu r i n g m a c h i n e w i t h t h a t c o n fi g u r a t i o n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 193 ---
3.1 TURING MACHINES 169
FIGURE 3.4
AT u r i n gm a c h i n ew i t hc o n fi g u r a t i o n 1011 q701111
Here we formalize our intuitive understanding of the way that a T uring ma-
chine computes. Say that configuration C1yields configuration C2if the T uring
machine can legally go from C1toC2in a single step. We define this notion
formally as follows.
Suppose that we have a,b,a n d cinΓ,a sw e l la s uandvinΓ∗and states qi
andqj.I nt h a tc a s e , ua q ibvanduqjacvare two configurations. Say that
ua q ibvyields uqjacv
if in the transition function δ(qi,b)=( qj,c ,L).T h a th a n d l e st h ec a s ew h e r et h e
Tu r i n g m a c h i n e m o v e s l e f t w a r d . F o r a r i g h t w a r d m o v e , s a y t h a t
ua q ibvyields uac q jv
ifδ(qi,b)=( qj,c ,R).
Special cases occur when the head is at one of the ends of the configuration.
For the left-hand end, the configuration qibvyields qjcvif the transition is left-
moving (because we prevent the machine from going off the left-hand end of the
tape), and it yields cqjvfor the right-moving transition. For the right-hand end,
the configuration ua q iis equivalent to ua q i␣because we assume that blanks
follow the part of the tape represented in the configuration. Thus we can handle
this case as before, with the head no longer at the right-hand end.
The start configuration ofMon input wis the configuration q0w,w h i c h
indicates that the machine is in the start state q0with its head at the leftmost
position on the tape. In an accepting configuration ,t h es t a t eo ft h ec o n fi g u r a t i o n
isqaccept.I n a rejecting configuration ,t h es t a t eo ft h ec o n fi g u r a t i o ni s qreject.
Accepting and rejecting configurations are halting configurations and do not
yield further configurations. Because the machine is defined to halt when in the
states qaccept andqreject,w ee q u i v a l e n t l yc o u l dh a v ed e fi n e dt h et r a n s i t i o nf u n c t i o n
to have the more complicated form δ:Q′×Γ−→Q×Γ×{L,R},w h e r e Q′isQ
without qaccept andqreject.AT u r i n gm a c h i n e Maccepts input wif a sequence of
configurations C1,C2,...,Ckexists, where
1.C1is the start configuration of Mon input w,
2.each Ciyields Ci+1,a n d
3.Ckis an accepting configuration.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 194 ---
170 CHAPTER 3 / THE CHURCH---TURING THESIS
The collection of strings that Maccepts is the language of M,o rthe lan-
guage recognized by M,d e n o t e d L(M).
DEFINITION 3.5
Call a language Turing-recognizable if some T uring machine
recognizes it.1
When we start a T uring machine on an input, three outcomes are possible.
The machine may accept ,reject ,o rloop.B yloopwe mean that the machine simply
does not halt. Looping may entail any simple or complex behavior that never
leads to a halting state.
AT u r i n gm a c h i n e Mcan fail to accept an input by entering the qreject state
and rejecting, or by looping. Sometimes distinguishing a machine that is looping
from one that is merely taking a long time is difficult. For this reason, we prefer
Tu r i n g m a c h i n e s t h a t h a l t o n a l l i n p u t s ; s u c h m a c h i n e s n e v e r l o o p . T h e s e m a -
chines are called deciders because they always make a decision to accept or reject.
Ad e c i d e rt h a tr e c o g n i z e ss o m el a n g u a g ea l s oi ss a i dt o decide that language.
DEFINITION 3.6
Call a language Turing-decidable or simply decidable if some
Tu r i n g m a c h i n e d e c i d e s i t .2
Next, we give examples of decidable languages. Every decidable language
is T uring-recognizable. We present examples of languages that are T uring-
recognizable but not decidable after we develop a technique for proving un-
decidability in Chapter 4.
EXAMPLES OF TURING MACHINES
As we did for finite and pushdown automata, we can formally describe a partic-
ular T uring machine by specifying each of its seven parts. However, going to
that level of detail can be cumbersome for all but the tiniest T uring machines.
Accordingly, we won’t spend much time giving such descriptions. Mostly we
1It is called a recursively enumerable language in some other textbooks.
2It is called a recursive language in some other textbooks.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 195 ---
3.1 TURING MACHINES 171
will give only higher level descriptions because they are precise enough for our
purposes and are much easier to understand. Nevertheless, it is important to
remember that every higher level description is actually just shorthand for its
formal counterpart. With patience and care we could describe any of the T uring
machines in this book in complete formal detail.
To h e l p y o u m a k e t h e c o n n e c t i o n b e t w e e n t h e f o r m a l d e s c r i p t i o n s a n d t h e
higher level descriptions, we give state diagrams in the next two examples. You
may skip over them if you already feel comfortable with this connection.
EXAMPLE 3.7
Here we describe a T uring machine ( TM)M2that decides A={02n|n≥0},t h e
language consisting of all strings of 0sw h o s el e n g t hi sap o w e ro f 2.
M2=“On input string w:
1.Sweep left to right across the tape, crossing off every other 0.
2.If in stage 1 the tape contained a single 0,accept .
3.If in stage 1 the tape contained more than a single 0and the
number of 0sw a so d d , reject .
4.Return the head to the left-hand end of the tape.
5.Go to stage 1. ”
Each iteration of stage 1 cuts the number of 0si nh a l f .A st h em a c h i n es w e e p s
across the tape in stage 1, it keeps track of whether the number of 0ss e e ni se v e n
or odd. If that number is odd and greater than 1,t h eo r i g i n a ln u m b e ro f 0si n
the input could not have been a power of 2.T h e r e f o r e , t h e m a c h i n e r e j e c t s i n
this instance. However, if the number of 0ss e e ni s 1,t h eo r i g i n a ln u m b e rm u s t
have been a power of 2.S oi nt h i sc a s e ,t h em a c h i n ea c c e p t s .
Now we give the formal description of M2=(Q,Σ,Γ,δ ,q 1,qaccept,qreject):
•Q={q1,q2,q3,q4,q5,qaccept,qreject},
•Σ= {0},a n d
•Γ={0,x,␣}.
•We describe δwith a state diagram (see Figure 3.8).
•The start, accept, and reject states are q1,qaccept,a n d qreject,r e s p e c t i v e l y .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 196 ---
172 CHAPTER 3 / THE CHURCH---TURING THESIS
FIGURE 3.8
State diagram for T uring machine M2
In this state diagram, the label 0→␣,Ra p p e a r so nt h et r a n s i t i o nf r o m q1toq2.
This label signifies that when in state q1with the head reading 0,t h em a c h i n e
goes to state q2,w r i t e s ␣,a n dm o v e st h eh e a dt ot h er i g h t . I no t h e rw o r d s ,
δ(q1,0)=( q2,␣,R).F o rc l a r i t yw eu s et h es h o r t h a n d 0→Ri nt h et r a n s i t i o nf r o m
q3toq4,t om e a nt h a tt h em a c h i n em o v e st ot h er i g h tw h e nr e a d i n g 0in state q3
but doesn’t alter the tape, so δ(q3,0)=( q4,0,R).
This machine begins by writing a blank symbol over the leftmost 0on the
tape so that it can find the left-hand end of the tape in stage 4. Whereas we
would normally use a more suggestive symbol such as #for the left-hand end
delimiter, we use a blank here to keep the tape alphabet, and hence the state
diagram, small. Example 3.11 gives another method of finding the left-hand end
of the tape.
Next we give a sample run of this machine on input 0000 .T h es t a r t i n gc o n -
figuration is q10000 .T h es e q u e n c eo fc o n fi g u r a t i o n st h em a c h i n ee n t e r sa p p e a r s
as follows; read down the columns and left to right.
q10000 ␣q5x0x␣ ␣xq5xx␣
␣q2000 q5␣x0x␣ ␣q5xxx␣
␣xq300 ␣q2x0x␣ q5␣xxx␣
␣x0q40 ␣xq20x␣ ␣q2xxx␣
␣x0xq3␣ ␣xxq3x␣ ␣xq2xx␣
␣x0q5x␣ ␣xxxq3␣ ␣xxq2x␣
␣xq50x␣ ␣xxq5x␣ ␣xxxq2␣
␣xxx␣qaccept
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 197 ---
3.1 TURING MACHINES 173
EXAMPLE 3.9
The following is a formal description of M1=(Q,Σ,Γ,δ ,q 1,qaccept,qreject),t h e
Tu r i n g m a c h i n e t h a t w e i n f o r m a l l y d e s c r i b e d ( p a g e 1 6 7 ) f o r d e c i d i n g t h e l a n -
guage B={w#w|w∈{0,1}∗}.
•Q={q1,...,q 8,qaccept,qreject},
•Σ= {0,1,#},a n d Γ={0,1,#,x,␣}.
•We describe δwith a state diagram (see the following figure).
•The start, accept, and reject states are q1,qaccept,a n d qreject,r e s p e c t i v e l y .
FIGURE 3.10
State diagram for T uring machine M1
In Figure 3.10, which depicts the state diagram of TMM1,y o uw i l lfi n dt h e
label 0,1→Ro nt h et r a n s i t i o ng o i n gf r o m q3to itself. That label means that the
machine stays in q3and moves to the right when it reads a 0or a1in state q3.I t
doesn’t change the symbol on the tape.
Stage 1 is implemented by states q1through q7,a n ds t a g e2b yt h er e m a i n i n g
states. T o simplify the figure, we don’t show the reject state or the transitions
going to the reject state. Those transitions occur implicitly whenever a state
lacks an outgoing transition for a particular symbol. Thus because in state q5
no outgoing arrow with a #is present, if a #occurs under the head when the
machine is in state q5,i tg o e st os t a t e qreject.F o r c o m p l e t e n e s s , w e s a y t h a t t h e
head moves right in each of these transitions to the reject state.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 198 ---
174 CHAPTER 3 / THE CHURCH---TURING THESIS
EXAMPLE 3.11
Here, a TMM3is doing some elementary arithmetic. It decides the language
C={aibjck|i×j=kandi, j, k ≥1}.
M3=“On input string w:
1.Scan the input from left to right to determine whether it is a
member of a+b+c+andreject if it isn’t.
2.Return the head to the left-hand end of the tape.
3.Cross off an aand scan to the right until a boccurs. Shuttle
between the b’s and the c’s, crossing off one of each until all b’s
are gone. If all c’s have been crossed off and some b’s remain,
reject .
4.Restore the crossed off b’s and repeat stage 3 if there is another
ato cross off. If all a’s have been crossed off, determine whether
allc’s also have been crossed off. If yes, accept ;o t h e r w i s e ,
reject .”
Let’s examine the four stages of M3more closely. In stage 1, the machine
operates like a finite automaton. No writing is necessary as the head moves from
left to right, keeping track by using its states to determine whether the input is
in the proper form.
Stage 2 looks equally simple but contains a subtlety. How can the TMfind
the left-hand end of the input tape? Finding the right-hand end of the input
is easy because it is terminated with a blank symbol. But the left-hand end has
no terminator initially. One technique that allows the machine to find the left-
hand end of the tape is for it to mark the leftmost symbol in some way when
the machine starts with its head on that symbol. Then the machine may scan
left until it finds the mark when it wants to reset its head to the left-hand end.
Example 3.7 illustrated this technique; a blank symbol marks the left-hand end.
At r i c k i e rm e t h o do ffi n d i n gt h el e f t - h a n de n do ft h et a p et a k e sa d v a n t a g eo f
the way that we defined the T uring machine model. Recall that if the machine
tries to move its head beyond the left-hand end of the tape, it stays in the same
place. We can use this feature to make a left-hand end detector. T o detect
whether the head is sitting on the left-hand end, the machine can write a special
symbol over the current position while recording the symbol that it replaced in
the control. Then it can attempt to move the head to the left. If it is still over
the special symbol, the leftward move didn’t succeed, and thus the head must
have been at the left-hand end. If instead it is over a different symbol, some
symbols remained to the left of that position on the tape. Before going farther,
the machine must be sure to restore the changed symbol to the original.
Stages 3 and 4 have straightforward implementations and use several states
each.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 199 ---
3.1 TURING MACHINES 175
EXAMPLE 3.12
Here, a TMM4is solving what is called the element distinctness problem .I ti sg i v e n
al i s to fs t r i n g so v e r {0,1}separated by #sa n di t sj o bi st oa c c e p ti fa l lt h es t r i n g s
are different. The language is
E={#x1#x2#···#xl|each xi∈{0,1}∗andxi̸=xjfor each i̸=j}.
Machine M4works by comparing x1with x2through xl,t h e nb yc o m p a r i n g x2
with x3through xl,a n ds oo n . A ni n f o r m a ld e s c r i p t i o no ft h e TMM4deciding
this language follows.
M4=“On input w:
1.Place a mark on top of the leftmost tape symbol. If that symbol
was a blank, accept .I f t h a t s y m b o l w a s a #,c o n t i n u ew i t ht h e
next stage. Otherwise, reject .
2.Scan right to the next #and place a second mark on top of it. If
no#is encountered before a blank symbol, only x1was present,
soaccept .
3.By zig-zagging, compare the two strings to the right of the
marked #s. If they are equal, reject .
4.Move the rightmost of the two marks to the next #symbol to
the right. If no #symbol is encountered before a blank sym-
bol, move the leftmost mark to the next #to its right and the
rightmost mark to the #after that. This time, if no #is available
for the rightmost mark, all the strings have been compared, so
accept .
5.Go to stage 3. ”
This machine illustrates the technique of marking tape symbols. In stage 2,
the machine places a mark above a symbol, #in this case. In the actual imple-
mentation, the machine has two different symbols, #and•#,i ni t st a p ea l p h a b e t .
Saying that the machine places a mark above a #means that the machine writes
the symbol•#at that location. Removing the mark means that the machine writes
the symbol without the dot. In general, we may want to place marks over vari-
ous symbols on the tape. T o do so, we merely include versions of all these tape
symbols with dots in the tape alphabet.
We conclude from the preceding examples that the described languages A,
B,C,a n d Eare decidable. All decidable languages are T uring-recognizable, so
these languages are also T uring-recognizable. Demonstrating a language that is
T uring-recognizable but undecidable is more difficult. We do so in Chapter 4.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 200 ---
176 CHAPTER 3 / THE CHURCH---TURING THESIS
3.2
VARIANTS OF TURING MACHINES
Alternative definitions of T uring machines abound, including versions with mul-
tiple tapes or with nondeterminism. They are called variants of the T uring
machine model. The original model and its reasonable variants all have the
same power—they recognize the same class of languages. In this section, we de-
scribe some of these variants and the proofs of equivalence in power. We call this
invariance to certain changes in the definition robustness .B o t h fi n i t ea u t o m a t a
and pushdown automata are somewhat robust models, but T uring machines have
an astonishing degree of robustness.
To i l l u s t r a t e t h e r o b u s t n e s s o f t h e Tu r i n g m a c h i n e m o d e l , l e t ’s v a r y t h e t y p e
of transition function permitted. In our definition, the transition function forces
the head to move to the left or right after each step; the head may not simply
stay put. Suppose that we had allowed the T uring machine the ability to stay put.
The transition function would then have the form δ:Q×Γ−→Q×Γ×{L,R,S}.
Might this feature allow T uring machines to recognize additional languages, thus
adding to the power of the model? Of course not, because we can convert any
TMwith the “stay put” feature to one that does not have it. We do so by replacing
each stay put transition with two transitions: one that moves to the right and the
second back to the left.
This small example contains the key to showing the equivalence of TMvari-
ants. T o show that two models are equivalent, we simply need to show that one
can simulate the other.
MULTITAPE TURING MACHINES
Amultitape Turing machine is like an ordinary T uring machine with several
tapes. Each tape has its own head for reading and writing. Initially the input
appears on tape 1, and the others start out blank. The transition function is
changed to allow for reading, writing, and moving the heads on some or all of
the tapes simultaneously. Formally, it is
δ:Q×Γk−→Q×Γk×{L,R,S}k,
where kis the number of tapes. The expression
δ(qi,a1,...,a k)=( qj,b1,...,b k,L,R,...,L)
means that if the machine is in state qiand heads 1 through kare reading symbols
a1through ak,t h em a c h i n eg o e st os t a t e qj,w r i t e ss y m b o l s b1through bk,a n d
directs each head to move left or right, or to stay put, as specified.
Multitape T uring machines appear to be more powerful than ordinary T uring
machines, but we can show that they are equivalent in power. Recall that two
machines are equivalent if they recognize the same language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 201 ---
3.2 VARIANTS OF TURING MACHINES 177
THEOREM 3.13
Every multitape T uring machine has an equivalent single-tape T uring machine.
PROOF We show how to convert a multitape TMMto an equivalent single-
tape TMS.T h ek e yi d e ai st os h o wh o wt os i m u l a t e Mwith S.
Say that Mhasktapes. Then Ssimulates the effect of ktapes by storing
their information on its single tape. It uses the new symbol #as a delimiter to
separate the contents of the different tapes. In addition to the contents of these
tapes, Smust keep track of the locations of the heads. It does so by writing a tape
symbol with a dot above it to mark the place where the head on that tape would
be. Think of these as “virtual” tapes and heads. As before, the “dotted” tape
symbols are simply new symbols that have been added to the tape alphabet. The
following figure illustrates how one tape can be used to represent three tapes.
FIGURE 3.14
Representing three tapes with one
S=“On input w=w1···wn:
1.First Sputs its tape into the format that represents all ktapes
ofM.T h ef o r m a t t e dt a p ec o n t a i n s
#•w1w2···wn#•␣#•␣#···#.
2.To s i m u l a t e a s i n g l e m o v e , Sscans its tape from the first #,
which marks the left-hand end, to the (k+1 )st#,w h i c hm a r k s
the right-hand end, in order to determine the symbols under
the virtual heads. Then Smakes a second pass to update the
tapes according to the way that M’s transition function dictates.
3.If at any point Smoves one of the virtual heads to the right onto
a#,t h i sa c t i o ns i g n i fi e st h a t Mhas moved the corresponding
head onto the previously unread blank portion of that tape. So
Swrites a blank symbol on this tape cell and shifts the tape
contents, from this cell until the rightmost #,o n eu n i tt ot h e
right. Then it continues the simulation as before. ”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 202 ---
178 CHAPTER 3 / THE CHURCH---TURING THESIS
COROLLARY 3.15
Al a n g u a g ei sT u r i n g - r e c o g n i z a b l ei fa n do n l yi fs o m em u l t i t a p eT u r i n gm a c h i n e
recognizes it.
PROOF AT u r i n g - r e c o g n i z a b l el a n g u a g ei sr e c o g n i z e db ya no r d i n a r y( s i n g l e -
tape) T uring machine, which is a special case of a multitape T uring machine.
That proves one direction of this corollary. The other direction follows from
Theorem 3.13.
NONDETERMINISTIC TURING MACHINES
An o n d e t e r m i n i s t i cT u r i n gm a c h i n ei sd e fi n e di nt h ee x p e c t e dw a y .A ta n yp o i n t
in a computation, the machine may proceed according to several possibilities.
The transition function for a nondeterministic T uring machine has the form
δ:Q×Γ−→ P(Q×Γ×{L,R}).
The computation of a nondeterministic T uring machine is a tree whose branches
correspond to different possibilities for the machine. If some branch of the com-
putation leads to the accept state, the machine accepts its input. If you feel the
need to review nondeterminism, turn to Section 1.2 (page 47). Now we show
that nondeterminism does not affect the power of the T uring machine model.
THEOREM 3.16
Every nondeterministic T uring machine has an equivalent deterministic T uring
machine.
PROOF IDEA We can simulate any nondeterministic TMNwith a determin-
istic TMD.T h ei d e ab e h i n dt h es i m u l a t i o ni st oh a v e Dtry all possible branches
ofN’s nondeterministic computation. If Dever finds the accept state on one of
these branches, Daccepts. Otherwise, D’s simulation will not terminate.
We view N’s computation on an input was a tree. Each branch of the tree
represents one of the branches of the nondeterminism. Each node of the tree
is a configuration of N.T h e r o o t o f t h e t r e e i s t h e s t a r t c o n fi g u r a t i o n .T h e
TMDsearches this tree for an accepting configuration. Conducting this search
carefully is crucial lest Dfail to visit the entire tree. A tempting, though bad,
idea is to have Dexplore the tree by using depth-first search. The depth-first
search strategy goes all the way down one branch before backing up to explore
other branches. If Dwere to explore the tree in this manner, Dcould go forever
down one infinite branch and miss an accepting configuration on some other
branch. Hence we design Dto explore the tree by using breadth-first search
instead. This strategy explores all branches to the same depth before going on
to explore any branch to the next depth. This method guarantees that Dwill
visit every node in the tree until it encounters an accepting configuration.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 203 ---
3.2 VARIANTS OF TURING MACHINES 179
PROOF The simulating deterministic TMDhas three tapes. By Theo-
rem 3.13, this arrangement is equivalent to having a single tape. The machine
Duses its three tapes in a particular way, as illustrated in the following figure.
Ta p e 1 a l w a y s c o n t a i n s t h e i n p u t s t r i n g a n d i s n e v e r a l t e r e d . Ta p e 2 m a i n t a i n s a
copy of N’s tape on some branch of its nondeterministic computation. Tape 3
keeps track of D’s location in N’s nondeterministic computation tree.
FIGURE 3.17
Deterministic TMDsimulating nondeterministic TMN
Let’s first consider the data representation on tape 3. Every node in the tree
can have at most bchildren, where bis the size of the largest set of possible
choices given by N’s transition function. To every node in the tree we assign
an address that is a string over the alphabet Γb={1,2,...,b }.W e a s s i g n t h e
address 231to the node we arrive at by starting at the root, going to its 2nd child,
going to that node’s 3rd child, and finally going to that node’s 1st child. Each
symbol in the string tells us which choice to make next when simulating a step
in one branch in N’s nondeterministic computation. Sometimes a symbol may
not correspond to any choice if too few choices are available for a configuration.
In that case, the address is invalid and doesn’t correspond to any node. T ape 3
contains a string over Γb.I tr e p r e s e n t st h eb r a n c ho f N’s computation from the
root to the node addressed by that string unless the address is invalid. The empty
string is the address of the root of the tree. Now we are ready to describe D.
1.Initially, tape 1 contains the input w,a n dt a p e s2a n d3a r ee m p t y .
2.Copy tape 1 to tape 2 and initialize the string on tape 3 to be ε.
3.Use tape 2 to simulate Nwith input won one branch of its nondeterminis-
tic computation. Before each step of N,c o n s u l tt h en e x ts y m b o lo nt a p e3
to determine which choice to make among those allowed by N’s transition
function. If no more symbols remain on tape 3 or if this nondeterministic
choice is invalid, abort this branch by going to stage 4. Also go to stage 4
if a rejecting configuration is encountered. If an accepting configuration is
encountered, accept the input.
4.Replace the string on tape 3 with the next string in the string ordering.
Simulate the next branch of N’s computation by going to stage 2.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 204 ---
180 CHAPTER 3 / THE CHURCH---TURING THESIS
COROLLARY 3.18
Al a n g u a g ei sT u r i n g - r e c o g n i z a b l ei fa n do n l yi fs o m en o n d e t e r m i n i s t i cT u r i n g
machine recognizes it.
PROOF Any deterministic TMis automatically a nondeterministic TM,a n ds o
one direction of this corollary follows immediately. The other direction follows
from Theorem 3.16.
We can modify the proof of Theorem 3.16 so that if Nalways halts on all
branches of its computation, Dwill always halt. We call a nondeterministic T ur-
ing machine a decider if all branches halt on all inputs. Exercise 3.3 asks you to
modify the proof in this way to obtain the following corollary to Theorem 3.16.
COROLLARY 3.19
Al a n g u a g ei sd e c i d a b l ei fa n do n l yi fs o m en o n d e t e r m i n i s t i cT u r i n gm a c h i n e
decides it.
ENUMERATORS
As we mentioned earlier, some people use the term recursively enumerable lan-
guage for T uring-recognizable language. That term originates from a type of
Tu r i n g m a c h i n e v a r i a n t c a l l e d a n enumerator .L o o s e l y d e fi n e d , a n e n u m e r a -
tor is a T uring machine with an attached printer. The T uring machine can use
that printer as an output device to print strings. Every time the T uring machine
wants to add a string to the list, it sends the string to the printer. Exercise 3.4 asks
you to give a formal definition of an enumerator. The following figure depicts a
schematic of this model.
FIGURE 3.20
Schematic of an enumerator
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 205 ---
3.2 VARIANTS OF TURING MACHINES 181
An enumerator Estarts with a blank input on its work tape. If the enumerator
doesn’t halt, it may print an infinite list of strings. The language enumerated by
Eis the collection of all the strings that it eventually prints out. Moreover, E
may generate the strings of the language in any order, possibly with repetitions.
Now we are ready to develop the connection between enumerators and T uring-
recognizable languages.
THEOREM 3.21
Al a n g u a g ei sT u r i n g - r e c o g n i z a b l ei fa n do n l yi fs o m ee n u m e r a t o re n u m e r a t e si t .
PROOF First we show that if we have an enumerator Ethat enumerates a
language A,aTMMrecognizes A.T h e TMMworks in the following way.
M=“On input w:
1.Run E.E v e r yt i m et h a t Eoutputs a string, compare it with w.
2.Ifwever appears in the output of E,accept .”
Clearly, Maccepts those strings that appear on E’s list.
Now we do the other direction. If TMMrecognizes a language A,w ec a n
construct the following enumerator EforA.S a yt h a t s1,s2,s3,...is a list of all
possible strings in Σ∗.
E=“Ignore the input.
1.Repeat the following for i=1,2,3,....
2. Run Mforisteps on each input, s1,s2,...,s i.
3. If any computations accept, print out the corresponding sj.”
IfMaccepts a particular string s,e v e n t u a l l yi tw i l la p p e a ro nt h el i s tg e n e r a t e d
byE.I n f a c t , i t w i l l a p p e a r o n t h e l i s t i n fi n i t e l y m a n y t i m e s b e c a u s e Mruns
from the beginning on each string for each repetition of step 1. This procedure
gives the effect of running Min parallel on all possible input strings.
EQUIVALENCE WITH OTHER MODELS
So far we have presented several variants of the T uring machine model and have
shown them to be equivalent in power. Many other models of general pur-
pose computation have been proposed. Some of these models are very much
like T uring machines, but others are quite different. All share the essential fea-
ture of T uring machines—namely, unrestricted access to unlimited memory—
distinguishing them from weaker models such as finite automata and pushdown
automata. Remarkably, allmodels with that feature turn out to be equivalent in
power, so long as they satisfy reasonable requirements.3
3For example, one requirement is the ability to perform only a finite amount of work in
as i n g l es t e p .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 206 ---
182 CHAPTER 3 / THE CHURCH---TURING THESIS
To u n d e r s t a n d t h i s p h e n o m e n o n , c o n s i d e r t h e a n a l o g o u s s i t u a t i o n f o r p r o -
gramming languages. Many, such as Pascal and LISP , look quite different from
one another in style and structure. Can some algorithm be programmed in one
of them and not the others? Of course not—we can compile LISP into Pascal
and Pascal into LISP , which means that the two languages describe exactly the
same class of algorithms. So do all other reasonable programming languages.
The widespread equivalence of computational models holds for precisely the
same reason. Any two computational models that satisfy certain reasonable re-
quirements can simulate one another and hence are equivalent in power.
This equivalence phenomenon has an important philosophical corollary.
Even though we can imagine many different computational models, the class
of algorithms that they describe remains the same. Whereas each individual
computational model has a certain arbitrariness to its definition, the underlying
class of algorithms that it describes is natural because the other models arrive
at the same, unique class. This phenomenon has had profound implications for
mathematics, as we show in the next section.
3.3
THE DEFINITION OF ALGORITHM
Informally speaking, an algorithm is a collection of simple instructions for car-
rying out some task. Commonplace in everyday life, algorithms sometimes are
called procedures orrecipes .A l g o r i t h m sa l s op l a ya ni m p o r t a n tr o l ei nm a t h e m a t -
ics. Ancient mathematical literature contains descriptions of algorithms for a
variety of tasks, such as finding prime numbers and greatest common divisors.
In contemporary mathematics, algorithms abound.
Even though algorithms have had a long history in mathematics, the notion
of algorithm itself was not defined precisely until the twentieth century. Before
that, mathematicians had an intuitive notion of what algorithms were, and relied
upon that notion when using and describing them. But that intuitive notion was
insufficient for gaining a deeper understanding of algorithms. The following
story relates how the precise definition of algorithm was crucial to one important
mathematical problem.
HILBERT’S PROBLEMS
In 1900, mathematician David Hilbert delivered a now-famous address at the
International Congress of Mathematicians in Paris. In his lecture, he identified
23 mathematical problems and posed them as a challenge for the coming century.
The tenth problem on his list concerned algorithms.
Before describing that problem, let’s briefly discuss polynomials. A polyno-
mial is a sum of terms, where each term is a product of certain variables and a
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 207 ---
3.3 THE DEFINITION OF ALGORITHM 183
constant, called a coefficient .F o re x a m p l e ,
6·x·x·x·y·z·z=6x3yz2
is a term with coefficient 6, and
6x3yz2+3xy2−x3−10
is a polynomial with four terms, over the variables x,y,a n d z.F o rt h i s d i s c u s -
sion, we consider only coefficients that are integers. A rootof a polynomial is an
assignment of values to its variables so that the value of the polynomial is 0. This
polynomial has a root at x=5,y=3,a n d z=0.T h i sr o o ti sa n integral root
because all the variables are assigned integer values. Some polynomials have an
integral root and some do not.
Hilbert’s tenth problem was to devise an algorithm that tests whether a poly-
nomial has an integral root. He did not use the term algorithm but rather “a
process according to which it can be determined by a finite number of oper-
ations.”4Interestingly, in the way he phrased this problem, Hilbert explicitly
asked that an algorithm be “devised.” Thus he apparently assumed that such an
algorithm must exist—someone need only find it.
As we now know, no algorithm exists for this task; it is algorithmically unsolv-
able. For mathematicians of that period to come to this conclusion with their
intuitive concept of algorithm would have been virtually impossible. The intu-
itive concept may have been adequate for giving algorithms for certain tasks, but
it was useless for showing that no algorithm exists for a particular task. Proving
that an algorithm does not exist requires having a clear definition of algorithm.
Progress on the tenth problem had to wait for that definition.
The definition came in the 1936 papers of Alonzo Church and Alan T ur-
ing. Church used a notational system called the λ-calculus to define algorithms.
Tu r i n g d i d i t w i t h h i s “ m a c h i n e s . ” T h e s e t w o d e fi n i t i o n s w e r e s h o w n t o b e
equivalent. This connection between the informal notion of algorithm and the
precise definition has come to be called the Church– Turing thesis .
The Church– T uring thesis provides the definition of algorithm necessary to
resolve Hilbert’s tenth problem. In 1970, Yuri Matijasevi ˘c, building on the work
of Martin Davis, Hilary Putnam, and Julia Robinson, showed that no algorithm
exists for testing whether a polynomial has integral roots. In Chapter 4 we de-
velop the techniques that form the basis for proving that this and other problems
are algorithmically unsolvable.
Intuitive notion Turing machine
of algorithmsequalsalgorithms
FIGURE 3.22
The Church– T uring thesis
4Tr a n s l a t e d f r o m t h e o r i g i n a l G e r m a n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 208 ---
184 CHAPTER 3 / THE CHURCH---TURING THESIS
Let’s phrase Hilbert’s tenth problem in our terminology. Doing so helps to
introduce some themes that we explore in Chapters 4 and 5. Let
D={p|pis a polynomial with an integral root }.
Hilbert’s tenth problem asks in essence whether the set Dis decidable. The
answer is negative. In contrast, we can show that Dis T uring-recognizable.
Before doing so, let’s consider a simpler problem. It is an analog of Hilbert’s
tenth problem for polynomials that have only a single variable, such as 4x3−
2x2+x−7.L e t
D1={p|pis a polynomial over xwith an integral root }.
Here is a TMM1that recognizes D1:
M1=“On input ⟨p⟩:w h e r e pis a polynomial over the variable x.
1.Evaluate pwith xset successively to the values 0,1,−1,2,−2,3,
−3,....I fa ta n yp o i n tt h ep o l y n o m i a le v a l u a t e st o0 , accept .”
Ifphas an integral root, M1eventually will find it and accept. If pdoes not have
an integral root, M1will run forever. For the multivariable case, we can present
as i m i l a r TMMthat recognizes D.H e r e , Mgoes through all possible settings of
its variables to integral values.
Both M1andMare recognizers but not deciders. We can convert M1to be
ad e c i d e rf o r D1because we can calculate bounds within which the roots of a
single variable polynomial must lie and restrict the search to these bounds. In
Problem 3.21 you are asked to show that the roots of such a polynomial must lie
between the values
±kcmax
c1,
where kis the number of terms in the polynomial, cmaxis the coefficient with
the largest absolute value, and c1is the coefficient of the highest order term. If a
root is not found within these bounds, the machine rejects .M a t i j a s e v i ˘c’s theorem
shows that calculating such bounds for multivariable polynomials is impossible.
TERMINOLOGY FOR DESCRIBING TURING MACHINES
We have come to a turning point in the study of the theory of computation. We
continue to speak of T uring machines, but our real focus from now on is on al-
gorithms. That is, the T uring machine merely serves as a precise model for the
definition of algorithm. We skip over the extensive theory of T uring machines
themselves and do not spend much time on the low-level programming of T ur-
ing machines. We need only to be comfortable enough with T uring machines to
believe that they capture all algorithms.
With that in mind, let’s standardize the way we describe Turing machine algo-
rithms. Initially, we ask: What is the right level of detail to give when describing
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 209 ---
3.3 THE DEFINITION OF ALGORITHM 185
such algorithms? Students commonly ask this question, especially when prepar-
ing solutions to exercises and problems. Let’s entertain three possibilities. The
first is the formal description that spells out in full the T uring machine’s states,
transition function, and so on. It is the lowest, most detailed level of description.
The second is a higher level of description, called the implementation descrip-
tion,i nw h i c hw eu s eE n g l i s hp r o s et od e s c r i b et h ew a yt h a tt h eT u r i n gm a c h i n e
moves its head and the way that it stores data on its tape. At this level we do not
give details of states or transition function. The third is the high-level description ,
wherein we use English prose to describe an algorithm, ignoring the implemen-
tation details. At this level we do not need to mention how the machine manages
its tape or head.
In this chapter, we have given formal and implementation-level descriptions
of various examples of T uring machines. Practicing with lower level T uring ma-
chine descriptions helps you understand T uring machines and gain confidence
in using them. Once you feel confident, high-level descriptions are sufficient.
We now set up a format and notation for describing Turing machines. The in-
put to a T uring machine is always a string. If we want to provide an object other
than a string as input, we must first represent that object as a string. Strings
can easily represent polynomials, graphs, grammars, automata, and any combi-
nation of those objects. A T uring machine may be programmed to decode the
representation so that it can be interpreted in the way we intend. Our nota-
tion for the encoding of an object Ointo its representation as a string is ⟨O⟩.I f
we have several objects O1,O2,...,O k,w ed e n o t et h e i re n c o d i n gi n t oas i n g l e
string ⟨O1,O2,...,O k⟩.T h e e n c o d i n g i t s e l f c a n b e d o n e i n m a n y r e a s o n a b l e
ways. It doesn’t matter which one we pick because a T uring machine can always
translate one such encoding into another.
In our format, we describe T uring machine algorithms with an indented seg-
ment of text within quotes. We break the algorithm into stages, each usually
involving many individual steps of the T uring machine’s computation. We indi-
cate the block structure of the algorithm with further indentation. The first line
of the algorithm describes the input to the machine. If the input description is
simply w,t h ei n p u ti st a k e nt ob eas t r i n g .I ft h ei n p u td e s c r i p t i o ni st h ee n c o d -
ing of an object as in ⟨A⟩,t h eT u r i n gm a c h i n efi r s ti m p l i c i t l yt e s t sw h e t h e rt h e
input properly encodes an object of the desired form and rejects it if it doesn’t.
EXAMPLE 3.23
LetAbe the language consisting of all strings representing undirected graphs
that are connected. Recall that a graph is connected if every node can be reached
from every other node by traveling along the edges of the graph. We write
A={⟨G⟩|Gis a connected undirected graph }.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 210 ---
186 CHAPTER 3 / THE CHURCH---TURING THESIS
The following is a high-level description of a TMMthat decides A.
M=“On input ⟨G⟩,t h ee n c o d i n go fag r a p h G:
1.Select the first node of Gand mark it.
2.Repeat the following stage until no new nodes are marked:
3. For each node in G,m a r ki ti fi ti sa t t a c h e db ya ne d g et oa
node that is already marked.
4.Scan all the nodes of Gto determine whether they all are
marked. If they are, accept ;o t h e r w i s e , reject .”
For additional practice, let’s examine some implementation-level details of
Tu r i n g m a c h i n e M.U s u a l l y w ew o n ’ t g i v et h i s l e v e lo f d e t a i l i n t h e f u t u r e a n d
you won’t need to either, unless specifically requested to do so in an exercise.
First, we must understand how ⟨G⟩encodes the graph Gas a string. Consider
an encoding that is a list of the nodes of Gfollowed by a list of the edges of G.
Each node is a decimal number, and each edge is the pair of decimal numbers
that represent the nodes at the two endpoints of the edge. The following figure
depicts such a graph and its encoding.
FIGURE 3.24
Ag r a p h Gand its encoding ⟨G⟩
When Mreceives the input ⟨G⟩,i tfi r s tc h e c k st od e t e r m i n ew h e t h e rt h e
input is the proper encoding of some graph. T o do so, Mscans the tape to be
sure that there are two lists and that they are in the proper form. The first list
should be a list of distinct decimal numbers, and the second should be a list of
pairs of decimal numbers. Then Mchecks several things. First, the node list
should contain no repetitions; and second, every node appearing on the edge list
should also appear on the node list. For the first, we can use the procedure given
in Example 3.12 for TMM4that checks element distinctness. A similar method
works for the second check. If the input passes these checks, it is the encoding
of some graph G.T h i s v e r i fi c a t i o n c o m p l e t e s t h e i n p u t c h e c k , a n d Mgoes on
to stage 1.
For stage 1, Mmarks the first node with a dot on the leftmost digit.
For stage 2, Mscans the list of nodes to find an undotted node n1and flags
it by marking it differently—say, by underlining the first symbol. Then Mscans
the list again to find a dotted node n2and underlines it, too.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 211 ---
EXERCISES 187
Now Mscans the list of edges. For each edge, Mtests whether the two
underlined nodes n1andn2are the ones appearing in that edge. If they are,
Mdots n1,r e m o v e st h eu n d e r l i n e s ,a n dg o e so nf r o mt h eb e g i n n i n go fs t a g e2 .
If they aren’t, Mchecks the next edge on the list. If there are no more edges,
{n1,n2}is not an edge of G.T h e n Mmoves the underline on n2to the next
dotted node and now calls this node n2.I t r e p e a t s t h e s t e p s i n t h i s p a r a g r a p h
to check, as before, whether the new pair {n1,n2}is an edge. If there are no
more dotted nodes, n1is not attached to any dotted nodes. Then Msets the
underlines so that n1is the next undotted node and n2is the first dotted node
and repeats the steps in this paragraph. If there are no more undotted nodes, M
has not been able to find any new nodes to dot, so it moves on to stage 4.
For stage 4, Mscans the list of nodes to determine whether all are dotted.
If they are, it enters the accept state; otherwise, it enters the reject state. This
completes the description of TMM.
EXERCISES
3.1 This exercise concerns TMM2,w h o s ed e s c r i p t i o na n ds t a t ed i a g r a ma p p e a ri nE x -
ample 3.7. In each of the parts, give the sequence of configurations that M2enters
when started on the indicated input string.
a.0.
Ab.00.
c.000.
d.000000 .
3.2 This exercise concerns TMM1,w h o s ed e s c r i p t i o na n ds t a t ed i a g r a ma p p e a ri nE x -
ample 3.9. In each of the parts, give the sequence of configurations that M1enters
when started on the indicated input string.
Aa.11.
b.1#1.
c.1##1 .
d.10#11 .
e.10#10 .
A3.3 Modify the proof of Theorem 3.16 to obtain Corollary 3.19, showing that a lan-
guage is decidable iff some nondeterministic T uring machine decides it. (You may
assume the following theorem about trees. If every node in a tree has finitely many
children and every branch of the tree has finitely many nodes, the tree itself has
finitely many nodes.)
3.4 Give a formal definition of an enumerator. Consider it to be a type of two-tape
Tu r i n g m a c h i n e t h a t u s e s i t s s e c o n d t a p e a s t h e p r i n t e r. I n c l u d e a d e fi n i t i o n o f t h e
enumerated language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 212 ---
188 CHAPTER 3 / THE CHURCH---TURING THESIS
A3.5 Examine the formal definition of a T uring machine to answer the following ques-
tions, and explain your reasoning.
a.Can a T uring machine ever write the blank symbol ␣on its tape?
b.Can the tape alphabet Γbe the same as the input alphabet Σ?
c.Can a T uring machine’s head everbe in the same location in two successive
steps?
d.Can a T uring machine contain just a single state?
3.6 In Theorem 3.21, we showed that a language is T uring-recognizable iff some enu-
merator enumerates it. Why didn’t we use the following simpler algorithm for the
forward direction of the proof? As before, s1,s2,...is a list of all strings in Σ∗.
E=“Ignore the input.
1.Repeat the following for i=1,2,3,....
2. Run Monsi.
3. If it accepts, print out si.”
3.7 Explain why the following is not a description of a legitimate T uring machine.
Mbad=“On input ⟨p⟩,ap o l y n o m i a lo v e rv a r i a b l e s x1,...,x k:
1.Tr y a l l p o s s i b l e s e t t i n g s o f x1,...,x kto integer values.
2.Evaluate pon all of these settings.
3.If any of these settings evaluates to 0,accept ;o t h e r w i s e , reject .”
3.8 Give implementation-level descriptions of T uring machines that decide the follow-
ing languages over the alphabet {0,1}.
Aa.{w|wcontains an equal number of 0sa n d 1s}
b.{w|wcontains twice as many 0sa s1s}
c.{w|wdoes not contain twice as many 0sa s1s}
PROBLEMS
3.9 Let a k-PDA be a pushdown automaton that has kstacks. Thus a 0-PDA is an
NFAand a 1-PDAis a conventional PDA.Y o ua l r e a d yk n o wt h a t 1-PDAsa r em o r e
powerful (recognize a larger class of languages) than 0-PDAs.
a.Show that 2-PDAsa r em o r ep o w e r f u lt h a n 1-PDAs.
b.Show that 3-PDAsa r en o tm o r ep o w e r f u lt h a n 2-PDAs.
(Hint: Simulate a T uring machine tape with two stacks.)
A3.10 Say that a write-once Turing machine is a single-tape TMthat can alter each tape
square at most once (including the input portion of the tape). Show that this variant
T uring machine model is equivalent to the ordinary T uring machine model. (Hint:
As a first step, consider the case whereby the T uring machine may alter each tape
square at most twice. Use lots of tape.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 213 ---
PROBLEMS 189
3.11 ATuring machine with doubly infinite tape is similar to an ordinary T uring ma-
chine, but its tape is infinite to the left as well as to the right. The tape is initially
filled with blanks except for the portion that contains the input. Computation is
defined as usual except that the head never encounters an end to the tape as it
moves leftward. Show that this type of T uring machine recognizes the class of
Tu r i n g - r e c o g n i z a b l e l a n g u a g e s .
3.12 ATuring machine with left reset is similar to an ordinary T uring machine, but the
transition function has the form
δ:Q×Γ−→Q×Γ×{R,RESET }.
Ifδ(q,a)=( r, b,RESET) ,w h e nt h em a c h i n ei si ns t a t e qreading an a,t h em a -
chine’s head jumps to the left-hand end of the tape after it writes bon the tape and
enters state r. Note that these machines do not have the usual ability to move the
head one symbol left. Show that T uring machines with left reset recognize the class
of T uring-recognizable languages.
3.13 ATuring machine with stay put instead of left is similar to an ordinary T uring
machine, but the transition function has the form
δ:Q×Γ−→Q×Γ×{R,S}.
At each point, the machine can move its head right or let it stay in the same posi-
tion. Show that this T uring machine variant is notequivalent to the usual version.
What class of languages do these machines recognize?
3.14 Aqueue automaton is like a push-down automaton except that the stack is replaced
by a queue. A queue is a tape allowing symbols to be written only on the left-hand
end and read only at the right-hand end. Each write operation (we’ll call it a push)
adds a symbol to the left-hand end of the queue and each read operation (we’ll
call it a pull)r e a d sa n dr e m o v e sas y m b o la tt h er i g h t - h a n de n d . A sw i t ha PDA,
the input is placed on a separate read-only input tape, and the head on the input
tape can move only from left to right. The input tape contains a cell with a blank
symbol following the input, so that the end of the input can be detected. A queue
automaton accepts its input by entering a special accept state at any time. Show that
al a n g u a g ec a nb er e c o g n i z e db yad e t e r m i n i s t i cq u e u ea u t o m a t o ni f ft h el a n g u a g e
is T uring-recognizable.
3.15 Show that the collection of decidable languages is closed under the operation of
Aa.union.
b.concatenation.
c.star.d.complementation.
e.intersection.
3.16 Show that the collection of T uring-recognizable languages is closed under the op-
eration of
Aa.union.
b.concatenation.
c.star.d.intersection.
e.homomorphism.
⋆3.17 LetB={⟨M1⟩,⟨M2⟩,...}be a T uring-recognizable language consisting of TM
descriptions. Show that there is a decidable language Cconsisting of TMdescrip-
tions such that every machine described in Bhas an equivalent machine in Cand
vice versa.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 214 ---
190 CHAPTER 3 / THE CHURCH---TURING THESIS
⋆3.18 Show that a language is decidable iff some enumerator enumerates the language in
the standard string order.
⋆3.19 Show that every infinite T uring-recognizable language has an infinite decidable
subset.
⋆3.20 Show that single-tape TMst h a tc a n n o tw r i t eo nt h ep o r t i o no ft h et a p ec o n t a i n i n g
the input string recognize only regular languages.
3.21 Letc1xn+c2xn−1+···+cnx+cn+1be a polynomial with a root at x=x0.L e t
cmaxbe the largest absolute value of a ci.S h o wt h a t
|x0|<(n+1 )cmax
|c1|.
A3.22 LetAbe the language containing only the single string s,w h e r e
s=⎨braceleftBigg
0if life never will be found on Mars.
1if life will be found on Mars someday .
IsAdecidable? Why or why not? For the purposes of this problem, assume that
the question of whether life will be found on Mars has an unambiguous Y ESor N O
answer.
SELECTED SOLUTIONS
3.1 (b)q100,␣q20,␣xq3␣,␣q5x␣,q5␣x␣,␣q2x␣,␣xq2␣,␣x␣qaccept.
3.2 (a)q111,xq31,x1q3␣,x1␣qreject.
3.3 We prove both directions of the iff. First, if a language Lis decidable, it can be
decided by a deterministic T uring machine, and that is automatically a nondeter-
ministic T uring machine.
Second, if a language Lis decided by a nondeterministic TMN,w em o d i f yt h e
deterministic TMDthat was given in the proof of Theorem 3.16 as follows.
Move stage 4 to be stage 5.
Add new stage 4: Reject if all branches of N’s nondeterminism have rejected.
We argue that this new TMD′is a decider for L.I fNaccepts its input, D′will
eventually find an accepting branch and accept, too. If Nrejects its input, all of
its branches halt and reject because it is a decider. Hence each of the branches has
finitely many nodes, where each node represents one step of N’s computation along
that branch. Therefore, N’s entire computation tree on this input is finite, by virtue
of the theorem about trees given in the statement of the exercise. Consequently,
D′will halt and reject when this entire tree has been explored.
3.5 (a)Yes. The tape alphabet Γcontains ␣.AT u r i n gm a c h i n ec a nw r i t ea n yc h a r a c t e r s
inΓon its tape.
(b)No. Σnever contains ␣,b u t Γalways contains ␣.S ot h e yc a n n o tb ee q u a l .
(c)Yes. If the Turing machine attempts to move its head off the left-hand end of
the tape, it remains on the same tape cell.
(d)No. Any T uring machine must contain two distinct states: qaccept andqreject.S o ,
aT u r i n gm a c h i n ec o n t a i n sa tl e a s tt w os t a t e s .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 215 ---
SELECTED SOLUTIONS 191
3.8 (a) “ On input string w:
1.Scan the tape and mark the first 0that has not been marked. If
no unmarked 0is found, go to stage 4. Otherwise, move the
head back to the front of the tape.
2.Scan the tape and mark the first 1that has not been marked. If
no unmarked 1is found, reject .
3.Move the head back to the front of the tape and go to stage 1.
4.Move the head back to the front of the tape. Scan the tape to see
if any unmarked 1sr e m a i n .I fn o n ea r ef o u n d , accept ;o t h e r w i s e ,
reject .”
3.10 We first simulate an ordinary Turing machine by a write-twice Turing machine.
The write-twice machine simulates a single step of the original machine by copying
the entire tape over to a fresh portion of the tape to the right-hand side of the
currently used portion. The copying procedure operates character by character,
marking a character as it is copied. This procedure alters each tape square twice:
once to write the character for the first time, and again to mark that it has been
copied. The position of the original T uring machine’s tape head is marked on
the tape. When copying the cells at or adjacent to the marked position, the tape
content is updated according to the rules of the original T uring machine.
To c a r r y o u t t h e s i m u l a t i o n w i t h a w r i t e - o n c e m a c h i n e , o p e r a t e a s b e f o r e , e x c e p t
that each cell of the previous tape is now represented by two cells. The first of these
contains the original machine’s tape symbol and the second is for the mark used in
the copying procedure. The input is not presented to the machine in the format
with two cells per symbol, so the very first time the tape is copied, the copying
marks are put directly over the input symbols.
3.15 (a)For any two decidable languages L1andL2,l e t M1andM2be the TMst h a t
decide them. We construct a TMM′that decides the union of L1andL2:
“On input w:
1.Run M1onw.I fi ta c c e p t s , accept .
2.Run M2onw.I fi ta c c e p t s , accept .O t h e r w i s e , reject .”
M′accepts wif either M1orM2accepts it. If both reject, M′rejects.
3.16 (a)For any two T uring-recognizable languages L1andL2,l e tM1andM2be the
TMst h a tr e c o g n i z et h e m .W ec o n s t r u c ta TMM′that recognizes the union of L1
andL2:
“On input w:
1.Run M1andM2alternately on wstep by step. If either accepts,
accept .I fb o t hh a l ta n dr e j e c t , reject .”
If either M1orM2accepts w,M′accepts wbecause the accepting TMarrives to its
accepting state after a finite number of steps. Note that if both M1andM2reject
and either of them does so by looping, then M′will loop.
3.22 The language Ais one of the two languages {0}or{1}.I ne i t h e rc a s e ,t h el a n g u a g e
is finite and hence decidable. If you aren’t able to determine which of these two
languages is A,y o uw o n ’ tb ea b l et od e s c r i b et h ed e c i d e rf o r A. However, you can
give two T uring machines, one of which is A’s decider.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 216 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 217 ---
4
DECIDABILITY
In Chapter 3 we introduced the T uring machine as a model of a general purpose
computer and defined the notion of algorithm in terms of T uring machines by
means of the Church– T uring thesis.
In this chapter we begin to investigate the power of algorithms to solve prob-
lems. We demonstrate certain problems that can be solved algorithmically and
others that cannot. Our objective is to explore the limits of algorithmic solv-
ability. You are probably familiar with solvability by algorithms because much of
computer science is devoted to solving problems. The unsolvability of certain
problems may come as a surprise.
Why should you study unsolvability? After all, showing that a problem is
unsolvable doesn’t appear to be of any use if you have to solve it. You need
to study this phenomenon for two reasons. First, knowing when a problem is
algorithmically unsolvable isuseful because then you realize that the problem
must be simplified or altered before you can find an algorithmic solution. Like
any tool, computers have capabilities and limitations that must be appreciated if
they are to be used well. The second reason is cultural. Even if you deal with
problems that clearly are solvable, a glimpse of the unsolvable can stimulate your
imagination and help you gain an important perspective on computation.
193
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 218 ---
194 CHAPTER 4 / DECIDABILITY
4.1
DECIDABLE LANGUAGES
In this section we give some examples of languages that are decidable by al-
gorithms. We focus on languages concerning automata and grammars. For
example, we present an algorithm that tests whether a string is a member of a
context-free language ( CFL). These languages are interesting for several reasons.
First, certain problems of this kind are related to applications. This problem of
testing whether a CFGgenerates a string is related to the problem of recogniz-
ing and compiling programs in a programming language. Second, certain other
problems concerning automata and grammars are not decidable by algorithms.
Starting with examples where decidability is possible helps you to appreciate the
undecidable examples.
DECIDABLE PROBLEMS CONCERNING
REGULAR LANGUAGES
We begin with certain computational problems concerning finite automata. We
give algorithms for testing whether a finite automaton accepts a string, whether
the language of a finite automaton is empty, and whether two finite automata are
equivalent.
Note that we chose to represent various computational problems by lan-
guages. Doing so is convenient because we have already set up terminology for
dealing with languages. For example, the acceptance problem forDFAso ft e s t i n g
whether a particular deterministic finite automaton accepts a given string can be
expressed as a language, ADFA.T h i sl a n g u a g ec o n t a i n st h ee n c o d i n g so fa l l DFAs
together with strings that the DFAsa c c e p t .L e t
ADFA={⟨B,w⟩|Bis aDFAthat accepts input string w}.
The problem of testing whether a DFABaccepts an input wis the same as the
problem of testing whether ⟨B,w⟩is a member of the language ADFA.S i m i l a r l y ,
we can formulate other computational problems in terms of testing membership
in a language. Showing that the language is decidable is the same as showing
that the computational problem is decidable.
In the following theorem we show that ADFAis decidable. Hence this theorem
shows that the problem of testing whether a given finite automaton accepts a
given string is decidable.
THEOREM 4.1
ADFAis a decidable language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 219 ---
4.1 DECIDABLE LANGUAGES 195
PROOF IDEA We simply need to present a TMMthat decides ADFA.
M=“On input ⟨B,w⟩,w h e r e Bis aDFAandwis a string:
1.Simulate Bon input w.
2.If the simulation ends in an accept state, accept .I fi t e n d si na
nonaccepting state, reject .”
PROOF We mention just a few implementation details of this proof. For
those of you familiar with writing programs in any standard programming lan-
guage, imagine how you would write a program to carry out the simulation.
First, let’s examine the input ⟨B,w⟩.I ti sar e p r e s e n t a t i o no fa DFABtogether
with a string w.O n e r e a s o n a b l e r e p r e s e n t a t i o n o f Bis simply a list of its five
components: Q,Σ,δ,q0,a n d F.W h e n Mreceives its input, Mfirst determines
whether it properly represents a DFABand a string w.I fn o t , Mrejects.
Then Mcarries out the simulation directly. It keeps track of B’s current
state and B’s current position in the input wby writing this information down
on its tape. Initially, B’s current state is q0andB’s current input position is
the leftmost symbol of w.T h es t a t e s a n dp o s i t i o n a r eu p d a t e d a c c o r d i n gt ot h e
specified transition function δ.W h e n Mfinishes processing the last symbol of
w,Maccepts the input if Bis in an accepting state; Mrejects the input if Bis
in a nonaccepting state.
We can prove a similar theorem for nondeterministic finite automata. Let
ANFA={⟨B,w⟩|Bis an NFAthat accepts input string w}.
THEOREM 4.2
ANFAis a decidable language.
PROOF We present a TMNthat decides ANFA.W ec o u l dd e s i g n Nto operate
likeM,s i m u l a t i n ga n NFAinstead of a DFA.I n s t e a d ,w e ’ l ld oi td i f f e r e n t l yt o
illustrate a new idea: Have NuseMas a subroutine. Because Mis designed
to work with DFAs,Nfirst converts the NFAit receives as input to a DFAbefore
passing it to M.
N=“On input ⟨B,w⟩,w h e r e Bis an NFAandwis a string:
1.Convert NFABto an equivalent DFAC,u s i n gt h ep r o c e d u r ef o r
this conversion given in Theorem 1.39.
2.Run TMMfrom Theorem 4.1 on input ⟨C, w⟩.
3.IfMaccepts, accept ;o t h e r w i s e , reject .”
Running TMMin stage 2 means incorporating Minto the design of Nas a
subprocedure.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 220 ---
196 CHAPTER 4 / DECIDABILITY
Similarly, we can determine whether a regular expression generates a given
string. Let AREX={⟨R, w⟩|Ris a regular expression that generates string w}.
THEOREM 4.3
AREXis a decidable language.
PROOF The following TMPdecides AREX.
P=“On input ⟨R, w⟩,w h e r e Ris a regular expression and wis a string:
1.Convert regular expression Rto an equivalent NFAAby using
the procedure for this conversion given in Theorem 1.54.
2.Run TMNon input ⟨A, w⟩.
3.IfNaccepts, accept ;i fNrejects, reject .”
Theorems 4.1, 4.2, and 4.3 illustrate that, for decidability purposes, it is
equivalent to present the T uring machine with a DFA,a n NFA,o rar e g u l a re x -
pression because the machine can convert one form of encoding to another.
Now we turn to a different kind of problem concerning finite automata:
emptiness testing for the language of a finite automaton. In the preceding three
theorems we had to determine whether a finite automaton accepts a particular
string. In the next proof we must determine whether or not a finite automaton
accepts any strings at all. Let
EDFA={⟨A⟩|Ais aDFAandL(A)=∅}.
THEOREM 4.4
EDFAis a decidable language.
PROOF ADFAaccepts some string iff reaching an accept state from the start
state by traveling along the arrows of the DFAis possible. T o test this condition,
we can design a TMTthat uses a marking algorithm similar to that used in
Example 3.23.
T=“On input ⟨A⟩,w h e r e Ais aDFA:
1.Mark the start state of A.
2.Repeat until no new states get marked:
3. Mark any state that has a transition coming into it from any
state that is already marked.
4.If no accept state is marked, accept ;o t h e r w i s e , reject .”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 221 ---
4.1 DECIDABLE LANGUAGES 197
The next theorem states that determining whether two DFAsr e c o g n i z et h e
same language is decidable. Let
EQDFA={⟨A, B⟩|AandBareDFAsa n d L(A)=L(B)}.
THEOREM 4.5
EQDFAis a decidable language.
PROOF To p r o v e t h i s t h e o r e m , w e u s e T h e o r e m 4 . 4 . We c o n s t r u c t a n e w
DFACfrom AandB,w h e r e Caccepts only those strings that are accepted by
either AorBbut not by both. Thus, if AandBrecognize the same language,
Cwill accept nothing. The language of Cis
L(C)=⎪parenleft⎢ig
L(A)∩
L(B)⎪parenright⎢ig
∪⎪parenleft⎢ig
L(A)∩L(B)⎪parenright⎢ig
.
This expression is sometimes called the symmetric difference ofL(A)andL(B)
and is illustrated in the following figure. Here,
 L(A)is the complement of L(A).
The symmetric difference is useful here because L(C)= ∅iffL(A)=L(B).
We can construct Cfrom AandBwith the constructions for proving the class
of regular languages closed under complementation, union, and intersection.
These constructions are algorithms that can be carried out by T uring machines.
Once we have constructed C,w ec a nu s eT h e o r e m4 . 4t ot e s tw h e t h e r L(C)is
empty. If it is empty, L(A)andL(B)must be equal.
F=“On input ⟨A, B⟩,w h e r e AandBareDFAs:
1.Construct DFACas described.
2.Run TMTfrom Theorem 4.4 on input ⟨C⟩.
3.IfTaccepts, accept .I fTrejects, reject .”
FIGURE 4.6
The symmetric difference of L(A)andL(B)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 222 ---
198 CHAPTER 4 / DECIDABILITY
DECIDABLE PROBLEMS CONCERNING
CONTEXT-FREE LANGUAGES
Here, we describe algorithms to determine whether a CFGgenerates a particular
string and to determine whether the language of a CFGis empty. Let
ACFG={⟨G, w⟩|Gis aCFGthat generates string w}.
THEOREM 4.7
ACFGis a decidable language.
PROOF IDEA For CFGGand string w,w ew a n tt od e t e r m i n ew h e t h e r G
generates w.O n e i d e a i s t o u s e Gto go through all derivations to determine
whether any is a derivation of w.T h i s i d e a d o e s n ’ t w o r k , a s i n fi n i t e l y m a n y
derivations may have to be tried. If Gdoes not generate w,t h i sa l g o r i t h mw o u l d
never halt. This idea gives a T uring machine that is a recognizer, but not a
decider, for ACFG.
To m a k e t h i s Tu r i n g m a c h i n e i n t o a d e c i d e r, w e n e e d t o e n s u r e t h a t t h e a l -
gorithm tries only finitely many derivations. In Problem 2.26 (page 157) we
showed that if Gwere in Chomsky normal form, any derivation of whas2n−1
steps, where nis the length of w.I n t h a t c a s e , c h e c k i n g o n l y d e r i v a t i o n s w i t h
2n−1steps to determine whether Ggenerates wwould be sufficient. Only
finitely many such derivations exist. We can convert Gto Chomsky normal
form by using the procedure given in Section 2.1.
PROOF The TMSforACFGfollows.
S=“On input ⟨G, w⟩,w h e r e Gis aCFGandwis a string:
1.Convert Gto an equivalent grammar in Chomsky normal form.
2.List all derivations with 2n−1steps, where nis the length of w;
except if n=0,t h e ni n s t e a dl i s ta l ld e r i v a t i o n sw i t ho n es t e p .
3.If any of these derivations generate w,accept ;i fn o t , reject .”
The problem of determining whether a CFGgenerates a particular string is
related to the problem of compiling programming languages. The algorithm in
TMSis very inefficient and would never be used in practice, but it is easy to de-
scribe and we aren’t concerned with efficiency here. In Part Three of this book,
we address issues concerning the running time and memory use of algorithms.
In the proof of Theorem 7.16, we describe a more efficient algorithm for rec-
ognizing general context-free languages. Even greater efficiency is possible for
recognizing deterministic context-free languages.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 223 ---
4.1 DECIDABLE LANGUAGES 199
Recall that we have given procedures for converting back and forth between
CFGsa n d PDAsi nT h e o r e m2 . 2 0 .H e n c ee v e r y t h i n gw es a ya b o u tt h ed e c i d a b i l i t y
of problems concerning CFGsa p p l i e se q u a l l yw e l lt o PDAs.
Let’s turn now to the emptiness testing problem for the language of a CFG.
As we did for DFAs, we can show that the problem of determining whether a CFG
generates any strings at all is decidable. Let
ECFG={⟨G⟩|Gis aCFGandL(G)=∅}.
THEOREM 4.8
ECFGis a decidable language.
PROOF IDEA To fi n d a n a l g o r i t h m f o r t h i s p r o b l e m , w e m i g h t a t t e m p t t o
useTMSfrom Theorem 4.7. It states that we can test whether a CFGgenerates
some particular string w.T od e t e r m i n ew h e t h e r L(G)=∅,t h ea l g o r i t h mm i g h t
try going through all possible w’s, one by one. But there are infinitely many w’s
to try, so this method could end up running forever. We need to take a different
approach.
In order to determine whether the language of a grammar is empty, we need
to test whether the start variable can generate a string of terminals. The algo-
rithm does so by solving a more general problem. It determines for each variable
whether that variable is capable of generating a string of terminals. When the
algorithm has determined that a variable can generate some string of terminals,
the algorithm keeps track of this information by placing a mark on that variable.
First, the algorithm marks all the terminal symbols in the grammar. Then, it
scans all the rules of the grammar. If it ever finds a rule that permits some vari-
able to be replaced by some string of symbols, all of which are already marked,
the algorithm knows that this variable can be marked, too. The algorithm con-
tinues in this way until it cannot mark any additional variables. The TMR
implements this algorithm.
PROOF
R=“On input ⟨G⟩,w h e r e Gis aCFG:
1.Mark all terminal symbols in G.
2.Repeat until no new variables get marked:
3. Mark any variable Awhere Ghas a rule A→U1U2···Ukand
each symbol U1,...,U khas already been marked.
4.If the start variable is not marked, accept ;o t h e r w i s e , reject .”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 224 ---
200 CHAPTER 4 / DECIDABILITY
Next, we consider the problem of determining whether two context-free
grammars generate the same language. Let
EQCFG={⟨G, H⟩|GandHareCFGsa n d L(G)=L(H)}.
Theorem 4.5 gave an algorithm that decides the analogous language EQDFAfor
finite automata. We used the decision procedure for EDFAto prove that EQDFA
is decidable. Because ECFGalso is decidable, you might think that we can use
as i m i l a rs t r a t e g yt op r o v et h a t EQCFGis decidable. But something is wrong
with this idea! The class of context-free languages is notclosed under comple-
mentation or intersection, as you proved in Exercise 2.2. In fact, EQCFGis not
decidable. The technique for proving so is presented in Chapter 5.
Now we show that context-free languages are decidable by T uring machines.
THEOREM 4.9
Every context-free language is decidable.
PROOF IDEA LetAbe a CFL.O u ro b j e c t i v ei st os h o wt h a t Ais decidable.
One (bad) idea is to convert a PDAforAdirectly into a TM.T h a ti s n ’ th a r dt o
do because simulating a stack with the TM’s more versatile tape is easy. The PDA
forAmay be nondeterministic, but that seems okay because we can convert it
into a nondeterministic TMand we know that any nondeterministic TMcan be
converted into an equivalent deterministic TM.Y e t ,t h e r ei sad i f fi c u l t y . S o m e
branches of the PDA’s computation may go on forever, reading and writing the
stack without ever halting. The simulating TMthen would also have some non-
halting branches in its computation, and so the TMwould not be a decider. A
different idea is necessary. Instead, we prove this theorem with the TMSthat we
designed in Theorem 4.7 to decide ACFG.
PROOF LetGbe a CFGforAand design a TMMGthat decides A.W eb u i l d
ac o p yo f GintoMG.I tw o r k sa sf o l l o w s .
MG=“On input w:
1.Run TMSon input ⟨G, w⟩.
2.If this machine accepts, accept ;i fi tr e j e c t s , reject .”
Theorem 4.9 provides the final link in the relationship among the four main
classes of languages that we have described so far: regular, context-free, decid-
able, and T uring-recognizable. Figure 4.10 depicts this relationship.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 225 ---
4.2 UNDECIDABILITY 201
FIGURE 4.10
The relationship among classes of languages
4.2
UNDECIDABILITY
In this section, we prove one of the most philosophically important theorems of
the theory of computation: There is a specific problem that is algorithmically
unsolvable. Computers appear to be so powerful that you may believe that all
problems will eventually yield to them. The theorem presented here demon-
strates that computers are limited in a fundamental way.
What sorts of problems are unsolvable by computer? Are they esoteric,
dwelling only in the minds of theoreticians? No! Even some ordinary prob-
lems that people want to solve turn out to be computationally unsolvable.
In one type of unsolvable problem, you are given a computer program and
ap r e c i s es p e c i fi c a t i o no fw h a tt h a tp r o g r a mi ss u p p o s e dt od o( e . g . ,s o r tal i s t
of numbers). You need to verify that the program performs as specified (i.e.,
that it is correct). Because both the program and the specification are mathe-
matically precise objects, you hope to automate the process of verification by
feeding these objects into a suitably programmed computer. However, you will
be disappointed. The general problem of software verification is not solvable by
computer.
In this section and in Chapter 5, you will encounter several computationally
unsolvable problems. We aim to help you develop a feeling for the types of
problems that are unsolvable and to learn techniques for proving unsolvability.
Now we turn to our first theorem that establishes the undecidability of a spe-
cific language: the problem of determining whether a T uring machine accepts a
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 226 ---
202 CHAPTER 4 / DECIDABILITY
given input string. We call it ATMby analogy with ADFAandACFG.B u t ,w h e r e a s
ADFAandACFGwere decidable, ATMis not. Let
ATM={⟨M,w⟩|Mis aTMandMaccepts w}.
THEOREM 4.11
ATMis undecidable.
Before we get to the proof, let’s first observe that ATMis T uring-recognizable.
Thus, this theorem shows that recognizers aremore powerful than deciders.
Requiring a TMto halt on all inputs restricts the kinds of languages that it can
recognize. The following T uring machine Urecognizes ATM.
U=“On input ⟨M,w⟩,w h e r e Mis aTMandwis a string:
1.Simulate Mon input w.
2.IfMever enters its accept state, accept ;i fMever enters its
reject state, reject .”
Note that this machine loops on input ⟨M,w⟩ifMloops on w,w h i c hi sw h y
this machine does not decide ATM.I ft h ea l g o r i t h mh a ds o m ew a yt od e t e r m i n e
thatMwas not halting on w,i tc o u l d reject in this case. However, an algorithm
has no way to make this determination, as we shall see.
The T uring machine Uis interesting in its own right. It is an example of the
universal Turing machine first proposed by Alan T uring in 1936. This machine
is called universal because it is capable of simulating any other T uring machine
from the description of that machine. The universal T uring machine played an
important early role in the development of stored-program computers.
THE DIAGONALIZATION METHOD
The proof of the undecidability of ATMuses a technique called diagonalization ,
discovered by mathematician Georg Cantor in 1873. Cantor was concerned
with the problem of measuring the sizes of infinite sets. If we have two infinite
sets, how can we tell whether one is larger than the other or whether they are of
the same size? For finite sets, of course, answering these questions is easy. We
simply count the elements in a finite set, and the resulting number is its size. But
if we try to count the elements of an infinite set, we will never finish! So we can’t
use the counting method to determine the relative sizes of infinite sets.
For example, take the set of even integers and the set of all strings over {0,1}.
Both sets are infinite and thus larger than any finite set, but is one of the two
larger than the other? How can we compare their relative size?
Cantor proposed a rather nice solution to this problem. He observed that two
finite sets have the same size if the elements of one set can be paired with the
elements of the other set. This method compares the sizes without resorting to
counting. We can extend this idea to infinite sets. Here it is more precisely.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 227 ---
4.2 UNDECIDABILITY 203
DEFINITION 4.12
Assume that we have sets AandBand a function ffrom AtoB.
Say that fisone-to-one if it never maps two different elements to
the same place—that is, if f(a)̸=f(b)whenever a̸=b.S a y t h a t
fisonto if it hits every element of B—that is, if for every b∈B
there is an a∈Asuch that f(a)=b.S a yt h a t AandBare the same
sizeif there is a one-to-one, onto function f:A−→B.A f u n c t i o n
that is both one-to-one and onto is called a correspondence .I n a
correspondence, every element of Amaps to a unique element of
Band each element of Bhas a unique element of Amapping to it.
Ac o r r e s p o n d e n c ei ss i m p l yaw a yo fp a i r i n gt h ee l e m e n t so f Awith
the elements of B.
Alternative common terminology for these types of functions is injective for
one-to-one, surjective for onto, and bijective for one-to-one and onto.
EXAMPLE 4.13
LetNbe the set of natural numbers {1,2,3,...}and let Ebe the set of even
natural numbers {2,4,6,...}. Using Cantor’s definition of size, we can see that
NandEhave the same size. The correspondence fmapping NtoEis simply
f(n)=2 n.W ec a nv i s u a l i z e fmore easily with the help of a table.
n
 f(n)
1
 2
2
 4
3
 6
...
...
Of course, this example seems bizarre. Intuitively, Eseems smaller than Nbe-
cause Eis a proper subset of N.B u t p a i r i n g e a c h m e m b e r o f Nwith its own
member of Eis possible, so we declare these two sets to be the same size.
DEFINITION 4.14
As e t Aiscountable if either it is finite or it has the same size as N.
EXAMPLE 4.15
Now we turn to an even stranger example. If we let Q={m
n|m, n∈N} be the
set of positive rational numbers, Qseems to be much larger than N.Y e t t h e s e
two sets are the same size according to our definition. We give a correspondence
with Nto show that Qis countable. One easy way to do so is to list all the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 228 ---
204 CHAPTER 4 / DECIDABILITY
elements of Q.T h e n w e p a i r t h e fi r s t e l e m e n t o n t h e l i s t w i t h t h e n u m b e r 1
from N,t h es e c o n de l e m e n to nt h el i s tw i t ht h en u m b e r2f r o m N,a n ds oo n .
We must ensure that every member of Qappears only once on the list.
To g e t t h i s l i s t , w e m a k e a n i n fi n i t e m a t r i x c o n t a i n i n g a l l t h e p o s i t i v e r a t i o -
nal numbers, as shown in Figure 4.16. The ith row contains all numbers with
numerator iand the jth column has all numbers with denominator j.S o t h e
numberi
joccurs in the ith row and jth column.
Now we turn this matrix into a list. One (bad) way to attempt it would be to
begin the list with all the elements in the first row. That isn’t a good approach
because the first row is infinite, so the list would never get to the second row.
Instead we list the elements on the diagonals, which are superimposed on the
diagram, starting from the corner. The first diagonal contains the single element
1
1,a n dt h es e c o n dd i a g o n a lc o n t a i n st h et w oe l e m e n t s2
1and1
2.S o t h e fi r s t
three elements on the list are1
1,2
1,a n d1
2.I nt h et h i r dd i a g o n a l ,ac o m p l i c a t i o n
arises. It contains3
1,2
2,a n d1
3.I f w e s i m p l y a d d e d t h e s e t o t h e l i s t , w e w o u l d
repeat1
1=2
2.W e a v o i d d o i n g s o b y s k i p p i n g a n e l e m e n t w h e n i t w o u l d c a u s e
ar e p e t i t i o n .S ow ea d do n l yt h et w on e we l e m e n t s3
1and1
3. Continuing in this
way, we obtain a list of all the elements of Q.
FIGURE 4.16
Ac o r r e s p o n d e n c eo f NandQ
After seeing the correspondence of NandQ,y o um i g h tt h i n kt h a ta n yt w o
infinite sets can be shown to have the same size. After all, you need only demon-
strate a correspondence, and this example shows that surprising correspondences
do exist. However, for some infinite sets, no correspondence with Nexists.
These sets are simply too big. Such sets are called uncountable .
The set of real numbers is an example of an uncountable set. A real number
is one that has a decimal representation. The numbers π=3.1415926 ...and
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 229 ---
4.2 UNDECIDABILITY 205
√
2= 1.4142135 ...are examples of real numbers. Let Rbe the set of real
numbers. Cantor proved that Ris uncountable. In doing so, he introduced the
diagonalization method.
THEOREM 4.17
Ris uncountable.
PROOF In order to show that Ris uncountable, we show that no correspon-
dence exists between NandR.T h e p r o o f i s b y c o n t r a d i c t i o n .S u p p o s e t h a t a
correspondence fexisted between NandR.O u r j o b i s t o s h o w t h a t ffails to
work as it should. For it to be a correspondence, fmust pair all the members of
Nwith all the members of R.B u tw ew i l lfi n da n xinRthat is not paired with
anything in N,w h i c hw i l lb eo u rc o n t r a d i c t i o n .
The way we find this xis by actually constructing it. We choose each digit
ofxto make xdifferent from one of the real numbers that is paired with an
element of N.I nt h e e n d ,w ea r es u r et h a t xis different from any real number
that is paired.
We can illustrate this idea by giving an example. Suppose that the correspon-
dence fexists. Let f(1)= 3.14159 ..., f (2)= 55.55555 ..., f (3)= ...,
and so on, just to make up some values for f.T h e n fpairs the number 1with
3.14159 ...,the number 2with 55.55555 ...,and so on. The following table
shows a few values of a hypothetical correspondence fbetween NandR.
n
 f(n)
1
 3.14159 ...
2
55.55555 ...
3
 0.12345 ...
4
 0.50000 ...
...
...
We construct the desired xby giving its decimal representation. It is a num-
ber between 0 and 1, so all its significant digits are fractional digits following the
decimal point. Our objective is to ensure that x̸=f(n)for any n.T o e n s u r e
that x̸=f(1),w el e tt h efi r s td i g i to f xbe anything different from the first
fractional digit 1off(1)=3.1
4159 ....A r b i t r a r i l y , w e l e t i t b e 4.T o e n s u r e
thatx̸=f(2),w el e tt h es e c o n dd i g i to f xbe anything different from the second
fractional digit 5off(2)=55.55
 5555 ....A r b i t r a r i l y ,w el e ti tb e 6.T h et h i r d
fractional digit of f(3)=0.123
 45...is3,s ow el e t xbe anything different—
say,4. Continuing in this way down the diagonal of the table for f,w eo b t a i n
all the digits of x,a ss h o w ni nt h ef o l l o w i n gt a b l e .W ek n o wt h a t xis not f(n)
for any nbecause it differs from f(n)in the nth fractional digit. (A slight prob-
lem arises because certain numbers, such as 0.1999 ...and0.2000 ...,a r ee q u a l
even though their decimal representations are different. We avoid this problem
by never selecting the digits 0or9when we construct x.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 230 ---
206 CHAPTER 4 / DECIDABILITY
n
 f(n)
1
 3.1
4159 ...
2
55.55
 555...
3
 0.123
 45...
4
 0.5000
 0...
...
...x=0.4641 ...
The preceding theorem has an important application to the theory of com-
putation. It shows that some languages are not decidable or even T uring-
recognizable, for the reason that there are uncountably many languages yet only
countably many T uring machines. Because each T uring machine can recognize
as i n g l el a n g u a g ea n dt h e r ea r em o r el a n g u a g e st h a nT u r i n gm a c h i n e s ,s o m e
languages are not recognized by any T uring machine. Such languages are not
Tu r i n g - r e c o g n i z a b l e , a s w e s t a t e i n t h e f o l l o w i n g c o r o l l a r y.
COROLLARY 4.18
Some languages are not T uring-recognizable.
PROOF To s h o w t h a t t h e s e t o f a l l Tu r i n g m a c h i n e s i s c o u n t a b l e , w e fi r s t
observe that the set of all strings Σ∗is countable for any alphabet Σ.W i t ho n l y
finitely many strings of each length, we may form a list of Σ∗by writing down
all strings of length 0, length 1, length 2, and so on.
The set of all T uring machines is countable because each T uring machine M
has an encoding into a string ⟨M⟩.I fw es i m p l y o m i t t h o s e s t r i n g s t h a t a r en o t
legal encodings of T uring machines, we can obtain a list of all T uring machines.
To s h o w t h a t t h e s e t o f a l l l a n g u a g e s i s u n c o u n t a b l e , w e fi r s t o b s e r v e t h a t t h e
set of all infinite binary sequences is uncountable. An infinite binary sequence is an
unending sequence of 0s and 1s. Let Bbe the set of all infinite binary sequences.
We can show that Bis uncountable by using a proof by diagonalization similar
to the one we used in Theorem 4.17 to show that Ris uncountable.
LetLbe the set of all languages over alphabet Σ.W e s h o w t h a t Lis un-
countable by giving a correspondence with B,t h u ss h o w i n gt h a tt h et w os e t sa r e
the same size. Let Σ∗={s1,s2,s3,...}.E a c h l a n g u a g e A∈Lhas a unique
sequence in B.T h e ith bit of that sequence is a 1 if si∈Aand is a 0 if si̸∈A,
which is called the characteristic sequence ofA.F o re x a m p l e ,i f Awere the lan-
guage of all strings starting with a 0over the alphabet {0,1},i t sc h a r a c t e r i s t i c
sequence χAwould be
Σ∗={ε,0,1,00,01,10,11,000,001,··· } ;
A={ 0, 00,01, 000,001,··· } ;
χA=0 1 0 1 1 0 0 1 1 ··· .
The function f:L− →B ,w h e r e f(A)equals the characteristic sequence of
A,i so n e - t o - o n ea n do n t o ,a n dh e n c ei sac o r r e s p o n d e n c e . T h e r e f o r e ,a s Bis
uncountable, Lis uncountable as well.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 231 ---
4.2 UNDECIDABILITY 207
Thus we have shown that the set of all languages cannot be put into a corre-
spondence with the set of all T uring machines. We conclude that some languages
are not recognized by any T uring machine.
AN UNDECIDABLE LANGUAGE
Now we are ready to prove Theorem 4.11, the undecidability of the language
ATM={⟨M,w⟩|Mis aTMandMaccepts w}.
PROOF We assume that ATMis decidable and obtain a contradiction. Sup-
pose that His a decider for ATM.O ni n p u t ⟨M,w⟩,w h e r e Mis aTMandwis a
string, Hhalts and accepts if Maccepts w.F u r t h e r m o r e , Hhalts and rejects if
Mfails to accept w.I no t h e rw o r d s ,w ea s s u m et h a t His aTM,w h e r e
H⎪parenleftbig
⟨M,w⟩⎪parenrightbig
=⎪braceleft⎢igg
accept ifMaccepts w
reject ifMdoes not accept w.
Now we construct a new T uring machine Dwith Has a subroutine. This
new TMcalls Hto determine what Mdoes when the input to Mis its own
description ⟨M⟩.O n c e Dhas determined this information, it does the opposite.
That is, it rejects if Maccepts and accepts if Mdoes not accept. The following
is a description of D.
D=“On input ⟨M⟩,w h e r e Mis aTM:
1.Run Hon input ⟨M,⟨M⟩⟩.
2.Output the opposite of what Houtputs. That is, if Haccepts,
reject ;a n di f Hrejects, accept .”
Don’t be confused by the notion of running a machine on its own description!
That is similar to running a program with itself as input, something that does
occasionally occur in practice. For example, a compiler is a program that trans-
lates other programs. A compiler for the language Python may itself be written
in Python, so running that program on itself would make sense. In summary,
D⎪parenleftbig
⟨M⟩⎪parenrightbig
=⎪braceleft⎢igg
accept ifMdoes not accept ⟨M⟩
reject ifMaccepts ⟨M⟩.
What happens when we run Dwith its own description ⟨D⟩as input? In that
case, we get
D⎪parenleftbig
⟨D⟩⎪parenrightbig
=⎪braceleft⎢igg
accept ifDdoes not accept ⟨D⟩
reject ifDaccepts ⟨D⟩.
No matter what Ddoes, it is forced to do the opposite, which is obviously a
contradiction. Thus, neither TMDnorTMHcan exist.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 232 ---
208 CHAPTER 4 / DECIDABILITY
Let’s review the steps of this proof. Assume that a TMHdecides ATM.U s e H
to build a TMDthat takes an input ⟨M⟩,w h e r e Daccepts its input ⟨M⟩exactly
when Mdoes not accept its input ⟨M⟩.F i n a l l y , r u n Don itself. Thus, the
machines take the following actions, with the last line being the contradiction.
•Haccepts ⟨M,w⟩exactly when Maccepts w.
•Drejects ⟨M⟩exactly when Maccepts ⟨M⟩.
•Drejects ⟨D⟩exactly when Daccepts ⟨D⟩.
Where is the diagonalization in the proof of Theorem 4.11? It becomes ap-
parent when you examine tables of behavior for TMsHandD.I n t h e s e t a b l e s
we list all TMsd o w nt h er o w s , M1,M2,...,a n da l lt h e i rd e s c r i p t i o n sa c r o s st h e
columns, ⟨M1⟩,⟨M2⟩,....T h ee n t r i e st e l lw h e t h e rt h em a c h i n ei nag i v e nr o w
accepts the input in a given column. The entry is accept if the machine accepts
the input but is blank if it rejects or loops on that input. We made up the entries
in the following figure to illustrate the idea.
⟨M1⟩⟨M2⟩⟨M3⟩⟨M4⟩· · ·
M1
accept accept
M2
accept accept accept accept
M3
M4
accept accept···
...
...
FIGURE 4.19
Entry i, jisaccept ifMiaccepts ⟨Mj⟩
In the following figure, the entries are the results of running Hon inputs
corresponding to Figure 4.19. So if M3does not accept input ⟨M2⟩,t h ee n t r y
for row M3and column ⟨M2⟩isreject because Hrejects input ⟨M3,⟨M2⟩⟩.
⟨M1⟩⟨M2⟩⟨M3⟩⟨M4⟩· · ·
M1
accept reject accept reject
M2
accept accept accept accept
M3
reject reject reject reject···
M4
accept accept reject reject
...
...
FIGURE 4.20
Entry i, jis the value of Hon input ⟨Mi,⟨Mj⟩⟩
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 233 ---
4.2 UNDECIDABILITY 209
In the following figure, we added Dto Figure 4.20. By our assumption, His
aTMand so is D.T h e r e f o r e , i t m u s t o c c u r o n t h e l i s t M1,M2,...of all TMs.
Note that Dcomputes the opposite of the diagonal entries. The contradiction
occurs at the point of the question mark where the entry must be the opposite
of itself.
⟨M1⟩⟨M2⟩⟨M3⟩⟨M4⟩· · · ⟨ D⟩· · ·
M1
accept
 reject accept reject accept
M2
accept accept
 accept accept accept
M3
reject reject reject
 reject···reject···
M4
accept accept reject reject
 accept
...
......
D
 reject reject accept accept ?
...
......
FIGURE 4.21
IfDis in the figure, a contradiction occurs at “?”
AT U R I N G - U N R E C O G N I Z A B L EL A N G U A G E
In the preceding section, we exhibited a language—namely, ATM—that is un-
decidable. Now we exhibit a language that isn’t even T uring-recognizable.
Note that ATMwill not suffice for this purpose because we showed that ATM
is T uring-recognizable (page 202). The following theorem shows that if both
al a n g u a g ea n di t sc o m p l e m e n ta r eT u r i n g - r e c o g n i z a b l e ,t h el a n g u a g ei sd e c i d -
able. Hence for any undecidable language, either it or its complement is not
Tu r i n g - r e c o g n i z a b l e . R e c a l l t h a t t h e c o m p l e m e n t o f a l a n g u a g e i s t h e l a n g u a g e
consisting of all strings that are not in the language. We say that a language is co-
Turing-recognizable if it is the complement of a T uring-recognizable language.
THEOREM 4.22
Al a n g u a g ei sd e c i d a b l ei f fi ti sT u r i n g - r e c o g n i z a b l ea n dc o - T u r i n g - r e c o g n i z a b l e .
In other words, a language is decidable exactly when both it and its complement
are T uring-recognizable.
PROOF We have two directions to prove. First, if Ais decidable, we can easily
see that both Aand its complement
 Aare T uring-recognizable. Any decidable
language is T uring-recognizable, and the complement of a decidable language
also is decidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 234 ---
210 CHAPTER 4 / DECIDABILITY
For the other direction, if both Aand
Aare T uring-recognizable, we let M1
be the recognizer for AandM2be the recognizer for
 A.T h ef o l l o w i n gT u r i n g
machine Mis a decider for A.
M=“On input w:
1.Run both M1andM2on input win parallel.
2.IfM1accepts, accept ;i fM2accepts, reject .”
Running the two machines in parallel means that Mhas two tapes, one for simu-
lating M1and the other for simulating M2.I nt h i sc a s e , Mtakes turns simulating
one step of each machine, which continues until one of them accepts.
Now we show that Mdecides A.E v e r ys t r i n g wis either in Aor
A.T h e r e -
fore, either M1orM2must accept w.B e c a u s e Mhalts whenever M1orM2
accepts, Malways halts and so it is a decider. Furthermore, it accepts all strings
inAand rejects all strings not in A.S o Mis a decider for A,a n dt h u s Ais
decidable.
COROLLARY 4.23
ATMis not T uring-recognizable.
PROOF We know that ATMis T uring-recognizable. If
 ATMalso were T uring-
recognizable, ATMwould be decidable. Theorem 4.11 tells us that ATMis not
decidable, so
 ATMmust not be T uring-recognizable.
EXERCISES
A4.1 Answer all parts for the following DFAMand give reasons for your answers.
a.Is⟨M,0100 ⟩∈ADFA?
b.Is⟨M,011⟩∈ADFA?
c.Is⟨M⟩∈ADFA?d.Is⟨M,0100 ⟩∈AREX?
e.Is⟨M⟩∈EDFA?
f.Is⟨M,M ⟩∈EQDFA?
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 235 ---
PROBLEMS 211
4.2 Consider the problem of determining whether a DFAand a regular expression are
equivalent. Express this problem as a language and show that it is decidable.
4.3 LetALL DFA={⟨A⟩|Ais aDFAandL(A)=Σ∗}.S h o wt h a t ALL DFAis decidable.
4.4 LetAεCFG={⟨G⟩|Gis aCFG that generates ε}.S h o wt h a t AεCFGis decidable.
A4.5 LetETM={⟨M⟩|Mis aTMandL(M)=∅}.S h o wt h a t
 ETM,t h ec o m p l e m e n to f
ETM,i sT u r i n g - r e c o g n i z a b l e .
4.6 LetXbe the set {1,2,3,4,5}andYbe the set {6,7,8,9,10}.W e d e s c r i b e t h e
functions f:X−→Yandg:X−→Yin the following tables. Answer each part
and give a reason for each negative answer.
n
f(n)
1
 6
2
 7
3
 6
4
 7
5
 6n
g(n)
1
 10
2
 9
3
 8
4
 7
5
 6
Aa.Isfone-to-one?
b.Isfonto?
c.Isfac o r r e s p o n d e n c e ?Ad.Isgone-to-one?
e.Isgonto?
f.Isgac o r r e s p o n d e n c e ?
4.7 LetBbe the set of all infinite sequences over {0,1}.S h o w t h a t Bis uncountable
using a proof by diagonalization.
4.8 LetT={(i, j, k)|i, j, k ∈N} .S h o wt h a t Tis countable.
4.9 Review the way that we define sets to be the same size in Definition 4.12 (page 203).
Show that “is the same size” is an equivalence relation.
PROBLEMS
A4.10 LetINFINITE DFA={⟨A⟩|Ais aDFAandL(A)is an infinite language }.S h o w
thatINFINITE DFAis decidable.
4.11 LetINFINITE PDA={⟨M⟩|Mis aPDAandL(M)is an infinite language }.S h o w
thatINFINITE PDAis decidable.
A4.12 LetA={⟨M⟩|Mis aDFAthat doesn’t accept any string containing an odd num-
ber of 1s}.S h o wt h a t Ais decidable.
4.13 LetA={⟨R, S⟩|RandSare regular expressions and L(R)⊆L(S)}.S h o wt h a t
Ais decidable.
A4.14 LetΣ= {0,1}.S h o w t h a t t h e p r o b l e m o f d e t e r m i n i n g w h e t h e r a CFG generates
some string in 1∗is decidable. In other words, show that
{⟨G⟩|Gis aCFG over {0,1}and1∗∩L(G)̸=∅}
is a decidable language.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 236 ---
212 CHAPTER 4 / DECIDABILITY
⋆4.15 Show that the problem of determining whether a CFG generates all strings in 1∗is
decidable. In other words, show that {⟨G⟩|Gis aCFG over {0,1}and1∗⊆L(G)}
is a decidable language.
4.16 LetA={⟨R⟩|Ris a regular expression describing a language containing at least
one string wthat has 111as a substring (i.e., w=x111yfor some xandy)}.S h o w
thatAis decidable.
4.17 Prove that EQDFAis decidable by testing the two DFAso na l ls t r i n g su pt oac e r t a i n
size. Calculate a size that works.
⋆4.18 LetCbe a language. Prove that Cis T uring-recognizable iff a decidable language
Dexists such that C={x|∃y(⟨x, y⟩∈D)}.
⋆4.19 Prove that the class of decidable languages is not closed under homomorphism.
4.20 LetAandBbe two disjoint languages. Say that language Cseparates AandBif
A⊆CandB⊆
C.S h o wt h a ta n yt w od i s j o i n tc o - T u r i n g - r e c o g n i z a b l el a n g u a g e s
are separable by some decidable language.
4.21 LetS={⟨M⟩|Mis aDFAthat accepts wRwhenever it accepts w}.S h o wt h a t S
is decidable.
4.22 LetPREFIX-FREE REX={⟨R⟩|Ris a regular expression and L(R)is prefix-free }.
Show that PREFIX-FREE REXis decidable. Why does a similar approach fail to
show that PREFIX-FREE CFGis decidable?
A⋆4.23 Say that an NFA isambiguous if it accepts some string along two different com-
putation branches. Let AMBIG NFA={⟨N⟩|Nis an ambiguous NFA}.S h o wt h a t
AMBIG NFAis decidable. (Suggestion: One elegant way to solve this problem is to
construct a suitable DFAand then run EDFAon it.)
4.24 Auseless state in a pushdown automaton is never entered on any input string. Con-
sider the problem of determining whether a pushdown automaton has any useless
states. Formulate this problem as a language and show that it is decidable.
A⋆4.25 LetBAL DFA={⟨M⟩|Mis aDFAthat accepts some string containing an equal
number of 0sa n d 1s}.S h o w t h a t BAL DFAis decidable. (Hint: Theorems about
CFLsa r eh e l p f u lh e r e . )
⋆4.26 LetPAL DFA={⟨M⟩|Mis aDFAthat accepts some palindrome }.S h o w t h a t
PAL DFAis decidable. (Hint: Theorems about CFLsa r eh e l p f u lh e r e . )
⋆4.27 LetE={⟨M⟩|Mis aDFAthat accepts some string with more 1st h a n 0s}.S h o w
thatEis decidable. (Hint: Theorems about CFLsa r eh e l p f u lh e r e . )
4.28 LetC={⟨G, x⟩|Gis aCFG xis a substring of some y∈L(G)}.S h o wt h a t Cis
decidable. (Hint: An elegant solution to this problem uses the decider for ECFG.)
4.29 LetCCFG={⟨G, k⟩|Gis aCFG andL(G)contains exactly kstrings where k≥0
ork=∞}.S h o wt h a t CCFGis decidable.
4.30 LetAbe a T uring-recognizable language consisting of descriptions of T uring ma-
chines, {⟨M1⟩,⟨M2⟩,. . .},w h e r ee v e r y Miis a decider. Prove that some decidable
language Dis not decided by any decider Miwhose description appears in A.
(Hint: You may find it helpful to consider an enumerator for A.)
4.31 Say that a variable AinCFLGisusable if it appears in some derivation of some
string w∈G.G i v e n a CFG Gand a variable A,c o n s i d e rt h ep r o b l e mo ft e s t i n g
whether Ais usable. Formulate this problem as a language and show that it is
decidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 237 ---
SELECTED SOLUTIONS 213
4.32 The proof of Lemma 2.41 says that (q,x)is alooping situation for a DPDA Pif
when Pis started in state qwith x∈Γon the top of the stack, it never pops
anything below xand it never reads an input symbol. Show that Fis decidable,
where F={⟨P,q,x ⟩|(q,x)is a looping situation for P}.
SELECTED SOLUTIONS
4.1 (a)Yes. The DFAMaccepts 0100 .
(b)No.Mdoesn’t accept 011.
(c)No. This input has only a single component and thus is not of the correct form.
(d)No. The first component is not a regular expression and so the input is not of
the correct form.
(e)No. M’s language isn’t empty.
(f)Yes.Maccepts the same language as itself.
4.5 Lets1,s2,...be a list of all strings in Σ∗.T h ef o l l o w i n g TMrecognizes
 ETM.
“On input ⟨M⟩,w h e r e Mis aTM:
1.Repeat the following for i=1,2,3,....
2. Run Mforisteps on each input, s1,s2,...,s i.
3. IfMhas accepted any of these, accept .O t h e r w i s e ,c o n t i n u e . ”
4.6 (a)No,fis not one-to-one because f(1) = f(3).
(d)Yes,gis one-to-one.
4.10 The following TMIdecides INFINITEDFA.
I=“On input ⟨A⟩,w h e r e Ais aDFA:
1.Letkbe the number of states of A.
2.Construct a DFADthat accepts all strings of length kor more.
3.Construct a DFAMsuch that L(M)=L(A)∩L(D).
4.Te s t L(M)=∅using the EDFAdecider Tfrom Theorem 4.4.
5.IfTaccepts, reject ;i fTrejects, accept .”
This algorithm works because a DFAthat accepts infinitely many strings must ac-
cept arbitrarily long strings. Therefore, this algorithm accepts such DFAs. Con-
versely, if the algorithm accepts a DFA,t h e DFAaccepts some string of length kor
more, where kis the number of states of the DFA.T h i ss t r i n gm a yb ep u m p e di n
the manner of the pumping lemma for regular languages to obtain infinitely many
accepted strings.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 238 ---
214 CHAPTER 4 / DECIDABILITY
4.12 The following TMdecides A.
“On input ⟨M⟩:
1.Construct a DFAOthat accepts every string containing an odd
number of 1s.
2.Construct a DFABsuch that L(B)=L(M)∩L(O).
3.Te s t w h e t h e r L(B)=∅using the EDFAdecider Tfrom Theo-
rem 4.4.
4.IfTaccepts, accept ;i fTrejects, reject .”
4.14 You showed in Problem 2.18 that if Cis a context-free language and Ris a regular
language, then C∩Ris context free. Therefore, 1∗∩L(G)is context free. The
following TMdecides the language of this problem.
“On input ⟨G⟩:
1.Construct CFG Hsuch that L(H)=1∗∩L(G).
2.Te s t w h e t h e r L(H)=∅using the ECFGdecider Rfrom Theo-
rem 4.8.
3.IfRaccepts, reject ;i fRrejects, accept .”
4.23 The following procedure decides AMBIG NFA.G i v e n a n NFAN,w ed e s i g na DFA
Dthat simulates Nand accepts a string iff it is accepted by Nalong two different
computational branches. Then we use a decider for EDFAto determine whether D
accepts any strings.
Our strategy for constructing Dis similar to the NFA-to-DFAconversion in the
proof of Theorem 1.39. We simulate Nby keeping a pebble on each active state.
We begin by putting a red pebble on the start state and on each state reachable
from the start state along εtransitions. We move, add, and remove pebbles in
accordance with N’s transitions, preserving the color of the pebbles. Whenever
two or more pebbles are moved to the same state, we replace its pebbles with a
blue pebble. After reading the input, we accept if a blue pebble is on an accept
state of Nor if two different accept states of Nhave red pebbles on them.
The DFADhas a state corresponding to each possible position of pebbles. For
each state of N,t h r e ep o s s i b i l i t i e so c c u r :I tc a nc o n t a i nar e dp e b b l e ,ab l u ep e b b l e ,
or no pebble. Thus, if Nhasnstates, Dwill have 3nstates. Its start state, accept
states, and transition function are defined to carry out the simulation.
4.25 The language of all strings with an equal number of 0sa n d 1si sac o n t e x t - f r e e
language, generated by the grammar S→1S0S|0S1S|ε.L e t Pbe the PDAthat
recognizes this language. Build a TMMforBAL DFA,w h i c ho p e r a t e sa sf o l l o w s .
On input ⟨B⟩,w h e r e Bis aDFA,u s e BandPto construct a new PDA Rthat
recognizes the intersection of the languages of BandP.T h e n t e s t w h e t h e r R’s
language is empty. If its language is empty, reject ;o t h e r w i s e , accept .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 239 ---
5
REDUCIBILITY
In Chapter 4 we established the T uring machine as our model of a general pur-
pose computer. We presented several examples of problems that are solvable
on a T uring machine and gave one example of a problem, ATM,t h a ti sc o m p u -
tationally unsolvable. In this chapter we examine several additional unsolvable
problems. In doing so, we introduce the primary method for proving that prob-
lems are computationally unsolvable. It is called reducibility .
Areduction is a way of converting one problem to another problem in such a
way that a solution to the second problem can be used to solve the first problem.
Such reducibilities come up often in everyday life, even if we don’t usually refer
to them in this way.
For example, suppose that you want to find your way around a new city. You
know that doing so would be easy if you had a map. Thus, you can reduce the
problem of finding your way around the city to the problem of obtaining a map
of the city.
Reducibility always involves two problems, which we call AandB.I fAre-
duces to B,w ec a nu s eas o l u t i o nt o Bto solve A.S o i n o u r e x a m p l e , Ais the
problem of finding your way around the city and Bis the problem of obtaining
am a p .N o t et h a tr e d u c i b i l i t ys a y sn o t h i n ga b o u ts o l v i n g AorBalone, but only
about the solvability of Ain the presence of a solution to B.
The following are further examples of reducibilities. The problem of travel-
ing from Boston to Paris reduces to the problem of buying a plane ticket between
the two cities. That problem in turn reduces to the problem of earning the
money for the ticket. And that problem reduces to the problem of finding a job.
215
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 240 ---
216 CHAPTER 5 / REDUCIBILITY
Reducibility also occurs in mathematical problems. For example, the problem
of measuring the area of a rectangle reduces to the problem of measuring its
length and width. The problem of solving a system of linear equations reduces
to the problem of inverting a matrix.
Reducibility plays an important role in classifying problems by decidability,
and later in complexity theory as well. When Ais reducible to B,s o l v i n g A
cannot be harder than solving Bbecause a solution to Bgives a solution to A.I n
terms of computability theory, if Ais reducible to BandBis decidable, Aalso is
decidable. Equivalently, if Ais undecidable and reducible to B,Bis undecidable.
This last version is key to proving that various problems are undecidable.
In short, our method for proving that a problem is undecidable will be to
show that some other problem already known to be undecidable reduces to it.
5.1
UNDECIDABLE PROBLEMS FROM
LANGUAGE THEORY
We have already established the undecidability of ATM,t h ep r o b l e mo fd e t e r -
mining whether a T uring machine accepts a given input. Let’s consider a related
problem, HALT TM,t h ep r o b l e mo fd e t e r m i n i n gw h e t h e raT u r i n gm a c h i n eh a l t s
(by accepting or rejecting) on a given input. This problem is widely known as the
halting problem .W eu s et h eu n d e c i d a b i l i t yo f ATMto prove the undecidability
of the halting problem by reducing ATMtoHALT TM.L e t
HALT TM={⟨M,w⟩|Mis aTMandMhalts on input w}.
THEOREM 5.1
HALT TMis undecidable.
PROOF IDEA This proof is by contradiction. We assume that HALT TMis
decidable and use that assumption to show that ATMis decidable, contradicting
Theorem 4.11. The key idea is to show that ATMis reducible to HALT TM.
Let’s assume that we have a TMRthat decides HALT TM.T h e nw eu s e Rto
construct S,aTMthat decides ATM.T o g e t a f e e l f o r t h e w a y t o c o n s t r u c t S,
pretend that you are S.Y o u r t a s k i s t o d e c i d e ATM.Y o u a r e g i v e n a n i n p u t o f
the form ⟨M,w⟩.Y o um u s to u t p u t accept ifMaccepts w,a n dy o um u s to u t p u t
reject ifMloops or rejects on w.T r ys i m u l a t i n g Monw.I fi ta c c e p t so rr e j e c t s ,
do the same. But you may not be able to determine whether Mis looping, and
in that case your simulation will not terminate. That’s bad because you are a
decider and thus never permitted to loop. So this idea by itself does not work.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 241 ---
5.1 UNDECIDABLE PROBLEMS FROM LANGUAGE THEORY 217
Instead, use the assumption that you have TMRthat decides HALT TM.W i t h
R,y o uc a nt e s tw h e t h e r Mhalts on w.I fRindicates that Mdoesn’t halt on w,
reject because ⟨M,w⟩isn’t in ATM.H o w e v e r ,i f Rindicates that Mdoes halt on
w,y o uc a nd ot h es i m u l a t i o nw i t h o u ta n yd a n g e ro fl o o p i n g .
Thus, if TMRexists, we can decide ATM,b u tw ek n o wt h a t ATMis unde-
cidable. By virtue of this contradiction, we can conclude that Rdoes not exist.
Therefore, HALT TMis undecidable.
PROOF Let’s assume for the purpose of obtaining a contradiction that TM
Rdecides HALT TM.W e c o n s t r u c t TMSto decide ATM,w i t h Soperating as
follows.
S=“On input ⟨M,w⟩,a ne n c o d i n go fa TMMand a string w:
1.Run TMRon input ⟨M,w⟩.
2.IfRrejects, reject .
3.IfRaccepts, simulate Monwuntil it halts.
4.IfMhas accepted, accept ;i fMhas rejected, reject .”
Clearly, if Rdecides HALT TM,t h e n Sdecides ATM.B e c a u s e ATMis unde-
cidable, HALT TMalso must be undecidable.
Theorem 5.1 illustrates our strategy for proving that a problem is undecid-
able. This strategy is common to most proofs of undecidability, except for the
undecidability of ATMitself, which is proved directly via the diagonalization
method.
We now present several other theorems and their proofs as further examples
of the reducibility method for proving undecidability. Let
ETM={⟨M⟩|Mis aTMandL(M)=∅}.
THEOREM 5.2
ETMis undecidable.
PROOF IDEA We follow the pattern adopted in Theorem 5.1. We assume
that ETMis decidable and then show that ATMis decidable—a contradiction.
LetRbe a TMthat decides ETM.W eu s e Rto construct TMSthat decides ATM.
How will Swork when it receives input ⟨M,w⟩?
One idea is for Sto run Ron input ⟨M⟩and see whether it accepts. If it does,
we know that L(M)is empty and therefore that Mdoes not accept w.B u ti f R
rejects ⟨M⟩,a l lw ek n o wi st h a t L(M)is not empty and therefore that Maccepts
some string—but we still do not know whether Maccepts the particular string
w.S ow en e e dt ou s ead i f f e r e n ti d e a .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 242 ---
218 CHAPTER 5 / REDUCIBILITY
Instead of running Ron⟨M⟩,w er u n Ron a modification of ⟨M⟩.W em o d i f y
⟨M⟩to guarantee that Mrejects all strings except w,b u to ni n p u t wit works as
usual. Then we use Rto determine whether the modified machine recognizes
the empty language. The only string the machine can now accept is w,s oi t s
language will be nonempty iff it accepts w.I fRaccepts when it is fed a descrip-
tion of the modified machine, we know that the modified machine doesn’t accept
anything and that Mdoesn’t accept w.
PROOF Let’s write the modified machine described in the proof idea using
our standard notation. We call it M1.
M1=“On input x:
1.Ifx̸=w,reject .
2.Ifx=w,r u n Mon input wandaccept ifMdoes. ”
This machine has the string was part of its description. It conducts the test
of whether x=win the obvious way, by scanning the input and comparing it
character by character with wto determine whether they are the same.
Putting all this together, we assume that TMRdecides ETMand construct TM
Sthat decides ATMas follows.
S=“On input ⟨M,w⟩,a ne n c o d i n go fa TMMand a string w:
1.Use the description of Mandwto construct the TMM1just
described.
2.Run Ron input ⟨M1⟩.
3.IfRaccepts, reject ;i fRrejects, accept .”
Note that Smust actually be able to compute a description of M1from a
description of Mandw.I t i s a b l e t o d o s o b e c a u s e i t o n l y n e e d s t o a d d e x t r a
states to Mthat perform the x=wtest.
IfRwere a decider for ETM,Swould be a decider for ATM.A d e c i d e r f o r
ATMcannot exist, so we know that ETMmust be undecidable.
Another interesting computational problem regarding T uring machines con-
cerns determining whether a given T uring machine recognizes a language that
also can be recognized by a simpler computational model. For example, we let
REGULAR TMbe the problem of determining whether a given T uring machine
has an equivalent finite automaton. This problem is the same as determining
whether the T uring machine recognizes a regular language. Let
REGULAR TM={⟨M⟩|Mis aTMandL(M)is a regular language }.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 243 ---
5.1 UNDECIDABLE PROBLEMS FROM LANGUAGE THEORY 219
THEOREM 5.3
REGULAR TMis undecidable.
PROOF IDEA As usual for undecidability theorems, this proof is by reduction
from ATM.W e a s s u m e t h a t REGULAR TMis decidable by a TMRand use this
assumption to construct a TMSthat decides ATM.L e s s o b v i o u s n o w i s h o w t o
useR’s ability to assist Sin its task. Nonetheless, we can do so.
The idea is for Sto take its input ⟨M,w⟩and modify Mso that the result-
ingTMrecognizes a regular language if and only if Maccepts w.W e c a l l t h e
modified machine M2.W e d e s i g n M2to recognize the nonregular language
{0n1n|n≥0}ifMdoes not accept w,a n dt or e c o g n i z et h er e g u l a rl a n g u a g e Σ∗
ifMaccepts w.W em u s ts p e c i f yh o w Scan construct such an M2from Mand
w.H e r e , M2works by automatically accepting all strings in {0n1n|n≥0}.I n
addition, if Maccepts w,M2accepts all other strings.
Note that the TMM2isnotconstructed for the purposes of actually running it
on some input—a common confusion. We construct M2only for the purpose of
feeding its description into the decider for REGULAR TMthat we have assumed
to exist. Once this decider returns its answer, we can use it to obtain the answer
to whether Maccepts w.T h u s ,w ec a nd e c i d e ATM,ac o n t r a d i c t i o n .
PROOF We let Rbe a TMthat decides REGULAR TMand construct TMSto
decide ATM.T h e n Sworks in the following manner.
S=“On input ⟨M,w⟩,w h e r e Mis aTMandwis a string:
1.Construct the following TMM2.
M2=“On input x:
1.Ifxhas the form 0n1n,accept .
2.Ifxdoes not have this form, run Mon input wand
accept ifMaccepts w.”
2.Run Ron input ⟨M2⟩.
3.IfRaccepts, accept ;i fRrejects, reject .”
Similarly, the problems of testing whether the language of a T uring machine
is a context-free language, a decidable language, or even a finite language can
be shown to be undecidable with similar proofs. In fact, a general result, called
Rice’s theorem, states that determining any property of the languages recognized
by T uring machines is undecidable. We give Rice’s theorem in Problem 5.28.
So far, our strategy for proving languages undecidable involves a reduction
from ATM.S o m e t i m e s r e d u c i n g f r o m s o m e o t h e r u n d e c i d a b l e l a n g u a g e , s u c h
asETM,i sm o r ec o n v e n i e n tw h e nw ea r es h o w i n gt h a tc e r t a i nl a n g u a g e sa r e
undecidable. Theorem 5.4 shows that testing the equivalence of two T uring
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 244 ---
220 CHAPTER 5 / REDUCIBILITY
machines is an undecidable problem. We could prove it by a reduction from
ATM,b u tw eu s et h i so p p o r t u n i t yt og i v ea ne x a m p l eo fa nu n d e c i d a b i l i t yp r o o f
by reduction from ETM.L e t
EQTM={⟨M1,M2⟩|M1andM2areTMsa n d L(M1)=L(M2)}.
THEOREM 5.4
EQTMis undecidable.
PROOF IDEA Show that if EQTMwere decidable, ETMalso would be decid-
able by giving a reduction from ETMtoEQTM.T h e i d e a i s s i m p l e . ETMis the
problem of determining whether the language of a TMis empty. EQTMis the
problem of determining whether the languages of two TMsa r et h es a m e .I fo n e
of these languages happens to be ∅,w ee n du pw i t ht h ep r o b l e mo fd e t e r m i n i n g
whether the language of the other machine is empty—that is, the ETMproblem.
So in a sense, the ETMproblem is a special case of the EQTMproblem wherein
one of the machines is fixed to recognize the empty language. This idea makes
giving the reduction easy.
PROOF We let TMRdecide EQTMand construct TMSto decide ETMas
follows.
S=“On input ⟨M⟩,w h e r e Mis aTM:
1.Run Ron input ⟨M,M 1⟩,w h e r e M1is aTMthat rejects all
inputs.
2.IfRaccepts, accept ;i fRrejects, reject .”
IfRdecides EQTM,Sdecides ETM.B u t ETMis undecidable by Theorem 5.2,
soEQTMalso must be undecidable.
REDUCTIONS VIA COMPUTATION HISTORIES
The computation history method is an important technique for proving that
ATMis reducible to certain languages. This method is often useful when the
problem to be shown undecidable involves testing for the existence of some-
thing. For example, this method is used to show the undecidability of Hilbert’s
tenth problem, testing for the existence of integral roots in a polynomial.
The computation history for a T uring machine on an input is simply the se-
quence of configurations that the machine goes through as it processes the input.
It is a complete record of the computation of this machine.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 245 ---
5.1 UNDECIDABLE PROBLEMS FROM LANGUAGE THEORY 221
DEFINITION 5.5
LetMbe a T uring machine and wan input string. An accepting
computation history forMonwis a sequence of configurations,
C1,C2,...,C l,w h e r e C1is the start configuration of Monw,Clis
an accepting configuration of M,a n de a c h Cilegally follows from
Ci−1according to the rules of M.Arejecting computation his-
toryforMonwis defined similarly, except that Clis a rejecting
configuration.
Computation histories are finite sequences. If Mdoesn’t halt on w,n oa c c e p t -
ing or rejecting computation history exists for Monw.D e t e r m i n i s t i cm a c h i n e s
have at most one computation history on any given input. Nondeterministic ma-
chines may have many computation histories on a single input, corresponding
to the various computation branches. For now, we continue to focus on deter-
ministic machines. Our first undecidability proof using the computation history
method concerns a type of machine called a linear bounded automaton.
DEFINITION 5.6
Alinear bounded automaton is a restricted type of T uring machine
wherein the tape head isn’t permitted to move off the portion of
the tape containing the input. If the machine tries to move its head
off either end of the input, the head stays where it is—in the same
way that the head will not move off the left-hand end of an ordinary
Tu r i n g m a c h i n e ’s t a p e .
Al i n e a rb o u n d e da u t o m a t o ni saT u r i n gm a c h i n ew i t hal i m i t e da m o u n to f
memory, as shown schematically in the following figure. It can only solve prob-
lems requiring memory that can fit within the tape used for the input. Using a
tape alphabet larger than the input alphabet allows the available memory to be
increased up to a constant factor. Hence we say that for an input of length n,t h e
amount of memory available is linear in n—thus the name of this model.
FIGURE 5.7
Schematic of a linear bounded automaton
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 246 ---
222 CHAPTER 5 / REDUCIBILITY
Despite their memory constraint, linear bounded automata ( LBAs) are quite
powerful. For example, the deciders for ADFA,ACFG,EDFA,a n d ECFGall are
LBAs. Every CFLcan be decided by an LBA.I nf a c t ,c o m i n gu pw i t had e c i d a b l e
language that can’t be decided by an LBAtakes some work. We develop the
techniques to do so in Chapter 9.
Here, ALBAis the problem of determining whether an LBAaccepts its input.
Even though ALBAis the same as the undecidable problem ATMwhere the T ur-
ing machine is restricted to be an LBA,w ec a ns h o wt h a t ALBAis decidable. Let
ALBA={⟨M,w⟩|Mis an LBAthat accepts string w}.
Before proving the decidability of ALBA,w efi n dt h ef o l l o w i n gl e m m au s e f u l .
It says that an LBAcan have only a limited number of configurations when a
string of length nis the input.
LEMMA 5.8
LetMbe an LBAwith qstates and gsymbols in the tape alphabet. There are
exactly qngndistinct configurations of Mfor a tape of length n.
PROOF Recall that a configuration of Mis like a snapshot in the middle of its
computation. A configuration consists of the state of the control, position of the
head, and contents of the tape. Here, Mhasqstates. The length of its tape is n,
so the head can be in one of npositions, and gnpossible strings of tape symbols
appear on the tape. The product of these three quantities is the total number of
different configurations of Mwith a tape of length n.
THEOREM 5.9
ALBAis decidable.
PROOF IDEA In order to decide whether LBAMaccepts input w,w es i m u l a t e
Monw.D u r i n gt h ec o u r s eo ft h es i m u l a t i o n ,i f Mhalts and accepts or rejects,
we accept or reject accordingly. The difficulty occurs if Mloops on w.W en e e d
to be able to detect looping so that we can halt and reject.
The idea for detecting when Mis looping is that as Mcomputes on w,i t
goes from configuration to configuration. If Mever repeats a configuration,
it would go on to repeat this configuration over and over again and thus be
in a loop. Because Mis an LBA,t h ea m o u n to ft a p ea v a i l a b l et oi ti sl i m i t e d .
By Lemma 5.8, Mcan be in only a limited number of configurations on this
amount of tape. Therefore, only a limited amount of time is available to M
before it will enter some configuration that it has previously entered. Detecting
thatMis looping is possible by simulating Mfor the number of steps given by
Lemma 5.8. If Mhas not halted by then, it must be looping.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 247 ---
5.1 UNDECIDABLE PROBLEMS FROM LANGUAGE THEORY 223
PROOF The algorithm that decides ALBAis as follows.
L=“On input ⟨M,w⟩,w h e r e Mis an LBAandwis a string:
1.Simulate Monwforqngnsteps or until it halts.
2.IfMhas halted, accept if it has accepted and reject if it has
rejected. If it has not halted, reject .”
IfMonwhas not halted within qngnsteps, it must be repeating a configura-
tion according to Lemma 5.8 and therefore looping. That is why our algorithm
rejects in this instance.
Theorem 5.9 shows that LBAsa n d TMsd i f f e ri no n ee s s e n t i a lw a y :F o r LBAs
the acceptance problem is decidable, but for TMsi ti s n ’ t .H o w e v e r ,c e r t a i no t h e r
problems involving LBAsr e m a i nu n d e c i d a b l e . O n ei st h ee m p t i n e s sp r o b l e m
ELBA={⟨M⟩|Mis an LBAwhere L(M)=∅}.T o p r o v et h a t ELBAis undecid-
able, we give a reduction that uses the computation history method.
THEOREM 5.10
ELBAis undecidable.
PROOF IDEA This proof is by reduction from ATM.W e s h o w t h a t i f ELBA
were decidable, ATMwould also be. Suppose that ELBAis decidable. How can
we use this supposition to decide ATM?
For a TMMand an input w,w ec a nd e t e r m i n ew h e t h e r Maccepts wby con-
structing a certain LBABand then testing whether L(B)is empty. The language
thatBrecognizes comprises all accepting computation histories for Monw.I f
Maccepts w,t h i sl a n g u a g ec o n t a i n so n es t r i n ga n ds oi sn o n e m p t y . I f Mdoes
not accept w,t h i sl a n g u a g ei se m p t y .I fw ec a nd e t e r m i n ew h e t h e r B’s language
is empty, clearly we can determine whether Maccepts w.
Now we describe how to construct Bfrom Mandw.N o t e t h a t w e n e e d
to show more than the mere existence of B.W e h a v e t o s h o w h o w a T u r i n g
machine can obtain a description of B,g i v e nd e s c r i p t i o n so f Mandw.
As in the previous reductions we’ve given for proving undecidability, we con-
struct Bonly to feed its description into the presumed ELBAdecider, but not to
runBon some input.
We construct Bto accept its input xifxis an accepting computation history
forMonw.R e c a l l t h a t a n a c c e p t i n g c o m p u t a t i o n h i s t o r y i s t h e s e q u e n c e o f
configurations, C1,C2,...,C lthatMgoes through as it accepts some string w.
For the purposes of this proof, we assume that the accepting computation history
is presented as a single string with the configurations separated from each other
by the #symbol, as shown in Figure 5.11.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 248 ---
224 CHAPTER 5 / REDUCIBILITY
#⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
C1#⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
C2#⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
C3#···#⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
Cl#
FIGURE 5.11
Ap o s s i b l ei n p u tt o B
The LBABworks as follows. When it receives an input x,Bis supposed to
accept if xis an accepting computation history for Monw.F i r s t , Bbreaks up
xaccording to the delimiters into strings C1,C2,..., C l.T h e n Bdetermines
whether the Ci’s satisfy the three conditions of an accepting computation history.
1.C1is the start configuration for Monw.
2.Each Ci+1legally follows from Ci.
3.Clis an accepting configuration for M.
The start configuration C1forMonwis the string q0w1w2···wn,w h e r e
q0is the start state for Monw.H e r e , Bhas this string directly built in, so
it is able to check the first condition. An accepting configuration is one that
contains the qaccept state, so Bcan check the third condition by scanning Clfor
qaccept.T h e s e c o n d c o n d i t i o n i s t h e h a r d e s t t o c h e c k .F o r e a c h p a i r o f a d j a c e n t
configurations, Bchecks on whether Ci+1legally follows from Ci.T h i s s t e p
involves verifying that CiandCi+1are identical except for the positions under
and adjacent to the head in Ci.T h e s ep o s i t i o n sm u s tb eu p d a t e da c c o r d i n gt ot h e
transition function of M.T h e n Bverifies that the updating was done properly
by zig-zagging between corresponding positions of CiandCi+1.T ok e e pt r a c k
of the current positions while zig-zagging, Bmarks the current position with
dots on the tape. Finally, if conditions 1, 2, and 3 are satisfied, Baccepts its
input.
By inverting the decider’s answer, we obtain the answer to whether Maccepts
w.T h u sw ec a nd e c i d e ATM,ac o n t r a d i c t i o n .
PROOF Now we are ready to state the reduction of ATMtoELBA.S u p p o s e
that TMRdecides ELBA. Construct TMSto decide ATMas follows.
S=“On input ⟨M,w⟩,w h e r e Mis aTMandwis a string:
1.Construct LBABfrom Mandwas described in the proof idea.
2.Run Ron input ⟨B⟩.
3.IfRrejects, accept ;i fRaccepts, reject .”
IfRaccepts ⟨B⟩,t h e n L(B)=∅.T h u s , Mhas no accepting computation
history on wandMdoesn’t accept w. Consequently, Srejects ⟨M,w⟩.S i m i l a r l y ,
ifRrejects ⟨B⟩,t h el a n g u a g eo f Bis nonempty. The only string that Bcan
accept is an accepting computation history for Monw.T h u s , Mmust accept w.
Consequently, Saccepts ⟨M,w⟩.F i g u r e5 . 1 2i l l u s t r a t e s LBAB.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 249 ---
5.1 UNDECIDABLE PROBLEMS FROM LANGUAGE THEORY 225
FIGURE 5.12
LBABchecking a TMcomputation history
We can also use the technique of reduction via computation histories to es-
tablish the undecidability of certain problems related to context-free grammars
and pushdown automata. Recall that in Theorem 4.8 we presented an algo-
rithm to decide whether a context-free grammar generates any strings—that is,
whether L(G)=∅.N o ww es h o wt h a tar e l a t e dp r o b l e mi su n d e c i d a b l e .I ti st h e
problem of determining whether a context-free grammar generates all possible
strings. Proving that this problem is undecidable is the main step in showing
that the equivalence problem for context-free grammars is undecidable. Let
ALL CFG={⟨G⟩|Gis aCFGandL(G)=Σ∗}.
THEOREM 5.13
ALL CFGis undecidable.
PROOF This proof is by contradiction. T o get the contradiction, we assume
thatALL CFGis decidable and use this assumption to show that ATMis decidable.
This proof is similar to that of Theorem 5.10 but with a small extra twist: It is a
reduction from ATMvia computation histories, but we modify the representation
of the computation histories slightly for a technical reason that we will explain
later.
We now describe how to use a decision procedure for ALL CFGto decide ATM.
For a TMMand an input w,w ec o n s t r u c ta CFGGthat generates all strings if
and only if Mdoes not accept w.S o i f Mdoes accept w,Gdoes notgenerate
some particular string. This string is—guess what—the accepting computation
history for Monw.T h a t i s , Gis designed to generate all strings that are not
accepting computation histories for Monw.
To m a k e t h e CFGGgenerate all strings that fail to be an accepting computa-
tion history for Monw,w eu t i l i z et h ef o l l o w i n gs t r a t e g y .As t r i n gm a yf a i lt ob e
an accepting computation history for several reasons. An accepting computation
history for Monwappears as #C1#C2#···#Cl#,w h e r e Ciis the configuration
ofMon the ith step of the computation on w.T h e n , Ggenerates all strings
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 250 ---
226 CHAPTER 5 / REDUCIBILITY
1.thatdo not start with C1,
2.thatdo not end with an accepting configuration, or
3.in which some Cidoes not properly yield Ci+1under the rules of M.
IfMdoes not accept w,n oa c c e p t i n gc o m p u t a t i o nh i s t o r ye x i s t s ,s o allstrings
fail in one way or another. Therefore, Gwould generate all strings, as desired.
Now we get down to the actual construction of G.I n s t e a d o f c o n s t r u c t i n g
G,w ec o n s t r u c ta PDAD.W e k n o w t h a t w e c a n u s e t h e c o n s t r u c t i o n g i v e n i n
Theorem 2.20 (page 117) to convert Dto a CFG.W ed os ob e c a u s e ,f o ro u r
purposes, designing a PDAis easier than designing a CFG.I nt h i si n s t a n c e , Dwill
start by nondeterministically branching to guess which of the preceding three
conditions to check. One branch checks on whether the beginning of the input
string is C1and accepts if it isn’t. Another branch checks on whether the input
string ends with a configuration containing the accept state, qaccept,a n da c c e p t s
if it isn’t.
The third branch is supposed to accept if some Cidoes not properly yield
Ci+1.I t w o r k s b y s c a n n i n g t h e i n p u t u n t i l i t n o n d e t e r m i n i s t i c a l l y d e c i d e s t h a t
it has come to Ci.N e x t ,i tp u s h e s Cionto the stack until it comes to the end as
marked by the #symbol. Then Dpops the stack to compare with Ci+1.T h e y
are supposed to match except around the head position, where the difference
is dictated by the transition function of M.F i n a l l y , Daccepts if it discovers a
mismatch or an improper update.
The problem with this idea is that when Dpops Cioff the stack, it is in
reverse order and not suitable for comparison with Ci+1.A tt h i sp o i n t ,t h et w i s t
in the proof appears: We write the accepting computation history differently.
Every other configuration appears in reverse order. The odd positions remain
written in the forward order, but the even positions are written backward. Thus,
an accepting computation history would appear as shown in the following figure.
# −→⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
C1# ←−⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
CR
2# −→⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
C3# ←−⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
CR
4#···#⎪bracehtipupleft
⎪bracehtipdownright⎪bracehtipdownleft
⎪bracehtipupright
Cl#
FIGURE 5.14
Every other configuration written in reverse order
In this modified form, the PDAis able to push a configuration so that when it
is popped, the order is suitable for comparison with the next one. We design D
to accept any string that is not an accepting computation history in the modified
form.
In Exercise 5.1 you can use Theorem 5.13 to show that EQCFGis undecidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 251 ---
5.2 A SIMPLE UNDECIDABLE PROBLEM 227
5.2
A SIMPLE UNDECIDABLE PROBLEM
In this section we show that the phenomenon of undecidability is not confined to
problems concerning automata. We give an example of an undecidable problem
concerning simple manipulations of strings. It is called the Post Correspondence
Problem ,o rPCP.
We can describe this problem easily as a type of puzzle. We begin with a col-
lection of dominos, each containing two strings, one on each side. An individual
domino looks like⎪bracketleft⎢iga
ab⎪bracketright⎢ig
and a collection of dominos looks like
⎪braceleftbigg⎪bracketleft⎢igb
ca⎪bracketright⎢ig
,⎪bracketleft⎢iga
ab⎪bracketright⎢ig
,⎪bracketleft⎢igca
a⎪bracketright⎢ig
,⎪bracketleft⎢igabc
c⎪bracketright⎢ig⎪bracerightbigg
.
The task is to make a list of these dominos (repetitions permitted) so that the
string we get by reading off the symbols on the top is the same as the string of
symbols on the bottom. This list is called a match .F o re x a m p l e ,t h ef o l l o w i n g
list is a match for this puzzle.
⎪bracketleft⎢iga
ab⎪bracketright⎢ig⎪bracketleft⎢igb
ca⎪bracketright⎢ig⎪bracketleft⎢igca
a⎪bracketright⎢ig⎪bracketleft⎢iga
ab⎪bracketright⎢ig⎪bracketleft⎢igabc
c⎪bracketright⎢ig
Reading off the top string we get abcaaabc ,w h i c hi st h es a m ea sr e a d i n go f ft h e
bottom. We can also depict this match by deforming the dominos so that the
corresponding symbols from top and bottom line up.For some collections of dominos, finding a match may not be possible. For
example, the collection
⎪braceleftbigg⎪bracketleft⎢igabc
ab⎪bracketright⎢ig
,⎪bracketleft⎢igca
a⎪bracketright⎢ig
,⎪bracketleft⎢igacc
ba⎪bracketright⎢ig⎪bracerightbigg
cannot contain a match because every top string is longer than the corresponding
bottom string.
The Post Correspondence Problem is to determine whether a collection of
dominos has a match. This problem is unsolvable by algorithms.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 252 ---
228 CHAPTER 5 / REDUCIBILITY
Before getting to the formal statement of this theorem and its proof, let’s state
the problem precisely and then express it as a language. An instance of the PCP
is a collection Pof dominos
P=⎪braceleftbigg⎪bracketleft⎢igt1
b1⎪bracketright⎢ig
,⎪bracketleft⎢igt2
b2⎪bracketright⎢ig
,. . .,⎪bracketleft⎢igtk
bk⎪bracketright⎢ig⎪bracerightbigg
,
and a match is a sequence i1,i2,...,i l,w h e r e ti1ti2···til=bi1bi2···bil.T h e
problem is to determine whether Phas a match. Let
PCP ={⟨P⟩|Pis an instance of the Post Correspondence Problem
with a match }.
THEOREM 5.15
PCP is undecidable.
PROOF IDEA Conceptually this proof is simple, though it involves many de-
tails. The main technique is reduction from ATMvia accepting computation
histories. We show that from any TMMand input w,w ec a nc o n s t r u c ta ni n -
stance Pwhere a match is an accepting computation history for Monw.I fw e
could determine whether the instance has a match, we would be able to deter-
mine whether Maccepts w.
How can we construct Pso that a match is an accepting computation history
forMonw?W e c h o o s e t h e d o m i n o s i n Pso that making a match forces a
simulation of Mto occur. In the match, each domino links a position or positions
in one configuration with the corresponding one(s) in the next configuration.
Before getting to the construction, we handle three small technical points.
(Don’t worry about them too much on your initial reading through this con-
struction.) First, for convenience in constructing P,w ea s s u m et h a t Monw
never attempts to move its head off the left-hand end of the tape. That requires
first altering Mto prevent this behavior. Second, if w=ε,w eu s et h es t r i n g ␣
in place of win the construction. Third, we modify the PCP to require that a
match starts with the first domino,
⎪bracketleft⎢igt1
b1⎪bracketright⎢ig
.
Later we show how to eliminate this requirement. We call this problem the
Modified Post Correspondence Problem (MPCP). Let
MPCP ={⟨P⟩|Pis an instance of the Post Correspondence Problem
with a match that starts with the first domino }.
Now let’s move into the details of the proof and design Pto simulate Monw.
PROOF We let TMRdecide the PCP and construct Sdeciding ATM.L e t
M=(Q,Σ,Γ,δ ,q 0,qaccept,qreject),
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 253 ---
5.2 A SIMPLE UNDECIDABLE PROBLEM 229
where Q,Σ,Γ,a n d δare the state set, input alphabet, tape alphabet, and transi-
tion function of M,r e s p e c t i v e l y .
In this case, Sconstructs an instance of the PCP Pthat has a match iff M
accepts w.T o d o t h a t , Sfirst constructs an instance P′of the MPCP . We de-
scribe the construction in seven parts, each of which accomplishes a particular
aspect of simulating Monw.T o e x p l a i n w h a t w e a r e d o i n g , w e i n t e r l e a v e t h e
construction with an example of the construction in action.
Part 1. The construction begins in the following manner.
Put⎪bracketleft⎢ig#
#q0w1w2···wn#⎪bracketright⎢ig
intoP′as the first domino⎪bracketleft⎢igt1
b1⎪bracketright⎢ig
.
Because P′is an instance of the MPCP , the match must begin with this domino.
Thus, the bottom string begins correctly with C1=q0w1w2···wn,t h efi r s t
configuration in the accepting computation history for Monw,a ss h o w ni nt h e
following figure.
FIGURE 5.16
Beginning of the MPCP match
In this depiction of the partial match achieved so far, the bottom string con-
sists of #q0w1w2···wn#and the top string consists only of #.T og e tam a t c h ,w e
need to extend the top string to match the bottom string. We provide additional
dominos to allow this extension. The additional dominos cause M’s next config-
uration to appear at the extension of the bottom string by forcing a single-step
simulation of M.
In parts 2, 3, and 4, we add to P′dominos that perform the main part of
the simulation. Part 2 handles head motions to the right, part 3 handles head
motions to the left, and part 4 handles the tape cells not adjacent to the head.
Part 2. For every a, b∈Γand every q,r∈Qwhere q̸=qreject,
ifδ(q,a)=( r, b,R),put⎪bracketleft⎢igqa
br⎪bracketright⎢ig
intoP′.
Part 3. For every a, b, c ∈Γand every q,r∈Qwhere q̸=qreject,
ifδ(q,a)=( r, b,L),put⎪bracketleft⎢igcqa
rcb⎪bracketright⎢ig
intoP′.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 254 ---
230 CHAPTER 5 / REDUCIBILITY
Part 4. For every a∈Γ,
put⎪bracketleft⎢iga
a⎪bracketright⎢ig
intoP′.
Now we make up a hypothetical example to illustrate what we have built so
far. Let Γ= {0,1,2,␣}.S a y t h a t wis the string 0100 and that the start state
ofMisq0.I n s t a t e q0,u p o nr e a d i n ga 0,l e t ’ ss a yt h a tt h et r a n s i t i o nf u n c t i o n
dictates that Menters state q7,w r i t e sa 2on the tape, and moves its head to the
right. In other words, δ(q0,0)=( q7,2,R).
Part 1 places the domino
⎪bracketleft⎢ig#
#q00100#⎪bracketright⎢ig
=⎪bracketleft⎢igt1
b1⎪bracketright⎢ig
inP′,a n dt h em a t c hb e g i n s
In addition, part 2 places the domino
⎪bracketleft⎢igq00
2q7⎪bracketright⎢ig
asδ(q0,0)=( q7,2,R)and part 4 places the dominos
⎪bracketleft⎢ig0
0⎪bracketright⎢ig
,⎪bracketleft⎢ig1
1⎪bracketright⎢ig
,⎪bracketleft⎢ig2
2⎪bracketright⎢ig
,and⎪bracketleft⎢ig␣
␣⎪bracketright⎢ig
inP′,a s0,1,2,a n d ␣are the members of Γ.T o g e t h e rw i t hp a r t 5 ,t h a ta l l o w s
us to extend the match to
Thus, the dominos of parts 2, 3, and 4 let us extend the match by adding
the second configuration after the first one. We want this process to continue,
adding the third configuration, then the fourth, and so on. For it to happen, we
need to add one more domino for copying the #symbol.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 255 ---
5.2 A SIMPLE UNDECIDABLE PROBLEM 231
Part 5.
Put⎪bracketleft⎢ig#
#⎪bracketright⎢ig
and⎪bracketleft⎢ig#
␣#⎪bracketright⎢ig
intoP′.
The first of these dominos allows us to copy the #symbol that marks the sep-
aration of the configurations. In addition to that, the second domino allows us
to add a blank symbol ␣at the end of the configuration to simulate the infinitely
many blanks to the right that are suppressed when we write the configuration.
Continuing with the example, let’s say that in state q7,u p o nr e a d i n ga 1,M
goes to state q5,w r i t e sa 0,a n dm o v e st h eh e a dt ot h er i g h t .T h a ti s , δ(q7,1)=
(q5,0,R).T h e nw eh a v et h ed o m i n o
⎪bracketleft⎢igq71
0q5⎪bracketright⎢ig
inP′.
So the latest partial match extends to
Then, suppose that in state q5,u p o nr e a d i n ga 0,Mgoes to state q9,w r i t e s
a2,a n dm o v e si t sh e a dt ot h el e f t . S o δ(q5,0)=( q9,2,L).T h e n w e h a v e t h e
dominos
⎪bracketleft⎢ig0q50
q902⎪bracketright⎢ig
,⎪bracketleft⎢ig1q50
q912⎪bracketright⎢ig
,⎪bracketleft⎢ig2q50
q922⎪bracketright⎢ig
,and⎪bracketleft⎢ig␣q50
q9␣2⎪bracketright⎢ig
.
The first one is relevant because the symbol to the left of the head is a 0.T h e
preceding partial match extends to
Note that as we construct a match, we are forced to simulate Mon input w.
This process continues until Mreaches a halting state. If the accept state occurs,
we want to let the top of the partial match “catch up” with the bottom so that
the match is complete. We can arrange for that to happen by adding additional
dominos.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 256 ---
232 CHAPTER 5 / REDUCIBILITY
Part 6. For every a∈Γ,
put⎪bracketleft⎢igaqaccept
qaccept⎪bracketright⎢ig
and⎪bracketleft⎢igqaccept a
qaccept⎪bracketright⎢ig
intoP′.
This step has the effect of adding “pseudo-steps” of the T uring machine after
it has halted, where the head “eats” adjacent symbols until none are left. Con-
tinuing with the example, if the partial match up to the point when the machine
halts in the accept state is
The dominos we have just added allow the match to continue:
Part 7. Finally, we add the domino
⎪bracketleft⎢igqaccept##
#⎪bracketright⎢ig
and complete the match:
That concludes the construction of P′.R e c a l l t h a t P′is an instance of the
MPCP whereby the match simulates the computation of Monw.T o fi n i s h
the proof, we recall that the MPCP differs from the PCP in that the match is
required to start with the first domino in the list. If we view P′as an instance of
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 257 ---
5.2 A SIMPLE UNDECIDABLE PROBLEM 233
the PCP instead of the MPCP , it obviously has a match, regardless of whether
Maccepts w. Can you find it? (Hint: It is very short.)
We now show how to convert P′toP, an instance of the PCP that still sim-
ulates Monw.W e d o s o w i t h a s o m e w h a t t e c h n i c a l t r i c k .T h e i d e a i s t o t a k e
the requirement that the match starts with the first domino and build it directly
into the problem instance itself so that it becomes enforced automatically. After
that, the requirement isn’t needed. We introduce some notation to implement
this idea.
Letu=u1u2···unbe any string of length n.D e fi n e ⋆u,u⋆,a n d ⋆u⋆to be
the three strings
⋆u =∗u1∗u2∗u3∗· · ·∗ un
u⋆ = u1∗u2∗u3∗· · ·∗ un∗
⋆u⋆ =∗u1∗u2∗u3∗· · ·∗ un∗.
Here, ⋆uadds the symbol ∗before every character in u,u⋆adds one after each
character in u,a n d ⋆u⋆adds one both before and after each character in u.
To c o n v e r t P′toP, an instance of the PCP , we do the following. If P′were
the collection
⎪braceleftbigg⎪bracketleft⎢igt1
b1⎪bracketright⎢ig
,⎪bracketleft⎢igt2
b2⎪bracketright⎢ig
,⎪bracketleft⎢igt3
b3⎪bracketright⎢ig
,. . .,⎪bracketleft⎢igtk
bk⎪bracketright⎢ig⎪bracerightbigg
,
we let Pbe the collection
⎪braceleftbigg⎪bracketleft⎢ig⋆t1
⋆b1⋆⎪bracketright⎢ig
,⎪bracketleft⎢ig⋆t1
b1⋆⎪bracketright⎢ig
,⎪bracketleft⎢ig⋆t2
b2⋆⎪bracketright⎢ig
,⎪bracketleft⎢ig⋆t3
b3⋆⎪bracketright⎢ig
,. . .,⎪bracketleft⎢ig⋆tk
bk⋆⎪bracketright⎢ig
,⎪bracketleft⎢ig∗✸
✸⎪bracketright⎢ig⎪bracerightbigg
.
Considering Pas an instance of the PCP , we see that the only domino that
could possibly start a match is the first one,
⎪bracketleft⎢ig⋆t1
⋆b1⋆⎪bracketright⎢ig
,
because it is the only one where both the top and the bottom start with the same
symbol—namely, ∗.B e s i d e sf o r c i n gt h em a t c ht os t a r tw i t ht h efi r s td o m i n o ,t h e
presence of the ∗sd o e s n ’ ta f f e c tp o s s i b l em a t c h e sb e c a u s et h e ys i m p l yi n t e r l e a v e
with the original symbols. The original symbols now occur in the even positions
of the match. The domino
⎪bracketleft⎢ig∗✸
✸⎪bracketright⎢ig
is there to allow the top to add the extra ∗at the end of the match.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 258 ---
234 CHAPTER 5 / REDUCIBILITY
5.3
MAPPING REDUCIBILITY
We have shown how to use the reducibility technique to prove that various prob-
lems are undecidable. In this section we formalize the notion of reducibility.
Doing so allows us to use reducibility in more refined ways, such as for prov-
ing that certain languages are not T uring-recognizable and for applications in
complexity theory.
The notion of reducing one problem to another may be defined formally in
one of several ways. The choice of which one to use depends on the application.
Our choice is a simple type of reducibility called mapping reducibility .1
Roughly speaking, being able to reduce problem Ato problem Bby using
am a p p i n gr e d u c i b i l i t ym e a n st h a tac o m p u t a b l ef u n c t i o ne x i s t st h a tc o n v e r t s
instances of problem Ato instances of problem B.I fw eh a v es u c hac o n v e r s i o n
function, called a reduction ,w ec a ns o l v e Awith a solver for B.T h e r e a s o n i s
that any instance of Acan be solved by first using the reduction to convert it
to an instance of Band then applying the solver for B.A p r e c i s e d e fi n i t i o n o f
mapping reducibility follows shortly.
COMPUTABLE FUNCTIONS
AT u r i n gm a c h i n ec o m p u t e saf u n c t i o nb ys t a r t i n gw i t ht h ei n p u tt ot h ef u n c t i o n
on the tape and halting with the output of the function on the tape.
DEFINITION 5.17
Af u n c t i o n f:Σ∗−→Σ∗is acomputable function if some T uring
machine M,o ne v e r yi n p u t w,h a l t sw i t hj u s t f(w)on its tape.
EXAMPLE 5.18
All usual, arithmetic operations on integers are computable functions. For ex-
ample, we can make a machine that takes input ⟨m, n⟩and returns m+n,t h e
sum of mandn.W ed o n ’ tg i v ea n yd e t a i l sh e r e ,l e a v i n gt h e ma se x e r c i s e s .
EXAMPLE 5.19
Computable functions may be transformations of machine descriptions. For
example, one computable function ftakes input wand returns the description
of a T uring machine ⟨M′⟩ifw=⟨M⟩is an encoding of a T uring machine M.
1It is called many–one reducibility in some other textbooks.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 259 ---
5.3 MAPPING REDUCIBILITY 235
The machine M′is a machine that recognizes the same language as M,b u t
never attempts to move its head off the left-hand end of its tape. The function
faccomplishes this task by adding several states to the description of M.T h e
function returns εifwis not a legal encoding of a T uring machine.
FORMAL DEFINITION OF MAPPING REDUCIBILITY
Now we define mapping reducibility. As usual, we represent computational
problems by languages.
DEFINITION 5.20
Language Aismapping reducible to language B,w r i t t e n A≤mB,
if there is a computable function f:Σ∗−→Σ∗,w h e r ef o re v e r y w,
w∈A⇐⇒f(w)∈B.
The function fis called the reduction from AtoB.
The following figure illustrates mapping reducibility.
FIGURE 5.21
Function freducing AtoB
Am a p p i n gr e d u c t i o no f AtoBprovides a way to convert questions about
membership testing in Ato membership testing in B.T o t e s t w h e t h e r w∈A,
we use the reduction fto map wtof(w)and test whether f(w)∈B.T h et e r m
mapping reduction comes from the function or mapping that provides the means
of doing the reduction.
If one problem is mapping reducible to a second, previously solved problem,
we can thereby obtain a solution to the original problem. We capture this idea
in Theorem 5.22.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 260 ---
236 CHAPTER 5 / REDUCIBILITY
THEOREM 5.22
IfA≤mBandBis decidable, then Ais decidable.
PROOF We let Mbe the decider for Bandfbe the reduction from AtoB.
We describe a decider NforAas follows.
N=“On input w:
1.Compute f(w).
2.Run Mon input f(w)and output whatever Moutputs. ”
Clearly, if w∈A,t h e n f(w)∈Bbecause fis a reduction from AtoB.T h u s ,
Maccepts f(w)whenever w∈A.T h e r e f o r e , Nworks as desired.
The following corollary of Theorem 5.22 has been our main tool for proving
undecidability.
COROLLARY 5.23
IfA≤mBandAis undecidable, then Bis undecidable.
Now we revisit some of our earlier proofs that used the reducibility method
to get examples of mapping reducibilities.
EXAMPLE 5.24
In Theorem 5.1 we used a reduction from ATMto prove that HALT TMis un-
decidable. This reduction showed how a decider for HALT TMcould be used to
give a decider for ATM.W ec a n d e m o n s t r a t e am a p p i n g r e d u c i b i l i t y f r o m ATM
toHALT TMas follows. T o do so, we must present a computable function fthat
takes input of the form ⟨M,w⟩and returns output of the form ⟨M′,w′⟩,w h e r e
⟨M,w⟩∈ATMif and only if ⟨M′,w′⟩∈HALT TM.
The following machine Fcomputes a reduction f.
F=“On input ⟨M,w⟩:
1.Construct the following machine M′.
M′=“On input x:
1.Run Monx.
2.IfMaccepts, accept .
3.IfMrejects, enter a loop. ”
2.Output ⟨M′,w⟩.”
Am i n o ri s s u ea r i s e sh e r ec o n c e r n i n gi m p r o p e r l yf o r m e di n p u ts t r i n g s .I f TMF
determines that its input is not of the correct form as specified in the input line
“On input ⟨M,w⟩:” and hence that the input is not in ATM,t h e TMoutputs a
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 261 ---
5.3 MAPPING REDUCIBILITY 237
string not in HALT TM.A n ys t r i n gn o ti n HALT TMwill do. In general, when we
describe a T uring machine that computes a reduction from AtoB,i m p r o p e r l y
formed inputs are assumed to map to strings outside of B.
EXAMPLE 5.25
The proof of the undecidability of the Post Correspondence Problem in Theo-
rem 5.15 contains two mapping reductions. First, it shows that ATM≤mMPCP
and then it shows that MPCP ≤mPCP .I n b o t h c a s e s , w e c a n e a s i l y o b -
tain the actual reduction function and show that it is a mapping reduction. As
Exercise 5.6 shows, mapping reducibility is transitive, so these two reductions
together imply that ATM≤mPCP .
EXAMPLE 5.26
Am a p p i n gr e d u c t i o nf r o m ETMtoEQTMlies in the proof of Theorem 5.4. In
this case, the reduction fmaps the input ⟨M⟩to the output ⟨M,M 1⟩,w h e r e M1
is the machine that rejects all inputs.
EXAMPLE 5.27
The proof of Theorem 5.2 showing that ETMis undecidable illustrates the dif-
ference between the formal notion of mapping reducibility that we have defined
in this section and the informal notion of reducibility that we used earlier in this
chapter. The proof shows that ETMis undecidable by reducing ATMto it. Let’s
see whether we can convert this reduction to a mapping reduction.
From the original reduction, we may easily construct a function fthat takes
input ⟨M,w⟩and produces output ⟨M1⟩,w h e r e M1is the T uring machine de-
scribed in that proof. But Maccepts wiffL(M1)isnotempty so fis a mapping
reduction from ATMto
ETM.I t s t i l l s h o w s t h a t ETMis undecidable because
decidability is not affected by complementation, but it doesn’t give a mapping
reduction from ATMtoETM.I nf a c t , n os u c hr e d u c t i o n e x i s t s , a sy o u a r ea s k e d
to show in Exercise 5.5.
The sensitivity of mapping reducibility to complementation is important
in the use of reducibility to prove nonrecognizability of certain languages.
We can also use mapping reducibility to show that problems are not Turing-
recognizable. The following theorem is analogous to Theorem 5.22.
THEOREM 5.28
IfA≤mBandBis T uring-recognizable, then Ais T uring-recognizable.
The proof is the same as that of Theorem 5.22, except that MandNare recog-
nizers instead of deciders.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 262 ---
238 CHAPTER 5 / REDUCIBILITY
COROLLARY 5.29
IfA≤mBandAis not T uring-recognizable, then Bis not T uring-recognizable.
In a typical application of this corollary, we let Abe
ATM,t h ec o m p l e m e n t
ofATM.W e k n o w t h a t
 ATMis not T uring-recognizable from Corollary 4.23.
The definition of mapping reducibility implies that A≤mBmeans the same
as
A≤m
B.T op r o v et h a t Bisn’t recognizable, we may show that ATM≤m
B.
We can also use mapping reducibility to show that certain problems are neither
Tu r i n g - r e c o g n i z a b l e n o r c o - Tu r i n g - r e c o g n i z a b l e , a s i n t h e f o l l o w i n g t h e o r e m .
THEOREM 5.30
EQTMis neither T uring-recognizable nor co-T uring-recognizable.
PROOF First we show that EQTMis not T uring-recognizable. We do so by
showing that ATMis reducible to
 EQTM.T h e r e d u c i n g f u n c t i o n fworks as
follows.
F=“On input ⟨M,w⟩,w h e r e Mis aTMandwas t r i n g :
1.Construct the following two machines, M1andM2.
M1=“On any input:
1.Reject .”
M2=“On any input:
1.Run Monw.I fi ta c c e p t s , accept .”
2.Output ⟨M1,M2⟩.”
Here, M1accepts nothing. If Maccepts w,M2accepts everything, and so the
two machines are not equivalent. Conversely, if Mdoesn’t accept w,M2accepts
nothing, and they are equivalent. Thus freduces ATMto
EQTM,a sd e s i r e d .
To s h o w t h a t
 EQTMis not T uring-recognizable, we give a reduction from
ATMto the complement of
 EQTM—namely, EQTM.H e n c e w e s h o w t h a t
ATM≤mEQTM.T h ef o l l o w i n g TMGcomputes the reducing function g.
G=“On input ⟨M,w⟩,w h e r e Mis aTMandwas t r i n g :
1.Construct the following two machines, M1andM2.
M1=“On any input:
1.Accept .”
M2=“On any input:
1.Run Monw.
2.If it accepts, accept .”
2.Output ⟨M1,M2⟩.”
The only difference between fandgis in machine M1.I n f,m a c h i n e M1
always rejects, whereas in git always accepts. In both fandg,Maccepts wiff
M2always accepts. In g,Maccepts wiffM1andM2are equivalent. That is why
gis a reduction from ATMtoEQTM.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 263 ---
EXERCISES 239
EXERCISES
5.1 Show that EQCFGis undecidable.
5.2 Show that EQCFGis co-T uring-recognizable.
5.3 Find a match in the following instance of the Post Correspondence Problem.
⎨braceleftbigg⎨bracketleftBigab
abab⎨bracketrightBig
,⎨bracketleftBigb
a⎨bracketrightBig
,⎨bracketleftBigaba
b⎨bracketrightBig
,⎨bracketleftBigaa
a⎨bracketrightBig⎨bracerightbigg
5.4 IfA≤mBandBis a regular language, does that imply that Ais a regular lan-
guage? Why or why not?
A5.5 Show that ATMis not mapping reducible to ETM.I n o t h e r w o r d s , s h o w t h a t n o
computable function reduces ATMtoETM. (Hint: Use a proof by contradiction,
and facts you already know about ATMandETM.)
A5.6 Show that ≤mis a transitive relation.
A5.7 Show that if Ais T uring-recognizable and A≤m
A,t h e n Ais decidable.
A5.8 In the proof of Theorem 5.15, we modified the T uring machine Mso that it never
tries to move its head off the left-hand end of the tape. Suppose that we did not
make this modification to M.M o d i f yt h eP C Pc o n s t r u c t i o nt oh a n d l et h i sc a s e .
PROBLEMS
5.9 LetT={⟨M⟩|Mis aTMthat accepts wRwhenever it accepts w}.S h o w t h a t T
is undecidable.
A5.10 Consider the problem of determining whether a two-tape T uring machine ever
writes a nonblank symbol on its second tape when it is run on input w.F o r m u l a t e
this problem as a language and show that it is undecidable.
A5.11 Consider the problem of determining whether a two-tape T uring machine ever
writes a nonblank symbol on its second tape during the course of its computation
on any input string. Formulate this problem as a language and show that it is
undecidable.
5.12 Consider the problem of determining whether a single-tape T uring machine ever
writes a blank symbol over a nonblank symbol during the course of its computation
on any input string. Formulate this problem as a language and show that it is
undecidable.
5.13 Auseless state in a T uring machine is one that is never entered on any input string.
Consider the problem of determining whether a T uring machine has any useless
states. Formulate this problem as a language and show that it is undecidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 264 ---
240 CHAPTER 5 / REDUCIBILITY
5.14 Consider the problem of determining whether a T uring machine Mon an input
wever attempts to move its head left when its head is on the left-most tape cell.
Formulate this problem as a language and show that it is undecidable.
5.15 Consider the problem of determining whether a T uring machine Mon an input
wever attempts to move its head left at any point during its computation on w.
Formulate this problem as a language and show that it is decidable.
5.16 LetΓ={0,1,␣}be the tape alphabet for all TMsi nt h i sp r o b l e m .D e fi n et h e busy
beaver function BB:N− →N as follows. For each value of k,c o n s i d e ra l l k-state
TMst h a th a l tw h e ns t a r t e dw i t hab l a n kt a p e .L e t BB(k)be the maximum number
of1st h a tr e m a i no nt h et a p ea m o n ga l lo ft h e s em a c h i n e s .S h o wt h a t BBis not a
computable function.
5.17 Show that the Post Correspondence Problem is decidable over the unary alphabet
Σ={1}.
5.18 Show that the Post Correspondence Problem is undecidable over the binary alpha-
betΣ={0,1}.
5.19 In the silly Post Correspondence Problem ,SPCP ,t h et o ps t r i n gi ne a c hp a i rh a st h e
same length as the bottom string. Show that the SPCP is decidable.
5.20 Prove that there exists an undecidable subset of {1}∗.
5.21 LetAMBIG CFG={⟨G⟩|Gis an ambiguous CFG}.Show that AMBIG CFGis unde-
cidable. (Hint: Use a reduction from PCP.G i v e na ni n s t a n c e
P=⎨braceleftbigg⎨bracketleftBigt1
b1⎨bracketrightBig
,⎨bracketleftBigt2
b2⎨bracketrightBig
,. . .,⎨bracketleftBigtk
bk⎨bracketrightBig⎨bracerightbigg
of the Post Correspondence Problem, construct a CFG Gwith the rules
S→T|B
T→t1Ta1|· · ·| tkTak|t1a1|· · ·| tkak
B→b1Ba1|· · ·| bkBak|b1a1|· · ·| bkak,
where a1,...,akare new terminal symbols. Prove that this reduction works.)
5.22 Show that Ais T uring-recognizable iff A≤mATM.
5.23 Show that Ais decidable iff A≤m0∗1∗.
5.24 LetJ={w|either w=0xfor some x∈ATM,o rw=1yfor some y∈
ATM}.
Show that neither Jnor
Jis T uring-recognizable.
5.25 Give an example of an undecidable language B,w h e r e B≤m
B.
5.26 Define a two-headed finite automaton (2DFA )t ob ead e t e r m i n i s t i cfi n i t ea u t o m a -
ton that has two read-only, bidirectional heads that start at the left-hand end of the
input tape and can be independently controlled to move in either direction. The
tape of a 2DFA is finite and is just large enough to contain the input plus two ad-
ditional blank tape cells, one on the left-hand end and one on the right-hand end,
that serve as delimiters. A 2DFA accepts its input by entering a special accept state.
For example, a 2DFA can recognize the language {anbncn|n≥0}.
a.LetA2DFA={⟨M,x⟩|Mis a2DFA andMaccepts x}.Show that A2DFAis
decidable.
b.LetE2DFA={⟨M⟩|Mis a2DFA andL(M)=∅}.S h o w t h a t E2DFAis not
decidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 265 ---
PROBLEMS 241
5.27 Atwo-dimensional finite automaton (2DIM-DFA )i sd e fi n e da sf o l l o w s .T h ei n p u t
is an m×nrectangle, for any m, n ≥2.T h e s q u a r e s a l o n g t h e b o u n d a r y o f t h e
rectangle contain the symbol #and the internal squares contain symbols over the
input alphabet Σ.T h et r a n s i t i o nf u n c t i o n δ:Q×(Σ∪{#})−→Q×{L,R,U,D}
indicates the next state and the new head position (Left, Right, Up, Down). The
machine accepts when it enters one of the designated accept states. It rejects if it
tries to move off the input rectangle or if it never halts. T wo such machines are
equivalent if they accept the same rectangles. Consider the problem of determin-
ing whether two of these machines are equivalent. Formulate this problem as a
language and show that it is undecidable.
A⋆5.28 Rice’s theorem .L e t Pbe any nontrivial property of the language of a T uring
machine. Prove that the problem of determining whether a given T uring machine’s
language has property Pis undecidable.
In more formal terms, let Pbe a language consisting of T uring machine descrip-
tions where Pfulfills two conditions. First, Pis nontrivial—it contains some, but
not all, TMdescriptions. Second, Pis a property of the TM’s language—whenever
L(M1)=L(M2),w eh a v e ⟨M1⟩∈Piff⟨M2⟩∈P. Here, M1andM2are any
TMs. Prove that Pis an undecidable language.
5.29 Show that both conditions in Problem 5.28 are necessary for proving that Pis
undecidable.
5.30 Use Rice’s theorem, which appears in Problem 5.28, to prove the undecidability of
each of the following languages.
Aa.INFINITE TM={⟨M⟩|Mis aTMandL(M)is an infinite language }.
b.{⟨M⟩|Mis aTMand1011 ∈L(M)}.
c.ALL TM={⟨M⟩|Mis aTMandL(M)=Σ∗}.
5.31 Let
f(x)=⎨braceleftBigg
3x+1 for odd x
x/2 for even x
for any natural number x.I fy o us t a r tw i t ha ni n t e g e r xand iterate f,y o uo b t a i na
sequence, x,f(x),f(f(x)),....S t o pi fy o ue v e rh i t1 .F o re x a m p l e ,i f x=1 7,y o u
get the sequence 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1. Extensive computer
tests have shown that every starting point between 1and a large positive integer
gives a sequence that ends in 1.B u t t h e q u e s t i o n o f w h e t h e r a l l p o s i t i v e s t a r t i n g
points end up at 1is unsolved; it is called the 3x+1problem.
Suppose that ATMwere decidable by a TMH.U s e Hto describe a TMthat is
guaranteed to state the answer to the 3x+1problem.
5.32 Prove that the following two languages are undecidable.
a.OVERLAP CFG={⟨G, H⟩|GandHareCFGsw h e r e L(G)∩L(H)̸=∅}.
(Hint: Adapt the hint in Problem 5.21.)
b.PREFIX-FREE CFG={⟨G⟩|Gis aCFG where L(G)is prefix-free }.
5.33 Consider the problem of determining whether a PDAaccepts some string of the
form {ww|w∈{0,1}∗}.U s e t h e c o m p u t a t i o n h i s t o r y m e t h o d t o s h o w t h a t t h i s
problem is undecidable.
5.34 LetX={⟨M,w⟩|Mis a single-tape TMthat never modifies the portion of the
tape that contains the input w}.I sXdecidable? Prove your answer.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 266 ---
242 CHAPTER 5 / REDUCIBILITY
5.35 Say that a variable AinCFG Gisnecessary if it appears in every derivation of some
string w∈G.L e t NECESSARY CFG={⟨G, A⟩|Ais a necessary variable in G}.
a.Show that NECESSARY CFGis T uring-recognizable.
b.Show that NECESSARY CFGis undecidable.
⋆5.36 Say that a CFG isminimal if none of its rules can be removed without changing the
language generated. Let MIN CFG={⟨G⟩|Gis a minimal CFG}.
a.Show that MIN CFGis T-recognizable.
b.Show that MIN CFGis undecidable.
SELECTED SOLUTIONS
5.5 Suppose for a contradiction that ATM≤mETMvia reduction f.I t f o l l o w s f r o m
the definition of mapping reducibility that
 ATM≤m
ETMvia the same reduction
function f. However,
 ETMis T uring-recognizable (see the solution to Exercise 4.5)
and
ATMis not T uring-recognizable, contradicting Theorem 5.28.
5.6 Suppose A≤mBandB≤mC.T h e n t h e r e a r e c o m p u t a b l e f u n c t i o n s fand
gsuch that x∈A⇐⇒f(x)∈Bandy∈B⇐⇒g(y)∈C.C o n s i d e r t h e
composition function h(x)= g(f(x)).W e c a n b u i l d a TMthat computes has
follows: First, simulate a TMforf(such a TMexists because we assumed that f
is computable) on input xand call the output y.T h e n s i m u l a t e a TMforgony.
The output is h(x)=g(f(x)).T h e r e f o r e , his a computable function. Moreover,
x∈A⇐⇒h(x)∈C. Hence A≤mCvia the reduction function h.
5.7 Suppose that A≤m
A.T h e n
 A≤mAvia the same mapping reduction. Because A
is T uring-recognizable, Theorem 5.28 implies that
 Ais T uring-recognizable, and
then Theorem 4.22 implies that Ais decidable.
5.8 You need to handle the case where the head is at the leftmost tape cell and attempts
to move left. T o do so, add dominos
⎨bracketleftBig#qa
#rb⎨bracketrightBig
for every q,r∈Qanda, b∈Γ,w h e r e δ(q,a)=( r, b,L).A d d i t i o n a l l y ,r e p l a c et h e
first domino with
⎨bracketleftBig#
##q0w1w2···wn⎨bracketrightBig
to handle the case where the head attempts to move left in the very first move.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 267 ---
SELECTED SOLUTIONS 243
5.10 LetB={⟨M,w⟩|Mis a two-tape TMthat writes a nonblank symbol on its second
tape when it is run on w}.S h o w t h a t ATMreduces to B.A s s u m e f o r t h e s a k e o f
contradiction that TMRdecides B.T h e nc o n s t r u c ta TMSthat uses Rto decide
ATM.
S=“On input ⟨M,w⟩:
1.UseMto construct the following two-tape TMT.
T=“On input x:
1.Simulate Monxusing the first tape.
2.If the simulation shows that Maccepts, write a non-
blank symbol on the second tape. ”
2.Run Ron⟨T,w⟩to determine whether Ton input wwrites a
nonblank symbol on its second tape.
3.IfRaccepts, Maccepts w,s oaccept .O t h e r w i s e , reject .”
5.11 LetC={⟨M⟩|Mis a two-tape TMthat writes a nonblank symbol on its second
tape when it is run on some input }.S h o wt h a t ATMreduces to C.A s s u m ef o rt h e
sake of contradiction that TMRdecides C.C o n s t r u c ta TMSthat uses Rto decide
ATM.
S=“On input ⟨M,w⟩:
1.UseMandwto construct the following two-tape TMTw.
Tw=“On any input:
1.Simulate Monwusing the first tape.
2.If the simulation shows that Maccepts, write a non-
blank symbol on the second tape. ”
2.Run Ron⟨Tw⟩to determine whether Twever writes a nonblank
symbol on its second tape.
3.IfRaccepts, Maccepts w,s oaccept .O t h e r w i s e , reject .”
5.28 Assume for the sake of contradiction that Pis a decidable language satisfying the
properties and let RPbe a TMthat decides P.W es h o wh o wt od e c i d e ATMusing
RPby constructing TMS.F i r s t , l e t T∅be a TMthat always rejects, so L(T∅)=∅.
You may assume that ⟨T∅⟩ ̸∈Pwithout loss of generality because you could pro-
ceed with
 Pinstead of Pif⟨T∅⟩∈P.B e c a u s e Pis not trivial, there exists a TMT
with ⟨T⟩∈P.D e s i g n Sto decide ATMusing RP’s ability to distinguish between
T∅andT.
S=“On input ⟨M,w⟩:
1.UseMandwto construct the following TMMw.
Mw=“On input x:
1.Simulate Monw.I fi th a l t sa n dr e j e c t s , reject .
If it accepts, proceed to stage 2.
2.Simulate Tonx.I fi ta c c e p t s , accept .”
2.UseTMRPto determine whether ⟨Mw⟩∈P.I fYES,accept .
IfNO,reject .”
TMMwsimulates TifMaccepts w. Hence L(Mw)equals L(T)ifMaccepts w
and∅otherwise. Therefore, ⟨Mw⟩∈PiffMaccepts w.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 268 ---
244 CHAPTER 5 / REDUCIBILITY
5.30 (a)INFINITE TMis a language of TMdescriptions. It satisfies the two conditions
of Rice’s theorem. First, it is nontrivial because some TMsh a v ei n fi n i t el a n g u a g e s
and others do not. Second, it depends only on the language. If two TMsr e c o g n i z e
the same language, either both have descriptions in INFINITE TMor neither do.
Consequently, Rice’s theorem implies that INFINITE TMis undecidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 269 ---
6
ADVANCED TOPICS IN
COMPUTABILITY
THEORY
In this chapter we delve into four deeper aspects of computability theory: (1) the
recursion theorem, (2) logical theories, (3) T uring reducibility, and (4) descrip-
tive complexity. The topic covered in each section is mainly independent of the
others, except for an application of the recursion theorem at the end of the sec-
tion on logical theories. Part Three of this book doesn’t depend on any material
from this chapter.
6.1
THE RECURSION THEOREM
The recursion theorem is a mathematical result that plays an important role in
advanced work in the theory of computability. It has connections to mathemati-
cal logic, the theory of self-reproducing systems, and even computer viruses.
To i n t r o d u c e t h e r e c u r s i o n t h e o r e m , w e c o n s i d e r a p a r a d o x t h a t a r i s e s i n t h e
study of life. It concerns the possibility of making machines that can construct
replicas of themselves. The paradox can be summarized in the following manner.
245
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 270 ---
246 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
1.Living things are machines.
2.Living things can self-reproduce.
3.Machines cannot self-reproduce.
Statement 1 is a tenet of modern biology. We believe that organisms operate
in a mechanistic way. Statement 2 is obvious. The ability to self-reproduce
is an essential characteristic of every biological species. For statement 3, we
make the following argument that machines cannot self-reproduce. Consider
am a c h i n et h a tc o n s t r u c t so t h e rm a c h i n e s ,s u c ha sa na u t o m a t e df a c t o r yt h a t
produces cars. Raw materials go in at one end, the manufacturing robots follow
as e to fi n s t r u c t i o n s ,a n dt h e nc o m p l e t e dv e h i c l e sc o m eo u tt h eo t h e re n d .
We claim that the factory must be more complex than the cars produced, in
the sense that designing the factory would be more difficult than designing a car.
This claim must be true because the factory itself has the car’s design within it,
in addition to the design of all the manufacturing robots. The same reasoning
applies to any machine Athat constructs a machine B:Amust be more complex
than B. But a machine cannot be more complex than itself. Consequently, no
machine can construct itself, and thus self-reproduction is impossible.
How can we resolve this paradox? The answer is simple: Statement 3 is in-
correct. Making machines that reproduce themselves ispossible. The recursion
theorem demonstrates how.
SELF-REFERENCE
Let’s begin by making a T uring machine that ignores its input and prints out
ac o p yo fi t so w nd e s c r i p t i o n . W ec a l lt h i sm a c h i n e SELF .T o h e l p d e s c r i b e
SELF ,w en e e dt h ef o l l o w i n gl e m m a .
LEMMA 6.1
There is a computable function q:Σ∗−→Σ∗,w h e r ei f wis any string, q(w)is
the description of a T uring machine Pwthat prints out wand then halts.
PROOF Once we understand the statement of this lemma, the proof is easy.
Obviously, we can take any string wand construct from it a T uring machine that
haswbuilt into a table so that the machine can simply output wwhen started.
The following TMQcomputes q(w).
Q=“On input string w:
1.Construct the following T uring machine Pw.
Pw=“On any input:
1.Erase input.
2.Write won the tape.
3.Halt. ”
2.Output ⟨Pw⟩.”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 271 ---
6.1 THE RECURSION THEOREM 247
The T uring machine SELF is in two parts: AandB.W e t h i n k o f AandB
as being two separate procedures that go together to make up SELF .W ew a n t
SELF to print out ⟨SELF ⟩=⟨AB⟩.
Part Aruns first and upon completion passes control to B.T h e j o b o f Ais
to print out a description of B,a n dc o n v e r s e l yt h ej o bo f Bis to print out a
description of A.T h e r e s u l t i s t h e d e s i r e d d e s c r i p t i o n o f SELF .T h e j o b s a r e
similar, but they are carried out differently. We show how to get part Afirst.
ForAwe use the machine P⟨B⟩,d e s c r i b e db y q⎪parenleftbig
⟨B⟩⎪parenrightbig
,w h i c hi st h er e s u l to f
applying the function qto⟨B⟩.T h u s ,p a r t Ais a T uring machine that prints out
⟨B⟩.O u r d e s c r i p t i o n o f Adepends on having a description of B.S o w e c a n ’ t
complete the description of Auntil we construct B.
Now for part B.W e m i g h t b e t e m p t e d t o d e fi n e Bwith q⎪parenleftbig
⟨A⟩⎪parenrightbig
,b u tt h a t
doesn’t make sense! Doing so would define Bin terms of A,w h i c hi nt u r ni s
defined in terms of B.T h a tw o u l db ea circular definition of an object in terms
of itself, a logical transgression. Instead, we define Bso that it prints Aby using
ad i f f e r e n ts t r a t e g y : Bcomputes Afrom the output that Aproduces.
We defined ⟨A⟩to be q⎪parenleftbig
⟨B⟩⎪parenrightbig
.N o w c o m e s t h e t r i c k y p a r t :I f Bcan obtain
⟨B⟩,i tc a na p p l y qto that and obtain ⟨A⟩.B u t h o w d o e s Bobtain ⟨B⟩?I t w a s
left on the tape when Afinished! So Bonly needs to look at the tape to obtain
⟨B⟩.T h e n a f t e r Bcomputes q⎪parenleftbig
⟨B⟩⎪parenrightbig
=⟨A⟩,i tc o m b i n e s AandBinto a single
machine and writes its description ⟨AB⟩=⟨SELF ⟩on the tape. In summary,
we have:
A=P⟨B⟩,a n d
B=“On input ⟨M⟩,w h e r e Mis a portion of a TM:
1.Compute q⎪parenleftbig
⟨M⟩⎪parenrightbig
.
2.Combine the result with ⟨M⟩to make a complete TM.
3.Print the description of this TMand halt. ”
This completes the construction of SELF ,f o rw h i c has c h e m a t i cd i a g r a mi s
presented in the following figure.       FIGURE 6.2
Schematic of SELF ,aTMthat prints its own description
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 272 ---
248 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
If we now run SELF ,w eo b s e r v et h ef o l l o w i n gb e h a v i o r .
1.First Aruns. It prints ⟨B⟩on the tape.
2.Bstarts. It looks at the tape and finds its input, ⟨B⟩.
3.Bcalculates q⎪parenleftbig
⟨B⟩⎪parenrightbig
=⟨A⟩and combines that with ⟨B⟩into a
TMdescription, ⟨SELF ⟩.
4.Bprints this description and halts.
We can easily implement this construction in any programming language to
obtain a program that outputs a copy of itself. We can even do so in plain En-
glish. Suppose that we want to give an English sentence that commands the
reader to print a copy of the same sentence. One way to do so is to say:
Print out this sentence.
This sentence has the desired meaning because it directs the reader to print a
copy of the sentence itself. However, it doesn’t have an obvious translation into
ap r o g r a m m i n gl a n g u a g eb e c a u s et h es e l f - r e f e r e n t i a lw o r d“ t h i s ”i nt h es e n t e n c e
usually has no counterpart. But no self-reference is needed to make such a sen-
tence. Consider the following alternative.
Print out two copies of the following, the second one in quotes:
“Print out two copies of the following, the second one in quotes:”
In this sentence, the self-reference is replaced with the same construction used
to make the TMSELF .P a r t Bof the construction is the clause:
Print out two copies of the following, the second one in quotes:
Part Ais the same, with quotes around it. Aprovides a copy of BtoBsoBcan
process that copy as the TMdoes.
The recursion theorem provides the ability to implement the self-referential
thisinto any programming language. With it, any program has the ability to refer
to its own description, which has certain applications, as you will see. Before
getting to that, we state the recursion theorem itself. The recursion theorem
extends the technique we used in constructing SELF so that a program can
obtain its own description and then go on to compute with it, instead of merely
printing it out.
THEOREM 6.3
Recursion theorem LetTbe a T uring machine that computes a function
t:Σ∗×Σ∗−→Σ∗.T h e r e i s a T u r i n g m a c h i n e Rthat computes a function
r:Σ∗−→Σ∗,w h e r ef o re v e r y w,
r(w)=t⎪parenleftbig
⟨R⟩,w⎪parenrightbig
.
The statement of this theorem seems a bit technical, but it actually represents
something quite simple. T o make a T uring machine that can obtain its own
description and then compute with it, we need only make a machine, called T
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 273 ---
6.1 THE RECURSION THEOREM 249
in the statement, that receives the description of the machine as an extra input.
Then the recursion theorem produces a new machine R,w h i c ho p e r a t e se x a c t l y
asTdoes but with R’s description filled in automatically.
PROOF The proof is similar to the construction of SELF .W ec o n s t r u c ta TM
Rin three parts, A,B,a n d T,w h e r e Tis given by the statement of the theorem;
as c h e m a t i cd i a g r a mi sp r e s e n t e di nt h ef o l l o w i n gfi g u r e .       FIGURE 6.4
Schematic of R
Here, Ais the T uring machine P⟨BT⟩described by q⎪parenleftbig
⟨BT⟩⎪parenrightbig
.T o p r e s e r v e
the input w,w er e d e s i g n qso that P⟨BT⟩writes its output following any string
preexisting on the tape. After Aruns, the tape contains w⟨BT⟩.
Again, Bis a procedure that examines its tape and applies qto its contents.
The result is ⟨A⟩.T h e n Bcombines A,B,a n d Tinto a single machine and ob-
tains its description ⟨ABT ⟩=⟨R⟩.F i n a l l y ,i te n c o d e st h a td e s c r i p t i o nt o g e t h e r
with w,p l a c e st h er e s u l t i n gs t r i n g ⟨R, w⟩on the tape, and passes control to T.
TERMINOLOGY FOR THE RECURSION THEOREM
The recursion theorem states that T uring machines can obtain their own de-
scription and then go on to compute with it. At first glance, this capability may
seem to be useful only for frivolous tasks such as making a machine that prints a
copy of itself. But, as we demonstrate, the recursion theorem is a handy tool for
solving certain problems concerning the theory of algorithms.
You can use the recursion theorem in the following way when designing Tur-
ing machine algorithms. If you are designing a machine M,y o uc a ni n c l u d et h e
phrase “obtain own description ⟨M⟩”i nt h ei n f o r m a ld e s c r i p t i o no f M’s algo-
rithm. Upon having obtained its own description, Mcan then go on to use it as
it would use any other computed value. For example, Mmight simply print out
⟨M⟩as happens in the TMSELF ,o ri tm i g h tc o u n tt h en u m b e ro fs t a t e si n ⟨M⟩,
or possibly even simulate ⟨M⟩.T o i l l u s t r a t e t h i s m e t h o d , w e u s e t h e r e c u r s i o n
theorem to describe the machine SELF .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 274 ---
250 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
SELF =“On any input:
1.Obtain, via the recursion theorem, own description ⟨SELF ⟩.
2.Print ⟨SELF ⟩.”
The recursion theorem shows how to implement the “obtain own descrip-
tion” construct. T o produce the machine SELF ,w efi r s tw r i t et h ef o l l o w i n g
machine T.
T=“On input ⟨M,w⟩:
1.Print ⟨M⟩and halt. ”
The TMTreceives a description of a TMMand a string was input, and it prints
the description of M.T h e nt h er e c u r s i o nt h e o r e ms h o w sh o wt oo b t a i na TMR,
which on input woperates like Ton input ⟨R, w⟩.T h u s , Rprints the description
ofR—exactly what is required of the machine SELF .
APPLICATIONS
Acomputer virus is a computer program that is designed to spread itself among
computers. Aptly named, it has much in common with a biological virus. Com-
puter viruses are inactive when standing alone as a piece of code. But when
placed appropriately in a host computer, thereby “infecting” it, they can become
activated and transmit copies of themselves to other accessible machines. Vari-
ous media can transmit viruses, including the Internet and transferable disks. In
order to carry out its primary task of self-replication, a virus may contain the
construction described in the proof of the recursion theorem.
Let’s now consider three theorems whose proofs use the recursion theorem.
An additional application appears in the proof of Theorem 6.17 in Section 6.2.
First we return to the proof of the undecidability of ATM.R e c a l lt h a tw ee a r -
lier proved it in Theorem 4.11, using Cantor’s diagonal method. The recursion
theorem gives us a new and simpler proof.
THEOREM 6.5
ATMis undecidable.
PROOF We assume that Turing machine Hdecides ATM,f o rt h ep u r p o s eo f
obtaining a contradiction. We construct the following machine B.
B=“On input w:
1.Obtain, via the recursion theorem, own description ⟨B⟩.
2.Run Hon input ⟨B,w⟩.
3.Do the opposite of what Hsays. That is, accept ifHrejects and
reject ifHaccepts. ”
Running Bon input wdoes the opposite of what Hdeclares it does. Therefore,
Hcannot be deciding ATM.D o n e !
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 275 ---
6.1 THE RECURSION THEOREM 251
The following theorem concerning minimal T uring machines is another ap-
plication of the recursion theorem.
DEFINITION 6.6
IfMis a T uring machine, then we say that the length of the descrip-
tion⟨M⟩ofMis the number of symbols in the string describing M.
Say that Misminimal if there is no T uring machine equivalent to
Mthat has a shorter description. Let
MIN TM={⟨M⟩|Mis a minimal TM}.
THEOREM 6.7
MIN TMis not T uring-recognizable.
PROOF Assume that some TMEenumerates MIN TMand obtain a contradic-
tion. We construct the following TMC.
C=“On input w:
1.Obtain, via the recursion theorem, own description ⟨C⟩.
2.Run the enumerator Euntil a machine Dappears with a longer
description than that of C.
3.Simulate Don input w.”
Because MIN TMis infinite, E’s list must contain a TMwith a longer descrip-
tion than C’s description. Therefore, step 2 of Ceventually terminates with
some TMDthat is longer than C.T h e n Csimulates Dand so is equivalent to it.
Because Cis shorter than Dand is equivalent to it, Dcannot be minimal. But
Dappears on the list that Eproduces. Thus, we have a contradiction.
Our final application of the recursion theorem is a type of fixed-point theo-
rem. A fixed point of a function is a value that isn’t changed by application of
the function. In this case, we consider functions that are computable transforma-
tions of T uring machine descriptions. We show that for any such transformation,
some T uring machine exists whose behavior is unchanged by the transformation.
This theorem is called the fixed-point version of the recursion theorem.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 276 ---
252 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
THEOREM 6.8
Lett:Σ∗−→Σ∗be a computable function. Then there is a T uring machine
Ffor which t⎪parenleftbig
⟨F⟩⎪parenrightbig
describes a T uring machine equivalent to F.H e r e w e ’ l l
assume that if a string isn’t a proper T uring machine encoding, it describes a
Tu r i n g m a c h i n e t h a t a l w a y s r e j e c t s i m m e d i a t e l y.
In this theorem, tplays the role of the transformation, and Fis the fixed point.
PROOF LetFbe the following T uring machine.
F=“On input w:
1.Obtain, via the recursion theorem, own description ⟨F⟩.
2.Compute t⎪parenleftbig
⟨F⟩⎪parenrightbig
to obtain the description of a TMG.
3.Simulate Gonw.”
Clearly, ⟨F⟩andt⎪parenleftbig
⟨F⟩⎪parenrightbig
=⟨G⟩describe equivalent T uring machines because
Fsimulates G.
6.2
DECIDABILITY OF LOGICAL THEORIES
Mathematical logic is the branch of mathematics that investigates mathematics
itself. It addresses questions such as: What is a theorem? What is a proof? What
is truth? Can an algorithm decide which statements are true? Are all true state-
ments provable? We’ll touch on a few of these topics in our brief introduction
to this rich and fascinating subject.
We focus on the problem of determining whether mathematical statements
are true or false and investigate the decidability of this problem. The answer
depends on the domain of mathematics from which the statements are drawn.
We examine two domains: one for which we can give an algorithm to decide
truth, and another for which this problem is undecidable.
First, we need to set up a precise language to formulate these problems. Our
intention is to be able to consider mathematical statements such as
1.∀q∃p∀x,y⎪bracketleftbig
p>q∧(x,y>1→xy̸=p)⎪bracketrightbig
,
2.∀a,b,c,n⎪bracketleftbig
(a,b,c> 0∧n>2)→an+bn̸=cn⎪bracketrightbig
,a n d
3.∀q∃p∀x,y⎪bracketleftbig
p>q∧(x,y>1→(xy̸=p∧xy̸=p+2))⎪bracketrightbig
.
Statement 1 says that infinitely many prime numbers exist, which has been
known to be true since the time of Euclid, about 2,300 years ago. Statement 2 is
Fermat’s last theorem ,w h i c hh a sb e e nk n o w nt ob et r u eo n l ys i n c eA n d r e wW i l e s
proved it in 1994. Finally, statement 3 says that infinitely many prime pairs1
exist. Known as the twin prime conjecture ,i tr e m a i n su n s o l v e d .
1Prime pairs are primes that differ by 2.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 277 ---
6.2 DECIDABILITY OF LOGICAL THEORIES 253
To c o n s i d e r w h e t h e r w e c o u l d a u t o m a t e t h e p r o c e s s o f d e t e r m i n i n g w h i c h o f
these statements are true, we treat such statements merely as strings and define a
language consisting of those statements that are true. Then we ask whether this
language is decidable.
To m a k e t h i s a b i t m o r e p r e c i s e , l e t ’s d e s c r i b e t h e f o r m o f t h e a l p h a b e t o f t h i s
language:
{∧,∨,¬,(,),∀,∃,x ,R 1,...,R k}.
The symbols ∧,∨,a n d ¬are called Boolean operations ;“(”a n d“ )”a r et h e
parentheses ;t h es y m b o l s ∀and∃are called quantifiers ;t h es y m b o l xis used to
denote variables ;2and the symbols R1,...,R kare called relations .
Aformula is a well-formed string over this alphabet. For completeness, we’ll
sketch the technical but obvious definition of a well-formed formula here, but
feel free to skip this part and go on to the next paragraph. A string of the form
Ri(x1,...,x k)is an atomic formula .T h e v a l u e jis the arity of the relation
symbol Ri.A l la p p e a r a n c e so ft h es a m er e l a t i o ns y m b o li naw e l l - f o r m e df o r m u l a
must have the same arity. Subject to this requirement, a string φis a formula if it
1.is an atomic formula,
2.has the form φ1∧φ2orφ1∨φ2or¬φ1,w h e r e φ1andφ2are smaller
formulas, or
3.has the form ∃xi[φ1]or∀xi[φ1],w h e r e φ1is a smaller formula.
Aq u a n t i fi e rm a ya p p e a ra n y w h e r ei nam a t h e m a t i c a ls t a t e m e n t . I t s scope is
the fragment of the statement appearing within the matched pair of parentheses
or brackets following the quantified variable. We assume that all formulas are in
prenex normal form ,w h e r ea l lq u a n t i fi e r sa p p e a ri nt h ef r o n to ft h ef o r m u l a .A
variable that isn’t bound within the scope of a quantifier is called a free variable .
Af o r m u l aw i t hn of r e ev a r i a b l e si sc a l l e da sentence orstatement .
EXAMPLE 6.9
Among the following examples of formulas, only the last one is a sentence.
1.R1(x1)∧R2(x1,x2,x3)
2.∀x1⎪bracketleftbig
R1(x1)∧R2(x1,x2,x3)⎪bracketrightbig
3.∀x1∃x2∃x3⎪bracketleftbig
R1(x1)∧R2(x1,x2,x3)⎪bracketrightbig
Having established the syntax of formulas, let’s discuss their meanings. The
Boolean operations and the quantifiers have their usual meanings. But to deter-
mine the meaning of the variables and relation symbols, we need to specify two
items. One is the universe over which the variables may take values. The other
2If we need to write several variables in a formula, we use the symbols w,y,z,o rx1,x2,
x3,a n ds oo n . W ed o n ’ tl i s ta l lt h ei n fi n i t e l ym a n yp o s s i b l ev a r i a b l e si nt h ea l p h a b e tt o
keep the alphabet finite. Instead, we list only the variable symbol x,a n du s es t r i n g so f x’s
to indicate other variables, as in xxforx2,xxxforx3,a n ds oo n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 278 ---
254 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
is an assignment of specific relations to the relation symbols. As we described in
Section 0.2 (page 9), a relation is a function from k-tuples over the universe to
{TRUE ,FALSE }.T h ea r i t yo far e l a t i o ns y m b o lm u s tm a t c ht h a to fi t sa s s i g n e d
relation.
Au n i v e r s et o g e t h e rw i t ha na s s i g n m e n to fr e l a t i o n st or e l a t i o ns y m b o l si s
called a model .3Formally, we say that a model Mis a tuple (U, P 1,...,P k),
where Uis the universe and P1through Pkare the relations assigned to symbols
R1through Rk.W e s o m e t i m e s r e f e r t o t h e language of a model to be the
collection of formulas that use only the relation symbols the model assigns, and
that use each relation symbol with the correct arity. If φis a sentence in the
language of a model, φis either true or false in that model. If φis true in a
model M,w es a yt h a t Mis a model of φ.
If you feel overwhelmed by these definitions, concentrate on our objective in
stating them. We want to set up a precise language of mathematical statements
so that we can ask whether an algorithm can determine which are true and which
are false. The following two examples should be helpful.
EXAMPLE 6.10
Letφbe the sentence ∀x∀y⎪bracketleftbig
R1(x, y)∨R1(y,x)⎪bracketrightbig
.L e tm o d e l M1=(N,≤)be
the model whose universe is the natural numbers and that assigns the “less than
or equal” relation to the symbol R1.O b v i o u s l y , φis true in model M1because
either a≤borb≤afor any two natural numbers aandb.H o w e v e r , i f M1
assigned “less than” instead of “less than or equal” to R1,t h e n φwould not be
true because it fails when xandyare equal.
If we know in advance which relation will be assigned to Ri,w em a yu s et h e
customary symbol for that relation in place of Riwith infix notation rather than
prefix notation if customary for that symbol. Thus, with model M1in mind, we
could write φas∀x∀y⎪bracketleftbig
x≤y∨y≤x⎪bracketrightbig
.
EXAMPLE 6.11
Now let M2be the model whose universe is the real numbers Rand that assigns
the relation PLUS toR1,w h e r e PLUS (a, b, c )= TRUE whenever a+b=c.
Then M2is a model of ψ=∀y∃x⎪bracketleftbig
R1(x, x, y )⎪bracketrightbig
.H o w e v e r ,i f Nwere used for
the universe instead of RinM2,t h es e n t e n c ew o u l db ef a l s e .
As in Example 6.10, we may write ψas∀y∃x⎪bracketleftbig
x+x=y⎪bracketrightbig
in place of
∀y∃x⎪bracketleftbig
R1(x, x, y )⎪bracketrightbig
when we know in advance that we will be assigning the ad-
dition relation to R1.
As Example 6.11 illustrates, we can represent functions such as the addition
function by relations. Similarly, we can represent constants such as 0and1by
relations.
3Am o d e li sa l s ov a r i o u s l yc a l l e da n interpretation or astructure .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 279 ---
6.2 DECIDABILITY OF LOGICAL THEORIES 255
Now we give one final definition in preparation for the next section. If M
is a model, we let the theory of M,w r i t t e nT h (M),b et h ec o l l e c t i o no ft r u e
sentences in the language of that model.
AD E C I D A B L ET H E O R Y
Number theory is one of the oldest branches of mathematics and also one of
its most difficult. Many innocent-looking statements about the natural num-
bers with the plus and times operations have confounded mathematicians for
centuries, such as the twin prime conjecture mentioned earlier.
In one of the celebrated developments in mathematical logic, Alonzo Church,
building on the work of Kurt G ¨odel, showed that no algorithm can decide in
general whether statements in number theory are true or false. Formally, we
write (N,+,×)to be the model whose universe is the natural numbers4with
the usual +and×relations. Church showed that Th (N,+,×),t h et h e o r yo f
this model, is undecidable.
Before looking at this undecidable theory, let’s examine one that is decidable.
Let(N,+)be the same model, without the ×relation. Its theory is Th (N,+).
For example, the formula ∀x∃y⎪bracketleftbig
x+x=y⎪bracketrightbig
is true and is therefore a member
of Th (N,+),b u tt h ef o r m u l a ∃y∀x⎪bracketleftbig
x+x=y⎪bracketrightbig
is false and is therefore not a
member.
THEOREM 6.12
Th(N,+)is decidable.
PROOF IDEA This proof is an interesting and nontrivial application of the
theory of finite automata that we presented in Chapter 1. One fact about finite
automata that we use appears in Problem 1.32, (page 88) where you were asked
to show that they are capable of doing addition if the input is presented in a
special form. The input describes three numbers in parallel by representing one
bit of each number in a single symbol from an eight-symbol alphabet. Here we
use a generalization of this method to present i-tuples of numbers in parallel
using an alphabet with 2isymbols.
We give an algorithm that can determine whether its input, a sentence φin
the language of (N,+),i st r u ei nt h a tm o d e l .L e t
φ=Q1x1Q2x2···Qlxl⎪bracketleftbig
ψ⎪bracketrightbig
,
where Q1,...,Qleach represents either ∃or∀andψis a formula without quan-
tifiers that has variables x1,...,x l.F o re a c h ifrom 0tol,d e fi n ef o r m u l a φias
φi=Qi+1xi+1Qi+2xi+2···Qlxl⎪bracketleftbig
ψ⎪bracketrightbig
.
Thus φ0=φandφl=ψ.
4For convenience in this chapter, we change our usual definition of Nto be {0,1,2,...}.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 280 ---
256 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
Formula φihasifree variables. For a1,...,a i∈N,w r i t e φi(a1,...,a i)to be
the sentence obtained by substituting the constants a1,...,a ifor the variables
x1,...,x iinφi.
For each ifrom 0tol,t h ea l g o r i t h mc o n s t r u c t safi n i t ea u t o m a t o n Aithat
recognizes the collection of strings representing i-tuples of numbers that make
φitrue. The algorithm begins by constructing Aldirectly, using a generalization
of the method in the solution to Problem 1.32. Then, for each ifrom ldown
to1,i tu s e s Aito construct Ai−1.F i n a l l y , o n c e t h e a l g o r i t h m h a s A0,i tt e s t s
whether A0accepts the empty string. If it does, φis true and the algorithm
accepts.
PROOF Fori>0,d e fi n et h ea l p h a b e t
Σi=⎪braceleft⎢igg⎪bracketleft⎢igg0...
0
0⎪bracketright⎢igg
,⎪bracketleft⎢igg0...
0
1⎪bracketright⎢igg
,⎪bracketleft⎢igg0...
1
0⎪bracketright⎢igg
,⎪bracketleft⎢igg0...
1
1⎪bracketright⎢igg
,. . . ,⎪bracketleft⎢igg1...
1
1⎪bracketright⎢igg⎪braceright⎢igg
.
Hence Σicontains all size icolumns of 0sa n d 1s. A string over Σirepresents i
binary integers (reading across the rows). We also define Σ0={[]},w h e r e []is
as y m b o l .
We now present an algorithm that decides Th (N,+).O n i n p u t φ,w h e r e
φis a sentence, the algorithm operates as follows. Write φand define φifor
each ifrom 0tol,a si nt h ep r o o fi d e a . F o re a c hs u c h i,c o n s t r u c tafi n i t e
automaton Aifrom φithat accepts strings over Σicorresponding to i-tuples
a1,...,a iwhenever φi(a1,...,a i)is true, as follows.
To c o n s t r u c t t h e fi r s t m a c h i n e Al,o b s e r v et h a t φl=ψis a Boolean combi-
nation of atomic formulas. An atomic formula in the language of Th (N,+)is a
single addition. Finite automata can be constructed to compute any of these in-
dividual relations corresponding to a single addition and then combined to give
the automaton Al.D o i n g s o i n v o l v e s t h e u s e o f t h e r e g u l a r l a n g u a g e c l o s u r e
constructions for union, intersection, and complementation to compute Boolean
combinations of the atomic formulas.
Next, we show how to construct Aifrom Ai+1.I fφi=∃xi+1φi+1,we con-
struct Aito operate as Ai+1operates, except that it nondeterministically guesses
the value of ai+1instead of receiving it as part of the input.
More precisely, Aicontains a state for each Ai+1state and a new start state.
Every time Aireads a symbol
⎡
⎣b1...
bi−1
bi⎤
⎦
,
where every bj∈{0,1}is a bit of the number aj,i tn o n d e t e r m i n i s t i c a l l yg u e s s e s
z∈{0,1}and simulates Ai+1on the input symbol
⎡
⎢⎣b1...
bi−1
biz⎤
⎥⎦
.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 281 ---
6.2 DECIDABILITY OF LOGICAL THEORIES 257
Initially, Ainondeterministically guesses the leading bits of ai+1corresponding
to suppressed leading 0si n a1through aiby nondeterministically branching
using ε-transitions from its new start state to all states that Ai+1could reach
from its start state with input strings of the symbols
⎪braceleft⎢igg⎪bracketleft⎢igg0...
0
0⎪bracketright⎢igg
,⎪bracketleft⎢igg0...
0
1⎪bracketright⎢igg⎪braceright⎢igg
inΣi+1. Clearly, Aiaccepts its input (a1,...,a i)if some ai+1exists where Ai+1
accepts (a1,...,a i+1).
Ifφi=∀xi+1φi+1,it is equivalent to ¬∃xi+1¬φi+1.Thus, we can construct
the finite automaton that recognizes the complement of the language of Ai+1,
then apply the preceding construction for the ∃quantifier, and finally apply com-
plementation once again to obtain Ai.
Finite automaton A0accepts any input iff φ0is true. So the final step of the
algorithm tests whether A0accepts ε.I f i t d o e s , φis true and the algorithm
accepts; otherwise, it rejects.
AN UNDECIDABLE THEORY
As we mentioned earlier, Th (N,+,×)is an undecidable theory. No algorithm
exists for deciding the truth or falsity of mathematical statements, even when re-
stricted to the language of (N,+,×).T h i st h e o r e mh a sg r e a ti m p o r t a n c ep h i l o -
sophically because it demonstrates that mathematics cannot be mechanized. We
state this theorem, but give only a brief sketch of its proof.
THEOREM 6.13
Th(N,+,×)is undecidable.
Although it contains many details, the proof of this theorem is not difficult
conceptually. It follows the pattern of the other proofs of undecidability pre-
sented in Chapter 4. We show that Th (N,+,×)is undecidable by reducing ATM
to it, using the computation history method as previously described (page 220).
The existence of the reduction depends on the following lemma.
LEMMA 6.14
LetMbe a T uring machine and was t r i n g .W ec a nc o n s t r u c tf r o m Mandwa
formula φM,win the language of (N,+,×)that contains a single free variable x,
whereby the sentence ∃xφM,wis true iff Maccepts w.
PROOF IDEA Formula φM,w“says” that xis a (suitably encoded) accepting
computation history of Monw.O f c o u r s e , xactually is just a rather large
integer, but it represents a computation history in a form that can be checked by
using the +and×operations.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 282 ---
258 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
The actual construction of φM,wis too complicated to present here. It ex-
tracts individual symbols in the computation history with the +and×operations
to check that the start configuration for Monwis correct, that each configura-
tion legally follows from the one preceding it, and that the last configuration is
accepting.
PROOF OF THEOREM 6.13 We give a mapping reduction from ATMto
Th(N,+,×).T h e r e d u c t i o n c o n s t r u c t s t h e f o r m u l a φM,wfrom the input
⟨M,w⟩by using Lemma 6.14. Then it outputs the sentence ∃xφM,w.
Next, we sketch the proof of Kurt G ¨odel’s celebrated incompleteness theorem .
Informally, this theorem says that in any reasonable system of formalizing the
notion of provability in number theory, some true statements are unprovable.
Loosely speaking, the formal proof πof a statement φis a sequence of state-
ments, S1,S2,...,S l,w h e r e Sl=φ.E a c h Sifollows from the preceding state-
ments and certain basic axioms about numbers, using simple and precise rules
of implication. We don’t have space to define the concept of proof; but for our
purposes, assuming the following two reasonable properties of proofs will be
enough.
1.The correctness of a proof of a statement can be checked by machine.
Formally, {⟨φ, π⟩|πis a proof of φ}is decidable.
2.The system of proofs is sound .T h a ti s ,i fas t a t e m e n ti sp r o v a b l e( i . e . ,h a sa
proof), it is true.
If a system of provability satisfies these two conditions, the following three the-
orems hold.
THEOREM 6.15
The collection of provable statements in Th (N,+,×)is T uring-recognizable.
PROOF The following algorithm Paccepts its input φifφis provable. Al-
gorithm Ptests each string as a candidate for a proof πofφ,u s i n gt h ep r o o f
checker assumed in provability property 1. If it finds that any of these candi-
dates is a proof, it accepts.
Now we can use the preceding theorem to prove our version of the incom-
pleteness theorem.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 283 ---
6.2 DECIDABILITY OF LOGICAL THEORIES 259
THEOREM 6.16
Some true statement in Th (N,+,×)is not provable.
PROOF We give a proof by contradiction. We assume to the contrary that all
true statements are provable. Using this assumption, we describe an algorithm
Dthat decides whether statements are true, contradicting Theorem 6.13.
On input φ,a l g o r i t h m Doperates by running algorithm Pgiven in the proof
of Theorem 6.15 in parallel on inputs φand¬φ.O n e o f t h e s e t w o s t a t e m e n t s
is true and thus by our assumption is provable. Therefore, Pmust halt on one
of the two inputs. By provability property 2, if φis provable, then φis true; and
if¬φis provable, then φis false. So algorithm Dcan decide the truth or falsity
ofφ.
In the final theorem of this section, we use the recursion theorem to give
an explicit sentence in the language of (N,+,×)that is true but not provable.
In Theorem 6.16 we demonstrated the existence of such a sentence but didn’t
actually describe one, as we do now.
THEOREM 6.17
The sentence ψunprovable ,a sd e s c r i b e di nt h ep r o o f ,i su n p r o v a b l e .
PROOF IDEA Construct a sentence that says “This sentence is not provable,”
using the recursion theorem to obtain the self-reference.
PROOF LetSbe a TMthat operates as follows.
S=“On any input:
1.Obtain own description ⟨S⟩via the recursion theorem.
2.Construct the sentence ψ=¬∃c⎪bracketleftbig
φS,0⎪bracketrightbig
,u s i n gL e m m a6 . 1 4 .
3.Run algorithm Pfrom the proof of Theorem 6.15 on input ψ.
4.If stage 3 accepts, accept .”
Letψunprovable be the sentence ψdescribed in stage 2 of algorithm S.T h a t
sentence is true iff Sdoesn’t accept 0(the string 0was selected arbitrarily).
IfSfinds a proof of ψunprovable ,Saccepts 0,a n dt h es e n t e n c ew o u l dt h u sb e
false. A false sentence cannot be provable, so this situation cannot occur. The
only remaining possibility is that Sfails to find a proof of ψunprovable and so S
doesn’t accept 0.B u tt h e n ψunprovable is true, as we claimed.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 284 ---
260 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
6.3
TURING REDUCIBILITY
We introduced the reducibility concept in Chapter 5 as a way of using a solution
to one problem to solve other problems. Thus, if Ais reducible to B,a n dw e
find a solution to B,w ec a no b t a i nas o l u t i o nt o A.S u b s e q u e n t l y ,w ed e s c r i b e d
mapping reducibility ,as p e c i fi cf o r mo fr e d u c i b i l i t y .B u td o e sm a p p i n gr e d u c i b i l i t y
capture our intuitive concept of reducibility in the most general way? It doesn’t.
For example, consider the two languages ATMand
ATM.I n t u i t i v e l y , t h e y
are reducible to one another because a solution to either could be used to solve
the other by simply reversing the answer. However, we know that
 ATMisnot
mapping reducible to ATMbecause ATMis T uring-recognizable but
 ATMisn’t.
Here we present a very general form of reducibility, called Turing reducibility ,
which captures our intuitive concept of reducibility more closely.
DEFINITION 6.18
Anoracle for a language Bis an external device that is capable of
reporting whether any string wis a member of B.A n oracle Turing
machine is a modified T uring machine that has the additional ca-
pability of querying an oracle. We write MBto describe an oracle
Tu r i n g m a c h i n e t h a t h a s a n o r a c l e f o r l a n g u a g e B.
We aren’t concerned with the way the oracle determines its responses. We use
the term oracle to connote a magical ability and consider oracles for languages
that aren’t decidable by ordinary algorithms, as the following example shows.
EXAMPLE 6.19
Consider an oracle for ATM.A no r a c l eT u r i n gm a c h i n ew i t ha no r a c l ef o r ATM
can decide more languages than an ordinary T uring machine can. Such a ma-
chine can (obviously) decide ATMitself, by querying the oracle about the input.
It can also decide ETM,t h ee m p t i n e s st e s t i n gp r o b l e mf o r TMsw i t ht h ef o l l o w i n g
procedure called TATM.
TATM=“On input ⟨M⟩,w h e r e Mis aTM:
1.Construct the following TMN.
N=“On any input:
1.Run Min parallel on all strings in Σ∗.
2.IfMaccepts any of these strings, accept .”
2.Query the oracle to determine whether ⟨N,0⟩∈ATM.
3.If the oracle answers NO,accept ;i fYES,reject .”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 285 ---
6.4 A DEFINITION OF INFORMATION 261
IfM’s language isn’t empty, Nwill accept every input and, in particular, in-
put0.H e n c e t h e o r a c l e w i l l a n s w e r YES,a n d TATMwill reject. Conversely, if
M’s language is empty, TATMwill accept. Thus TATMdecides ETM.W es a yt h a t
ETMisdecidable relative to ATM.T h a t b r i n g s u s t o t h e d e fi n i t i o n o f T u r i n g
reducibility.
DEFINITION 6.20
Language AisTuring reducible to language B,w r i t t e n A≤TB,i f
Ais decidable relative to B.
Example 6.19 shows that ETMis T uring reducible to ATM.T u r i n gr e d u c i b i l i t y
satisfies our intuitive concept of reducibility as shown by the following theorem.
THEOREM 6.21
IfA≤TBandBis decidable, then Ais decidable.
PROOF IfBis decidable, then we may replace the oracle for Bby an actual
procedure that decides B.T h u s ,w em a yr e p l a c et h eo r a c l eT u r i n gm a c h i n et h a t
decides Aby an ordinary T uring machine that decides A.
Tu r i n g r e d u c i b i l i t y i s a g e n e r a l i z a t i o n o f m a p p i n g r e d u c i b i l i t y. I f A≤mB,
then A≤TBbecause the mapping reduction may be used to give an oracle
Tu r i n g m a c h i n e t h a t d e c i d e s Arelative to B.
An oracle T uring machine with an oracle for ATMis very powerful. It can
solve many problems that are not solvable by ordinary T uring machines. But
even such a powerful machine cannot decide all languages (see Exercise 6.4).
6.4
A DEFINITION OF INFORMATION
The concepts algorithm andinformation are fundamental in computer science.
While the Church– T uring thesis gives a universally applicable definition of al-
gorithm, no equally comprehensive definition of information is known. Instead
of a single, universal definition of information, several definitions are used—
depending upon the application. In this section we present one way of defining
information, using computability theory.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 286 ---
262 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
We start with an example. Consider the information content of the following
two binary sequences.
A=0101010101010101010101010101010101010101
B=1110010110100011101010000111010011010111
Intuitively, sequence Acontains little information because it is merely a repe-
tition of the pattern 01twenty times. In contrast, sequence Bappears to contain
more information.
We can use this simple example to illustrate the idea behind the definition of
information that we present. We define the quantity of information contained in
an object to be the size of that object’s smallest representation or description. By
ad e s c r i p t i o no fa no b j e c t ,w em e a nap r e c i s ea n du n a m b i g u o u sc h a r a c t e r i z a t i o n
of the object so that we may recreate it from the description alone. Thus, se-
quence Acontains little information because it has a small description, whereas
sequence Bapparently contains more information because it seems to have no
concise description.
Why do we consider only the shortest description when determining an ob-
ject’s quantity of information? We may always describe an object, such as a
string, by placing a copy of the object directly into the description. Thus, we
can obviously describe the preceding string Bwith a table that is 40 bits long
containing a copy of B.T h i st y p eo fd e s c r i p t i o ni sn e v e rs h o r t e rt h a nt h eo b j e c t
itself and doesn’t tell us anything about its information quantity. However, a de-
scription that is significantly shorter than the object implies that the information
contained within it can be compressed into a small volume, and so the amount
of information can’t be very large. Hence the size of the shortest description
determines the amount of information.
Now we formalize this intuitive idea. Doing so isn’t difficult, but we must do
some preliminary work. First, we restrict our attention to objects that are binary
strings. Other objects can be represented as binary strings, so this restriction
doesn’t limit the scope of the theory. Second, we consider only descriptions
that are themselves binary strings. By imposing this requirement, we may easily
compare the length of the object with the length of its description. In the next
section, we consider the type of description that we allow.
MINIMAL LENGTH DESCRIPTIONS
Many types of description language can be used to define information. Selecting
which language to use affects the characteristics of the definition. Our descrip-
tion language is based on algorithms.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 287 ---
6.4 A DEFINITION OF INFORMATION 263
One way to use algorithms to describe strings is to construct a T uring machine
that prints out the string when it is started on a blank tape and then represent
that T uring machine itself as a string. Thus, the string representing the T uring
machine is a description of the original string. A drawback to this approach is
that a T uring machine cannot represent a table of information concisely with its
transition function. T o represent a string of nbits, you might use nstates and
nrows in the transition function table. That would result in a description that
is excessively long for our purpose. Instead, we use the following more concise
description language.
We describe a binary string xwith a T uring machine Mand a binary input
wtoM.T h e l e n g t h o f t h e d e s c r i p t i o n i s t h e c o m b i n e d l e n g t h o f r e p r e s e n t i n g
Mandw.W e w r i t e t h i s d e s c r i p t i o n w i t h o u r u s u a l n o t a t i o n f o r e n c o d i n g s e v -
eral objects into a single binary string ⟨M,w⟩.B u t h e r ew em u s t p a y a d d i t i o n a l
attention to the encoding operation ⟨·,·⟩because we need to produce a concise
result. We define the string ⟨M,w⟩to be ⟨M⟩w,w h e r ew es i m p l yc o n c a t e n a t e
the binary string wonto the end of the binary encoding of M.T h e e n c o d i n g
⟨M⟩ofMmay be done in any standard way, except for the subtlety that we de-
scribe in the next paragraph. (Don’t worry about this subtle point on your first
reading of this material. For now, skip past the next paragraph and the following
figure.)
When concatenating wonto the end of ⟨M⟩to yield a description of x,y o u
might run into trouble if the point at which ⟨M⟩ends and wbegins is not dis-
cernible from the description itself. Otherwise, several ways of partitioning the
description ⟨M⟩winto a syntactically correct TMand an input may occur, and
then the description would be ambiguous and hence invalid. We avoid this prob-
lem by ensuring that we can locate the separation between ⟨M⟩andwin⟨M⟩w.
One way to do so is to write each bit of ⟨M⟩twice, writing 0as00and1as11,
and then follow it with 01to mark the separation point. We illustrate this idea
in the following figure, depicting the description ⟨M,w⟩of some string x.
⟨M,w⟩=11001111001100 ···1100delimiter⎪bracehtipdownleft⎪bracehtipupright⎪bracehtipupleft⎪bracehtipdownright
01⎪bracehtipupleft
 ⎪bracehtipdownright⎪bracehtipdownleft
 ⎪bracehtipupright
⟨M⟩01101011 ···010⎪bracehtipupleft
 ⎪bracehtipdownright⎪bracehtipdownleft
 ⎪bracehtipupright
w
FIGURE 6.22
Example of the format of the description ⟨M,w⟩of some string x
Now that we have fixed our description language, we are ready to define our
measure of the quantity of information in a string.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 288 ---
264 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
DEFINITION 6.23
Letxbe a binary string. The minimal description ofx,w r i t t e n
d(x),i st h es h o r t e s ts t r i n g ⟨M,w⟩where TMMon input whalts
with xon its tape. If several such strings exist, select the lexi-
cographically first among them. The descriptive complexity5ofx,
written K(x),i s
K(x)=|d(x)|.
In other words, K(x)is the length of the minimal description of x.T h e
definition of K(x)is intended to capture our intuition for the amount of infor-
mation in the string x.N e x t w ee s t a b l i s h s o m e s i m p l e r e s u l t s a b o u t d e s c r i p t i v e
complexity.
THEOREM 6.24
∃c∀x⎪bracketleftbig
K(x)≤|x|+c⎪bracketrightbig
This theorem says that the descriptive complexity of a string is at most a fixed
constant more than its length. The constant is a universal one, not dependent
on the string.
PROOF To p r o v e a n u p p e r b o u n d o n K(x)as this theorem claims, we need
only demonstrate some description of xthat is no longer than the stated bound.
Then the minimal description of xmay be shorter than the demonstrated de-
scription, but not longer.
Consider the following description of the string x.L e t Mbe a T uring ma-
chine that halts as soon as it is started. This machine computes the identity
function—its output is the same as its input. A description of xis simply ⟨M⟩x.
Letting cbe the length of ⟨M⟩completes the proof.
Theorem 6.24 illustrates how we use the input to the T uring machine to rep-
resent information that would require a significantly larger description if stored
instead by using the machine’s transition function. It conforms to our intuition
that the amount of information contained by a string cannot be (substantially)
more than its length. Similarly, intuition says that the information contained by
the string xxis not significantly more than the information contained by x.T h e
following theorem verifies this fact.
5Descriptive complexity is called Kolmogorov complexity orKolmogorov–Chaitin com-
plexity in some treatments.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 289 ---
6.4 A DEFINITION OF INFORMATION 265
THEOREM 6.25
∃c∀x⎪bracketleftbig
K(xx)≤K(x)+c⎪bracketrightbig
PROOF Consider the following T uring machine M,w h i c he x p e c t sa ni n p u t
of the form ⟨N,w⟩,w h e r e Nis a T uring machine and wis an input for it.
M=“On input ⟨N,w⟩,w h e r e Nis aTMandwis a string:
1.Run Nonwuntil it halts and produces an output string s.
2.Output the string ss.”
Ad e s c r i p t i o no f xxis⟨M⟩d(x).R e c a l lt h a t d(x)is a minimal description of x.
The length of this description is |⟨M⟩|+|d(x)|,w h i c hi s c+K ( x)where cis the
length of ⟨M⟩.
Next we examine how the descriptive complexity of the concatenation xyof
two strings xandyis related to their individual complexities. Theorem 6.24
might lead us to believe that the complexity of the concatenation is at most the
sum of the individual complexities (plus a fixed constant), but the cost of com-
bining two descriptions leads to a greater bound, as described in the following
theorem.
THEOREM 6.26
∃c∀x,y⎪bracketleftbig
K(xy)≤2K(x)+K ( y)+c⎪bracketrightbig
PROOF We construct a TMMthat breaks its input winto two separate de-
scriptions. The bits of the first description d(x)are all doubled and terminated
with string 01before the second description d(y)appears, as described in the
text preceding Figure 6.22. Once both descriptions have been obtained, they are
run to obtain the strings xandyand the output xyis produced.
The length of this description of xyis clearly twice the complexity of xplus
the complexity of yplus a fixed constant for describing M.T h i ss u mi s
2K(x)+K ( y)+c,
and the proof is complete.
We may improve this theorem somewhat by using a more efficient method
of indicating the separation between the two descriptions. One way avoids dou-
bling the bits of d(x).I n s t e a dw ep r e p e n dt h el e n g t ho f d(x)as a binary integer
that has been doubled to differentiate it from d(x).T h ed e s c r i p t i o ns t i l lc o n t a i n s
enough information to decode it into the two descriptions of xandy,a n di tn o w
has length at most
2 log2(K(x)) + K( x)+K ( y)+c.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 290 ---
266 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
Further small improvements are possible. However, as Problem 6.26 asks you to
show, we cannot reach the bound K(x)+K ( y)+c.
OPTIMALITY OF THE DEFINITION
Now that we have established some of the elementary properties of descriptive
complexity and you have had a chance to develop some intuition, we discuss
some features of the definitions.
Our definition of K(x)has an optimality property among all possible ways
of defining descriptive complexity with algorithms. Suppose that we consider a
general description language to be any computable function p:Σ∗−→Σ∗and
define the minimal description of xwith respect to p,w r i t t e n dp(x),t ob et h e
first string swhere p(s)= x,i nt h es t a n d a r ds t r i n go r d e r . T h u s , sis lexico-
graphically first among the shortest descriptions of x.D e fi n e Kp(x)=|dp(x)|.
For example, consider a programming language such as Python (encoded into
binary) as the description language. Then dPython (x)would be the minimal
Python program that outputs x,a n d KPython (x)would be the length of the min-
imal program.
The following theorem shows that any description language of this type is
not significantly more concise than the language of T uring machines and inputs
that we originally defined.
THEOREM 6.27
For any description language p,afi x e dc o n s t a n t cexists that depends only on p,
where
∀x⎪bracketleftbig
K(x)≤Kp(x)+c⎪bracketrightbig
.
PROOF IDEA We illustrate the idea of this proof by using the Python exam-
ple. Suppose that xhas a short description win Python. Let Mbe a TMthat
can interpret Python and use the Python program for xasM’s input w.T h e n
⟨M,w⟩is a description of xthat is only a fixed amount larger than the Python
description of x.T h ee x t r al e n g t hi sf o rt h eP y t h o ni n t e r p r e t e r M.
PROOF Ta k e a n y d e s c r i p t i o n l a n g u a g e pand consider the following T uring
machine M.
M=“On input w:
1.Output p(w).”
Then ⟨M⟩dp(x)is a description of xwhose length is at most a fixed constant
greater than Kp(x).T h ec o n s t a n ti st h el e n g t ho f ⟨M⟩.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 291 ---
6.4 A DEFINITION OF INFORMATION 267
INCOMPRESSIBLE STRINGS AND RANDOMNESS
Theorem 6.24 shows that a string’s minimal description is never much longer
than the string itself. Of course for some strings, the minimal description may
be much shorter if the information in the string appears sparsely or redundantly.
Do some strings lack short descriptions? In other words, is the minimal de-
scription of some strings actually as long as the string itself? We show that such
strings exist. These strings can’t be described any more concisely than simply
writing them out explicitly.
DEFINITION 6.28
Letxbe a string. Say that xisc-compressible if
K(x)≤|x|−c.
Ifxis not c-compressible, we say that xisincompressible by c.
Ifxis incompressible by 1,w es a yt h a t xisincompressible .
In other words, if xhas a description that is cbits shorter than its length,
xisc-compressible. If not, xis incompressible by c.F i n a l l y , i f xdoesn’t have
any description shorter than itself, xis incompressible. We first show that in-
compressible strings exist, and then we discuss their interesting properties. In
particular, we show that incompressible strings look like strings that are obtained
from random coin tosses.
THEOREM 6.29
Incompressible strings of every length exist.
PROOF IDEA The number of strings of length nis greater than the number
of descriptions of length less than n.E a c h d e s c r i p t i o n d e s c r i b e s a t m o s t o n e
string. Therefore, some string of length nis not described by any description of
length less than n.T h a ts t r i n gi si n c o m p r e s s i b l e .
PROOF The number of binary strings of length nis2n.E a c hd e s c r i p t i o ni sa
binary string, so the number of descriptions of length less than nis at most the
sum of the number of strings of each length up to n−1,o r
⎪summationdisplay
0≤i≤n−12i=1+2+4+8+ ···+2n−1=2n−1.
The number of short descriptions is less than the number of strings of length n.
Therefore, at least one string of length nis incompressible.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 292 ---
268 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
COROLLARY 6.30
At least 2n−2n−c+1+1strings of length nare incompressible by c.
PROOF We extend the proof of Theorem 6.29. Every c-compressible string
has a description of length at most n−c.N o m o r e t h a n 2n−c+1−1such
descriptions can occur. Therefore, at most 2n−c+1−1of the 2nstrings of
length nmay have such descriptions. The remaining strings, numbering at least
2n−(2n−c+1−1),a r ei n c o m p r e s s i b l eb y c.
Incompressible strings have many properties that we would expect to find in
randomly chosen strings. For example, we can show that any incompressible
string of length nhas roughly an equal number of 0sa n d 1s, and that the length
of its longest run of 0si sa p p r o x i m a t e l y log2n,a sw ew o u l de x p e c tt ofi n di n
ar a n d o ms t r i n go ft h a tl e n g t h . P r o v i n gs u c hs t a t e m e n t sw o u l dt a k eu st o of a r
afield into combinatorics and probability, but we will prove a theorem that forms
the basis for these statements.
That theorem shows that any computable property that holds for “almost all”
strings also holds for all sufficiently long incompressible strings. As we men-
tioned in Section 0.2, a property of strings is simply a function fthat maps
strings to {TRUE ,FALSE }.W es a yt h a tap r o p e r t y holds for almost all strings if
the fraction of strings of length non which it is FALSE approaches 0asngrows
large. A randomly chosen long string is likely to satisfy a computable property
that holds for almost all strings. Therefore, random strings and incompressible
strings share such properties.
THEOREM 6.31
Letfbe a computable property that holds for almost all strings. Then, for any
b>0,t h ep r o p e r t y fisFALSE on only finitely many strings that are incompress-
ible by b.
PROOF LetMbe the following algorithm.
M=“On input i,ab i n a r yi n t e g e r :
1.Find the ith string swhere f(s)= FALSE ,i nt h es t a n d a r ds t r i n g
order.
2.Output string s.”
We can use Mto obtain short descriptions of strings that fail to have property
fas follows. For any such string x,l e tixbe the position or index ofxon a list of
all strings that fail to have property f,i nt h es t a n d a r ds t r i n go r d e r( i . e . ,b yl e n g t h
and lexicographically within each length). Then ⟨M,i x⟩is a description of x.
The length of this description is |ix|+c,w h e r e cis the length of ⟨M⟩.B e c a u s e
few strings fail to have property f,t h ei n d e xo f xis small and its description is
correspondingly short.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 293 ---
6.4 A DEFINITION OF INFORMATION 269
Fix any number b>0.S e l e c t nsuch that at most a 1/2b+c+1fraction of
strings of length nor less fail to have property f.A l ls u f fi c i e n t l yl a r g e nsatisfy
this condition because fholds for almost all strings. Let xbe a string of length
nthat fails to have property f.W eh a v e 2n+1−1strings of length nor less, so
ix≤2n+1−1
2b+c+1≤2n−b−c.
Therefore, |ix|≤n−b−c,s ot h el e n g t ho f ⟨M,i x⟩is at most (n−b−c)+c=n−b,
which implies that
K(x)≤n−b.
Thus every sufficiently long xthat fails to have property fis compressible by b.
Hence only finitely many strings that fail to have property fare incompressible
byb,a n dt h et h e o r e mi sp r o v e d .
At this point, exhibiting some examples of incompressible strings would be
appropriate. However, as Problem 6.23 asks you to show, the K measure of
complexity is not computable. Furthermore, no algorithm can decide in general
whether strings are incompressible, by Problem 6.24. Indeed, by Problem 6.25,
no infinite subset of them is T uring-recognizable. So we have no way to ob-
tain long incompressible strings and would have no way to determine whether
as t r i n gi si n c o m p r e s s i b l ee v e ni fw eh a do n e .T h ef o l l o w i n gt h e o r e md e s c r i b e s
certain strings that are nearly incompressible, although it doesn’t provide a way
to exhibit them explicitly.
THEOREM 6.32
For some constant b,f o re v e r ys t r i n g x,t h em i n i m a ld e s c r i p t i o n d(x)ofxis
incompressible by b.
PROOF Consider the following TMM:
M=“On input ⟨R, y⟩,w h e r e Ris aTMandyis a string:
1.Run Ronyandreject if its output is not of the form ⟨S, z⟩.
2.Run Sonzand halt with its output on the tape. ”
Letbbe|⟨M⟩|+1.W e s h o w t h a t bsatisfies the theorem. Suppose to the
contrary that d(x)isb-compressible for some string x.T h e n
|d(d(x))|≤|d(x)|−b.
But then ⟨M⟩d(d(x))is a description of xwhose length is at most
|⟨M⟩|+|d(d(x))|≤(b−1) + ( |d(x)|−b)=|d(x)|−1.
This description of xis shorter than d(x),c o n t r a d i c t i n gt h el a t t e r ’ sm i n i m a l i t y .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 294 ---
270 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
EXERCISES
6.1 Give an example in the spirit of the recursion theorem of a program in a real pro-
gramming language (or a reasonable approximation thereof ) that prints itself out.
6.2 Show that any infinite subset of MIN TMis not T uring-recognizable.
A6.3 Show that if A≤TBandB≤TC,t h e n A≤TC.
6.4 LetATM′={⟨M,w⟩|Mis an oracle TMandMATMaccepts w}.S h o w t h a t ATM′
is undecidable relative to ATM.
A6.5 Is the statement ∃x∀y⎨bracketleftbig
x+y=y⎨bracketrightbig
am e m b e ro fT h (N,+)?W h yo rw h yn o t ?W h a t
about the statement ∃x∀y⎨bracketleftbig
x+y=x⎨bracketrightbig
?
PROBLEMS
6.6 Describe two different T uring machines, MandN,w h e r e Moutputs ⟨N⟩andN
outputs ⟨M⟩,w h e ns t a r t e do na n yi n p u t .
6.7 In the fixed-point version of the recursion theorem (Theorem 6.8), let the trans-
formation tbe a function that interchanges the states qaccept andqrejectin T uring
machine descriptions. Give an example of a fixed point for t.
⋆6.8 Show that EQTM̸≤m
EQTM.
A6.9 Use the recursion theorem to give an alternative proof of Rice’s theorem in Prob-
lem 5.28.
A6.10 Give a model of the sentence
φeq=∀x⎨bracketleftbig
R1(x, x)⎨bracketrightbig
∧∀x,y⎨bracketleftbig
R1(x,y)↔R1(y,x)⎨bracketrightbig
∧∀x,y,z⎨bracketleftbig
(R1(x,y)∧R1(y,z))→R1(x, z)⎨bracketrightbig
.
⋆6.11 Letφeqbe defined as in Problem 6.10. Give a model of the sentence
φlt=φeq
∧∀x,y⎨bracketleftbig
R1(x, y)→¬R2(x,y)⎨bracketrightbig
∧∀x,y⎨bracketleftbig
¬R1(x,y)→(R2(x, y)⊕R2(y,x))⎨bracketrightbig
∧∀x,y,z⎨bracketleftbig
(R2(x,y)∧R2(y,z))→R2(x,z)⎨bracketrightbig
∧∀x∃y⎨bracketleftbig
R2(x,y)⎨bracketrightbig
.
A6.12 Let(N,<)be the model with universe Nand the “less than” relation. Show that
Th(N,<)is decidable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 295 ---
PROBLEMS 271
6.13 For each m> 1letZm={0,1,2,...,m −1},a n dl e t Fm=(Zm,+,×)be the
model whose universe is Zmand that has relations corresponding to the +and
×relations computed modulo m.S h o w t h a t f o r e a c h m,t h et h e o r yT h (Fm)is
decidable.
6.14 Show that for any two languages AandB,al a n g u a g e Jexists, where A≤TJand
B≤TJ.
6.15 Show that for any language A,al a n g u a g e Bexists, where A≤TBandB̸≤TA.
⋆6.16 Prove that there exist two languages AandBthat are T uring-incomparable—that
is, where A̸≤TBandB̸≤TA.
⋆6.17 LetAandBbe two disjoint languages. Say that language Cseparates AandB
ifA⊆CandB⊆
C.D e s c r i b e t w o d i s j o i n t T u r i n g - r e c o g n i z a b l e l a n g u a g e s t h a t
aren’t separable by any decidable language.
6.18 Show that
 EQTMis recognizable by a T uring machine with an oracle for ATM.
6.19 In Corollary 4.18, we showed that the set of all languages is uncountable. Use this
result to prove that languages exist that are not recognizable by an oracle T uring
machine with an oracle for ATM.
6.20 Recall the Post Correspondence Problem that we defined in Section 5.2 and its
associated language PCP.S h o wt h a t PCP is decidable relative to ATM.
6.21 Show how to compute the descriptive complexity of strings K(x)with an oracle
forATM.
6.22 Use the result of Problem 6.21 to give a function fthat is computable with an
oracle for ATM,w h e r ef o re a c h n,f(n)is an incompressible string of length n.
6.23 Show that the function K(x)is not a computable function.
6.24 Show that the set of incompressible strings is undecidable.
6.25 Show that the set of incompressible strings contains no infinite subset that is
Tu r i n g - r e c o g n i z a b l e .
⋆6.26 Show that for any c,s o m es t r i n g s xandyexist, where K(xy)>K(x)+K ( y)+c.
6.27 LetS={⟨M⟩|Mis aTMandL(M)={⟨M⟩} }.S h o w t h a t n e i t h e r Snor
Sis
Tu r i n g - r e c o g n i z a b l e .
6.28 LetR⊆Nkbe a k-ary relation. Say that Risdefinable in Th (N,+)if we can
give a formula φwith kfree variables x1,...,x ksuch that for all a1,...,a k∈N,
φ(a1,...,a k)is true exactly when a1,...,a k∈R.S h o wt h a te a c ho ft h ef o l l o w i n g
relations is definable in Th (N,+).
Aa.R0={0}
b.R1={1}
c.R=={(a, a)|a∈N}
d.R<={(a, b)|a, b∈Nanda<b }
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 296 ---
272 CHAPTER 6 / ADVANCED TOPICS IN COMPUTABILITY THEORY
SELECTED SOLUTIONS
6.3 Say that MB
1decides AandMC
2decides B.U s e a n o r a c l e TMM3,w h e r e MC
3
decides A.M a c h i n e M3simulates M1.E v e r y t i m e M1queries its oracle about
some string x,m a c h i n e M3tests whether x∈Band provides the answer to M1.
Because machine M3doesn’t have an oracle for Band cannot perform that test
directly, it simulates M2on input xto obtain that information. Machine M3can
obtain the answer to M2’s queries directly because these two machines use the same
oracle, C.
6.5 The statement ∃x∀y⎨bracketleftbig
x+y=y⎨bracketrightbig
is a member of Th (N,+)because that statement
is true for the standard interpretation of +over the universe N.R e c a l l t h a t w e
useN={0,1,2,...}in this chapter and so we may use x=0.T h e s t a t e m e n t
∃x∀y⎨bracketleftbig
x+y=x⎨bracketrightbig
is not a member of Th (N,+)because that statement isn’t true in
this model. For any value of x,s e t t i n g y=1causes x+y=xto fail.
6.9 Assume for the sake of contradiction that some TMXdecides a property P,a n d P
satisfies the conditions of Rice’s theorem. One of these conditions says that TMsA
andBexist where ⟨A⟩∈Pand⟨B⟩ ̸∈P.U s e AandBto construct TMR:
R=“On input w:
1.Obtain own description ⟨R⟩using the recursion theorem.
2.Run Xon⟨R⟩.
3.IfXaccepts ⟨R⟩,s i m u l a t e Bonw.
IfXrejects ⟨R⟩,s i m u l a t e Aonw.”
If⟨R⟩∈P,t h e n Xaccepts ⟨R⟩andL(R)=L(B).B u t ⟨B⟩ ̸∈P,c o n t r a d i c t i n g
⟨R⟩∈P,b e c a u s e Pagrees on TMst h a th a v et h es a m el a n g u a g e . W ea r r i v ea t
as i m i l a rc o n t r a d i c t i o ni f ⟨R⟩ ̸∈P.T h e r e f o r e , o u r o r i g i n a l a s s u m p t i o n i s f a l s e .
Every property satisfying the conditions of Rice’s theorem is undecidable.
6.10 The statement φeqgives the three conditions of an equivalence relation. A model
(A, R 1),w h e r e Ais any universe and R1is any equivalence relation over A,i sa
model of φeq.F o re x a m p l e ,l e t Abe the integers Zand let R1={(i, i)|i∈Z }.
6.12 Reduce Th (N,<)to Th (N,+),w h i c hw e ’ v ea l r e a d ys h o w nt ob ed e c i d a b l e .S h o w
how to convert a sentence φ1over the language of (N,<)to a sentence φ2over
the language of (N,+)while preserving truth or falsity in the respective models.
Replace every occurrence of i<j inφ1with the formula ∃k⎨bracketleftbig
(i+k=j)∧(k+k̸=k)⎨bracketrightbig
inφ2,w h e r e kis a different new variable each time.
Sentence φ2is equivalent to φ1because “ iis less than j”m e a n st h a tw ec a na d d
an o n z e r ov a l u et o iand obtain j.P u t t i n g φ2into prenex-normal form, as re-
quired by the algorithm for deciding Th (N,+),r e q u i r e sab i to fa d d i t i o n a lw o r k .
The new existential quantifiers are brought to the front of the sentence. T o do
so, these quantifiers must pass through Boolean operations that appear in the sen-
tence. Quantifiers can be brought through the operations of ∧and∨without
change. Passing through ¬changes ∃to∀and vice versa. Thus, ¬∃kψbecomes
the equivalent expression ∀k¬ψ,a n d ¬∀kψbecomes ∃k¬ψ.
6.28 (a)R0is definable in Th (N,+)byφ0(x)=∀y[x+y=y].
(c)R=is definable in Th (N,+)byφ=(u, v)=∀x[φ0(x)→x+u=v].
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 297 ---
PART THREE
COMPLEXITY THEORY
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 298 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 299 ---
7
TIME COMPLEXITY
Even when a problem is decidable and thus computationally solvable in prin-
ciple, it may not be solvable in practice if the solution requires an inordinate
amount of time or memory. In this final part of the book, we introduce com-
putational complexity theory—an investigation of the time, memory, or other
resources required for solving computational problems. We begin with time.
Our objective in this chapter is to present the basics of time complexity theory.
First we introduce a way of measuring the time used to solve a problem. Then we
show how to classify problems according to the amount of time required. After
that we discuss the possibility that certain decidable problems require enormous
amounts of time, and how to determine when you are faced with such a problem.
7.1
MEASURING COMPLEXITY
Let’s begin with an example. T ake the language A={0k1k|k≥0}.O b v i o u s l y ,
Ais a decidable language. How much time does a single-tape T uring machine
need to decide A?W ee x a m i n et h ef o l l o w i n gs i n g l e - t a p e TMM1forA.W eg i v e
275
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 300 ---
276 CHAPTER 7 / TIME COMPLEXITY
the T uring machine description at a low level, including the actual head motion
on the tape so that we can count the number of steps that M1uses when it runs.
M1=“On input string w:
1.Scan across the tape and reject if a0is found to the right of a 1.
2.Repeat if both 0sa n d 1sr e m a i no nt h et a p e :
3. Scan across the tape, crossing off a single 0and a single 1.
4.If0ss t i l lr e m a i na f t e ra l lt h e 1sh a v eb e e nc r o s s e do f f ,o ri f 1s
still remain after all the 0sh a v eb e e nc r o s s e do f f , reject .O t h e r -
wise, if neither 0sn o r 1sr e m a i no nt h et a p e , accept .”
We will analyze the algorithm for TMM1deciding Ato determine how much
time it uses. First, we introduce some terminology and notation for this purpose.
The number of steps that an algorithm uses on a particular input may depend
on several parameters. For instance, if the input is a graph, the number of steps
may depend on the number of nodes, the number of edges, and the maximum
degree of the graph, or some combination of these and/or other factors. For
simplicity, we compute the running time of an algorithm purely as a function
of the length of the string representing the input and don’t consider any other
parameters. In worst-case analysis ,t h ef o r mw ec o n s i d e rh e r e ,w ec o n s i d e rt h e
longest running time of all inputs of a particular length. In average-case anal-
ysis,w ec o n s i d e rt h ea v e r a g eo fa l lt h er u n n i n gt i m e so fi n p u t so fap a r t i c u l a r
length.
DEFINITION 7.1
LetMbe a deterministic T uring machine that halts on all in-
puts. The running time ortime complexity ofMis the function
f:N− →N ,w h e r e f(n)is the maximum number of steps that M
uses on any input of length n.I ff(n)is the running time of M,
we say that Mruns in time f(n)and that Mis an f(n)time T ur-
ing machine. Customarily we use nto represent the length of the
input.
BIG-O AND SMALL-O NOTATION
Because the exact running time of an algorithm often is a complex expression,
we usually just estimate it. In one convenient form of estimation, called asymp-
totic analysis ,w es e e kt ou n d e r s t a n dt h er u n n i n gt i m eo ft h ea l g o r i t h mw h e n
it is run on large inputs. We do so by considering only the highest order term
of the expression for the running time of the algorithm, disregarding both the
coefficient of that term and any lower order terms, because the highest order
term dominates the other terms on large inputs.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 301 ---
7.1 MEASURING COMPLEXITY 277
For example, the function f(n)=6 n3+2n2+2 0 n+4 5 has four terms
and the highest order term is 6n3.D i s r e g a r d i n g t h e c o e f fi c i e n t 6,w es a yt h a t
fis asymptotically at most n3.T h e asymptotic notation orbig-O notation for
describing this relationship is f(n)= O(n3).We formalize this notion in the
following definition. Let R+be the set of nonnegative real numbers.
DEFINITION 7.2
Letfandgbe functions f,g:N− →R+.S a yt h a t f(n)=O(g(n))
if positive integers candn0exist such that for every integer n≥n0,
f(n)≤cg(n).
When f(n)=O(g(n)),w es a yt h a t g(n)is an upper bound for
f(n),o rm o r ep r e c i s e l y ,t h a t g(n)is an asymptotic upper bound for
f(n),t oe m p h a s i z et h a tw ea r es u p p r e s s i n gc o n s t a n tf a c t o r s .
Intuitively, f(n)= O(g(n))means that fis less than or equal to gif we
disregard differences up to a constant factor. You may think of Oas rep-
resenting a suppressed constant. In practice, most functions fthat you are
likely to encounter have an obvious highest order term h.I n t h a t c a s e , w r i t e
f(n)=O(g(n)),w h e r e gishwithout its coefficient.
EXAMPLE 7.3
Letf1(n)be the function 5n3+2n2+22n+6.T h e n ,s e l e c t i n gt h eh i g h e s to r d e r
term 5n3and disregarding its coefficient 5gives f1(n)=O(n3).
Let’s verify that this result satisfies the formal definition. We do so by letting
cbe6andn0be10.T h e n , 5n3+2n2+2 2n+6≤6n3for every n≥10.
In addition, f1(n)= O(n4)because n4is larger than n3and so is still an
asymptotic upper bound on f1.
However, f1(n)is not O(n2).R e g a r d l e s so ft h ev a l u e sw ea s s i g nt o candn0,
the definition remains unsatisfied in this case.
EXAMPLE 7.4
The big- Ointeracts with logarithms in a particular way. Usually when we use
logarithms, we must specify the base, as in x= log2n.T h eb a s e 2here indicates
that this equality is equivalent to the equality 2x=n. Changing the value of
the base bchanges the value of logbnby a constant factor, owing to the identity
logbn= log2n/log2b.T h u s , w h e n w e w r i t e f(n)=O(logn),s p e c i f y i n gt h e
base is no longer necessary because we are suppressing constant factors anyway.
Letf2(n)be the function 3nlog2n+5nlog2log2n+2.I nt h i sc a s e ,w eh a v e
f2(n)=O(nlogn)because logndominates log log n.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 302 ---
278 CHAPTER 7 / TIME COMPLEXITY
Big-Onotation also appears in arithmetic expressions such as the expression
f(n)=O(n2)+O(n).I nt h a tc a s e ,e a c ho c c u r r e n c eo ft h e Osymbol represents
ad i f f e r e n ts u p p r e s s e dc o n s t a n t . B e c a u s et h e O(n2)term dominates the O(n)
term, that expression is equivalent to f(n)=O(n2).W h e nt h e Osymbol occurs
in an exponent, as in the expression f(n)=2O(n),t h es a m ei d e aa p p l i e s . T h i s
expression represents an upper bound of 2cnfor some constant c.
The expression f(n)=2O(logn)occurs in some analyses. Using the identity
n=2log2nand thus nc=2clog2n,w es e et h a t 2O(logn)represents an upper
bound of ncfor some c.T h e e x p r e s s i o n nO(1)represents the same bound in a
different way because the expression O(1)represents a value that is never more
than a fixed constant.
Frequently, we derive bounds of the form ncforcgreater than 0.S u c hb o u n d s
are called polynomial bounds .B o u n d s o f t h e f o r m 2(nδ)are called exponential
bounds when δis a real number greater than 0.
Big-Onotation has a companion called small-o notation .B i g - Onotation says
that one function is asymptotically no more than another. T o say that one func-
tion is asymptotically less than another, we use small- onotation. The difference
between the big- Oand small- onotations is analogous to the difference between
≤and<.
DEFINITION 7.5
Letfandgbe functions f,g:N− →R+.S a yt h a t f(n)=o(g(n))
if
lim
n→∞f(n)
g(n)=0.
In other words, f(n)= o(g(n))means that for any real number
c>0,an u m b e r n0exists, where f(n)<cg(n)for all n≥n0.
EXAMPLE 7.6
The following are easy to check.
1.√
n=o(n).
2.n=o(nlog log n).
3.nlog log n=o(nlogn).
4.nlogn=o(n2).
5.n2=o(n3).
However, f(n)is never o(f(n)).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 303 ---
7.1 MEASURING COMPLEXITY 279
ANALYZING ALGORITHMS
Let’s analyze the TMalgorithm we gave for the language A={0k1k|k≥0}.W e
repeat the algorithm here for convenience.
M1=“On input string w:
1.Scan across the tape and reject if a0is found to the right of a 1.
2.Repeat if both 0sa n d 1sr e m a i no nt h et a p e :
3. Scan across the tape, crossing off a single 0and a single 1.
4.If0ss t i l lr e m a i na f t e ra l lt h e 1sh a v eb e e nc r o s s e do f f ,o ri f 1s
still remain after all the 0sh a v eb e e nc r o s s e do f f , reject .O t h e r -
wise, if neither 0sn o r 1sr e m a i no nt h et a p e , accept .”
To a n a l y z e M1,w ec o n s i d e re a c ho fi t sf o u rs t a g e ss e p a r a t e l y . I ns t a g e1 ,
the machine scans across the tape to verify that the input is of the form 0∗1∗.
Performing this scan uses nsteps. As we mentioned earlier, we typically use n
to represent the length of the input. Repositioning the head at the left-hand
end of the tape uses another nsteps. So the total used in this stage is 2nsteps.
In big- Onotation, we say that this stage uses O(n)steps. Note that we didn’t
mention the repositioning of the tape head in the machine description. Using
asymptotic notation allows us to omit details of the machine description that
affect the running time by at most a constant factor.
In stages 2 and 3, the machine repeatedly scans the tape and crosses off a 0
and1on each scan. Each scan uses O(n)steps. Because each scan crosses off
two symbols, at most n/2scans can occur. So the total time taken by stages 2
and 3 is (n/2)O(n)=O(n2)steps.
In stage 4, the machine makes a single scan to decide whether to accept or
reject. The time taken in this stage is at most O(n).
Thus, the total time of M1on an input of length nisO(n)+O(n2)+O(n),
orO(n2).I no t h e rw o r d s ,i t s r u n n i n gt i m ei s O(n2),w h i c hc o m p l e t e st h et i m e
analysis of this machine.
Let’s set up some notation for classifying languages according to their time
requirements.
DEFINITION 7.7
Lett:N− →R+be a function. Define the time complexity class ,
TIME( t(n)),t ob et h ec o l l e c t i o no fa l ll a n g u a g e st h a ta r ed e c i d -
able by an O(t(n))time T uring machine.
Recall the language A={0k1k|k≥0}.T h e p r e c e d i n g a n a l y s i s s h o w s t h a t
A∈TIME( n2)because M1decides Ain time O(n2)andTIME( n2)contains all
languages that can be decided in O(n2)time.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 304 ---
280 CHAPTER 7 / TIME COMPLEXITY
Is there a machine that decides Aasymptotically more quickly? In other
words, is AinTIME( t(n))fort(n)= o(n2)?W e c a n i m p r o v e t h e r u n n i n g
time by crossing off two 0sa n dt w o 1so ne v e r ys c a ni n s t e a do fj u s to n eb e c a u s e
doing so cuts the number of scans by half. But that improves the running time
only by a factor of 2 and doesn’t affect the asymptotic running time. The fol-
lowing machine, M2,u s e sad i f f e r e n tm e t h o dt od e c i d e Aasymptotically faster.
It shows that A∈TIME( nlogn).
M2=“On input string w:
1.Scan across the tape and reject if a0is found to the right of a 1.
2.Repeat as long as some 0sa n ds o m e 1sr e m a i no nt h et a p e :
3. Scan across the tape, checking whether the total number of
0sa n d 1sr e m a i n i n gi se v e no ro d d .I fi ti so d d , reject .
4. Scan again across the tape, crossing off every other 0starting
with the first 0,a n dt h e nc r o s s i n go f fe v e r yo t h e r 1starting
with the first 1.
5.If no 0sa n dn o 1sr e m a i no nt h et a p e , accept .O t h e r w i s e ,
reject .”
Before analyzing M2,l e t ’ sv e r i f yt h a ti ta c t u a l l yd e c i d e s A.O n e v e r y s c a n
performed in stage 4, the total number of 0sr e m a i n i n gi sc u ti nh a l fa n da n y
remainder is discarded. Thus, if we started with 13 0s, after stage 4 is executed a
single time, only 6 0sr e m a i n .A f t e rs u b s e q u e n te x e c u t i o n so ft h i ss t a g e ,3 ,t h e n
1, and then 0 remain. This stage has the same effect on the number of 1s.
Now we examine the even/odd parity of the number of 0sa n dt h en u m b e r
of1s at each execution of stage 3. Consider again starting with 13 0sa n d1 3
1s. The first execution of stage 3 finds an odd number of 0s( b e c a u s e1 3i sa n
odd number) and an odd number of 1s. On subsequent executions, an even
number (6) occurs, then an odd number (3), and an odd number (1). We do not
execute this stage on 0 0so r0 1sb e c a u s eo ft h ec o n d i t i o no nt h er e p e a tl o o p
specified in stage 2. For the sequence of parities found (odd, even, odd, odd), if
we replace the evens with 0s and the odds with 1s and then reverse the sequence,
we obtain 1101, the binary representation of 13, or the number of 0sa n d 1sa t
the beginning. The sequence of parities always gives the reverse of the binary
representation.
When stage 3 checks to determine that the total number of 0sa n d 1sr e -
maining is even, it actually is checking on the agreement of the parity of the 0s
with the parity of the 1s. If all parities agree, the binary representations of the
numbers of 0sa n do f 1sa g r e e ,a n ds ot h et w on u m b e r sa r ee q u a l .
To a n a l y z e t h e r u n n i n g t i m e o f M2,w efi r s to b s e r v et h a te v e r ys t a g et a k e s
O(n)time. We then determine the number of times that each is executed.
Stages 1 and 5 are executed once, taking a total of O(n)time. Stage 4 crosses
off at least half the 0sa n d 1se a c ht i m ei ti se x e c u t e d ,s oa tm o s t 1 + log2niter-
ations of the repeat loop occur before all get crossed off. Thus the total time of
stages 2, 3, and 4 is (1 + log2n)O(n),o rO(nlogn).T h er u n n i n gt i m eo f M2is
O(n)+O(nlogn)=O(nlogn).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 305 ---
7.1 MEASURING COMPLEXITY 281
Earlier we showed that A∈TIME( n2),b u tn o ww eh a v eab e t t e rb o u n d —
namely, A∈TIME( nlogn).T h i sr e s u l tc a n n o tb ef u r t h e ri m p r o v e do ns i n g l e -
tape T uring machines. In fact, any language that can be decided in o(nlogn)
time on a single-tape T uring machine is regular, as Problem 7.49 asks you to
show.
We can decide the language AinO(n)time (also called linear time )i ft h e
Tu r i n g m a c h i n e h a s a s e c o n d t a p e . T h e f o l l o w i n g t w o - t a p e TMM3decides Ain
linear time. Machine M3operates differently from the previous machines for A.
It simply copies the 0st oi t ss e c o n dt a p ea n dt h e nm a t c h e st h e ma g a i n s tt h e 1s.
M3=“On input string w:
1.Scan across tape 1 and reject if a0is found to the right of a 1.
2.Scan across the 0so nt a p e1u n t i lt h efi r s t 1.A tt h es a m et i m e ,
copy the 0so n t ot a p e2 .
3.Scan across the 1so nt a p e1u n t i lt h ee n do ft h ei n p u t .F o re a c h
1read on tape 1, cross off a 0on tape 2. If all 0sa r ec r o s s e do f f
before all the 1sa r er e a d , reject .
4.If all the 0sh a v en o wb e e nc r o s s e do f f , accept .I fa n y 0sr e m a i n ,
reject .”
This machine is simple to analyze. Each of the four stages uses O(n)steps, so
the total running time is O(n)and thus is linear. Note that this running time is
the best possible because nsteps are necessary just to read the input.
Let’s summarize what we have shown about the time complexity of A,t h e
amount of time required for deciding A.W e p r o d u c e d a s i n g l e - t a p e TMM1
that decides AinO(n2)time and a faster single tape TMM2that decides Ain
O(nlogn)time. The solution to Problem 7.49 implies that no single-tape TM
can do it more quickly. Then we exhibited a two-tape TMM3that decides Ain
O(n)time. Hence the time complexity of Aon a single-tape TMisO(nlogn),
and on a two-tape TMit isO(n).N o t et h a tt h ec o m p l e x i t yo f Adepends on the
model of computation selected.
This discussion highlights an important difference between complexity the-
ory and computability theory. In computability theory, the Church– T uring thesis
implies that all reasonable models of computation are equivalent—that is, they
all decide the same class of languages. In complexity theory, the choice of model
affects the time complexity of languages. Languages that are decidable in, say,
linear time on one model aren’t necessarily decidable in linear time on another.
In complexity theory, we classify computational problems according to their
time complexity. But with which model do we measure time? The same language
may have different time requirements on different models.
Fortunately, time requirements don’t differ greatly for typical deterministic
models. So, if our classification system isn’t very sensitive to relatively small
differences in complexity, the choice of deterministic model isn’t crucial. We
discuss this idea further in the next several sections.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 306 ---
282 CHAPTER 7 / TIME COMPLEXITY
COMPLEXITY RELATIONSHIPS AMONG MODELS
Here we examine how the choice of computational model can affect the time
complexity of languages. We consider three models: the single-tape T uring ma-
chine; the multitape T uring machine; and the nondeterministic T uring machine.
THEOREM 7.8
Lett(n)be a function, where t(n)≥n.T h e ne v e r y t(n)time multitape T uring
machine has an equivalent O(t2(n))time single-tape T uring machine.
PROOF IDEA The idea behind the proof of this theorem is quite simple.
Recall that in Theorem 3.13, we showed how to convert any multitape TMinto
as i n g l e - t a p e TMthat simulates it. Now we analyze that simulation to determine
how much additional time it requires. We show that simulating each step of
the multitape machine uses at most O(t(n))steps on the single-tape machine.
Hence the total time used is O(t2(n))steps.
PROOF LetMbe ak-tape TMthat runs in t(n)time. We construct a single-
tape TMSthat runs in O(t2(n))time.
Machine Soperates by simulating M,a sd e s c r i b e di nT h e o r e m3 . 1 3 . T o
review that simulation, we recall that Suses its single tape to represent the con-
tents on all kofM’s tapes. The tapes are stored consecutively, with the positions
ofM’s heads marked on the appropriate squares.
Initially, Sputs its tape into the format that represents all the tapes of M
and then simulates M’s steps. To simulate one step, Sscans all the information
stored on its tape to determine the symbols under M’s tape heads. Then Smakes
another pass over its tape to update the tape contents and head positions. If one
ofM’s heads moves rightward onto the previously unread portion of its tape, S
must increase the amount of space allocated to this tape. It does so by shifting a
portion of its own tape one cell to the right.
Now we analyze this simulation. For each step of M,m a c h i n e Smakes two
passes over the active portion of its tape. The first obtains the information nec-
essary to determine the next move and the second carries it out. The length
of the active portion of S’s tape determines how long Stakes to scan it, so we
must determine an upper bound on this length. T o do so, we take the sum of the
lengths of the active portions of M’sktapes. Each of these active portions has
length at most t(n)because Musest(n)tape cells in t(n)steps if the head moves
rightward at every step, and even fewer if a head ever moves leftward. Thus, a
scan of the active portion of S’s tape uses O(t(n))steps.
To s i m u l a t e e a c h o f M’s steps, Sperforms two scans and possibly up to k
rightward shifts. Each uses O(t(n))time, so the total time for Sto simulate one
ofM’s steps is O(t(n)).
Now we bound the total time used by the simulation. The initial stage, where
Sputs its tape into the proper format, uses O(n)steps. Afterward, Ssimulates
each of the t(n)steps of M,u s i n g O(t(n))steps, so this part of the simulation
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 307 ---
7.1 MEASURING COMPLEXITY 283
uses t(n)×O(t(n)) = O(t2(n))steps. Therefore, the entire simulation of M
usesO(n)+O(t2(n))steps.
We have assumed that t(n)≥n(a reasonable assumption because Mcould
not even read the entire input in less time). Therefore, the running time of Sis
O(t2(n))and the proof is complete.
Next, we consider the analogous theorem for nondeterministic single-tape
Tu r i n g m a c h i n e s . We s h o w t h a t a n y l a n g u a g e t h a t i s d e c i d a b l e o n s u c h a m a -
chine is decidable on a deterministic single-tape T uring machine that requires
significantly more time. Before doing so, we must define the running time of
an o n d e t e r m i n i s t i cT u r i n gm a c h i n e . R e c a l lt h a tan o n d e t e r m i n i s t i cT u r i n gm a -
chine is a decider if all its computation branches halt on all inputs.
DEFINITION 7.9
LetNbe a nondeterministic T uring machine that is a decider. The
running time ofNis the function f:N− →N ,w h e r e f(n)is the
maximum number of steps that Nuses on any branch of its com-
putation on any input of length n,a ss h o w ni nt h ef o l l o w i n gfi g u r e .
    FIGURE 7.10
Measuring deterministic and nondeterministic time
The definition of the running time of a nondeterministic T uring machine is
not intended to correspond to any real-world computing device. Rather, it is a
useful mathematical definition that assists in characterizing the complexity of an
important class of computational problems, as we demonstrate shortly.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 308 ---
284 CHAPTER 7 / TIME COMPLEXITY
THEOREM 7.11
Lett(n)be a function, where t(n)≥n.T h e ne v e r y t(n)time nondeterministic
single-tape T uring machine has an equivalent 2O(t(n))time deterministic single-
tape T uring machine.
PROOF LetNbe a nondeterministic TMrunning in t(n)time. We construct a
deterministic TMDthat simulates Nas in the proof of Theorem 3.16 by search-
ingN’s nondeterministic computation tree. Now we analyze that simulation.
On an input of length n,e v e r yb r a n c ho f N’s nondeterministic computation
tree has a length of at most t(n).E v e r y n o d e i n t h e t r e e c a n h a v e a t m o s t b
children, where bis the maximum number of legal choices given by N’s transition
function. Thus, the total number of leaves in the tree is at most bt(n).
The simulation proceeds by exploring this tree breadth first. In other words,
it visits all nodes at depth dbefore going on to any of the nodes at depth d+1.
The algorithm given in the proof of Theorem 3.16 inefficiently starts at the root
and travels down to a node whenever it visits that node. But eliminating this
inefficiency doesn’t alter the statement of the current theorem, so we leave it
as is. The total number of nodes in the tree is less than twice the maximum
number of leaves, so we bound it by O(bt(n)).T h et i m ei tt a k e st os t a r tf r o mt h e
root and travel down to a node is O(t(n)).T h e r e f o r e ,t h er u n n i n gt i m e o f Dis
O(t(n)bt(n))=2O(t(n)).
As described in Theorem 3.16, the TMDhas three tapes. Converting to a
single-tape TMat most squares the running time, by Theorem 7.8. Thus, the
running time of the single-tape simulator is (2O(t(n)))2=2O(2t(n))=2O(t(n))
and the theorem is proved.
7.2
THE CLASS P
Theorems 7.8 and 7.11 illustrate an important distinction. On the one hand, we
demonstrated at most a square or polynomial difference between the time com-
plexity of problems measured on deterministic single-tape and multitape T uring
machines. On the other hand, we showed at most an exponential difference be-
tween the time complexity of problems on deterministic and nondeterministic
Tu r i n g m a c h i n e s .
POLYNOMIAL TIME
For our purposes, polynomial differences in running time are considered to be
small, whereas exponential differences are considered to be large. Let’s look at
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 309 ---
7.2 THE CLASS P 285
why we chose to make this separation between polynomials and exponentials
rather than between some other classes of functions.
First, note the dramatic difference between the growth rate of typically oc-
curring polynomials such as n3and typically occurring exponentials such as 2n.
For example, let nbe 1000, the size of a reasonable input to an algorithm. In
that case, n3is 1 billion, a large but manageable number, whereas 2nis a num-
ber much larger than the number of atoms in the universe. Polynomial time
algorithms are fast enough for many purposes, but exponential time algorithms
rarely are useful.
Exponential time algorithms typically arise when we solve problems by ex-
haustively searching through a space of solutions, called brute-force search .F o r
example, one way to factor a number into its constituent primes is to search
through all potential divisors. The size of the search space is exponential, so
this search uses exponential time. Sometimes brute-force search may be avoided
through a deeper understanding of a problem, which may reveal a polynomial
time algorithm of greater utility.
All reasonable deterministic computational models are polynomially equiv-
alent .T h a t i s , a n y o n e o f t h e m c a n s i m u l a t e a n o t h e r w i t h o n l y a p o l y n o m i a l
increase in running time. When we say that all reasonable deterministic models
are polynomially equivalent, we do not attempt to define reasonable .H o w e v e r ,
we have in mind a notion broad enough to include models that closely approxi-
mate running times on actual computers. For example, Theorem 7.8 shows that
the deterministic single-tape and multitape T uring machine models are polyno-
mially equivalent.
From here on we focus on aspects of time complexity theory that are unaf-
fected by polynomial differences in running time. Ignoring these differences
allows us to develop a theory that doesn’t depend on the selection of a partic-
ular model of computation. Remember, our aim is to present the fundamental
properties of computation ,r a t h e rt h a np r o p e r t i e so fT u r i n gm a c h i n e so ra n yo t h e r
special model.
You may feel that disregarding polynomial differences in running time is ab-
surd. Real programmers certainly care about such differences and work hard just
to make their programs run twice as quickly. However, we disregarded constant
factors a while back when we introduced asymptotic notation. Now we propose
to disregard the much greater polynomial differences, such as that between time
nand time n3.
Our decision to disregard polynomial differences doesn’t imply that we con-
sider such differences unimportant. On the contrary, we certainly do consider
the difference between time nand time n3to be an important one. But some
questions, such as the polynomiality or nonpolynomiality of the factoring prob-
lem, do not depend on polynomial differences and are important, too. We
merely choose to focus on this type of question here. Ignoring the trees to see
the forest doesn’t mean that one is more important than the other—it just gives
ad i f f e r e n tp e r s p e c t i v e .
Now we come to an important definition in complexity theory.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 310 ---
286 CHAPTER 7 / TIME COMPLEXITY
DEFINITION 7.12
Pis the class of languages that are decidable in polynomial time on
ad e t e r m i n i s t i cs i n g l e - t a p eT u r i n gm a c h i n e .I no t h e rw o r d s ,
P=⎪uniondisplay
kTIME( nk).
The class Pplays a central role in our theory and is important because
1.Pis invariant for all models of computation that are polynomially equiva-
lent to the deterministic single-tape T uring machine, and
2.Proughly corresponds to the class of problems that are realistically solv-
able on a computer.
Item 1 indicates that Pis a mathematically robust class. It isn’t affected by the
particulars of the model of computation that we are using.
Item 2 indicates that Pis relevant from a practical standpoint. When a
problem is in P,w eh a v eam e t h o do fs o l v i n gi tt h a tr u n si nt i m e nkfor some
constant k.W h e t h e r t h i s r u n n i n g t i m e i s p r a c t i c a l d e p e n d s o n kand on the
application. Of course, a running time of n100is unlikely to be of any practical
use. Nevertheless, calling polynomial time the threshold of practical solvability
has proven to be useful. Once a polynomial time algorithm has been found for
ap r o b l e mt h a tf o r m e r l ya p p e a r e dt or e q u i r ee x p o n e n t i a lt i m e ,s o m ek e yi n s i g h t
into it has been gained and further reductions in its complexity usually follow,
often to the point of actual practical utility.
EXAMPLES OF PROBLEMS IN P
When we present a polynomial time algorithm, we give a high-level description
of it without reference to features of a particular computational model. Doing so
avoids tedious details of tapes and head motions. We follow certain conventions
when describing an algorithm so that we can analyze it for polynomiality.
We continue to describe algorithms with numbered stages. Now we must
be sensitive to the number of T uring machine steps required to implement each
stage, as well as to the total number of stages that the algorithm uses.
When we analyze an algorithm to show that it runs in polynomial time, we
need to do two things. First, we have to give a polynomial upper bound (usu-
ally in big- Onotation) on the number of stages that the algorithm uses when it
runs on an input of length n.T h e n , w e h a v e t o e x a m i n e t h e i n d i v i d u a l s t a g e s
in the description of the algorithm to be sure that each can be implemented in
polynomial time on a reasonable deterministic model. We choose the stages
when we describe the algorithm to make this second part of the analysis easy to
do. When both tasks have been completed, we can conclude that the algorithm
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 311 ---
7.2 THE CLASS P 287
runs in polynomial time because we have demonstrated that it runs for a poly-
nomial number of stages, each of which can be done in polynomial time, and the
composition of polynomials is a polynomial.
One point that requires attention is the encoding method used for problems.
We continue to use the angle-bracket notation ⟨·⟩to indicate a reasonable en-
coding of one or more objects into a string, without specifying any particular
encoding method. Now, a reasonable method is one that allows for polyno-
mial time encoding and decoding of objects into natural internal representations
or into other reasonable encodings. Familiar encoding methods for graphs, au-
tomata, and the like all are reasonable. But note that unary notation for encoding
numbers (as in the number 17 encoded by the unary string 11111111111111111 )
isn’t reasonable because it is exponentially larger than truly reasonable encod-
ings, such as base knotation for any k≥2.
Many computational problems you encounter in this chapter contain encod-
ings of graphs. One reasonable encoding of a graph is a list of its nodes and
edges. Another is the adjacency matrix ,w h e r et h e (i, j)th entry is 1if there is
an edge from node ito node jand0if not. When we analyze algorithms on
graphs, the running time may be computed in terms of the number of nodes
instead of the size of the graph representation. In reasonable graph represen-
tations, the size of the representation is a polynomial in the number of nodes.
Thus, if we analyze an algorithm and show that its running time is polynomial
(or exponential) in the number of nodes, we know that it is polynomial (or expo-
nential) in the size of the input.
The first problem concerns directed graphs. A directed graph Gcontains
nodes sandt,a ss h o w ni nt h ef o l l o w i n gfi g u r e .T h e PATH problem is to deter-
mine whether a directed path exists from stot.L e t
PATH ={⟨G, s, t ⟩|Gis a directed graph that has a directed path from stot}.
FIGURE 7.13
The PATH problem: Is there a path from stot?
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 312 ---
288 CHAPTER 7 / TIME COMPLEXITY
THEOREM 7.14
PATH ∈P.
PROOF IDEA We prove this theorem by presenting a polynomial time algo-
rithm that decides PATH .B e f o r ed e s c r i b i n g t h a t a l g o r i t h m ,l e t ’ s o b s e r v et h a t a
brute-force algorithm for this problem isn’t fast enough.
Ab r u t e - f o r c ea l g o r i t h mf o r PATH proceeds by examining all potential paths
inGand determining whether any is a directed path from stot.Ap o t e n t i a lp a t h
is a sequence of nodes in Ghaving a length of at most m,w h e r e mis the number
of nodes in G.( I fa n yd i r e c t e dp a t he x i s t sf r o m stot,o n eh a v i n gal e n g t ho fa t
most mexists because repeating a node never is necessary.) But the number of
such potential paths is roughly mm,w h i c hi se x p o n e n t i a li nt h en u m b e ro fn o d e s
inG.T h e r e f o r e ,t h i sb r u t e - f o r c ea l g o r i t h mu s e se x p o n e n t i a lt i m e .
To g e t a p o l y n o m i a l t i m e a l g o r i t h m f o r PATH ,w em u s td os o m e t h i n gt h a t
avoids brute force. One way is to use a graph-searching method such as breadth-
first search. Here, we successively mark all nodes in Gthat are reachable from s
by directed paths of length 1, then 2, then 3, through m.B o u n d i n gt h er u n n i n g
time of this strategy by a polynomial is easy.
PROOF Ap o l y n o m i a lt i m ea l g o r i t h m MforPATH operates as follows.
M=“On input ⟨G, s, t ⟩,w h e r e Gis a directed graph with nodes sandt:
1.Place a mark on node s.
2.Repeat the following until no additional nodes are marked:
3. Scan all the edges of G.I fa ne d g e (a, b)is found going from
am a r k e dn o d e ato an unmarked node b,m a r kn o d e b.
4.Iftis marked, accept .O t h e r w i s e , reject .”
Now we analyze this algorithm to show that it runs in polynomial time. Ob-
viously, stages 1 and 4 are executed only once. Stage 3 runs at most mtimes
because each time except the last it marks an additional node in G.T h u s , t h e
total number of stages used is at most 1+1+ m,g i v i n gap o l y n o m i a li nt h es i z e
ofG.
Stages 1 and 4 of Mare easily implemented in polynomial time on any rea-
sonable deterministic model. Stage 3 involves a scan of the input and a test of
whether certain nodes are marked, which also is easily implemented in polyno-
mial time. Hence Mis a polynomial time algorithm for PATH .
Let’s turn to another example of a polynomial time algorithm. Say that two
numbers are relatively prime if1is the largest integer that evenly divides them
both. For example, 10and21are relatively prime, even though neither of them
is a prime number by itself, whereas 10and22are not relatively prime because
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 313 ---
7.2 THE CLASS P 289
both are divisible by 2.L e t RELPRIME be the problem of testing whether two
numbers are relatively prime. Thus
RELPRIME ={⟨x, y⟩|xandyare relatively prime }.
THEOREM 7.15
RELPRIME ∈P.
PROOF IDEA One algorithm that solves this problem searches through all
possible divisors of both numbers and accepts if none are greater than 1.H o w -
ever, the magnitude of a number represented in binary, or in any other base k
notation for k≥2,i se x p o n e n t i a li nt h el e n g t ho fi t sr e p r e s e n t a t i o n .T h e r e f o r e ,
this brute-force algorithm searches through an exponential number of potential
divisors and has an exponential running time.
Instead, we solve this problem with an ancient numerical procedure, called
theEuclidean algorithm ,f o rc o m p u t i n gt h eg r e a t e s tc o m m o nd i v i s o r . T h e
greatest common divisor of natural numbers xandy,w r i t t e n gcd(x, y),i st h e
largest integer that evenly divides both xandy.F o r e x a m p l e , gcd(18 ,24) = 6 .
Obviously, xandyare relatively prime iff gcd(x, y)=1 .W e d e s c r i b e t h e E u -
clidean algorithm as algorithm Ein the proof. It uses the mod function, where
xmod yis the remainder after the integer division of xbyy.
PROOF The Euclidean algorithm Eis as follows.
E=“On input ⟨x, y⟩,w h e r e xandyare natural numbers in binary:
1.Repeat until y=0:
2. Assign x←xmod y.
3. Exchange xandy.
4.Output x.”
Algorithm Rsolves RELPRIME ,u s i n g Eas a subroutine.
R=“On input ⟨x, y⟩,w h e r e xandyare natural numbers in binary:
1.Run Eon⟨x, y⟩.
2.If the result is 1,accept .O t h e r w i s e , reject .”
Clearly, if Eruns correctly in polynomial time, so does Rand hence we only
need to analyze Efor time and correctness. The correctness of this algorithm is
well known so we won’t discuss it further here.
To a n a l y z e t h e t i m e c o m p l e x i t y o f E,w efi r s ts h o wt h a te v e r ye x e c u t i o no f
stage 2 (except possibly the first) cuts the value of xby at least half. After stage 2
is executed, x<y because of the nature of the mod function. After stage 3,
x>y because the two have been exchanged. Thus, when stage 2 is subsequently
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 314 ---
290 CHAPTER 7 / TIME COMPLEXITY
executed, x>y .I fx/2≥y,t h e n xmod y<y ≤x/2andxdrops by at least
half. If x/2<y,t h e n xmod y=x−y<x / 2andxdrops by at least half.
The values of xandyare exchanged every time stage 3 is executed, so each
of the original values of xandyare reduced by at least half every other time
through the loop. Thus, the maximum number of times that stages 2 and 3 are
executed is the lesser of 2 log2xand2 log2y.T h e s el o g a r i t h m sa r ep r o p o r t i o n a l
to the lengths of the representations, giving the number of stages executed as
O(n).E a c h s t a g e o f Euses only polynomial time, so the total running time is
polynomial.
The final example of a polynomial time algorithm shows that every context-
free language is decidable in polynomial time.
THEOREM 7.16
Every context-free language is a member of P.
PROOF IDEA In Theorem 4.9, we proved that every CFLis decidable. T o
do so, we gave an algorithm for each CFLthat decides it. If that algorithm runs
in polynomial time, the current theorem follows as a corollary. Let’s recall that
algorithm and find out whether it runs quickly enough.
LetLbe a CFLgenerated by CFGGthat is in Chomsky normal form. From
Problem 2.26, any derivation of a string whas2n−1steps, where nis the length
ofwbecause Gis in Chomsky normal form. The decider for Lworks by trying
all possible derivations with 2n−1steps when its input is a string of length n.I f
any of these is a derivation of w,t h ed e c i d e ra c c e p t s ;i fn o t ,i tr e j e c t s .
Aq u i c ka n a l y s i so ft h i sa l g o r i t h ms h o w st h a ti td o e s n ’ tr u ni np o l y n o m i a l
time. The number of derivations with ksteps may be exponential in k,s ot h i s
algorithm may require exponential time.
To g e t a p o l y n o m i a l t i m e a l g o r i t h m , w e i n t r o d u c e a p o w e r f u l t e c h n i q u e c a l l e d
dynamic programming .T h i s t e c h n i q u e u s e s t h e a c c u m u l a t i o n o f i n f o r m a t i o n
about smaller subproblems to solve larger problems. We record the solution to
any subproblem so that we need to solve it only once. We do so by making a
table of all subproblems and entering their solutions systematically as we find
them.
In this case, we consider the subproblems of determining whether each vari-
able in Ggenerates each substring of w.T h e a l g o r i t h m e n t e r s t h e s o l u t i o n t o
this subproblem in an n×ntable. For i≤j,t h e (i, j)th entry of the table con-
tains the collection of variables that generate the substring wiwi+1···wj.F o r
i>j,t h et a b l ee n t r i e sa r eu n u s e d .
The algorithm fills in the table entries for each substring of w.F i r s t i t fi l l s
in the entries for the substrings of length 1,t h e nt h o s eo fl e n g t h 2,a n ds oo n .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 315 ---
7.2 THE CLASS P 291
It uses the entries for the shorter lengths to assist in determining the entries for
the longer lengths.
For example, suppose that the algorithm has already determined which vari-
ables generate all substrings up to length k.T od e t e r m i n ew h e t h e rav a r i a b l e A
generates a particular substring of length k+1,t h ea l g o r i t h ms p l i t st h a ts u b s t r i n g
into two nonempty pieces in the kpossible ways. For each split, the algorithm
examines each rule A→BCto determine whether Bgenerates the first piece
andCgenerates the second piece, using table entries previously computed. If
both BandCgenerate the respective pieces, Agenerates the substring and so
is added to the associated table entry. The algorithm starts the process with the
strings of length 1by examining the table for the rules A→b.
PROOF The following algorithm Dimplements the proof idea. Let Gbe
aCFGin Chomsky normal form generating the CFLL.A s s u m e t h a t Sis the
start variable. (Recall that the empty string is handled specially in a Chomsky
normal form grammar. The algorithm handles the special case in which w=ε
in stage 1.) Comments appear inside double brackets.
D=“On input w=w1···wn:
1.Forw=ε,i fS→εis a rule, accept ;e l s e , reject .[[w=εcase]]
2.Fori=1ton: [[examine each substring of length 1] ]
3. For each variable A:
4. Te s t w h e t h e r A→bis a rule, where b=wi.
5. If so, place Aintable(i, i).
6.Forl=2ton: [[lis the length of the substring ]]
7. Fori=1ton−l+1:[[iis the start position of the substring ]]
8. Letj=i+l−1. [[jis the end position of the substring ]]
9. Fork=itoj−1: [[kis the split position ]]
10. For each rule A→BC:
11. Iftable(i, k)contains Bandtable(k+1,j)contains
C,p u t Aintable(i, j).
12. IfSis in table(1,n),accept ;e l s e , reject .”
Now we analyze D.E a c h s t a g e i s e a s i l y i m p l e m e n t e d t o r u n i n p o l y n o m i a l
time. Stages 4 and 5 run at most nvtimes, where vis the number of variables in
Gand is a fixed constant independent of n;h e n c et h e s es t a g e sr u n O(n)times.
Stage 6 runs at most ntimes. Each time stage 6 runs, stage 7 runs at most n
times. Each time stage 7 runs, stages 8 and 9 run at most ntimes. Each time
stage 9 runs, stage 10 runs rtimes, where ris the number of rules of Gand
is another fixed constant. Thus stage 11, the inner loop of the algorithm, runs
O(n3)times. Summing the total shows that Dexecutes O(n3)stages.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 316 ---
292 CHAPTER 7 / TIME COMPLEXITY
7.3
THE CLASS NP
As we observed in Section 7.2, we can avoid brute-force search in many problems
and obtain polynomial time solutions. However, attempts to avoid brute force
in certain other problems, including many interesting and useful ones, haven’t
been successful, and polynomial time algorithms that solve them aren’t known
to exist.
Why have we been unsuccessful in finding polynomial time algorithms for
these problems? We don’t know the answer to this important question. Perhaps
these problems have as yet undiscovered polynomial time algorithms that rest
on unknown principles. Or possibly some of these problems simply cannot be
solved in polynomial time. They may be intrinsically difficult.
One remarkable discovery concerning this question shows that the complex-
ities of many problems are linked. A polynomial time algorithm for one such
problem can be used to solve an entire class of problems. T o understand this
phenomenon, let’s begin with an example.
AHamiltonian path in a directed graph Gis a directed path that goes through
each node exactly once. We consider the problem of testing whether a directed
graph contains a Hamiltonian path connecting two specified nodes, as shown in
the following figure. Let
HAMPATH ={⟨G, s, t ⟩|Gis a directed graph
with a Hamiltonian path from stot}.
FIGURE 7.17
AH a m i l t o n i a np a t hg o e st h r o u g he v e r yn o d ee x a c t l yo n c e
We can easily obtain an exponential time algorithm for the HAMPATH prob-
lem by modifying the brute-force algorithm for PATH given in Theorem 7.14.
We need only add a check to verify that the potential path is Hamiltonian. No
one knows whether HAMPATH is solvable in polynomial time.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 317 ---
7.3 THE CLASS NP 293
The HAMPATH problem has a feature called polynomial verifiability that is
important for understanding its complexity. Even though we don’t know of a fast
(i.e., polynomial time) way to determine whether a graph contains a Hamiltonian
path, if such a path were discovered somehow (perhaps using the exponential
time algorithm), we could easily convince someone else of its existence simply
by presenting it. In other words, verifying the existence of a Hamiltonian path
may be much easier than determining its existence.
Another polynomially verifiable problem is compositeness. Recall that a nat-
ural number is composite if it is the product of two integers greater than 1(i.e., a
composite number is one that is not a prime number). Let
COMPOSITES ={x|x=pq,for integers p, q > 1}.
We can easily verify that a number is composite—all that is needed is a divisor
of that number. Recently, a polynomial time algorithm for testing whether a
number is prime or composite was discovered, but it is considerably more com-
plicated than the preceding method for verifying compositeness.
Some problems may not be polynomially verifiable. For example, take
HAMPATH ,t h ec o m p l e m e n to ft h e HAMPATH problem. Even if we could
determine (somehow) that a graph did nothave a Hamiltonian path, we don’t
know of a way for someone else to verify its nonexistence without using the
same exponential time algorithm for making the determination in the first place.
Af o r m a ld e fi n i t i o nf o l l o w s .
DEFINITION 7.18
Averifier for a language Ais an algorithm V,w h e r e
A={w|Vaccepts ⟨w, c⟩for some string c}.
We measure the time of a verifier only in terms of the length of w,
so apolynomial time verifier runs in polynomial time in the length
ofw.Al a n g u a g e Aispolynomially verifiable if it has a polynomial
time verifier.
Av e r i fi e ru s e sa d d i t i o n a li n f o r m a t i o n ,r e p r e s e n t e db yt h es y m b o l cin Defini-
tion 7.18, to verify that a string wis a member of A.T h i si n f o r m a t i o ni sc a l l e da
certificate ,o rproof ,o fm e m b e r s h i pi n A.O b s e r v et h a tf o rp o l y n o m i a lv e r i fi e r s ,
the certificate has polynomial length (in the length of w)b e c a u s et h a ti sa l lt h e
verifier can access in its time bound. Let’s apply this definition to the languages
HAMPATH andCOMPOSITES .
For the HAMPATH problem, a certificate for a string ⟨G, s, t ⟩∈HAMPATH
simply is a Hamiltonian path from stot.F o r t h e COMPOSITES problem, a
certificate for the composite number xsimply is one of its divisors. In both
cases, the verifier can check in polynomial time that the input is in the language
when it is given the certificate.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 318 ---
294 CHAPTER 7 / TIME COMPLEXITY
DEFINITION 7.19
NPis the class of languages that have polynomial time verifiers.
The class NPis important because it contains many problems of practical in-
terest. From the preceding discussion, both HAMPATH andCOMPOSITES are
members of NP.A sw em e n t i o n e d , COMPOSITES is also a member of P,w h i c h
is a subset of NP;b u tp r o v i n gt h i ss t r o n g e rr e s u l ti sm u c hm o r ed i f fi c u l t .T h e
term NPcomes from nondeterministic polynomial time and is derived from an
alternative characterization by using nondeterministic polynomial time T uring
machines. Problems in NPare sometimes called NP-problems.
The following is a nondeterministic T uring machine ( NTM)t h a td e c i d e st h e
HAMPATH problem in nondeterministic polynomial time. Recall that in Def-
inition 7.9, we defined the time of a nondeterministic machine to be the time
used by the longest computation branch.
N1=“On input ⟨G, s, t ⟩,w h e r e Gis a directed graph with nodes sandt:
1.Write a list of mnumbers, p1,...,p m,w h e r e mis the number
of nodes in G.E a c h n u m b e r i n t h e l i s t i s n o n d e t e r m i n i s t i c a l l y
selected to be between 1andm.
2.Check for repetitions in the list. If any are found, reject .
3.Check whether s=p1andt=pm.I fe i t h e rf a i l , reject .
4.For each ibetween 1andm−1,c h e c kw h e t h e r (pi,pi+1)is an
edge of G.I fa n ya r en o t , reject .O t h e r w i s e ,a l lt e s t sh a v eb e e n
passed, so accept .”
To a n a l y z e t h i s a l g o r i t h m a n d v e r i f y t h a t i t r u n s i n n o n d e t e r m i n i s t i c p o l y -
nomial time, we examine each of its stages. In stage 1, the nondeterministic
selection clearly runs in polynomial time. In stages 2 and 3, each part is a simple
check, so together they run in polynomial time. Finally, stage 4 also clearly runs
in polynomial time. Thus, this algorithm runs in nondeterministic polynomial
time.
THEOREM 7.20
Al a n g u a g ei si n NPiff it is decided by some nondeterministic polynomial time
Tu r i n g m a c h i n e .
PROOF IDEA We show how to convert a polynomial time verifier to an
equivalent polynomial time NTM and vice versa. The NTM simulates the ver-
ifier by guessing the certificate. The verifier simulates the NTM by using the
accepting branch as the certificate.
PROOF For the forward direction of this theorem, let A∈NPand show that
Ais decided by a polynomial time NTMN.L e t Vbe the polynomial time verifier
forAthat exists by the definition of NP.A s s u m et h a t Vis aTMthat runs in time
nkand construct Nas follows.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 319 ---
7.3 THE CLASS NP 295
N=“On input wof length n:
1.Nondeterministically select string cof length at most nk.
2.Run Von input ⟨w, c⟩.
3.IfVaccepts, accept ;o t h e r w i s e , reject .”
To p r o v e t h e o t h e r d i r e c t i o n o f t h e t h e o r e m , a s s u m e t h a t Ais decided by a
polynomial time NTM Nand construct a polynomial time verifier Vas follows.
V=“On input ⟨w, c⟩,w h e r e wandcare strings:
1.Simulate Non input w,t r e a t i n ge a c hs y m b o lo f cas a descrip-
tion of the nondeterministic choice to make at each step (as in
the proof of Theorem 3.16).
2.If this branch of N’s computation accepts, accept ;o t h e r w i s e ,
reject .”
We define the nondeterministic time complexity class NTIME( t(n))as anal-
ogous to the deterministic time complexity class TIME( t(n)).
DEFINITION 7.21
NTIME (t(n)) ={L|Lis a language decided by an O(t(n))time
nondeterministic T uring machine }.
COROLLARY 7.22
NP =⎪uniontext
kNTIME( nk).
The class NPis insensitive to the choice of reasonable nondeterministic com-
putational model because all such models are polynomially equivalent. When
describing and analyzing nondeterministic polynomial time algorithms, we fol-
low the preceding conventions for deterministic polynomial time algorithms.
Each stage of a nondeterministic polynomial time algorithm must have an obvi-
ous implementation in nondeterministic polynomial time on a reasonable non-
deterministic computational model. We analyze the algorithm to show that
every branch uses at most polynomially many stages.
EXAMPLES OF PROBLEMS IN NP
Aclique in an undirected graph is a subgraph, wherein every two nodes are
connected by an edge. A k-clique is a clique that contains knodes. Figure 7.23
illustrates a graph with a 5-clique.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 320 ---
296 CHAPTER 7 / TIME COMPLEXITY
FIGURE 7.23
Ag r a p hw i t ha5 - c l i q u e
The clique problem is to determine whether a graph contains a clique of a
specified size. Let
CLIQUE ={⟨G, k⟩|Gis an undirected graph with a k-clique }.
THEOREM 7.24
CLIQUE is in NP.
PROOF IDEA The clique is the certificate.
PROOF The following is a verifier VforCLIQUE .
V=“On input ⟨⟨G, k⟩,c⟩:
1.Te s t w h e t h e r cis a subgraph with knodes in G.
2.Te s t w h e t h e r Gcontains all edges connecting nodes in c.
3.If both pass, accept ;o t h e r w i s e , reject .”
ALTERNATIVE PROOF If you prefer to think of NPin terms of nonde-
terministic polynomial time T uring machines, you may prove this theorem by
giving one that decides CLIQUE .O b s e r v et h es i m i l a r i t yb e t w e e nt h et w op r o o f s .
N=“On input ⟨G, k⟩,w h e r e Gis a graph:
1.Nondeterministically select a subset cofknodes of G.
2.Te s t w h e t h e r Gcontains all edges connecting nodes in c.
3.If yes, accept ;o t h e r w i s e , reject .”
Next, we consider the SUBSET-SUM problem concerning integer arithmetic.
We are given a collection of numbers x1,...,x kand a target number t.W ew a n t
to determine whether the collection contains a subcollection that adds up to t.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 321 ---
7.3 THE CLASS NP 297
Thus,
SUBSET-SUM ={⟨S, t⟩|S={x1,...,x k},and for some
{y1,...,y l}⊆{x1,...,x k},we have Σyi=t}.
For example, ⟨{4,11,16,21,27},25⟩∈SUBSET-SUM because 4+2 1 = 2 5 .
Note that {x1,...,x k}and{y1,...,y l}are considered to be multisets and so
allow repetition of elements.
THEOREM 7.25
SUBSET-SUM is in NP.
PROOF IDEA The subset is the certificate.
PROOF The following is a verifier VforSUBSET-SUM .
V=“On input ⟨⟨S, t⟩,c⟩:
1.Te s t w h e t h e r cis a collection of numbers that sum to t.
2.Te s t w h e t h e r Scontains all the numbers in c.
3.If both pass, accept ;o t h e r w i s e , reject .”
ALTERNATIVE PROOF We can also prove this theorem by giving a nonde-
terministic polynomial time T uring machine for SUBSET-SUM as follows.
N=“On input ⟨S, t⟩:
1.Nondeterministically select a subset cof the numbers in S.
2.Te s t w h e t h e r cis a collection of numbers that sum to t.
3.If the test passes, accept ;o t h e r w i s e , reject .”
Observe that the complements of these sets,
 CLIQUE and
SUBSET-SUM ,
are not obviously members of NP.V e r i f y i n gt h a ts o m e t h i n gi s notpresent seems
to be more difficult than verifying that it ispresent. We make a separate com-
plexity class, called coNP ,w h i c hc o n t a i n st h el a n g u a g e st h a ta r ec o m p l e m e n t s
of languages in NP.W ed o n ’ tk n o ww h e t h e r coNP is different from NP.
THE P VERSUS NP QUESTION
As we have been saying, NPis the class of languages that are solvable in polyno-
mial time on a nondeterministic T uring machine; or, equivalently, it is the class
of languages whereby membership in the language can be verified in polynomial
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 322 ---
298 CHAPTER 7 / TIME COMPLEXITY
time. Pis the class of languages where membership can be tested in polyno-
mial time. We summarize this information as follows, where we loosely refer to
polynomial time solvable as solvable “quickly.”
P=the class of languages for which membership can be decided quickly.
NP = the class of languages for which membership can be verified quickly.
We have presented examples of languages, such as HAMPATH andCLIQUE ,
that are members of NPbut that are not known to be in P.T h ep o w e ro fp o l y n o -
mial verifiability seems to be much greater than that of polynomial decidability.
But, hard as it may be to imagine, PandNPcould be equal. We are unable to
prove the existence of a single language in NPthat is not in P.
The question of whether P=N P is one of the greatest unsolved problems
in theoretical computer science and contemporary mathematics. If these classes
were equal, any polynomially verifiable problem would be polynomially decid-
able. Most researchers believe that the two classes are not equal because people
have invested enormous effort to find polynomial time algorithms for certain
problems in NP,w i t h o u ts u c c e s s . R e s e a r c h e r sa l s oh a v et r i e dp r o v i n gt h a tt h e
classes are unequal, but that would entail showing that no fast algorithm exists
to replace brute-force search. Doing so is presently beyond scientific reach. The
following figure shows the two possibilities.
FIGURE 7.26
One of these two possibilities is correct
The best deterministic method currently known for deciding languages in NP
uses exponential time. In other words, we can prove that
NP⊆EXPTIME =⎪uniondisplay
kTIME(2nk),
but we don’t know whether NPis contained in a smaller deterministic time com-
plexity class.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 323 ---
7.4 NP-COMPLETENESS 299
7.4
NP-COMPLETENESS
One important advance on the Pversus NPquestion came in the early 1970s
with the work of Stephen Cook and Leonid Levin. They discovered certain
problems in NPwhose individual complexity is related to that of the entire class.
If a polynomial time algorithm exists for any of these problems, all problems in
NPwould be polynomial time solvable. These problems are called NP-complete .
The phenomenon of NP-completeness is important for both theoretical and
practical reasons.
On the theoretical side, a researcher trying to show that Pis unequal to NP
may focus on an NP-complete problem. If any problem in NPrequires more
than polynomial time, an NP-complete one does. Furthermore, a researcher
attempting to prove that Pequals NPonly needs to find a polynomial time al-
gorithm for an NP-complete problem to achieve this goal.
On the practical side, the phenomenon of NP-completeness may prevent
wasting time searching for a nonexistent polynomial time algorithm to solve
ap a r t i c u l a rp r o b l e m .E v e nt h o u g hw em a yn o th a v et h en e c e s s a r ym a t h e m a t i c s
to prove that the problem is unsolvable in polynomial time, we believe that Pis
unequal to NP.S op r o v i n gt h a tap r o b l e mi s NP-complete is strong evidence of
its nonpolynomiality.
The first NP-complete problem that we present is called the satisfiability
problem .R e c a l lt h a tv a r i a b l e st h a tc a nt a k eo nt h ev a l u e s TRUE and FALSE are
called Boolean variables (see Section 0.2). Usually, we represent TRUE by1and
FALSE by0.T h e Boolean operations AND ,OR,a n d NOT ,r e p r e s e n t e db yt h e
symbols ∧,∨,a n d ¬,r e s p e c t i v e l y ,a r ed e s c r i b e di nt h ef o l l o w i n gl i s t .W eu s et h e
overbar as a shorthand for the ¬symbol, so
 xmeans ¬x.
0∧0=0 0 ∨0=0
 0=1
0∧1=0 0 ∨1=1
 1=0
1∧0=0 1 ∨0=1
1∧1=1 1 ∨1=1
ABoolean formula is an expression involving Boolean variables and opera-
tions. For example,
φ=(
x∧y)∨(x∧
z)
is a Boolean formula. A Boolean formula is satisfiable if some assignment of 0s
and1st ot h ev a r i a b l e sm a k e st h ef o r m u l ae v a l u a t et o 1.T h ep r e c e d i n gf o r m u l ai s
satisfiable because the assignment x=0,y=1,a n d z=0makes φevaluate to 1.
We say the assignment satisfies φ.T h e satisfiability problem is to test whether a
Boolean formula is satisfiable. Let
SAT ={⟨φ⟩|φis a satisfiable Boolean formula }.
Now we state a theorem that links the complexity of the SAT problem to the
complexities of all problems in NP.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 324 ---
300 CHAPTER 7 / TIME COMPLEXITY
THEOREM 7.27
SAT ∈PiffP=N P .
Next, we develop the method that is central to the proof of this theorem.
POLYNOMIAL TIME REDUCIBILITY
In Chapter 5, we defined the concept of reducing one problem to another. When
problem Areduces to problem B,as o l u t i o nt o Bcan be used to solve A.N o w
we define a version of reducibility that takes the efficiency of computation into
account. When problem Aisefficiently reducible to problem B,a ne f fi c i e n t
solution to Bcan be used to solve Aefficiently.
DEFINITION 7.28
Af u n c t i o n f:Σ∗−→Σ∗is apolynomial time computable function
if some polynomial time T uring machine Mexists that halts with
justf(w)on its tape, when started on any input w.
DEFINITION 7.29
Language Aispolynomial time mapping reducible ,1or simply poly-
nomial time reducible ,t ol a n g u a g e B,w r i t t e n A≤PB,i fap o l y n o -
mial time computable function f:Σ∗−→Σ∗exists, where for every
w,
w∈A⇐⇒f(w)∈B.
The function fis called the polynomial time reduction ofAtoB.
Polynomial time reducibility is the efficient analog to mapping reducibility
as defined in Section 5.3. Other forms of efficient reducibility are available, but
polynomial time reducibility is a simple form that is adequate for our purposes
so we won’t discuss the others here. Figure 7.30 illustrates polynomial time
reducibility.
1It is called polynomial time many–one reducibility in some other textbooks.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 325 ---
7.4 NP-COMPLETENESS 301
FIGURE 7.30
Polynomial time function freducing AtoB
As with an ordinary mapping reduction, a polynomial time reduction of Ato
Bprovides a way to convert membership testing in Ato membership testing in
B—but now the conversion is done efficiently. T o test whether w∈A,w eu s e
the reduction fto map wtof(w)and test whether f(w)∈B.
If one language is polynomial time reducible to a language already known to
have a polynomial time solution, we obtain a polynomial time solution to the
original language, as in the following theorem.
THEOREM 7.31
IfA≤PBandB∈P,t h e n A∈P.
PROOF LetMbe the polynomial time algorithm deciding Bandfbe the
polynomial time reduction from AtoB.W e d e s c r i b e a p o l y n o m i a l t i m e a l g o -
rithm Ndeciding Aas follows.
N=“On input w:
1.Compute f(w).
2.Run Mon input f(w)and output whatever Moutputs. ”
We have w∈Awhenever f(w)∈Bbecause fis a reduction from AtoB.
Thus, Maccepts f(w)whenever w∈A.M o r e o v e r , Nruns in polynomial time
because each of its two stages runs in polynomial time. Note that stage 2 runs in
polynomial time because the composition of two polynomials is a polynomial.
Before demonstrating a polynomial time reduction, we introduce 3SAT ,a
special case of the satisfiability problem whereby all formulas are in a special
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 326 ---
302 CHAPTER 7 / TIME COMPLEXITY
form. A literal is a Boolean variable or a negated Boolean variable, as in xor
x.
Aclause is several literals connected with ∨s, as in (x1∨
x2∨
x3∨x4).AB o o l e a n
formula is in conjunctive normal form ,c a l l e da cnf-formula ,i fi tc o m p r i s e s
several clauses connected with ∧s, as in
(x1∨
x2∨
x3∨x4)∧(x3∨
x5∨x6)∧(x3∨
x6).
It is a 3cnf-formula if all the clauses have three literals, as in
(x1∨
x2∨
x3)∧(x3∨
x5∨x6)∧(x3∨
x6∨x4)∧(x4∨x5∨x6).
Let3SAT ={⟨φ⟩|φis a satisfiable 3cnf-formula }.I f a n a s s i g n m e n t s a t i s fi e s a
cnf-formula, each clause must contain at least one literal that evaluates to 1.
The following theorem presents a polynomial time reduction from the 3SAT
problem to the CLIQUE problem.
THEOREM 7.32
3SAT is polynomial time reducible to CLIQUE .
PROOF IDEA The polynomial time reduction fthat we demonstrate from
3SAT toCLIQUE converts formulas to graphs. In the constructed graphs,
cliques of a specified size correspond to satisfying assignments of the formula.
Structures within the graph are designed to mimic the behavior of the variables
and clauses.
PROOF Letφbe a formula with kclauses such as
φ=(a1∨b1∨c1)∧(a2∨b2∨c2)∧· · ·∧ (ak∨bk∨ck).
The reduction fgenerates the string ⟨G, k⟩,w h e r e Gis an undirected graph
defined as follows.
The nodes in Gare organized into kgroups of three nodes each called the
triples ,t1,...,t k.E a c h t r i p l e c o r r e s p o n d s t o o n e o f t h e c l a u s e s i n φ,a n de a c h
node in a triple corresponds to a literal in the associated clause. Label each node
ofGwith its corresponding literal in φ.
The edges of Gconnect all but two types of pairs of nodes in G.N o e d g e
is present between nodes in the same triple, and no edge is present between
two nodes with contradictory labels, as in x2and
x2.F i g u r e7 . 3 3i l l u s t r a t e st h i s
construction when φ=(x1∨x1∨x2)∧(
x1∨
x2∨
x2)∧(
x1∨x2∨x2).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 327 ---
7.4 NP-COMPLETENESS 303
FIGURE 7.33
The graph that the reduction produces from
φ=(x1∨x1∨x2)∧(
x1∨
x2∨
x2)∧(
x1∨x2∨x2)
Now we demonstrate why this construction works. We show that φis satisfi-
able iff Ghas a k-clique.
Suppose that φhas a satisfying assignment. In that satisfying assignment, at
least one literal is true in every clause. In each triple of G,w es e l e c to n en o d e
corresponding to a true literal in the satisfying assignment. If more than one
literal is true in a particular clause, we choose one of the true literals arbitrarily.
The nodes just selected form a k-clique. The number of nodes selected is k
because we chose one for each of the ktriples. Each pair of selected nodes is
joined by an edge because no pair fits one of the exceptions described previously.
They could not be from the same triple because we selected only one node per
triple. They could not have contradictory labels because the associated literals
were both true in the satisfying assignment. Therefore, Gcontains a k-clique.
Suppose that Ghas a k-clique. No two of the clique’s nodes occur in the same
triple because nodes in the same triple aren’t connected by edges. Therefore,
each of the ktriples contains exactly one of the kclique nodes. We assign truth
values to the variables of φso that each literal labeling a clique node is made
true. Doing so is always possible because two nodes labeled in a contradictory
way are not connected by an edge and hence both can’t be in the clique. This
assignment to the variables satisfies φbecause each triple contains a clique node
and hence each clause contains a literal that is assigned TRUE .T h e r e f o r e , φis
satisfiable.
Theorems 7.31 and 7.32 tell us that if CLIQUE is solvable in polynomial time,
so is 3SAT .A tfi r s tg l a n c e ,t h i sc o n n e c t i o nb e t w e e nt h e s et w op r o b l e m sa p p e a r s
quite remarkable because, superficially, they are rather different. But polynomial
time reducibility allows us to link their complexities. Now we turn to a definition
that will allow us similarly to link the complexities of an entire class of problems.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 328 ---
304 CHAPTER 7 / TIME COMPLEXITY
DEFINITION OF NP-COMPLETENESS
DEFINITION 7.34
Al a n g u a g e BisNP-complete if it satisfies two conditions:
1.Bis in NP,a n d
2.every AinNPis polynomial time reducible to B.
THEOREM 7.35
IfBisNP-complete and B∈P,t h e n P=N P .
PROOF This theorem follows directly from the definition of polynomial time
reducibility.
THEOREM 7.36
IfBisNP-complete and B≤PCforCinNP,t h e n CisNP-complete.
PROOF We already know that Cis in NP,s ow em u s ts h o wt h a te v e r y Ain
NPis polynomial time reducible to C.B e c a u s e BisNP-complete, every lan-
guage in NPis polynomial time reducible to B,a n d Bin turn is polynomial
time reducible to C.P o l y n o m i a l t i m er e d u c t i o n s c o m p o s e ; t h a t i s ,i f Ais poly-
nomial time reducible to BandBis polynomial time reducible to C,t h e n A
is polynomial time reducible to C.H e n c e e v e r y l a n g u a g e i n NPis polynomial
time reducible to C.
THE COOK- - -LEVIN THEOREM
Once we have one NP-complete problem, we may obtain others by polynomial
time reduction from it. However, establishing the first NP-complete problem is
more difficult. Now we do so by proving that SAT isNP-complete.
THEOREM 7.37
SAT isNP-complete.2
This theorem implies Theorem 7.27.
2An alternative proof of this theorem appears in Section 9.3.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 329 ---
7.4 NP-COMPLETENESS 305
PROOF IDEA Showing that SAT is in NPis easy, and we do so shortly. The
hard part of the proof is showing that any language in NPis polynomial time
reducible to SAT.
To d o s o , w e c o n s t r u c t a p o l y n o m i a l t i m e r e d u c t i o n f o r e a c h l a n g u a g e AinNP
toSAT.T h er e d u c t i o nf o r Atakes a string wand produces a Boolean formula φ
that simulates the NPmachine for Aon input w.I f t h e m a c h i n ea c c e p t s , φhas
as a t i s f y i n ga s s i g n m e n tt h a tc o r r e s p o n d st ot h ea c c e p t i n gc o m p u t a t i o n . I ft h e
machine doesn’t accept, no assignment satisfies φ.T h e r e f o r e , wis in Aif and
only if φis satisfiable.
Actually constructing the reduction to work in this way is a conceptually
simple task, though we must cope with many details. A Boolean formula may
contain the Boolean operations AND ,OR,a n d NOT ,a n dt h e s eo p e r a t i o n sf o r m
the basis for the circuitry used in electronic computers. Hence the fact that we
can design a Boolean formula to simulate a T uring machine isn’t surprising. The
details are in the implementation of this idea.
PROOF First, we show that SAT is in NP.A n o n d e t e r m i n i s t i c p o l y n o m i a l
time machine can guess an assignment to a given formula φand accept if the
assignment satisfies φ.
Next, we take any language AinNPand show that Ais polynomial time
reducible to SAT.L e t Nbe a nondeterministic T uring machine that decides A
innktime for some constant k.( F o r c o n v e n i e n c e , w e a c t u a l l y a s s u m e t h a t N
runs in time nk−3;b u to n l yt h o s er e a d e r si n t e r e s t e di nd e t a i l ss h o u l dw o r r y
about this minor point.) The following notion helps to describe the reduction.
Atableau forNonwis an nk×nktable whose rows are the configurations of
ab r a n c ho ft h ec o m p u t a t i o no f Non input w,a ss h o w ni nt h ef o l l o w i n gfi g u r e .
FIGURE 7.38
At a b l e a ui sa n nk×nktable of configurations
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 330 ---
306 CHAPTER 7 / TIME COMPLEXITY
For convenience later, we assume that each configuration starts and ends with
a#symbol. Therefore, the first and last columns of a tableau are all #s. The first
row of the tableau is the starting configuration of Nonw,a n de a c hr o wf o l l o w s
the previous one according to N’s transition function. A tableau is accepting if
any row of the tableau is an accepting configuration.
Every accepting tableau for Nonwcorresponds to an accepting computation
branch of Nonw.T h u s , t h e p r o b l e m o f d e t e r m i n i n g w h e t h e r Naccepts wis
equivalent to the problem of determining whether an accepting tableau for N
onwexists.
Now we get to the description of the polynomial time reduction ffrom Ato
SAT.O ni n p u t w,t h er e d u c t i o np r o d u c e saf o r m u l a φ.W eb e g i nb yd e s c r i b i n g
the variables of φ.S a y t h a t QandΓare the state set and tape alphabet of N,
respectively. Let C=Q∪Γ∪{#}.F o re a c h iandjbetween 1andnkand for
each sinC,w eh a v eav a r i a b l e , xi,j,s.
Each of the (nk)2entries of a tableau is called a cell.T h e c e l l i n r o w iand
column jis called cell[i, j]and contains a symbol from C.W e r e p r e s e n t t h e
contents of the cells with the variables of φ.I f xi,j,stakes on the value 1,i t
means that cell[i, j]contains an s.
Now we design φso that a satisfying assignment to the variables does corre-
spond to an accepting tableau for Nonw.T h e f o r m u l a φis the AND of four
parts: φcell∧φstart∧φmove∧φaccept.W ed e s c r i b ee a c hp a r ti nt u r n .
As we mentioned previously, turning variable xi,j,son corresponds to placing
symbol sincell[i, j].T h efi r s tt h i n gw em u s tg u a r a n t e ei no r d e rt oo b t a i nac o r -
respondence between an assignment and a tableau is that the assignment turns
on exactly one variable for each cell. Formula φcellensures this requirement by
expressing it in terms of Boolean operations:
φcell=⎪logicalanddisplay
1≤i,j≤nk⎪bracketleftbigg⎪parenleft⎢ig⎪logicalordisplay
s∈Cxi,j,s⎪parenright⎢ig
∧⎪parenleft⎢ig⎪logicalanddisplay
s,t∈C
s̸=t(
xi,j,s∨
xi,j,t)⎪parenright⎢ig⎪bracketrightbigg
.
The symbols⎪logicalandtextand⎪logicalortextstand for iterated AND and OR.F o r e x a m p l e , t h e
expression in the preceding formula
⎪logicalordisplay
s∈Cxi,j,s
is shorthand for
xi,j,s 1∨xi,j,s 2∨···∨ xi,j,s l
where C={s1,s2,...,s l}.H e n c e φcellis actually a large expression that con-
tains a fragment for each cell in the tableau because iandjrange from 1tonk.
The first part of each fragment says that at least one variable is turned on in the
corresponding cell. The second part of each fragment says that no more than
one variable is turned on (literally, it says that in each pair of variables, at least
one is turned off) in the corresponding cell. These fragments are connected by
∧operations.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 331 ---
7.4 NP-COMPLETENESS 307
The first part of φcellinside the brackets stipulates that at least one variable
that is associated with each cell is on, whereas the second part stipulates that no
more than one variable is on for each cell. Any assignment to the variables that
satisfies φ(and therefore φcell)m u s th a v ee x a c t l yo n ev a r i a b l eo nf o re v e r yc e l l .
Thus, any satisfying assignment specifies one symbol in each cell of the table.
Parts φstart,φmove,a n d φaccept ensure that these symbols actually correspond to an
accepting tableau as follows.
Formula φstartensures that the first row of the table is the starting configu-
ration of Nonwby explicitly stipulating that the corresponding variables are
on:
φstart=x1,1,#∧x1,2,q0∧
x1,3,w1∧x1,4,w2∧...∧x1,n+2,wn∧
x1,n+3,␣∧...∧x1,nk−1,␣∧x1,nk,#.
Formula φaccept guarantees that an accepting configuration occurs in the
tableau. It ensures that qaccept,t h es y m b o lf o rt h ea c c e p ts t a t e ,a p p e a r si no n e
of the cells of the tableau by stipulating that one of the corresponding variables
is on:
φaccept =⎪logicalordisplay
1≤i,j≤nkxi,j,q accept.
Finally, formula φmoveguarantees that each row of the tableau corresponds to
ac o n fi g u r a t i o nt h a tl e g a l l yf o l l o w st h ep r e c e d i n gr o w ’ sc o n fi g u r a t i o na c c o r d i n g
toN’s rules. It does so by ensuring that each 2×3window of cells is legal.
We say that a 2×3window is legal if that window does not violate the actions
specified by N’s transition function. In other words, a window is legal if it might
appear when one configuration correctly follows another.3
For example, say that a,b,a n d care members of the tape alphabet, and q1
andq2are states of N.A s s u m et h a tw h e ni ns t a t e q1with the head reading an a,
Nwrites a b,s t a y si ns t a t e q1,a n dm o v e sr i g h t ;a n dt h a tw h e ni ns t a t e q1with
the head reading a b,Nnondeterministically either
1.writes a c,e n t e r s q2,a n dm o v e st ot h el e f t ,o r
2.writes an a,e n t e r s q2,a n dm o v e st ot h er i g h t .
Expressed formally, δ(q1,a)={(q1,b,R)}andδ(q1,b)= {(q2,c,L),(q2,a,R)}.
Examples of legal windows for this machine are shown in Figure 7.39.
3We could give a precise definition of legal window here, in terms of the transition func-
tion. But doing so is quite tedious and would distract us from the main thrust of the
proof argument. Anyone desiring more precision should refer to the related analysis in
the proof of Theorem 5.15, the undecidability of the Post Correspondence Problem.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 332 ---
308 CHAPTER 7 / TIME COMPLEXITY
(a)
a
q1
b
q2
a
c
(b)
a
q1
b
a
a
q2
(c)
a
a
q1
a
a
b
(d)
#
b
a
#
b
a
(e)
a
b
a
a
b
q2
(f )
b
b
b
c
b
b
FIGURE 7.39
Examples of legal windows
In Figure 7.39, windows (a) and (b) are legal because the transition function
allows Nto move in the indicated way. Window (c) is legal because, with q1
appearing on the right side of the top row, we don’t know what symbol the head
is over. That symbol could be an a,a n d q1might change it to a band move to the
right. That possibility would give rise to this window, so it doesn’t violate N’s
rules. Window (d) is obviously legal because the top and bottom are identical,
which would occur if the head weren’t adjacent to the location of the window.
Note that #may appear on the left or right of both the top and bottom rows
in a legal window. Window (e) is legal because state q1reading a bmight have
been immediately to the right of the top row, and it would then have moved to
the left in state q2to appear on the right-hand end of the bottom row. Finally,
window (f ) is legal because state q1might have been immediately to the left of
the top row, and it might have changed the bto acand moved to the left.
The windows shown in the following figure aren’t legal for machine N.
(a)
a
b
a
a
a
a
(b)
a
q1
b
q2
a
a
(c)
b
q1
b
q2
b
q2
FIGURE 7.40
Examples of illegal windows
In window (a), the central symbol in the top row can’t change because a state
wasn’t adjacent to it. Window (b) isn’t legal because the transition function spec-
ifies that the bgets changed to a cbut not to an a.W i n d o w( c )i s n ’ tl e g a lb e c a u s e
two states appear in the bottom row.
CLAIM 7.41
If the top row of the tableau is the start configuration and every window in the
tableau is legal, each row of the tableau is a configuration that legally follows the
preceding one.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 333 ---
7.4 NP-COMPLETENESS 309
We prove this claim by considering any two adjacent configurations in the
tableau, called the upper configuration and the lower configuration. In the up-
per configuration, every cell that contains a tape symbol and isn’t adjacent to
as t a t es y m b o li st h ec e n t e rt o pc e l li naw i n d o ww h o s et o pr o wc o n t a i n sn o
states. Therefore, that symbol must appear unchanged in the center bottom of
the window. Hence it appears in the same position in the bottom configuration.
The window containing the state symbol in the center top cell guarantees that
the corresponding three positions are updated consistently with the transition
function. Therefore, if the upper configuration is a legal configuration, so is the
lower configuration, and the lower one follows the upper one according to N’s
rules. Note that this proof, though straightforward, depends crucially on our
choice of a 2×3window size, as Problem 7.41 shows.
Now we return to the construction of φmove.I ts t i p u l a t e st h a ta l lt h ew i n d o w s
in the tableau are legal. Each window contains six cells, which may be set in
afi x e dn u m b e ro fw a y st oy i e l dal e g a lw i n d o w . F o r m u l a φmove says that the
settings of those six cells must be one of these ways, or
φmove=⎪logicalanddisplay
1≤i<nk,1<j<nk⎪parenleftbig
the(i, j)-window is legal⎪parenrightbig
.
The (i, j)-window has cell[i, j]as the upper central position. We replace the
text “the (i, j)-window is legal” in this formula with the following formula. We
write the contents of six cells of a window as a1,...,a 6.
⎪logicalordisplay
a1,...,a 6
is a legal window⎪parenleftbig
xi,j−1,a1∧xi,j,a 2∧xi,j+1,a3∧xi+1,j−1,a4∧xi+1,j,a 5∧xi+1,j+1,a6⎪parenrightbig
Next, we analyze the complexity of the reduction to show that it operates in
polynomial time. T o do so, we examine the size of φ.F i r s t , w e e s t i m a t e t h e
number of variables it has. Recall that the tableau is an nk×nktable, so it
contains n2kcells. Each cell has lvariables associated with it, where lis the
number of symbols in C.B e c a u s e ldepends only on the TMNand not on the
length of the input n,t h et o t a ln u m b e ro fv a r i a b l e si s O(n2k).
We estimate the size of each of the parts of φ.F o r m u l a φcellcontains a fixed-
size fragment of the formula for each cell of the tableau, so its size is O(n2k).
Formula φstarthas a fragment for each cell in the top row, so its size is O(nk).
Formulas φmoveandφaccept each contain a fixed-size fragment of the formula for
each cell of the tableau, so their size is O(n2k).T h u s , φ’s total size is O(n2k).
That bound is sufficient for our purposes because it shows that the size of φ
is polynomial in n.I f i t w e r e m o r e t h a n p o l y n o m i a l , t h e r e d u c t i o n w o u l d n ’ t
have any chance of generating it in polynomial time. (Actually, our estimates are
low by a factor of O(logn)because each variable has indices that can range up
tonkand so may require O(logn)symbols to write into the formula, but this
additional factor doesn’t change the polynomiality of the result.)
To s e e t h a t w e c a n g e n e r a t e t h e f o r m u l a i n p o l y n o m i a l t i m e , o b s e r v e i t s h i g h l y
repetitive nature. Each component of the formula is composed of many nearly
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 334 ---
310 CHAPTER 7 / TIME COMPLEXITY
identical fragments, which differ only at the indices in a simple way. Therefore,
we may easily construct a reduction that produces φin polynomial time from the
input w.
Thus, we have concluded the proof of the Cook–Levin theorem, showing
that SAT isNP-complete. Showing the NP-completeness of other languages
generally doesn’t require such a lengthy proof. Instead, NP-completeness can be
proved with a polynomial time reduction from a language that is already known
to be NP-complete. We can use SAT for this purpose; but using 3SAT ,t h e
special case of SAT that we defined on page 302, is usually easier. Recall that
the formulas in 3SAT are in conjunctive normal form (cnf) with three literals
per clause. First, we must show that 3SAT itself is NP-complete. We prove this
assertion as a corollary to Theorem 7.37.
COROLLARY 7.42
3SAT isNP-complete.
PROOF Obviously 3SAT is inNP,s ow eo n l yn e e dt op r o v et h a ta l ll a n g u a g e s
inNPreduce to 3SAT in polynomial time. One way to do so is by showing
that SAT polynomial time reduces to 3SAT .I n s t e a d , w e m o d i f y t h e p r o o f o f
Theorem 7.37 so that it directly produces a formula in conjunctive normal form
with three literals per clause.
Theorem 7.37 produces a formula that is already almost in conjunctive nor-
mal form. Formula φcellis a big AND of subformulas, each of which contains a
big ORand a big AND ofORs. Thus, φcellis an AND of clauses and so is already
in cnf. Formula φstartis a big AND of variables. T aking each of these variables
to be a clause of size 1,w es e et h a t φstartis in cnf. Formula φaccept is a big OR
of variables and is thus a single clause. Formula φmoveis the only one that isn’t
already in cnf, but we may easily convert it into a formula that is in cnf as follows.
Recall that φmoveis a big AND of subformulas, each of which is an ORofAND s
that describes all possible legal windows. The distributive laws, as described in
Chapter 0, state that we can replace an ORofAND sw i t ha ne q u i v a l e n t AND of
ORs. Doing so may significantly increase the size of each subformula, but it can
only increase the total size of φmoveby a constant factor because the size of each
subformula depends only on N.T h e r e s u l t i s a f o r m u l a t h a t i s i n c o n j u n c t i v e
normal form.
Now that we have written the formula in cnf, we convert it to one with three
literals per clause. In each clause that currently has one or two literals, we repli-
cate one of the literals until the total number is three. In each clause that has
more than three literals, we split it into several clauses and add additional vari-
ables to preserve the satisfiability or nonsatisfiability of the original.
For example, we replace clause (a1∨a2∨a3∨a4),w h e r e i ne a c h aiis a literal,
with the two-clause expression (a1∨a2∨z)∧(
z∨a3∨a4),w h e r e i n zis a new
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 335 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 311
variable. If some setting of the ai’s satisfies the original clause, we can find some
setting of zso that the two new clauses are satisfied and vice versa. In general, if
the clause contains lliterals,
(a1∨a2∨···∨ al),
we can replace it with the l−2clauses
(a1∨a2∨z1)∧(
z1∨a3∨z2)∧(
z2∨a4∨z3)∧···∧ (
zl−3∨al−1∨al).
We may easily verify that the new formula is satisfiable iff the original formula
was, so the proof is complete.
7.5
ADDITIONAL NP-COMPLETE PROBLEMS
The phenomenon of NP-completeness is widespread. NP-complete problems
appear in many fields. For reasons that are not well understood, most naturally
occurring NP-problems are known either to be in P or to be NP-complete. If
you seek a polynomial time algorithm for a new NP-problem, spending part of
your effort attempting to prove it NP-complete is sensible because doing so may
prevent you from working to find a polynomial time algorithm that doesn’t exist.
In this section, we present additional theorems showing that various lan-
guages are NP-complete. These theorems provide examples of the techniques
that are used in proofs of this kind. Our general strategy is to exhibit a polyno-
mial time reduction from 3SAT to the language in question, though we some-
times reduce from other NP-complete languages when that is more convenient.
When constructing a polynomial time reduction from 3SAT to a language, we
look for structures in that language that can simulate the variables and clauses in
Boolean formulas. Such structures are sometimes called gadgets .F o r e x a m p l e ,
in the reduction from 3SAT toCLIQUE presented in Theorem 7.32, individual
nodes simulate variables and triples of nodes simulate clauses. An individual
node may or may not be a member of the clique, corresponding to a variable
that may or may not be true in a satisfying assignment. Each clause must contain
al i t e r a lt h a ti sa s s i g n e d TRUE . Correspondingly, each triple must contain a
node in the clique (in order to reach the target size). The following corollary to
Theorem 7.32 states that CLIQUE isNP-complete.
COROLLARY 7.43
CLIQUE isNP-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 336 ---
312 CHAPTER 7 / TIME COMPLEXITY
THE VERTEX COVER PROBLEM
IfGis an undirected graph, a vertex cover ofGis a subset of the nodes where
every edge of Gtouches one of those nodes. The vertex cover problem asks
whether a graph contains a vertex cover of a specified size:
VERTEX-COVER ={⟨G, k⟩|Gis an undirected graph that
has a k-node vertex cover }.
THEOREM 7.44
VERTEX-COVER isNP-complete.
PROOF IDEA To s h o w t h a t VERTEX-COVER isNP-complete, we must
show that it is in NPand that all NP-problems are polynomial time reducible
to it. The first part is easy; a certificate is simply a vertex cover of size k.
To p r o v e t h e s e c o n d p a r t , w e s h o w t h a t 3SAT is polynomial time reducible to
VERTEX-COVER .T h er e d u c t i o nc o n v e r t sa3 c n f - f o r m u l a φinto a graph Gand
an u m b e r k,s ot h a t φis satisfiable whenever Ghas a vertex cover with knodes.
The conversion is done without knowing whether φis satisfiable. In effect, G
simulates φ.T h eg r a p hc o n t a i n sg a d g e t st h a tm i m i ct h ev a r i a b l e sa n dc l a u s e so f
the formula. Designing these gadgets requires a bit of ingenuity.
For the variable gadget, we look for a structure in Gthat can participate in
the vertex cover in either of two possible ways, corresponding to the two pos-
sible truth assignments to the variable. The variable gadget contains two nodes
connected by an edge. That structure works because one of these nodes must
appear in the vertex cover. We arbitrarily associate TRUE and FALSE with these
two nodes.
For the clause gadget, we look for a structure that induces the vertex cover to
include nodes in the variable gadgets corresponding to at least one true literal in
the clause. The gadget contains three nodes and additional edges so that any ver-
tex cover must include at least two of the nodes, or possibly all three. Only two
nodes would be required if one of the variable gadget nodes helps by covering
an edge, as would happen if the associated literal satisfies that clause. Other-
wise, three nodes would be required. Finally, we chose kso that the sought-after
vertex cover has one node per variable gadget and two nodes per clause gadget.
PROOF Here are the details of a reduction from 3SAT toVERTEX-COVER
that operates in polynomial time. The reduction maps a Boolean formula φto a
graph Gand a value k.F o re a c hv a r i a b l e xinφ,w ep r o d u c ea ne d g ec o n n e c t i n g
two nodes. We label the two nodes in this gadget xand
x.S e t t i n g xto be TRUE
corresponds to selecting the node labeled xfor the vertex cover, whereas FALSE
corresponds to the node labeled
 x.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 337 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 313
The gadgets for the clauses are a bit more complex. Each clause gadget is a
triple of nodes that are labeled with the three literals of the clause. These three
nodes are connected to each other and to the nodes in the variable gadgets that
have the identical labels. Thus, the total number of nodes that appear in Gis
2m+3l,w h e r e φhasmvariables and lclauses. Let kbem+2l.
For example, if φ=(x1∨x1∨x2)∧(
x1∨
x2∨
x2)∧(
x1∨x2∨x2),t h e
reduction produces ⟨G, k⟩from φ,w h e r e k=8andGtakes the form shown in
the following figure.
FIGURE 7.45
The graph that the reduction produces from
φ=(x1∨x1∨x2)∧(
x1∨
x2∨
x2)∧(
x1∨x2∨x2)
To p r o v e t h a t t h i s r e d u c t i o n w o r k s , w e n e e d t o s h o w t h a t φis satisfiable if and
only if Ghas a vertex cover with knodes. We start with a satisfying assignment.
We first put the nodes of the variable gadgets that correspond to the true literals
in the assignment into the vertex cover. Then, we select one true literal in every
clause and put the remaining two nodes from every clause gadget into the vertex
cover. Now we have a total of knodes. They cover all edges because every vari-
able gadget edge is clearly covered, all three edges within every clause gadget are
covered, and all edges between variable and clause gadgets are covered. Hence
Ghas a vertex cover with knodes.
Second, if Ghas a vertex cover with knodes, we show that φis satisfiable
by constructing the satisfying assignment. The vertex cover must contain one
node in each variable gadget and two in every clause gadget in order to cover the
edges of the variable gadgets and the three edges within the clause gadgets. That
accounts for all the nodes, so none are left over. We take the nodes of the vari-
able gadgets that are in the vertex cover and assign TRUE to the corresponding
literals. That assignment satisfies φbecause each of the three edges connecting
the variable gadgets with each clause gadget is covered, and only two nodes of
the clause gadget are in the vertex cover. Therefore, one of the edges must be
covered by a node from a variable gadget and so that assignment satisfies the
corresponding clause.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 338 ---
314 CHAPTER 7 / TIME COMPLEXITY
THE HAMILTONIAN PATH PROBLEM
Recall that the Hamiltonian path problem asks whether the input graph contains
ap a t hf r o m stotthat goes through every node exactly once.
THEOREM 7.46
HAMPATH isNP-complete.
PROOF IDEA We showed that HAMPATH is in NPin Section 7.3. T o show
that every NP-problem is polynomial time reducible to HAMPATH ,w es h o w
that3SAT is polynomial time reducible to HAMPATH .W eg i v eaw a yt oc o n v e r t
3cnf-formulas to graphs in which Hamiltonian paths correspond to satisfying
assignments of the formula. The graphs contain gadgets that mimic variables
and clauses. The variable gadget is a diamond structure that can be traversed in
either of two ways, corresponding to the two truth settings. The clause gadget
is a node. Ensuring that the path goes through each clause gadget corresponds
to ensuring that each clause is satisfied in the satisfying assignment.
PROOF We previously demonstrated that HAMPATH is in NP,s oa l lt h a t
remains to be done is to show 3SAT ≤PHAMPATH .F o re a c h3 c n f - f o r m u l a φ,
we show how to construct a directed graph Gwith two nodes, sandt,w h e r ea
Hamiltonian path exists between sandtiffφis satisfiable.
We start the construction with a 3cnf-formula φcontaining kclauses,
φ=(a1∨b1∨c1)∧(a2∨b2∨c2)∧· · ·∧ (ak∨bk∨ck),
where each a,b,a n d cis a literal xior
xi.L e t x1,...,x lbe the lvariables of φ.
Now we show how to convert φto a graph G.T h eg r a p h Gthat we construct
has various parts to represent the variables and clauses that appear in φ.
We represent each variable xiwith a diamond-shaped structure that contains
ah o r i z o n t a lr o wo fn o d e s ,a ss h o w ni nt h ef o l l o w i n gfi g u r e .L a t e rw es p e c i f yt h e
number of nodes that appear in the horizontal row.
FIGURE 7.47
Representing the variable xias a diamond structure
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 339 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 315
We represent each clause of φas a single node, as follows.FIGURE 7.48
Representing the clause cjas a node
The following figure depicts the global structure of G.I t s h o w s a l l t h e e l e -
ments of Gand their relationships, except the edges that represent the relation-
ship of the variables to the clauses that contain them.
FIGURE 7.49
The high-level structure of G
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 340 ---
316 CHAPTER 7 / TIME COMPLEXITY
Next, we show how to connect the diamonds representing the variables to the
nodes representing the clauses. Each diamond structure contains a horizontal
row of nodes connected by edges running in both directions. The horizontal
row contains 3k+1nodes in addition to the two nodes on the ends belonging to
the diamond. These nodes are grouped into adjacent pairs, one for each clause,
with extra separator nodes next to the pairs, as shown in the following figure.
FIGURE 7.50
The horizontal nodes in a diamond structure
If variable xiappears in clause cj,w ea d dt h ef o l l o w i n gt w oe d g e sf r o mt h e
jth pair in the ith diamond to the jth clause node.
FIGURE 7.51
The additional edges when clause cjcontains xi
If
xiappears in clause cj,w ea d dt w oe d g e sf r o mt h e jth pair in the ith dia-
mond to the jth clause node, as shown in Figure 7.52.
After we add all the edges corresponding to each occurrence of xior
xiin
each clause, the construction of Gis complete. T o show that this construction
works, we argue that if φis satisfiable, a Hamiltonian path exists from stot;a n d ,
conversely, if such a path exists, φis satisfiable.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 341 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 317
FIGURE 7.52
The additional edges when clause cjcontains
 xi
Suppose that φis satisfiable. T o demonstrate a Hamiltonian path from sto
t,w efi r s ti g n o r et h ec l a u s en o d e s . T h ep a t hb e g i n sa t s,g o e st h r o u g he a c h
diamond in turn, and ends up at t.T o h i t t h e h o r i z o n t a l n o d e s i n a d i a m o n d ,
the path either zig-zags from left to right or zag-zigs from right to left; the
satisfying assignment to φdetermines which. If xiis assigned TRUE ,t h ep a t h
zig-zags through the corresponding diamond. If xiis assigned FALSE ,t h ep a t h
zag-zigs. We show both possibilities in the following figure.
FIGURE 7.53
Zig-zagging and zag-zigging through a diamond, as determined by the
satisfying assignment
So far, this path covers all the nodes in Gexcept the clause nodes. We can
easily include them by adding detours at the horizontal nodes. In each clause,
we select one of the literals assigned TRUE by the satisfying assignment.
If we selected xiin clause cj,w ec a nd e t o u ra tt h e jth pair in the ith diamond.
Doing so is possible because ximust be TRUE ,s ot h ep a t hz i g - z a g sf r o ml e f tt o
right through the corresponding diamond. Hence the edges to the cjnode are
in the correct order to allow a detour and return.
Similarly, if we selected
 xiin clause cj,w ec a nd e t o u ra tt h e jth pair in the
ith diamond because ximust be FALSE ,s ot h ep a t hz a g - z i g sf r o mr i g h tt ol e f t
through the corresponding diamond. Hence the edges to the cjnode again are
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 342 ---
318 CHAPTER 7 / TIME COMPLEXITY
in the correct order to allow a detour and return. (Note that each true literal in a
clause provides an option of a detour to hit the clause node. As a result, if several
literals in a clause are true, only one detour is taken.) Thus, we have constructed
the desired Hamiltonian path.
For the reverse direction, if Ghas a Hamiltonian path from stot,w ed e m o n -
strate a satisfying assignment for φ.I ft h eH a m i l t o n i a np a t hi s normal —that is, it
goes through the diamonds in order from the top one to the bottom one, except
for the detours to the clause nodes—we can easily obtain the satisfying assign-
ment. If the path zig-zags through the diamond, we assign the corresponding
variable TRUE ;a n di fi tz a g - z i g s ,w ea s s i g n FALSE .B e c a u s ee a c hc l a u s en o d ea p -
pears on the path, by observing how the detour to it is taken, we may determine
which of the literals in the corresponding clause is TRUE .
All that remains to be shown is that a Hamiltonian path must be normal.
Normality may fail only if the path enters a clause from one diamond but returns
to another, as in the following figure.
  
FIGURE 7.54
This situation cannot occur
The path goes from node a1toc;b u ti n s t e a do fr e t u r n i n gt o a2in the same
diamond, it returns to b2in a different diamond. If that occurs, either a2ora3
must be a separator node. If a2were a separator node, the only edges entering
a2would be from a1anda3.I fa3were a separator node, a1anda2would be
in the same clause pair, and hence the only edges entering a2would be from a1,
a3,a n d c.I n e i t h e rc a s e , t h e p a t h c o u l d n o t c o n t a i n n o d e a2.T h ep a t h c a n n o t
enter a2from cora1because the path goes elsewhere from these nodes. The
path cannot enter a2from a3because a3is the only available node that a2points
at, so the path must exit a2viaa3.H e n c eaH a m i l t o n i a n p a t h m u s t b en o r m a l .
This reduction obviously operates in polynomial time and the proof is complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 343 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 319
Next, we consider an undirected version of the Hamiltonian path problem,
called UHAMPATH .T o s h o w t h a t UHAMPATH isNP-complete, we give a
polynomial time reduction from the directed version of the problem.
THEOREM 7.55
UHAMPATH isNP-complete.
PROOF The reduction takes a directed graph Gwith nodes sandt,a n dc o n -
structs an undirected graph G′with nodes s′andt′.G r a p h Ghas a Hamiltonian
path from stotiffG′has a Hamiltonian path from s′tot′.W ed e s c r i b e G′as
follows.
Each node uofG,e x c e p tf o r sandt,i sr e p l a c e db yat r i p l eo fn o d e s uin,umid,
anduoutinG′.N o d e s sandtinGare replaced by nodes sout=s′andtin=t′in
G′.E d g e so ft w ot y p e sa p p e a ri n G′.F i r s t ,e d g e sc o n n e c t umidwith uinanduout.
Second, an edge connects uoutwith vinif an edge goes from utovinG.T h a t
completes the construction of G′.
We can demonstrate that this construction works by showing that Ghas a
Hamiltonian path from stotiffG′has a Hamiltonian path from souttotin.T o
show one direction, we observe that a Hamiltonian path PinG,
s, u1,u2,...,u k, t,
has a corresponding Hamiltonian path P′inG′,
sout,uin
1,umid
1,uout
1,uin
2,umid
2,uout
2,..., tin.
To s h o w t h e o t h e r d i r e c t i o n , w e c l a i m t h a t a n y H a m i l t o n i a n p a t h i n G′from
souttotinmust go from a triple of nodes to a triple of nodes, except for the start
and finish, as does the path P′we just described. That would complete the proof
because any such path has a corresponding Hamiltonian path in G.W e p r o v e
the claim by following the path starting at node sout.O b s e r v et h a tt h en e x tn o d e
in the path must be uin
ifor some ibecause only those nodes are connected to sout.
The next node must be umid
ibecause no other way is available to include umid
iin
the Hamiltonian path. After umid
icomes uout
ibecause that is the only other node
to which umid
iis connected. The next node must be uin
jfor some jbecause no
other available node is connected to uout
i.T h ea r g u m e n tt h e nr e p e a t su n t i l tinis
reached.
THE SUBSET SUM PROBLEM
Recall the SUBSET-SUM problem defined on page 297. In that problem, we
were given a collection of numbers x1,...,x ktogether with a target number t,
and were to determine whether the collection contains a subcollection that adds
up to t.W en o ws h o wt h a tt h i sp r o b l e mi s NP-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 344 ---
320 CHAPTER 7 / TIME COMPLEXITY
THEOREM 7.56
SUBSET-SUM isNP-complete.
PROOF IDEA We have already shown that SUBSET-SUM is in NPin The-
orem 7.25. We prove that all languages in NPare polynomial time reducible
toSUBSET-SUM by reducing the NP-complete language 3SAT to it. Given
a3 c n f - f o r m u l a φ,w ec o n s t r u c ta ni n s t a n c eo ft h e SUBSET-SUM problem that
contains a subcollection summing to the target tif and only if φis satisfiable.
Call this subcollection T.
To a c h i e v e t h i s r e d u c t i o n , w e fi n d s t r u c t u r e s o f t h e SUBSET-SUM problem
that represent variables and clauses. The SUBSET-SUM problem instance that
we construct contains numbers of large magnitude presented in decimal nota-
tion. We represent variables by pairs of numbers and clauses by certain positions
in the decimal representations of the numbers.
We represent variable xiby two numbers, yiandzi.W ep r o v et h a te i t h e r yi
orzimust be in Tfor each i,w h i c he s t a b l i s h e st h ee n c o d i n gf o rt h et r u t hv a l u e
ofxiin the satisfying assignment.
Each clause position contains a certain value in the target t,w h i c hi m p o s e sa
requirement on the subset T.W ep r o v et h a tt h i sr e q u i r e m e n ti st h es a m ea st h e
one in the corresponding clause—namely, that one of the literals in that clause
is assigned TRUE .
PROOF We already know that SUBSET-SUM ∈NP,s ow en o ws h o wt h a t
3SAT ≤PSUBSET-SUM .
Letφbe a Boolean formula with variables x1,...,x land clauses c1,...,c k.
The reduction converts φto an instance of the SUBSET-SUM problem ⟨S, t⟩,
wherein the elements of Sand the number tare the rows in the table in Fig-
ure 7.57, expressed in ordinary decimal notation. The rows above the double
line are labeled
y1,z1,y2,z2,...,y l,zland g1,h1,g2,h2,...,g k,hk
and constitute the elements of S.T h er o wb e l o wt h ed o u b l el i n ei s t.
Thus, Scontains one pair of numbers, yi,zi,f o re a c hv a r i a b l e xiinφ.T h e
decimal representation of these numbers is in two parts, as indicated in the table.
The left-hand part comprises a 1followed by l−i0s. The right-hand part
contains one digit for each clause, where the digit of yiin column cjis1if clause
cjcontains literal xi,a n dt h ed i g i to f ziin column cjis1if clause cjcontains
literal
 xi.D i g i t sn o ts p e c i fi e dt ob e 1are0.
The table is partially filled in to illustrate sample clauses, c1,c2,a n d ck:
(x1∨
x2∨x3)∧(x2∨x3∨···)∧· · ·∧ (
x3∨···∨··· ).
Additionally, Scontains one pair of numbers, gj,hj,f o re a c hc l a u s e cj.T h e s e
two numbers are equal and consist of a 1followed by k−j0s.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 345 ---
7.5 ADDITIONAL NP-COMPLETE PROBLEMS 321
Finally, the target number t,t h eb o t t o mr o wo ft h et a b l e ,c o n s i s t so f l1s
followed by k3s.
1234 ···l
c1c2···ck
y1
1000 ···0
10 ···0
z1
1000 ···0
00 ···0
y2
 100 ···0
01 ···0
z2
 100 ···0
10 ···0
y3
 10 ···0
11 ···0
z3
 10 ···0
00 ···1
...
......
.........
yl
 1
00 ···0
zl
 1
00 ···0
g1
 10 ···0
h1
 10 ···0
g2
 1···0
h2
 1···0
...
......
gk
 1
hk
 1
t
1111 ···1
33 ···3
FIGURE 7.57
Reducing 3SAT toSUBSET-SUM
Next, we show why this construction works. We demonstrate that φis satis-
fiable iff some subset of Ssums to t.
Suppose that φis satisfiable. We construct a subset of Sas follows. We select
yiifxiis assigned TRUE in the satisfying assignment, and ziifxiis assigned
FALSE .I f w e a d d u p w h a t w e h a v e s e l e c t e d s o f a r , w e o b t a i n a 1in each of the
firstldigits because we have selected either yiorzifor each i.F u r t h e r m o r e ,e a c h
of the last kdigits is a number between 1and3because each clause is satisfied
and so contains between 1and3true literals. We additionally select enough of
thegandhnumbers to bring each of the last kdigits up to 3,t h u sh i t t i n gt h e
target.
Suppose that a subset of Ssums to t.W e c o n s t r u c t a s a t i s f y i n g a s s i g n m e n t
toφafter making several observations. First, all the digits in members of Sare
either 0or1.F u r t h e r m o r e , e a c h c o l u m n i n t h e t a b l e d e s c r i b i n g Scontains at
most five 1s. Hence a “carry” into the next column never occurs when a subset
ofSis added. T o get a 1in each of the first lcolumns, the subset must have
either yiorzifor each i,b u tn o tb o t h .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 346 ---
322 CHAPTER 7 / TIME COMPLEXITY
Now we make the satisfying assignment. If the subset contains yi,w ea s s i g n
xiTRUE ;o t h e r w i s e ,w ea s s i g ni t FALSE .T h i sa s s i g n m e n tm u s ts a t i s f y φbecause
in each of the final kcolumns, the sum is always 3.I nc o l u m n cj,a tm o s t 2can
come from gjandhj,s oa tl e a s t 1in this column must come from some yior
ziin the subset. If it is yi,t h e n xiappears in cjand is assigned TRUE ,s ocj
is satisfied. If it is zi,t h e n
 xiappears in cjandxiis assigned FALSE ,s ocjis
satisfied. Therefore, φis satisfied.
Finally, we must be sure that the reduction can be carried out in polynomial
time. The table has a size of roughly (k+l)2and each entry can be easily
calculated for any φ.S ot h et o t a lt i m ei s O(n2)easy stages.
EXERCISES
7.1 Answer each part TRUE orFALSE .
a.2n=O(n).
b.n2=O(n).
Ac.n2=O(nlog2n).Ad.nlogn=O(n2).
e.3n=2O(n).
f.22n=O(22n).
7.2 Answer each part TRUE orFALSE .
a.n=o(2n).
b.2n=o(n2).
Ac.2n=o(3n).Ad.1=o(n).
e.n=o(logn).
f.1=o(1/n).
7.3 Which of the following pairs of numbers are relatively prime? Show the calcula-
tions that led to your conclusions.
a.1274 and 10505
b.7289 and 8029
7.4 Fill out the table described in the polynomial time algorithm for context-free lan-
guage recognition from Theorem 7.16 for string w=baba andCFG G:
S→RT
R→TR|a
T→TR|b
7.5 Is the following formula satisfiable?
(x∨y)∧(x∨
y)∧(
x∨y)∧(
x∨
y)
7.6 Show that Pis closed under union, concatenation, and complement.
7.7 Show that NPis closed under union and concatenation.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 347 ---
PROBLEMS 323
7.8 LetCONNECTED ={⟨G⟩|Gis a connected undirected graph }.A n a l y z e t h e
algorithm given on page 185 to show that this language is in P.
7.9 Atriangle in an undirected graph is a 3-clique. Show that TRIANGLE ∈P,w h e r e
TRIANGLE ={⟨G⟩|Gcontains a triangle }.
7.10 Show that ALL DFAis in P.
7.11 In both parts, provide an analysis of the time complexity of your algorithm.
a.Show that EQDFA∈P.
b.Say that a language Aisstar-closed ifA=A∗.G i v e a p o l y n o m i a l t i m e
algorithm to test whether a DFArecognizes a star-closed language. (Note
thatEQNFAis not known to be in P.)
7.12 Call graphs GandHisomorphic if the nodes of Gmay be reordered so that it is
identical to H.L e t ISO ={⟨G, H⟩|GandHare isomorphic graphs }.S h o wt h a t
ISO∈NP.
PROBLEMS
7.13 Let
MODEXP ={⟨a, b, c, p ⟩|a, b, c, andpare positive binary integers
such that ab≡c(mod p)}.
Show that MODEXP ∈P. (Note that the most obvious algorithm doesn’t run in
polynomial time. Hint: T ry it first where bis a power of 2.)
7.14 Apermutation on the set {1,...,k }is a one-to-one, onto function on this set.
When pis a permutation, ptmeans the composition of pwith itself ttimes. Let
PERM-POWER ={⟨p, q, t ⟩|p=qtwhere pandqare permutations
on{1,...,k }andtis a binary integer }.
Show that PERM-POWER ∈P. (Note that the most obvious algorithm doesn’t
run within polynomial time. Hint: First try it where tis a power of 2.)
7.15 Show that Pis closed under the star operation. (Hint: Use dynamic programming.
On input y=y1···ynforyi∈Σ,b u i l dat a b l ei n d i c a t i n gf o re a c h i≤jwhether
the substring yi···yj∈A∗for any A∈P.)
A7.16 Show that NPis closed under the star operation.
7.17 LetUNARY-SSUM be the subset sum problem in which all numbers are repre-
sented in unary. Why does the NP-completeness proof for SUBSET-SUM fail to
show UNARY-SSUM isNP-complete? Show that UNARY-SSUM ∈P.
7.18 Show that if P=N P ,t h e ne v e r yl a n g u a g e A∈P,e x c e p t A=∅andA=Σ∗,i s
NP-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 348 ---
324 CHAPTER 7 / TIME COMPLEXITY
⋆7.19 Show that PRIMES ={m|mis a prime number in binary }∈NP. (Hint: For
p>1,t h em u l t i p l i c a t i v eg r o u p Z∗
p={x|xis relatively prime to pand1≤x<p }
is both cyclic and of order p−1iffpis prime. You may use this fact without
justifying it. The stronger statement PRIMES ∈Pis now known to be true, but it
is more difficult to prove.)
7.20 We generally believe that PATH is not NP-complete. Explain the reason behind
this belief. Show that proving PATH is not NP-complete would prove P ̸=NP .
7.21 LetGrepresent an undirected graph. Also let
SPATH ={⟨G, a, b, k ⟩|Gcontains a simple path of
length at most kfrom atob},
and
LPATH ={⟨G, a, b, k ⟩|Gcontains a simple path of
length at least kfrom atob}.
a.Show that SPATH ∈P.
b.Show that LPATH isNP-complete.
7.22 LetDOUBLE-SAT ={⟨φ⟩|φhas at least two satisfying assignments }.S h o wt h a t
DOUBLE-SAT isNP-complete.
A7.23 LetHALF-CLIQUE ={⟨G⟩|Gis an undirected graph having a complete sub-
graph with at least m/2nodes, where mis the number of nodes in G}.S h o wt h a t
HALF-CLIQUE isNP-complete.
7.24 LetCNF k={⟨φ⟩|φis a satisfiable cnf-formula where each variable appears in at
most kplaces }.
a.Show that CNF 2∈P.
b.Show that CNF 3isNP-complete.
7.25 LetCNF H={⟨φ⟩|φis a satisfiable cnf-formula where each clause contains any
number of literals, but at most one negated literal }.S h o wt h a t CNF H∈P.
7.26 Letφbe a 3cnf-formula. An ̸=-assignment to the variables of φis one where
each clause contains two literals with unequal truth values. In other words, an
̸=-assignment satisfies φwithout assigning three true literals in any clause.
a.Show that the negation of any ̸=-assignment to φis also an ̸=-assignment.
b.Let̸=SAT be the collection of 3cnf-formulas that have an ̸=-assignment.
Show that we obtain a polynomial time reduction from 3SAT to̸=SAT by
replacing each clause ci
(y1∨y2∨y3)
with the two clauses
(y1∨y2∨zi)and (
zi∨y3∨b),
where ziis a new variable for each clause ci,a n d bis a single additional new
variable.
c.Conclude that ̸=SAT isNP-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 349 ---
PROBLEMS 325
7.27 Acutin an undirected graph is a separation of the vertices Vinto two disjoint
subsets SandT.T h es i z e o fac u ti st h en u m b e ro fe d g e s t h a th a v e o n ee n d p o i n t
inSand the other in T.L e t
MAX-CUT ={⟨G, k⟩|Ghas a cut of size kor more }.
Show that MAX-CUT isNP-complete. You may assume the result of Prob-
lem 7.26. (Hint: Show that ̸=SAT ≤PMAX-CUT .T h e v a r i a b l e g a d g e t f o r
variable xis a collection of 3cnodes labeled with xand another 3cnodes labeled
with
 x,w h e r e cis the number of clauses. All nodes labeled xare connected with
all nodes labeled
 x.T h ec l a u s eg a d g e ti sat r i a n g l eo ft h r e ee d g e sc o n n e c t i n gt h r e e
nodes labeled with the literals appearing in the clause. Do not use the same node
in more than one clause gadget. Prove that this reduction works.)
7.28 You are given a box and a collection of cards as indicated in the following figure.
Because of the pegs in the box and the notches in the cards, each card will fit in the
box in either of two ways. Each card contains two columns of holes, some of which
may not be punched out. The puzzle is solved by placing all the cards in the box
so as to completely cover the bottom of the box (i.e., every hole position is blocked
by at least one card that has no hole there). Let PUZZLE ={⟨c1,...,c k⟩|each ci
represents a card and this collection of cards has a solution }.S h o w t h a t PUZZLE
isNP-complete.
7.29 Acoloring of a graph is an assignment of colors to its nodes so that no two adjacent
nodes are assigned the same color. Let
3COLOR ={⟨G⟩|Gis colorable with 3 colors }.
Show that 3COLOR isNP-complete. (Hint: Use the following three subgraphs.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 350 ---
326 CHAPTER 7 / TIME COMPLEXITY
7.30 LetSET-SPLITTING ={⟨S, C⟩|Sis a finite set and C={C1,...,C k}is a
collection of subsets of S,f o rs o m e k>0,s u c ht h a te l e m e n t so f Scan be colored
redorblueso that no Cihas all its elements colored with the same color }.S h o w
thatSET-SPLITTING isNP-complete.
7.31 Consider the following scheduling problem. You are given a list of final exams
F1,...,F kto be scheduled, and a list of students S1,...,S l.E a c hs t u d e n ti st a k i n g
some specified subset of these exams. You must schedule these exams into slots so
that no student is required to take two exams in the same slot. The problem is to
determine if such a schedule exists that uses only hslots. Formulate this problem
as a language and show that this language is NP-complete.
7.32 This problem is inspired by the single-player game Minesweeper ,g e n e r a l i z e dt oa n
arbitrary graph. Let Gbe an undirected graph, where each node either contains
as i n g l e ,h i d d e n mine or is empty. The player chooses nodes, one by one. If the
player chooses a node containing a mine, the player loses. If the player chooses an
empty node, the player learns the number of neighboring nodes containing mines.
(A neighboring node is one connected to the chosen node by an edge.) The player
wins if and when all empty nodes have been so chosen.
In the mine consistency problem ,y o ua r eg i v e nag r a p h Galong with numbers labeling
some of G’s nodes. You must determine whether a placement of mines on the
remaining nodes is possible, so that any node vthat is labeled mhas exactly m
neighboring nodes containing mines. Formulate this problem as a language and
show that it is NP-complete.
A7.33 In the following solitaire game, you are given an m×mboard. On each of its
m2positions lies either a blue stone, a red stone, or nothing at all. You play by
removing stones from the board until each column contains only stones of a sin-
gle color and each row contains at least one stone. You win if you achieve this
objective. Winning may or may not be possible, depending upon the initial con-
figuration. Let SOLITAIRE ={⟨G⟩|Gis a winnable game configuration }.Prove
thatSOLITAIRE isNP-complete.
7.34 Recall, in our discussion of the Church– T uring thesis, that we introduced the lan-
guage D={⟨p⟩|pis a polynomial in several variables having an integral root }.W e
stated, but didn’t prove, that Dis undecidable. In this problem, you are to prove a
different property of D—namely, that DisNP-hard. A problem is NP-hard if all
problems in NPare polynomial time reducible to it, even though it may not be in
NPitself. So you must show that all problems in NPare polynomial time reducible
toD.
7.35 As u b s e to ft h en o d e so fag r a p h Gis adominating set if every other node of Gis
adjacent to some node in the subset. Let
DOMINATING-SET ={⟨G, k⟩|Ghas a dominating set with knodes }.
Show that it is NP-complete by giving a reduction from VERTEX-COVER .
⋆7.36 Show that the following problem is NP-complete. You are given a set of states Q=
{q0,q1,...,q l}and a collection of pairs {(s1,r1),...,(sk,rk)}where the siare
distinct strings over Σ= {0,1},a n dt h e riare (not necessarily distinct) members
ofQ.D e t e r m i n e w h e t h e r a DFAM=(Q,Σ,δ ,q 0,F)exists where δ(q0,si)=ri
for each i. Here, δ(q,s)is the state that Menters after reading s,s t a r t i n ga ts t a t e q.
(Note that Fis irrelevant here.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 351 ---
PROBLEMS 327
7.37 LetU={⟨M,x, #t⟩|NTM Maccepts xwithin tsteps on at least one branch }.
Note that Misn’t required to halt on all branches. Show that UisNP-complete.
⋆7.38 Show that if P=N P ,ap o l y n o m i a lt i m ea l g o r i t h me x i s t st h a tp r o d u c e sas a t i s f y i n g
assignment when given a satisfiable Boolean formula. (Note: The algorithm you
are asked to provide computes a function; but NPcontains languages, not func-
tions. The P=N P assumption implies that SAT is in P,s ot e s t i n gs a t i s fi a b i l i t yi s
solvable in polynomial time. But the assumption doesn’t say how this test is done,
and the test may not reveal satisfying assignments. You must show that you can find
them anyway. Hint: Use the satisfiability tester repeatedly to find the assignment
bit-by-bit.)
⋆7.39 Show that if P=N P ,y o uc a nf a c t o ri n t e g e r si np o l y n o m i a lt i m e .( S e et h en o t ei n
Problem 7.38.)
A⋆7.40 Show that if P=N P ,ap o l y n o m i a lt i m ea l g o r i t h me x i s t st h a tt a k e sa nu n d i r e c t e d
graph as input and finds a largest clique contained in that graph. (See the note in
Problem 7.38.)
7.41 In the proof of the Cook–Levin theorem, a window is a 2×3rectangle of cells.
Show why the proof would have failed if we had used 2×2windows instead.
⋆7.42 Consider the algorithm MINIMIZE ,w h i c ht a k e sa DFAMas input and outputs
DFAM′.
MINIMIZE =“On input ⟨M⟩,w h e r e M=(Q,Σ,δ ,q 0,A)is aDFA:
1.Remove all states of Mthat are unreachable from the start state.
2.Construct the following undirected graph Gwhose nodes are
the states of M.
3.Place an edge in Gconnecting every accept state with every
nonaccept state. Add additional edges as follows.
4.Repeat until no new edges are added to G:
5. For every pair of distinct states qandrofMand every a∈Σ:
6. Add the edge (q,r)toGif(δ(q,a),δ(r, a))is an edge of G.
7.For each state q,l e t[q]be the collection of states
[q]={r∈Q|no edge joins qandrinG}.
8.Form a new DFAM′=(Q′,Σ,δ′,q0′,A′)where
Q′={[q]|q∈Q}(if[q]=[r],o n l yo n eo ft h e mi si n Q′),
δ′([q],a)=[δ(q,a)]for every q∈Qanda∈Σ,
q0′=[q0],and
A′={[q]|q∈A}.
9.Output ⟨M′⟩.”
a.Show that MandM′are equivalent.
b.Show that M′is minimal—that is, no DFAwith fewer states recognizes the
same language. You may use the result of Problem 1.52 without proof.
c.Show that MINIMIZE operates in polynomial time.
7.43 For a cnf-formula φwith mvariables and cclauses, show that you can construct
in polynomial time an NFAwith O(cm)states that accepts all nonsatisfying assign-
ments, represented as Boolean strings of length m.C o n c l u d et h a t P̸=N P implies
thatNFAsc a n n o tb em i n i m i z e di np o l y n o m i a lt i m e .
⋆7.44 A2cnf-formula is an AND of clauses, where each clause is an ORof at most two
literals. Let 2SAT ={⟨φ⟩|φis a satisfiable 2cnf-formula }.S h o wt h a t 2SAT ∈P.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 352 ---
328 CHAPTER 7 / TIME COMPLEXITY
7.45 Modify the algorithm for context-free language recognition in the proof of The-
orem 7.16 to give a polynomial time algorithm that produces a parse tree for a
string, given the string and a CFG,i ft h a tg r a m m a rg e n e r a t e st h es t r i n g .
7.46 Say that two Boolean formulas are equivalent if they have the same set of variables
and are true on the same set of assignments to those variables (i.e., they describe
the same Boolean function). A Boolean formula is minimal if no shorter Boolean
formula is equivalent to it. Let MIN-FORMULA be the collection of minimal
Boolean formulas. Show that if P=N P ,t h e n MIN-FORMULA ∈P.
7.47 The difference hierarchy DiPis defined recursively as
a.D1P=N P and
b.DiP={A|A=B\CforBinNPandCinDi−1P}.
(Here B\C=B∩
C.)
For example, a language in D2Pis the difference of two NPlanguages. Sometimes
D2Pis called DP(and may be written DP). Let
Z={⟨G1,k1,G2,k2⟩|G1has a k1-clique and G2doesn’t have a k2-clique }.
Show that Zis complete for DP.I no t h e rw o r d s ,s h o wt h a t Zis in DPand every
language in DPis polynomial time reducible to Z.
⋆7.48 LetMAX-CLIQUE ={⟨G, k⟩|al a r g e s tc l i q u ei n Gis of size exactly k}.U s e t h e
result of Problem 7.47 to show that MAX-CLIQUE isDP-complete.
⋆7.49 Letf:N− →N be any function where f(n)=o(nlogn).S h o wt h a t TIME( f(n))
contains only the regular languages.
⋆7.50 Call a regular expression star-free if it does not contain any star operations. Then,
letEQSF−REX={⟨R, S⟩|RandSare equivalent star-free regular expressions }.
Show that EQSF−REXis in coNP .W h yd o e sy o u ra r g u m e n tf a i lf o rg e n e r a lr e g u l a r
expressions?
⋆7.51 This problem investigates resolution ,am e t h o df o rp r o v i n gt h eu n s a t i s fi a b i l i t yo f
cnf-formulas. Let φ=C1∧C2∧···∧ Cmbe a formula in cnf, where the Ciare its
clauses. Let C={Ci|Ciis a clause of φ}.I n a resolution step ,w et a k et w oc l a u s e s
CaandCbinC,w h i c hb o t hh a v es o m ev a r i a b l e xoccurring positively in one of
the clauses and negatively in the other. Thus, Ca=(x∨y1∨y2∨···∨ yk)and
Cb=(
x∨z1∨z2∨···∨ zl),w h e r et h e yiandziare literals. We form the new
clause (y1∨y2∨···∨ yk∨z1∨z2∨···∨ zl)and remove repeated literals. Add
this new clause to C.R e p e a tt h er e s o l u t i o ns t e p su n t i ln oa d d i t i o n a lc l a u s e sc a nb e
obtained. If the empty clause ()is in C,t h e nd e c l a r e φunsatisfiable.
Say that resolution is sound if it never declares satisfiable formulas to be unsatisfi-
able. Say that resolution is complete if all unsatisfiable formulas are declared to be
unsatisfiable.
a.Show that resolution is sound and complete.
b.Use part (a) to show that 2SAT ∈P.
⋆7.52 Show that Pis closed under homomorphism iff P=N P .
⋆7.53 LetA⊆1∗be any unary language. Show that if Ais NP-complete, then P = NP .
(Hint: Consider a polynomial time reduction ffrom SAT toA.F o raf o r m u l a φ,
letφ0100be the reduced formula where variables x1,x2,x3,a n d x4inφare set to
the values 0, 1, 0, and 0, respectively. What happens when you apply fto all of
these exponentially many reduced formulas?)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 353 ---
SELECTED SOLUTIONS 329
7.54 In a directed graph, the indegree of a node is the number of incoming edges and
theoutdegree is the number of outgoing edges. Show that the following problem
is NP-complete. Given an undirected graph Gand a designated subset CofG’s
nodes, is it possible to convert Gto a directed graph by assigning directions to each
of its edges so that every node in Chas indegree 0 or outdegree 0, and every other
node in Ghas indegree at least 1?
SELECTED SOLUTIONS
7.1 (c)FALSE ;(d)TRUE .
7.2 (c)TRUE ;(d)TRUE .
7.16 LetA∈NP.C o n s t r u c t NTM Mto decide A∗in nondeterministic polynomial time.
M=“On input w:
1.Nondeterministically divide winto pieces w=x1x2···xk.
2.For each xi,n o n d e t e r m i n i s t i c a l l yg u e s st h ec e r t i fi c a t e st h a t
show xi∈A.
3.Verify all certificates if possible, then accept .
Otherwise, if verification fails, reject .”
7.23 We give a polynomial time mapping reduction from CLIQUE toHALF-CLIQUE .
The input to the reduction is a pair ⟨G, k⟩and the reduction produces the graph
⟨H⟩as output where His as follows. If Ghasmnodes and k=m/2,t h e n H=G.
Ifk<m / 2,t h e n His the graph obtained from Gby adding jnodes, each con-
nected to every one of the original nodes and to each other, where j=m−2k.
Thus, Hhasm+j=2m−2knodes. Observe that Ghas a k-clique iff Hhas a
clique of size k+j=m−k,a n ds o ⟨G, k⟩∈CLIQUE iff⟨H⟩∈HALF-CLIQUE .
Ifk>m / 2,t h e n His the graph obtained by adding jnodes to Gwithout any
additional edges, where j=2k−m.T h u s , Hhasm+j=2knodes, and so
Ghas a k-clique iff Hhas a clique of size k.T h e r e f o r e , ⟨G, k⟩∈CLIQUE iff
⟨H⟩∈HALF-CLIQUE .W ea l s on e e dt os h o w HALF-CLIQUE ∈NP.T h ec e r t i fi -
cate is simply the clique.
7.33 First, SOLITAIRE ∈NPbecause we can verify that a solution works, in polynomial
time. Second, we show that 3SAT ≤PSOLITAIRE .G i v e n φwith mvariables
x1,...,x mandkclauses c1,...,c k,c o n s t r u c tt h ef o l l o w i n g k×mgame G.W e
assume that φhas no clauses that contain both xiand
xibecause such clauses may
be removed without affecting satisfiability.
Ifxiis in clause cj,p u tab l u es t o n ei nr o w cj,c o l u m n xi.I f
xiis in clause cj,p u ta
red stone in row cj,c o l u m n xi.W ec a nm a k et h eb o a r ds q u a r eb yr e p e a t i n gar o w
or adding a blank column as necessary without affecting solvability. We show that
φis satisfiable iff Ghas a solution.
(→)Ta k e a s a t i s f y i n g a s s i g n m e n t . I f xiis true (false), remove the red (blue) stones
from the corresponding column. So stones corresponding to true literals remain.
Because every clause has a true literal, every row has a stone.
(←)Ta k e a g a m e s o l u t i o n . I f t h e r e d ( b l u e ) s t o n e s w e r e r e m o v e d f r o m a c o l u m n ,
set the corresponding variable true (false). Every row has a stone remaining, so
every clause has a true literal. Therefore, φis satisfied.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 354 ---
330 CHAPTER 7 / TIME COMPLEXITY
7.40 If you assume that P=N P ,t h e n CLIQUE ∈P,a n dy o uc a nt e s tw h e t h e r Gcon-
tains a clique of size kin polynomial time, for any value of k.B yt e s t i n gw h e t h e r G
contains a clique of each size, from 1to the number of nodes in G,y o uc a nd e t e r -
mine the size tof a maximum clique in Gin polynomial time. Once you know t,
you can find a clique with tnodes as follows. For each node xofG,r e m o v e xand
calculate the resulting maximum clique size. If the resulting size decreases, replace
xand continue with the next node. If the resulting size is still t,k e e p xperma-
nently removed and continue with the next node. When you have considered all
nodes in this way, the remaining nodes are a t-clique.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 355 ---
8
SPACE COMPLEXITY
In this chapter, we consider the complexity of computational problems in terms
of the amount of space, or memory, that they require. Time and space are two
of the most important considerations when we seek practical solutions to many
computational problems. Space complexity shares many of the features of time
complexity and serves as a further way of classifying problems according to their
computational difficulty.
As we did with time complexity, we need to select a model for measuring the
space used by an algorithm. We continue with the T uring machine model for
the same reason that we used it to measure time. T uring machines are mathe-
matically simple and close enough to real computers to give meaningful results.
DEFINITION 8.1
LetMbe a deterministic T uring machine that halts on all inputs.
The space complexity ofMis the function f:N− →N ,w h e r e f(n)
is the maximum number of tape cells that Mscans on any input of
length n.I ft h es p a c ec o m p l e x i t yo f Misf(n),w ea l s os a yt h a t M
runs in space f(n).
IfMis a nondeterministic T uring machine wherein all branches
halt on all inputs, we define its space complexity f(n)to be the
maximum number of tape cells that Mscans on any branch of its
computation for any input of length n.
331
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 356 ---
332 CHAPTER 8 / SPACE COMPLEXITY
We typically estimate the space complexity of Turing machines by using
asymptotic notation.
DEFINITION 8.2
Letf:N− →R+be a function. The space complexity classes ,
SPACE( f(n))andNSPACE( f(n)),a r ed e fi n e da sf o l l o w s .
SPACE( f(n)) = {L|Lis a language decided by an O(f(n))space
deterministic T uring machine }.
NSPACE( f(n)) = {L|Lis a language decided by an O(f(n))space
nondeterministic T uring machine }.
EXAMPLE 8.3
In Chapter 7, we introduced the NP-complete problem SAT.H e r e , w e s h o w
that SAT can be solved with a linear space algorithm. We believe that SAT
cannot be solved with a polynomial time algorithm, much less with a linear time
algorithm, because SAT isNP-complete. Space appears to be more powerful
than time because space can be reused, whereas time cannot.
M1=“On input ⟨φ⟩,w h e r e φis a Boolean formula:
1.For each truth assignment to the variables x1,...,x mofφ:
2. Evaluate φon that truth assignment.
3.Ifφever evaluated to 1,accept ;i fn o t , reject .”
Machine M1clearly runs in linear space because each iteration of the loop
can reuse the same portion of the tape. The machine needs to store only the
current truth assignment, and that can be done with O(m)space. The number
of variables mis at most n,t h el e n g t ho ft h ei n p u t ,s ot h i sm a c h i n er u n si ns p a c e
O(n).
EXAMPLE 8.4
Here, we illustrate the nondeterministic space complexity of a language. In the
next section, we show how determining the nondeterministic space complex-
ity can be useful in determining its deterministic space complexity. Consider
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 357 ---
8.1 SAVITCH’S THEOREM 333
the problem of testing whether a nondeterministic finite automaton accepts all
strings. Let
ALL NFA={⟨A⟩|Ais an NFAandL(A)=Σ∗}.
We give a nondeterministic linear space algorithm that decides the complement
of this language,
 ALL NFA.T h e i d e a b e h i n d t h i s a l g o r i t h m i s t o u s e n o n d e t e r -
minism to guess a string that is rejected by the NFA,a n dt ou s el i n e a rs p a c et o
keep track of which states the NFAcould be in at a particular time. Note that
this language is not known to be in NPor in coNP .
N=“On input ⟨M⟩,w h e r e Mis an NFA:
1.Place a marker on the start state of the NFA.
2.Repeat 2qtimes, where qis the number of states of M:
3. Nondeterministically select an input symbol and change the
positions of the markers on M’s states to simulate reading
that symbol.
4.Accept if stages 2 and 3 reveal some string that Mrejects; that
is, if at some point none of the markers lie on accept states of
M.O t h e r w i s e , reject .”
IfMrejects any strings, it must reject one of length at most 2qbecause in
any longer string that is rejected, the locations of the markers described in the
preceding algorithm would repeat. The section of the string between the rep-
etitions can be removed to obtain a shorter rejected string. Hence Ndecides
ALL NFA.( N o t et h a t Naccepts improperly formed inputs, too.)
The only space needed by this algorithm is for storing the location of the
markers and the repeat loop counter, and that can be done with linear space.
Hence the algorithm runs in nondeterministic space O(n).N e x t , w e p r o v e a
theorem that provides information about the deterministic space complexity of
ALL NFA.
8.1
SAVITCH’S THEOREM
Savitch’s theorem is one of the earliest results concerning space complexity. It
shows that deterministic machines can simulate nondeterministic machines by
using a surprisingly small amount of space. For time complexity, such a simu-
lation seems to require an exponential increase in time. For space complexity,
Savitch’s theorem shows that any nondeterministic TMthat uses f(n)space can
be converted to a deterministic TMthat uses only f2(n)space.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 358 ---
334 CHAPTER 8 / SPACE COMPLEXITY
THEOREM 8.5
Savitch’s theorem For any1function f:N− →R+,w h e r e f(n)≥n,
NSPACE( f(n))⊆SPACE( f2(n)).
PROOF IDEA We need to simulate an f(n)space NTM deterministically. A
naive approach is to proceed by trying all the branches of the NTM’s computation,
one by one. The simulation needs to keep track of which branch it is currently
trying so that it is able to go on to the next one. But a branch that uses f(n)
space may run for 2O(f(n))steps and each step may be a nondeterministic choice.
Exploring the branches sequentially would require recording all the choices used
on a particular branch in order to be able to find the next branch. Therefore,
this approach may use 2O(f(n))space, exceeding our goal of O(f2(n))space.
Instead, we take a different approach by considering the following more gen-
eral problem. We are given two configurations of the NTM,c1andc2,t o g e t h e r
with a number t,a n dw et e s tw h e t h e rt h e NTMcan get from c1toc2within tsteps
using only f(n)space. We call this problem the yieldability problem .B y s o l v -
ing the yieldability problem, where c1is the start configuration, c2is the accept
configuration, and tis the maximum number of steps that the nondeterministic
machine can use, we can determine whether the machine accepts its input.
We give a deterministic, recursive algorithm that solves the yieldability prob-
lem. It operates by searching for an intermediate configuration cm,a n dr e c u r -
sively testing whether (1) c1can get to cmwithin t/2steps, and (2) whether cm
can get to c2within t/2steps. Reusing the space for each of the two recursive
tests allows a significant savings of space.
This algorithm needs space for storing the recursion stack. Each level of the
recursion uses O(f(n))space to store a configuration. The depth of the recur-
sion is logt,w h e r e tis the maximum time that the nondeterministic machine
may use on any branch. We have t=2O(f(n)),s ologt=O(f(n)).H e n c e t h e
deterministic simulation uses O(f2(n))space.
PROOF LetNbe an NTMdeciding a language Ain space f(n).W ec o n s t r u c t
ad e t e r m i n i s t i c TMMdeciding A.M a c h i n e Muses the procedure CANYIELD ,
which tests whether one of N’s configurations can yield another within a speci-
fied number of steps. This procedure solves the yieldability problem described
in the proof idea.
Letwbe a string considered as input to N.F o r c o n fi g u r a t i o n s c1andc2of
N,a n di n t e g e r t,CANYIELD (c1,c2,t)outputs accept ifNcan go from config-
uration c1to configuration c2intor fewer steps along some nondeterministic
1On page 351, we show that Savitch’s theorem also holds whenever f(n)≥logn.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 359 ---
8.1 SAVITCH’S THEOREM 335
path. If not, CANYIELD outputs reject .F o rc o n v e n i e n c e ,w ea s s u m e t h a t tis a
power of 2.
CANYIELD =“On input c1,c2,a n d t:
1.Ift=1,t h e nt e s td i r e c t l yw h e t h e r c1=c2or whether c1yields
c2in one step according to the rules of N.Accept if either test
succeeds; reject if both fail.
2.Ift>1,t h e nf o re a c hc o n fi g u r a t i o n cmofNusing space f(n):
3. Run CANYIELD (c1,cm,t
2).
4. Run CANYIELD (cm,c2,t
2).
5. If steps 3 and 4 both accept, then accept .
6.If haven’t yet accepted, reject .”
Now we define Mto simulate Nas follows. We first modify Nso that when
it accepts, it clears its tape and moves the head to the leftmost cell—thereby
entering a configuration called caccept.W e l e t cstartbe the start configuration of
Nonw.W es e l e c tac o n s t a n t dso that Nhas no more than 2df(n)configurations
using f(n)tape, where nis the length of w.T h e nw ek n o wt h a t 2df(n)provides
an upper bound on the running time of any branch of Nonw.
M=“On input w:
1.Output the result of CANYIELD (cstart,caccept,2df(n)).”
Algorithm CANYIELD obviously solves the yieldability problem, and hence
Mcorrectly simulates N.W e n e e d t o a n a l y z e i t t o v e r i f y t h a t Mworks within
O(f2(n))space.
Whenever CANYIELD invokes itself recursively, it stores the current stage
number and the values of c1,c2,a n d ton a stack so that these values may be
restored upon return from the recursive invocation. Each level of the recursion
thus uses O(f(n))additional space. Furthermore, each level of the recursion
divides the size of tin half. Initially tstarts out equal to 2df(n),s ot h ed e p t h
of the recursion is O(log 2df(n))orO(f(n)).T h e r e f o r e , t h e t o t a l s p a c e u s e d i s
O(f2(n)),a sc l a i m e d .
One technical difficulty arises in this argument because algorithm Mneeds to
know the value of f(n)when it calls CANYIELD .W e c a n h a n d l e t h i s d i f fi c u l t y
by modifying Mso that it tries f(n)=1 ,2,3,....F o r e a c h v a l u e f(n)=i,t h e
modified algorithm uses CANYIELD to determine whether the accept configu-
ration is reachable. In addition, it uses CANYIELD to determine whether Nuses
at least space i+1by testing whether Ncan reach any of the configurations of
length i+1from the start configuration. If the accept configuration is reachable,
Maccepts; if no configuration of length i+1is reachable, Mrejects; and other-
wise, Mcontinues with f(n)=i+1.( W ec o u l dh a v eh a n d l e dt h i sd i f fi c u l t yi n
another way by assuming that Mcan compute f(n)within O(f(n))space, but
then we would need to add that assumption to the statement of the theorem.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 360 ---
336 CHAPTER 8 / SPACE COMPLEXITY
8.2
THE CLASS PSPACE
By analogy with the class P,w ed e fi n et h ec l a s s PSPACE for space complexity.
DEFINITION 8.6
PSPACE is the class of languages that are decidable in polynomial
space on a deterministic T uring machine. In other words,
PSPACE =⎪uniondisplay
kSPACE( nk).
We define NPSPACE ,t h en o n d e t e r m i n i s t i cc o u n t e r p a r tt o PSPACE ,i n
terms of the NSPACE classes. However, PSPACE = NPSPACE by virtue of
Savitch’s theorem because the square of any polynomial is still a polynomial.
In Examples 8.3 and 8.4, we showed that SAT is in SPACE( n)and that
ALL NFAis in co NSPACE( n)and hence, by Savitch’s theorem, in SPACE( n2)
because the deterministic space complexity classes are closed under complement.
Therefore, both languages are in PSPACE .
Let’s examine the relationship of PSPACE with PandNP.W eo b s e r v et h a t
P⊆PSPACE because a machine that runs quickly cannot use a great deal of
space. More precisely, for t(n)≥n,a n ym a c h i n et h a to p e r a t e si nt i m e t(n)can
use at most t(n)space because a machine can explore at most one new cell at each
step of its computation. Similarly, NP⊆NPSPACE ,a n ds o NP⊆PSPACE .
Conversely, we can bound the time complexity of a T uring machine in terms
of its space complexity. For f(n)≥n,aTMthat uses f(n)space can have at most
f(n)2O(f(n))different configurations, by a simple generalization of the proof of
Lemma 5.8 on page 222. A TMcomputation that halts may not repeat a configu-
ration. Therefore, a TM2that uses space f(n)must run in time f(n)2O(f(n)),s o
PSPACE ⊆EXPTIME =⎪uniontext
kTIME(2nk).
We summarize our knowledge of the relationships among the complexity
classes defined so far in the series of containments
P⊆NP⊆PSPACE = NPSPACE ⊆EXPTIME .
We don’t know whether any of these containments is actually an equality.
Someone may yet discover a simulation like the one in Savitch’s theorem that
merges some of these classes into the same class. However, in Chapter 9 we
prove that P̸= EXPTIME .T h e r e f o r e , a t l e a s t o n e o f t h e p r e c e d i n g c o n t a i n -
ments is proper, but we are unable to say which! Indeed, most researchers
2The requirement here that f(n)≥nis generalized later to f(n)≥lognwhen we
introduce TMst h a tu s es u b l i n e a rs p a c eo np a g e3 5 0 .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 361 ---
8.3 PSPACE-COMPLETENESS 337
believe that all the containments are proper. The following diagram depicts
the relationships among these classes, assuming that all are different.
FIGURE 8.7
Conjectured relationships among P,NP,PSPACE ,a n d EXPTIME
8.3
PSPACE-COMPLETENESS
In Section 7.4, we introduced the category of NP-complete languages as rep-
resenting the most difficult languages in NP.D e m o n s t r a t i n gt h a tal a n g u a g ei s
NP-complete provides strong evidence that the language is not in P.I fi tw e r e ,
PandNPwould be equal. In this section, we introduce the analogous notion
PSPACE -completeness for the class PSPACE .
DEFINITION 8.8
Al a n g u a g e BisPSPACE-complete if it satisfies two conditions:
1.Bis in PSPACE ,a n d
2.every AinPSPACE is polynomial time reducible to B.
IfBmerely satisfies condition 2, we say that it is PSPACE-hard .
In defining PSPACE -completeness, we use polynomial time reducibility as
given in Definition 7.29. Why don’t we define a notion of polynomial space
reducibility and use that instead of polynomial time reducibility? T o understand
the answer to this important question, consider our motivation for defining com-
plete problems in the first place.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 362 ---
338 CHAPTER 8 / SPACE COMPLEXITY
Complete problems are important because they are examples of the most
difficult problems in a complexity class. A complete problem is most difficult
because any other problem in the class is easily reduced into it. So if we find an
easy way to solve the complete problem, we can easily solve all other problems in
the class. The reduction must be easy,r e l a t i v et ot h ec o m p l e x i t yo ft y p i c a lp r o b -
lems in the class, for this reasoning to apply. If the reduction itself were difficult
to compute, an easy solution to the complete problem wouldn’t necessarily yield
an easy solution to the problems reducing to it.
Therefore, the rule is: Whenever we define complete problems for a com-
plexity class, the reduction model must be more limited than the model used for
defining the class itself.
THE TQBF PROBLEM
Our first example of a PSPACE -complete problem involves a generalization of
the satisfiability problem. Recall that a Boolean formula is an expression that
contains Boolean variables, the constants 0and1,a n dt h eB o o l e a no p e r a t i o n s ∧,
∨,a n d ¬.W en o wi n t r o d u c eam o r eg e n e r a lt y p eo fB o o l e a nf o r m u l a .
The quantifiers ∀(for all) and ∃(there exists) make frequent appearances in
mathematical statements. Writing the statement ∀xφmeans that for every value
for the variable x,t h es t a t e m e n t φis true. Similarly, writing the statement ∃xφ
means that for some value of the variable x,t h es t a t e m e n t φis true. Sometimes,
∀is referred to as the universal quantifier and∃as the existential quantifier .
We say that the variable ximmediately following the quantifier is bound to the
quantifier.
For example, considering the natural numbers, the statement ∀x[x+1>x]
means that the successor x+1 of every natural number xis greater than
the number itself. Obviously, this statement is true. However, the statement
∃y[y+y=3 ] obviously is false. When interpreting the meaning of statements
involving quantifiers, we must consider the universe from which the values are
drawn. In the preceding cases, the universe comprised the natural numbers; but
if we took the real numbers instead, the existentially quantified statement would
become true.
Statements may contain several quantifiers, as in ∀x∃y[y>x ].F o rt h eu n i -
verse of the natural numbers, this statement says that every natural number has
another natural number larger than it. The order of the quantifiers is impor-
tant. Reversing the order, as in the statement ∃y∀x[y>x ],g i v e sa ne n t i r e l y
different meaning—namely, that some natural number is greater than all others.
Obviously, the first statement is true and the second statement is false.
Aq u a n t i fi e rm a ya p p e a ra n y w h e r ei nam a t h e m a t i c a ls t a t e m e n t .I ta p p l i e st o
the fragment of the statement appearing within the matched pair of parentheses
or brackets following the quantified variable. This fragment is called the scope
of the quantifier. Often, it is convenient to require that all quantifiers appear at
the beginning of the statement and that each quantifier’s scope is everything fol-
lowing it. Such statements are said to be in prenex normal form .A n ys t a t e m e n t
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 363 ---
8.3 PSPACE-COMPLETENESS 339
may be put into prenex normal form easily. We consider statements in this form
only, unless otherwise indicated.
Boolean formulas with quantifiers are called quantified Boolean formulas .
For such formulas, the universe is {0,1}.F o re x a m p l e ,
φ=∀x∃y⎪bracketleftbig
(x∨y)∧(
x∨
y)⎪bracketrightbig
is a quantified Boolean formula. Here, φis true, but it would be false if the
quantifiers ∀xand∃ywere reversed.
When each variable of a formula appears within the scope of some quantifier,
the formula is said to be fully quantified .A f u l l y q u a n t i fi e d B o o l e a n f o r m u l a
is sometimes called a sentence and is always either true or false. For example,
the preceding formula φis fully quantified. However, if the initial part, ∀x,o f
φwere removed, the formula would no longer be fully quantified and would be
neither true nor false.
The TQBF problem is to determine whether a fully quantified Boolean for-
mula is true or false. We define the language
TQBF ={⟨φ⟩|φis a true fully quantified Boolean formula }.
THEOREM 8.9
TQBF isPSPACE -complete.
PROOF IDEA To s h o w t h a t TQBF is in PSPACE ,w eg i v eas t r a i g h t f o r w a r d
algorithm that assigns values to the variables and recursively evaluates the truth
of the formula for those values. From that information, the algorithm can deter-
mine the truth of the original quantified formula.
To s h o w t h a t e v e r y l a n g u a g e AinPSPACE reduces to TQBF in polynomial
time, we begin with a polynomial space-bounded T uring machine for A.T h e n
we give a polynomial time reduction that maps a string to a quantified Boolean
formula φthat encodes a simulation of the machine on that input. The formula
is true iff the machine accepts.
As a first attempt at this construction, let’s try to imitate the proof of the
Cook–Levin theorem, Theorem 7.37. We can construct a formula φthat simu-
lates Mon an input wby expressing the requirements for an accepting tableau.
At a b l e a uf o r Monwhas width O(nk),t h es p a c eu s e db y M,b u ti t sh e i g h ti s
exponential in nkbecause Mcan run for exponential time. Thus, if we were to
represent the tableau with a formula directly, we would end up with a formula
of exponential size. However, a polynomial time reduction cannot produce an
exponential-size result, so this attempt fails to show that A≤PTQBF .
Instead, we use a technique related to the proof of Savitch’s theorem to con-
struct the formula. The formula divides the tableau into halves and employs the
universal quantifier to represent each half with the same part of the formula.
The result is a much shorter formula.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 364 ---
340 CHAPTER 8 / SPACE COMPLEXITY
PROOF First, we give a polynomial space algorithm deciding TQBF .
T=“On input ⟨φ⟩,af u l l yq u a n t i fi e dB o o l e a nf o r m u l a :
1.Ifφcontains no quantifiers, then it is an expression with only
constants, so evaluate φandaccept if it is true; otherwise, reject .
2.Ifφequals ∃xψ,r e c u r s i v e l yc a l l Tonψ,fi r s tw i t h 0substituted
forxand then with 1substituted for x.I fe i t h e rr e s u l ti sa c c e p t ,
then accept ;o t h e r w i s e , reject .
3.Ifφequals ∀xψ,r e c u r s i v e l yc a l l Tonψ,fi r s tw i t h 0substituted
forxand then with 1substituted for x.I f b o t h r e s u l t s a r e a c -
cept, then accept ;o t h e r w i s e , reject .”
Algorithm Tobviously decides TQBF .T o a n a l y z e i t s s p a c e c o m p l e x i t y , w e
observe that the depth of the recursion is at most the number of variables. At
each level we need only store the value of one variable, so the total space used is
O(m),w h e r e mis the number of variables that appear in φ.T h e r e f o r e , Truns
in linear space.
Next, we show that TQBF isPSPACE -hard. Let Abe a language decided by
aTMMin space nkfor some constant k.W eg i v eap o l y n o m i a lt i m er e d u c t i o n
from AtoTQBF .
The reduction maps a string wto a quantified Boolean formula φthat is true
iffMaccepts w.T os h o wh o wt oc o n s t r u c t φ,w es o l v eam o r eg e n e r a lp r o b l e m .
Using two collections of variables denoted c1andc2representing two configu-
rations and a number t>0,w ec o n s t r u c taf o r m u l a φc1,c2,t.I fw ea s s i g n c1and
c2to actual configurations, the formula is true iff Mcan go from c1toc2in at
most tsteps. Then we can let φbe the formula φcstart,caccept,h,w h e r e h=2df(n)for
ac o n s t a n t d,c h o s e ns ot h a t Mhas no more than 2df(n)possible configurations
on an input of length n.H e r e ,l e t f(n)=nk.F o rc o n v e n i e n c e ,w ea s s u m et h a t
tis a power of 2.
The formula encodes the contents of configuration cells as in the proof of the
Cook–Levin theorem. Each cell has several variables associated with it, one for
each tape symbol and state, corresponding to the possible settings of that cell.
Each configuration has nkcells and so is encoded by O(nk)variables.
Ift=1,w ec a ne a s i l yc o n s t r u c t φc1,c2,t.W e d e s i g n t h e f o r m u l a t o s a y t h a t
either c1equals c2,o rc2follows from c1in a single step of M.W e e x p r e s s
the equality by writing a Boolean expression saying that each of the variables
representing c1contains the same Boolean value as the corresponding variable
representing c2.W e e x p r e s s t h e s e c o n d p o s s i b i l i t y b y u s i n g t h e t e c h n i q u e p r e -
sented in the proof of the Cook–Levin theorem. That is, we can express that
c1yields c2in a single step of Mby writing Boolean expressions stating that
the contents of each triple of c1’s cells correctly yields the contents of the corre-
sponding triple of c2’s cells.
Ift>1,w ec o n s t r u c t φc1,c2,trecursively. As a warm-up, let’s try one idea that
doesn’t quite work and then fix it. Let
φc1,c2,t=∃m1⎪bracketleftbig
φc1,m1,t
2∧φm1,c2,t
2⎪bracketrightbig
.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 365 ---
8.3 PSPACE-COMPLETENESS 341
The symbol m1represents a configuration of M.W r i t i n g ∃m1is shorthand for
∃x1,...,x l,w h e r e l=O(nk)andx1,...,x lare the variables that encode m1.
So this construction of φc1,c2,tsays that Mcan go from c1toc2in at most tsteps
if some intermediate configuration m1exists, whereby Mcan go from c1tom1
in at mostt
2steps and then from m1toc2in at mostt
2steps. Then we construct
the two formulas φc1,m1,t
2andφm1,c2,t
2recursively.
The formula φc1,c2,thas the correct value; that is, it is TRUE whenever M
can go from c1toc2within tsteps. However, it is too big. Every level of the
recursion involved in the construction cuts tin half but roughly doubles the size
of the formula. Hence we end up with a formula of size roughly t.I n i t i a l l y
t=2df(n),s ot h i sm e t h o dg i v e sa ne x p o n e n t i a l l yl a r g ef o r m u l a .
To r e d u c e t h e s i z e o f t h e f o r m u l a , w e u s e t h e ∀quantifier in addition to the ∃
quantifier. Let
φc1,c2,t=∃m1∀(c3,c4)∈{(c1,m1),(m1,c2)}⎪bracketleftbig
φc3,c4,t
2⎪bracketrightbig
.
The introduction of the new variables representing the configurations c3andc4
allows us to “fold” the two recursive subformulas into a single subformula, while
preserving the original meaning. By writing ∀(c3,c4)∈{(c1,m1),(m1,c2)},w e
indicate that the variables representing the configurations c3andc4may take the
values of the variables of c1andm1or of m1andc2,r e s p e c t i v e l y ,a n dt h a tt h e
resulting formula φc3,c4,t
2is true in either case. We may replace the construct
∀x∈{y,z}[...]with the equivalent construct ∀x[(x=y∨x=z)→...]to
obtain a syntactically correct quantified Boolean formula. Recall that in Sec-
tion 0.2, we showed that Boolean implication (→)and Boolean equality (=)can
be expressed in terms of AND and NOT .H e r e ,f o rc l a r i t y ,w eu s et h es y m b o l =
for Boolean equality instead of the equivalent symbol ↔used in Section 0.2.
To c a l c u l a t e t h e s i z e o f t h e f o r m u l a φcstart,caccept,h,w h e r e h=2df(n),w en o t e
that each level of the recursion adds a portion of the formula that is linear in the
size of the configurations and is thus of size O(f(n)).T h e n u m b e r o f l e v e l s o f
the recursion is log(2df(n)),o rO(f(n)).H e n c et h es i z eo ft h er e s u l t i n gf o r m u l a
isO(f2(n)).
WINNING STRATEGIES FOR GAMES
For the purposes of this section, a game is loosely defined to be a competition
in which opposing parties attempt to achieve some goal according to prespec-
ified rules. Games appear in many forms, from board games such as chess to
economic and war games that model corporate or societal conflict.
Games are closely related to quantifiers. A quantified statement has a corre-
sponding game; conversely, a game often has a corresponding quantified state-
ment. These correspondences are helpful in several ways. For one, expressing a
mathematical statement that uses many quantifiers in terms of the correspond-
ing game may give insight into the statement’s meaning. For another, expressing
ag a m ei nt e r m so faq u a n t i fi e ds t a t e m e n ta i d si nu n d e r s t a n d i n gt h ec o m p l e x i t y
of the game. T o illustrate the correspondence between games and quantifiers,
we turn to an artificial game called the formula game .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 366 ---
342 CHAPTER 8 / SPACE COMPLEXITY
Letφ=∃x1∀x2∃x3···Qxk[ψ]be a quantified Boolean formula in prenex
normal form. Here, Qrepresents either a ∀or an ∃quantifier. We associate a
game with φas follows. T wo players, called Player A and Player E, take turns
selecting the values of the variables x1,...,x k.P l a y e r A s e l e c t s v a l u e s f o r t h e
variables that are bound to ∀quantifiers, and Player E selects values for the
variables that are bound to ∃quantifiers. The order of play is the same as that
of the quantifiers at the beginning of the formula. At the end of play, we use the
values that the players have selected for the variables and declare that Player E
has won the game if ψ,t h ep a r to ft h ef o r m u l aw i t ht h eq u a n t i fi e r ss t r i p p e do f f ,
is now TRUE .P l a y e rAh a sw o ni f ψis now FALSE .
EXAMPLE 8.10
Say that φ1is the formula
∃x1∀x2∃x3⎪bracketleftbig
(x1∨x2)∧(x2∨x3)∧(
x2∨
x3)⎪bracketrightbig
.
In the formula game for φ1,P l a y e rEp i c k st h ev a l u eo f x1,t h e nP l a y e rAp i c k s
the value of x2,a n dfi n a l l yP l a y e rEp i c k st h ev a l u eo f x3.
To i l l u s t r a t e a s a m p l e p l a y o f t h i s g a m e , w e b e g i n b y r e p r e s e n t i n g t h e B o o l e a n
value TRUE with 1and FALSE with 0,a su s u a l . L e t ’ ss a yt h a tP l a y e rEp i c k s
x1=1,t h e nP l a y e rAp i c k s x2=0,a n dfi n a l l yP l a y e rEp i c k s x3=1.W i t h
these values for x1,x2,a n d x3,t h es u b f o r m u l a
(x1∨x2)∧(x2∨x3)∧(
x2∨
x3)
is1,s oP l a y e rEh a sw o nt h eg a m e .I nf a c t ,P l a y e rEm a ya l w a y sw i nt h i sg a m eb y
selecting x1=1and then selecting x3to be the negation of whatever Player A
selects for x2.W e s a y t h a t P l a y e r E h a s a winning strategy for this game. A
player has a winning strategy for a game if that player wins when both sides play
optimally.
Now let’s change the formula slightly to get a game in which Player A has a
winning strategy. Let φ2be the formula
∃x1∀x2∃x3⎪bracketleftbig
(x1∨x2)∧(x2∨x3)∧(x2∨
x3)⎪bracketrightbig
.
Player A now has a winning strategy because no matter what Player E selects
forx1,P l a y e rAm a ys e l e c t x2=0,t h e r e b yf a l s i f y i n gt h ep a r to ft h ef o r m u l a
appearing after the quantifiers, whatever Player E’s last move may be.
We next consider the problem of determining which player has a winning
strategy in the formula game associated with a particular formula. Let
FORMULA-GAME ={⟨φ⟩|Player E has a winning strategy in
the formula game associated with φ}.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 367 ---
8.3 PSPACE-COMPLETENESS 343
THEOREM 8.11
FORMULA-GAME isPSPACE -complete.
PROOF IDEA FORMULA-GAME isPSPACE -complete for a simple reason.
It is the same as TQBF .T os e et h a t FORMULA-GAME =TQBF ,o b s e r v et h a ta
formula is TRUE exactly when Player E has a winning strategy in the associated
formula game. The two statements are different ways of saying the same thing.
PROOF The formula φ=∃x1∀x2∃x3···[ψ]isTRUE when some setting
forx1exists such that for any setting of x2,as e t t i n go f x3exists such that, and so
on...,w h e r e ψisTRUE under the settings of the variables. Similarly, Player E
has a winning strategy in the game associated with φwhen Player E can make
some assignment to x1such that for any setting of x2,P l a y e rEc a nm a k ea n
assignment to x3such that, and so on ...,w h e r e ψisTRUE under these settings
of the variables.
The same reasoning applies when the formula doesn’t alternate between ex-
istential and universal quantifiers. If φhas the form ∀x1,x2,x3∃x4,x5∀x6[ψ],
Player A would make the first three moves in the formula game to assign values
tox1,x2,a n d x3;t h e nP l a y e rEw o u l dm a k et w om o v e st oa s s i g n x4andx5;a n d
finally Player A would assign a value x6.
Hence φ∈TQBF exactly when φ∈FORMULA-GAME ,a n dt h et h e o r e m
follows from Theorem 8.9.
GENERALIZED GEOGRAPHY
Now that we know that the formula game is PSPACE -complete, we can es-
tablish the PSPACE -completeness or PSPACE -hardness of some other games
more easily. We’ll begin with a generalization of the game geography and later
discuss games such as chess, checkers, and GO.
Geography is a child’s game in which players take turns naming cities from
anywhere in the world. Each city chosen must begin with the same letter that
ended the previous city’s name. Repetition isn’t permitted. The game starts with
some designated starting city and ends when some player loses because he or she
is unable to continue. For example, if the game starts with Peoria, then Amherst
might legally follow (because Peoria ends with the letter a,a n dA m h e r s tb e g i n s
with the letter a), then T ucson, then Nashua, and so on until one player gets
stuck and thereby loses.
We can model this game with a directed graph whose nodes are the cities of
the world. We draw an arrow from one city to another if the first can lead to the
second according to the game rules. In other words, the graph contains an edge
from a city X to a city Y if city X ends with the same letter that begins city Y. We
illustrate a portion of the geography graph in Figure 8.12.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 368 ---
344 CHAPTER 8 / SPACE COMPLEXITY
FIGURE 8.12
Portion of the graph representing the geography game
When the rules of geography are interpreted for this graphic representation,
one player starts by selecting the designated start node and then the players take
turns picking nodes that form a simple path in the graph. The requirement that
the path be simple (i.e., doesn’t use any node more than once) corresponds to the
requirement that a city may not be repeated. The first player unable to extend
the path loses the game.
Ingeneralized geography ,w et a k ea na r b i t r a r yd i r e c t e dg r a p hw i t had e s -
ignated start node instead of the graph associated with the actual cities. For
example, the following graph is an example of a generalized geography game.
FIGURE 8.13
As a m p l eg e n e r a l i z e dg e o g r a p h yg a m e
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 369 ---
8.3 PSPACE-COMPLETENESS 345
Say that Player I is the one who moves first and Player II second. In this
example, Player I has a winning strategy as follows. Player I starts at node 1,
the designated start node. Node 1 points only at nodes 2 and 3, so Player I’s
first move must be one of these two choices. He chooses 3. Now Player II must
move, but node 3 points only to node 5, so she is forced to select node 5. Then
Player I selects 6, from choices 6, 7, and 8. Now Player II must play from node 6,
but it points only to node 3, and 3 was previously played. Player II is stuck and
thus Player I wins.
If we change the example by reversing the direction of the edge between
nodes 3 and 6, Player II has a winning strategy. Can you see it? If Player I starts
out with node 3 as before, Player II responds with 6 and wins immediately, so
Player I’s only hope is to begin with 2. In that case, however, Player II responds
with 4. If Player I now takes 5, Player II wins with 6. If Player I takes 7, Player II
wins with 9. No matter what Player I does, Player II can find a way to win, so
Player II has a winning strategy.
The problem of determining which player has a winning strategy in a gener-
alized geography game is PSPACE -complete. Let
GG={⟨G, b⟩|Player I has a winning strategy for the generalized
geography game played on graph Gstarting at node b}.
THEOREM 8.14
GGisPSPACE -complete.
PROOF IDEA Ar e c u r s i v ea l g o r i t h ms i m i l a rt ot h eo n eu s e df o r TQBF in
Theorem 8.9 determines which player has a winning strategy. This algorithm
runs in polynomial space and so GG∈PSPACE .
To p r o v e t h a t GGisPSPACE -hard, we give a polynomial time reduction
from FORMULA-GAME toGG.T h i s r e d u c t i o n c o n v e r t s a f o r m u l a g a m e t o
ag e n e r a l i z e dg e o g r a p h yg r a p hs ot h a tp l a yo nt h eg r a p hm i m i c sp l a yi nt h e
formula game. In effect, the players in the generalized geography game are
really playing an encoded form of the formula game.
PROOF The following algorithm decides whether Player I has a winning
strategy in instances of generalized geography; in other words, it decides GG.
We show that it runs in polynomial space.
M=“On input ⟨G, b⟩,w h e r e Gis a directed graph and bis a node of G:
1.Ifbhas outdegree 0,reject because Player I loses immediately.
2.Remove node band all connected arrows to get a new graph G′.
3.For each of the nodes b1,b2,...,b kthatboriginally pointed at,
recursively call Mon⟨G′,bi⟩.
4.If all of these accept, Player II has a winning strategy in the
original game, so reject .O t h e r w i s e , P l a y e r I I d o e s n ’ t h a v e a
winning strategy, so Player I must; therefore, accept .”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 370 ---
346 CHAPTER 8 / SPACE COMPLEXITY
The only space required by this algorithm is for storing the recursion stack.
Each level of the recursion adds a single node to the stack, and at most mlevels
occur, where mis the number of nodes in G.H e n c et h ea l g o r i t h mr u n si nl i n e a r
space.
To e s t a b l i s h t h e PSPACE -hardness of GG,w es h o wt h a t FORMULA-GAME
is polynomial time reducible to GG.T h er e d u c t i o nm a p st h ef o r m u l a
φ=∃x1∀x2∃x3···Qxk[ψ]
to an instance ⟨G, b⟩of generalized geography. Here we assume for simplicity
thatφ’s quantifiers begin and end with ∃,a n dt h a tt h e ys t r i c t l ya l t e r n a t eb e t w e e n
∃and∀.A f o r m u l a t h a t d o e s n ’ t c o n f o r m t o t h i s a s s u m p t i o n m a y b e c o n v e r t e d
to a slightly larger one that does by adding extra quantifiers binding otherwise
unused or “dummy” variables. We assume also that ψis in conjunctive normal
form (see Problem 8.12).
The reduction constructs a geography game on a graph Gwhere optimal play
mimics optimal play of the formula game on φ.P l a y e rIi nt h eg e o g r a p h yg a m e
takes the role of Player E in the formula game, and Player II takes the role of
Player A.
The structure of graph Gis partially shown in the following figure. Play starts
at node b,w h i c ha p p e a r sa tt h et o pl e f t - h a n ds i d eo f G.U n d e r n e a t h b,as e q u e n c e
of diamond structures appears, one for each of the variables of φ.B e f o r eg e t t i n g
to the right-hand side of G,l e t ’ ss e eh o wp l a yp r o c e e d so nt h el e f t - h a n ds i d e .
FIGURE 8.15
Partial structure of the geography game simulating the formula game
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 371 ---
8.3 PSPACE-COMPLETENESS 347
Play starts at b.P l a y e rIm u s ts e l e c to n eo ft h et w oe d g e sg o i n gf r o m b.T h e s e
edges correspond to Player E’s possible choices at the beginning of the formula
game. The left-hand choice for Player I corresponds to TRUE for Player E in the
formula game and the right-hand choice to FALSE .A f t e r P l a y e r I h a s s e l e c t e d
one of these edges—say, the left-hand one—Player II moves. Only one outgoing
edge is present, so this move is forced. Similarly, Player I’s next move is forced
and play continues from the top of the second diamond. Now two edges again
are present, but Player II gets the choice. This choice corresponds to Player A ’s
first move in the formula game. As play continues in this way, Players I and II
choose a rightward or leftward path through each of the diamonds.
After play passes through all the diamonds, the head of the path is at the
bottom node in the last diamond, and it is Player I’s turn because we assumed
that the last quantifier is ∃.P l a y e rI ’ sn e x tm o v ei sf o r c e d .T h e nt h e ya r ea tn o d e
cin Figure 8.15 and Player II makes the next move.
This point in the geography game corresponds to the end of play in the
formula game. The chosen path through the diamonds corresponds to an as-
signment to φ’s variables. Under that assignment, if ψisTRUE ,P l a y e rEw i n s
the formula game; and if ψisFALSE ,P l a y e rAw i n s .T h es t r u c t u r eo nt h er i g h t -
hand side of the following figure guarantees that Player I can win if Player E has
won, and that Player II can win if Player A has won.
FIGURE 8.16
Full structure of the geography game simulating the formula game, where
φ=∃x1∀x2···∃xk[(x1∨
x2∨x3)∧(
x2∨
x3∨···)∧· · ·∧ () ]
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 372 ---
348 CHAPTER 8 / SPACE COMPLEXITY
At node c,P l a y e rI Im a yc h o o s ean o d ec o r r e s p o n d i n gt oo n eo f ψ’s clauses.
Then Player I may choose a node corresponding to a literal in that clause.
The nodes corresponding to unnegated literals are connected to the left-hand
(TRUE )s i d e so ft h ed i a m o n df o ra s s o c i a t e dv a r i a b l e s ,a n ds i m i l a r l yf o rn e g a t e d
literals and right-hand ( FALSE )s i d e sa ss h o w ni nF i g u r e8 . 1 6 .
IfψisFALSE ,P l a y e rI Im a yw i nb ys e l e c t i n gt h eu n s a t i s fi e dc l a u s e . A n y
literal that Player I may then pick is FALSE and is connected to the side of the
diamond that hasn’t yet been played. Thus Player II may play the node in the
diamond, but then Player I is unable to move and loses. If ψisTRUE ,a n yc l a u s e
that Player II picks contains a TRUE literal. Player I selects that literal after
Player II’s move. Because the literal is TRUE ,i ti sc o n n e c t e dt ot h es i d eo ft h e
diamond that has already been played, so Player II is unable to move and loses.
In Theorem 8.14, we showed that no polynomial time algorithm exists for
optimal play in generalized geography unless P=P S P A C E .W e ’ dl i k et op r o v e
as i m i l a rt h e o r e mr e g a r d i n gt h ed i f fi c u l t yo fc o m p u t i n go p t i m a lp l a yi nb o a r d
games such as chess, but an obstacle arises. Only a finite number of different
game positions may occur on the standard 8×8chess board. In principle, all
these positions may be placed in a table, along with a best move in each position.
The table would be too large to fit inside our galaxy but, being finite, could be
stored in the control of a T uring machine (or even that of a finite automaton!).
Thus, the machine would be able to play optimally in linear time, using table
lookup. Perhaps at some time in the future, methods that can quantify the com-
plexity of finite problems will be developed. But current methods are asymptotic
and hence apply only to the rate of growth of the complexity as the problem size
increases—not to any fixed size. Nevertheless, we can give some evidence for
the difficulty of computing optimal play for many board games by generaliz-
ing them to an n×nboard. Such generalizations of chess, checkers, and GO
have been shown to be PSPACE -hard or hard for even larger complexity classes,
depending on the details of the generalization.
8.4
THE CLASSES L AND NL
Until now, we have considered only time and space complexity bounds that are
at least linear—that is, bounds where f(n)is at least n.N o ww ee x a m i n es m a l l e r ,
sublinear space bounds. In time complexity, sublinear bounds are insufficient for
reading the entire input, so we don’t consider them here. In sublinear space com-
plexity, the machine is able to read the entire input but it doesn’t have enough
space to store the input. T o consider this situation meaningfully, we must modify
our computational model.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 373 ---
8.4 THE CLASSES L AND NL 349
We introduce a Turing machine with two tapes: a read-only input tape, and a
read/write work tape. On the read-only tape, the input head can detect symbols
but not change them. We provide a way for the machine to detect when the head
is at the left-hand and right-hand ends of the input. The input head must remain
on the portion of the tape containing the input. The work tape may be read and
written in the usual way. Only the cells scanned on the work tape contribute to
the space complexity of this type of T uring machine.
Think of a read-only input tape as a CD-ROM, a device used for input on
many personal computers. Often, the CD-ROM contains more data than the
computer can store in its main memory. Sublinear space algorithms allow the
computer to manipulate the data without storing all of it in main memory.
For space bounds that are at least linear, the two-tape TMmodel is equivalent
to the standard one-tape model (see Exercise 8.1). For sublinear space bounds,
we use only the two-tape model.
DEFINITION 8.17
Lis the class of languages that are decidable in logarithmic space
on a deterministic T uring machine. In other words,
L = SPACE(log n).
NLis the class of languages that are decidable in logarithmic space
on a nondeterministic T uring machine. In other words,
NL = NSPACE(log n).
We focus on lognspace instead of, say,√
norlog2nspace, for several rea-
sons that are similar to those for our selection of polynomial time and space
bounds. Logarithmic space is just large enough to solve a number of interesting
computational problems, and it has attractive mathematical properties such as
robustness even when the machine model and input encoding method change.
Pointers into the input may be represented in logarithmic space, so one way to
think about the power of log space algorithms is to consider the power of a fixed
number of input pointers.
EXAMPLE 8.18
The language A={0k1k|k≥0}is a member of L.I nS e c t i o n7 . 1 ,o np a g e2 7 5
we described a T uring machine that decides Aby zig-zagging back and forth
across the input, crossing off the 0sa n d 1sa st h e ya r em a t c h e d .T h a ta l g o r i t h m
uses linear space to record which positions have been crossed off, but it can be
modified to use only log space.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 374 ---
350 CHAPTER 8 / SPACE COMPLEXITY
The log space TMforAcannot cross off the 0sa n d 1st h a th a v eb e e nm a t c h e d
on the input tape because that tape is read-only. Instead, the machine counts
the number of 0sa n d ,s e p a r a t e l y ,t h en u m b e ro f 1si nb i n a r yo nt h ew o r kt a p e .
The only space required is that used to record the two counters. In binary, each
counter uses only logarithmic space and hence the algorithm runs in O(logn)
space. Therefore, A∈L.
EXAMPLE 8.19
Recall the language
PATH ={⟨G, s, t ⟩|Gis a directed graph that has a directed path from stot}
defined in Section 7.2. Theorem 7.14 shows that PATH is in P,b u tt h a tt h e
algorithm given uses linear space. We don’t know whether PATH can be solved
in logarithmic space deterministically, but we do know a nondeterministic log
space algorithm for PATH .
The nondeterministic log space T uring machine deciding PATH operates by
starting at node sand nondeterministically guessing the nodes of a path from s
tot.T h em a c h i n er e c o r d so n l yt h ep o s i t i o no ft h ec u r r e n tn o d ea te a c hs t e po n
the work tape, not the entire path (which would exceed the logarithmic space
requirement). The machine nondeterministically selects the next node from
among those pointed at by the current node. It repeats this action until it reaches
node tandaccepts ,o ru n t i li th a sg o n eo nf o r msteps and rejects ,w h e r e mis
the number of nodes in the graph. Thus, PATH is in NL.
Our earlier claim that any f(n)space bounded T uring machine also runs in
time 2O(f(n))is no longer true for very small space bounds. For example, a
Tu r i n g m a c h i n e t h a t u s e s O(1)(i.e., constant) space may run for nsteps. T o
obtain a bound on the running time that applies for every space bound f(n),w e
give the following definition.
DEFINITION 8.20
IfMis a T uring machine that has a separate read-only input tape
andwis an input, a configuration of Monwis a setting of the
state, the work tape, and the positions of the two tape heads. The
input wis not a part of the configuration of Monw.
IfMruns in f(n)space and wis an input of length n,t h en u m b e ro fc o n fi g u -
rations of Monwisn2O(f(n)).T oe x p l a i nt h i sr e s u l t ,l e t ’ ss a yt h a t Mhascstates
andgtape symbols. The number of strings that can appear on the work tape is
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 375 ---
8.5 NL-COMPLETENESS 351
gf(n).T h ei n p u th e a dc a nb ei no n eo f npositions, and the work tape head can
be in one of f(n)positions. Therefore, the total number of configurations of M
onw,w h i c hi sa nu p p e rb o u n do nt h er u n n i n gt i m eo f Monw,i scnf(n)gf(n),
orn2O(f(n)).
We focus almost exclusively on space bounds f(n)that are at least logn.O u r
earlier claim that the time complexity of a machine is at most exponential in
its space complexity remains true for such bounds because n2O(f(n))is2O(f(n))
when f(n)≥logn.
Recall that Savitch’s theorem shows that we can convert nondeterministic TMs
to deterministic TMsa n di n c r e a s et h es p a c ec o m p l e x i t y f(n)by only a squaring,
provided that f(n)≥n.W ec a ne x t e n dS a v i t c h ’ st h e o r e mt oh o l df o rs u b l i n e a r
space bounds down to f(n)≥logn.T h e p r o o f i s i d e n t i c a l t o t h e o r i g i n a l o n e
we gave on page 334, except that we use T uring machines with a read-only input
tape; and instead of referring to configurations of N,w er e f e rt oc o n fi g u r a t i o n s
ofNonw.S t o r i n g a c o n fi g u r a t i o n o f Nonwuses log(n2O(f(n))) = log n+
O(f(n))space. If f(n)≥logn,t h es t o r a g eu s e di s O(f(n))and the remainder
of the proof remains the same.
8.5
NL-COMPLETENESS
As we mentioned in Example 8.19, the PATH problem is known to be in NLbut
isn’t known to be in L.W eb e l i e v et h a t PATH doesn’t belong to L,b u tw ed o n ’ t
know how to prove this conjecture. In fact, we don’t know of any problem in
NLthat can be proven to be outside L.A n a l o g o u s t o t h e q u e s t i o n o f w h e t h e r
P=N P ,w eh a v et h eq u e s t i o no fw h e t h e r L=N L .
As a step toward resolving the Lversus NLquestion, we can exhibit certain
languages that are NL-complete. As with complete languages for other com-
plexity classes, the NL-complete languages are examples of languages that are,
in a certain sense, the most difficult languages in NL.I fLandNLare different,
allNL-complete languages don’t belong to L.
As with our previous definitions of completeness, we define an NL-complete
language to be one that is in NLand to which any other language in NLis
reducible. However, we don’t use polynomial time reducibility here because, as
you will see, all problems in NLare solvable in polynomial time. Therefore,
every two problems in NLexcept ∅andΣ∗are polynomial time reducible to one
another (see the discussion of polynomial time reducibility in the definition of
PSPACE -completeness on page 337). Hence polynomial time reducibility is too
strong to differentiate problems in NLfrom one another. Instead we use a new
type of reducibility called log space reducibility .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 376 ---
352 CHAPTER 8 / SPACE COMPLEXITY
DEFINITION 8.21
Alog space transducer is a T uring machine with a read-only input
tape, a write-only output tape, and a read/write work tape. The
head on the output tape cannot move leftward, so it cannot read
what it has written. The work tape may contain O(logn)symbols.
Al o gs p a c et r a n s d u c e r Mcomputes a function f:Σ∗−→Σ∗,w h e r e
f(w)is the string remaining on the output tape after Mhalts when
it is started with won its input tape. We call falog space com-
putable function .L a n g u a g e Aislog space reducible to language B,
written A≤LB,i fAis mapping reducible to Bby means of a log
space computable function f.
Now we are ready to define NL-completeness.
DEFINITION 8.22
Al a n g u a g e BisNL-complete if
1.B∈NL,a n d
2.every AinNLis log space reducible to B.
If one language is log space reducible to another language already known to
be in L,t h eo r i g i n a ll a n g u a g ei sa l s oi n L,a st h ef o l l o w i n gt h e o r e md e m o n s t r a t e s .
THEOREM 8.23
IfA≤LBandB∈L,t h e n A∈L.
PROOF At e m p t i n ga p p r o a c ht ot h ep r o o fo ft h i st h e o r e mi st of o l l o wt h e
model presented in Theorem 7.31, the analogous result for polynomial time re-
ducibility. In that approach, a log space algorithm for Afirst maps its input wto
f(w),u s i n gt h el o gs p a c er e d u c t i o n f,a n dt h e na p p l i e st h el o gs p a c ea l g o r i t h m
forB.H o w e v e r ,t h es t o r a g er e q u i r e df o r f(w)may be too large to fit within the
log space bound, so we need to modify this approach.
Instead, A’s machine MAcomputes individual symbols of f(w)as requested
byB’s machine MB.I n t h e s i m u l a t i o n , MAkeeps track of where MB’s input
head would be on f(w).E v e r y t i m e MBmoves, MArestarts the computation
offonwfrom the beginning and ignores all the output except for the desired
location of f(w).D o i n g s o m a y r e q u i r e o c c a s i o n a l r e c o m p u t a t i o n o f p a r t s o f
f(w)and so is inefficient in its time complexity. The advantage of this method
is that only a single symbol of f(w)needs to be stored at any point, in effect
trading time for space.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 377 ---
8.5 NL-COMPLETENESS 353
COROLLARY 8.24
If any NL-complete language is in L,t h e n L=N L .
SEARCHING IN GRAPHS
THEOREM 8.25
PATH isNL-complete.
PROOF IDEA Example 8.19 shows that PATH is in NL,s ow eo n l yn e e dt o
show that PATH isNL-hard. In other words, we must show that every language
AinNLis log space reducible to PATH .
The idea behind the log space reduction from AtoPATH is to construct a
graph that represents the computation of the nondeterministic log space T uring
machine for A.T h e r e d u c t i o n m a p s a s t r i n g wto a graph whose nodes cor-
respond to the configurations of the NTM on input w.O n e n o d e p o i n t s t o a
second node if the corresponding first configuration can yield the second con-
figuration in a single step of the NTM.H e n c et h em a c h i n ea c c e p t s wwhenever
some path from the node corresponding to the start configuration leads to the
node corresponding to the accepting configuration.
PROOF We show how to give a log space reduction from any language Ain
NLtoPATH .L e t ’ s s a y t h a t NTM Mdecides AinO(logn)space. Given an
input w,w ec o n s t r u c t ⟨G, s, t ⟩in log space, where Gis a directed graph that
contains a path from stotif and only if Maccepts w.
The nodes of Gare the configurations of Monw.F o rc o n fi g u r a t i o n s c1and
c2ofMonw,t h ep a i r (c1,c2)is an edge of Gifc2is one of the possible next
configurations of Mstarting from c1.M o r ep r e c i s e l y ,i f M’s transition function
indicates that c1’s state together with the tape symbols under its input and work
tape heads can yield the next state and head actions to make c1into c2,t h e n
(c1,c2)is an edge of G.N o d e sis the start configuration of Monw.M a c h i n e
Mis modified to have a unique accepting configuration, and we designate this
configuration to be node t.
This mapping reduces AtoPATH because whenever Maccepts its input,
some branch of its computation accepts, which corresponds to a path from the
start configuration sto the accepting configuration tinG. Conversely, if some
path exists from stotinG,s o m ec o m p u t a t i o nb r a n c ha c c e p t sw h e n Mruns on
input w,a n d Maccepts w.
To s h o w t h a t t h e r e d u c t i o n o p e r a t e s i n l o g s p a c e , w e g i v e a l o g s p a c e t r a n s -
ducer that outputs ⟨G, s, t ⟩on input w.W e d e s c r i b e Gby listing its nodes and
edges. Listing the nodes is easy because each node is a configuration of Monw
and can be represented in clognspace for some constant c.T h et r a n s d u c e rs e -
quentially goes through all possible strings of length clogn,t e s t sw h e t h e re a c h
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 378 ---
354 CHAPTER 8 / SPACE COMPLEXITY
is a legal configuration of Monw,a n do u t p u t st h o s et h a tp a s st h et e s t . T h e
transducer lists the edges similarly. Log space is sufficient for verifying that a
configuration c1ofMonwcan yield configuration c2because the transducer
only needs to examine the actual tape contents under the head locations given
inc1to determine that M’s transition function would give configuration c2as
ar e s u l t . T h et r a n s d u c e rt r i e sa l lp a i r s (c1,c2)in turn to find which qualify as
edges of G.T h o s et h a td oa r ea d d e dt ot h eo u t p u tt a p e .
One immediate spinoff of Theorem 8.25 is the following corollary, which
states that NLis a subset of P.
COROLLARY 8.26
NL⊆P.
PROOF Theorem 8.25 shows that any language in NLis log space reducible
toPATH .R e c a l l t h a t a T u r i n g m a c h i n e t h a t u s e s s p a c e f(n)runs in time
n2O(f(n)),s oar e d u c e rt h a tr u n si nl o gs p a c ea l s or u n si np o l y n o m i a lt i m e .
Therefore, any language in NLis polynomial time reducible to PATH ,w h i c h
in turn is in P,b yT h e o r e m7 . 1 4 .W ek n o wt h a te v e r yl a n g u a g et h a ti sp o l y n o -
mial time reducible to a language in Pis also in P,s ot h ep r o o fi sc o m p l e t e .
Though log space reducibility appears to be highly restrictive, it is adequate
for most reductions in complexity theory because these are usually computa-
tionally simple. For example, in Theorem 8.9 we showed that every PSPACE
problem is polynomial time reducible to TQBF .T h e h i g h l y r e p e t i t i v e f o r m u -
las that these reductions produce may be computed using only log space, and
therefore we may conclude that TQBF isPSPACE -complete with respect to log
space reducibility. This conclusion is important because Corollary 9.6 shows
that NL/subsetnoteqlPSPACE .T h i s s e p a r a t i o n a n d l o g s p a c e r e d u c i b i l i t y i m p l y t h a t
TQBF ̸∈NL.
8.6
NL EQUALS CONL
This section contains one of the most surprising results known concerning the
relationships among complexity classes. The classes NPandcoNP are generally
believed to be different. At first glance, the same appears to hold for the classes
NLandcoNL .T h e f a c t t h a t NLequals coNL ,a sw ea r ea b o u tt op r o v e ,s h o w s
that our intuition about computation still has many gaps in it.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 379 ---
8.6 NL EQUALS CONL 355
THEOREM 8.27
NL = coNL .
PROOF IDEA We show that
 PATH is in NL,a n dt h e r e b ye s t a b l i s ht h a te v -
ery problem in coNL is also in NL,b e c a u s e PATH isNL-complete. The NL
algorithm Mthat we present for
 PATH must have an accepting computation
whenever the input graph Gdoes notcontain a path from stot.
First, let’s tackle an easier problem. Let cbe the number of nodes in Gthat
are reachable from s.W ea s s u m et h a t cis provided as an input to Mand show
how to use cto solve
 PATH .L a t e rw es h o wh o wt oc o m p u t e c.
Given G,s,t,a n d c,t h em a c h i n e Moperates as follows. One by one, M
goes through all the mnodes of Gand nondeterministically guesses whether
each one is reachable from s.W h e n e v e ran o d e uis guessed to be reachable, M
attempts to verify this guess by guessing a path of length mor less from stou.I f
ac o m p u t a t i o nb r a n c hf a i l st ov e r i f yt h i sg u e s s ,i tr e j e c t s .I na d d i t i o n ,i fab r a n c h
guesses that tis reachable, it rejects. Machine Mcounts the number of nodes
that have been verified to be reachable. When a branch has gone through all
ofG’s nodes, it checks that the number of nodes that it verified to be reachable
from sequals c,t h en u m b e ro fn o d e st h a ta c t u a l l ya r er e a c h a b l e ,a n dr e j e c t si f
not. Otherwise, this branch accepts.
In other words, if Mnondeterministically selects exactly cnodes reachable
from s,n o ti n c l u d i n g t,a n dp r o v e st h a te a c hi sr e a c h a b l ef r o m sby guessing the
path, Mknows that the remaining nodes, including t,a r e notreachable, so it can
accept.
Next, we show how to calculate c,t h en u m b e ro fn o d e sr e a c h a b l ef r o m s.W e
describe a nondeterministic log space procedure whereby at least one computa-
tion branch has the correct value for cand all other branches reject.
For each ifrom 0tom,w ed e fi n e Aito be the collection of nodes that are at
ad i s t a n c eo f ior less from s(i.e., that have a path of length at most ifrom s).
SoA0={s},e a c h Ai⊆Ai+1,a n d Amcontains all nodes that are reachable
from s.L e t cibe the number of nodes in Ai.W en e x td e s c r i b eap r o c e d u r et h a t
calculates ci+1from ci.R e p e a t e da p p l i c a t i o no ft h i sp r o c e d u r ey i e l d st h ed e s i r e d
value of c=cm.
We calculate ci+1from ci,u s i n ga ni d e as i m i l a rt ot h eo n ep r e s e n t e de a r l i e r
in this proof sketch. The algorithm goes through all the nodes of G,d e t e r m i n e s
whether each is a member of Ai+1,a n dc o u n t st h em e m b e r s .
To d e t e r m i n e w h e t h e r a n o d e vis inAi+1,w eu s ea ni n n e rl o o pt og ot h r o u g h
all the nodes of Gand guess whether each node is in Ai.E a c hp o s i t i v eg u e s s i s
verified by guessing the path of length at most ifrom s.F o re a c hn o d e uverified
to be in Ai,t h ea l g o r i t h mt e s t sw h e t h e r (u, v)is an edge of G.I f i t i s a n e d g e ,
vis in Ai+1.A d d i t i o n a l l y , t h e n u m b e r o f n o d e s v e r i fi e dt o b e i n Aiis counted.
At the completion of the inner loop, if the total number of nodes verified to be
inAiis not ci,a l lAihave not been found, so this computation branch rejects.
If the count equals ciandvhas not yet been shown to be in Ai+1,w ec o n c l u d e
that it isn’t in Ai+1.T h e nw eg oo nt ot h en e x t vin the outer loop.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 380 ---
356 CHAPTER 8 / SPACE COMPLEXITY
PROOF Here is an algorithm for
 PATH .L e t mbe the number of nodes of G.
M=“On input ⟨G, s, t ⟩:
1.Letc0=1. [[A0={s}has1node ]]
2.Fori=0tom−1: [[compute ci+1from ci]]
3. Letci+1=1. [[ci+1counts nodes in Ai+1]]
4. For each node v̸=sinG: [[check if v∈Ai+1]]
5. Letd=0. [[dre-counts Ai]]
6. For each node uinG: [[check if u∈Ai]]
7. Nondeterministically either perform or skip these steps:
8. Nondeterministically follow a path of length at most i
from sandreject if it doesn’t end at u.
9. Increment d. [[verified that u∈Ai]]
10. If(u, v)is an edge of G,i n c r e m e n t ci+1and go to
stage 5 with the next v. [[verified that v∈Ai+1]]
11. Ifd̸=ci,t h e n reject . [[check whether found all Ai]]
12. Letd=0. [[cmnow known; dre-counts Am]]
13. For each node uinG: [[check if u∈Am]]
14. Nondeterministically either perform or skip these steps:
15. Nondeterministically follow a path of length at most m
from sandreject if it doesn’t end at u.
16. Ifu=t,t h e n reject . [[found path from stot]]
17. Increment d. [[verified that u∈Am]]
18. Ifd̸=cm,t h e n reject . [[check whether found all of Am]]
Otherwise, accept .”
This algorithm only needs to store m,u,v,ci,ci+1,d,i,a n dap o i n t e rt ot h e
head of a path at any given time. Hence it runs in log space. (Note that M
accepts improperly formed inputs, too.)
We summarize our present knowledge of the relationships among several
complexity classes as follows:
L⊆NL = coNL ⊆P⊆NP⊆PSPACE .
We don’t know whether any of these containments are proper, although we
prove NL/subsetnoteqlPSPACE in Corollary 9.6. Consequently, either coNL /subsetnoteqlPor
P/subsetnoteqlPSPACE must hold, but we don’t know which one does! Most researchers
conjecture that all these containments are proper.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 381 ---
EXERCISES 357
EXERCISES
8.1 Show that for any function f:N− →R+,w h e r e f(n)≥n,t h es p a c ec o m p l e x i t y
class SPACE( f(n))is the same whether you define the class by using the single-
tape TMmodel or the two-tape read-only input TMmodel.
8.2 Consider the following position in the standard tic-tac-toe game.
×
⃝⃝⃝
⃝⃝⃝
 ×
Let’s say that it is the ×-player’s turn to move next. Describe a winning strategy
for this player. (Recall that a winning strategy isn’t merely the best move to make
in the current position. It also includes all the responses that this player must make
in order to win, however the opponent moves.)
8.3 Consider the following generalized geography game wherein the start node is the
one with the arrow pointing in from nowhere. Does Player I have a winning strat-
egy? Does Player II? Give reasons for your answers.
8.4 Show that PSPACE is closed under the operations union, complementation, and
star.
A8.5 Show that ADFA∈L.
8.6 Show that any PSPACE -hard language is also NP-hard.
A8.7 Show that NLis closed under the operations union, concatenation, and star.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 382 ---
358 CHAPTER 8 / SPACE COMPLEXITY
PROBLEMS
8.8 LetEQREX={⟨R, S⟩|RandSare equivalent regular expressions }.S h o w t h a t
EQREX∈PSPACE .
8.9 Aladder is a sequence of strings s1,s2,...,s k,w h e r e i ne v e r ys t r i n gd i f f e r sf r o m
the preceding one by exactly one character. For example, the following is a ladder
of English words, starting with “head” and ending with “free”:
head, hear, near, fear, bear, beer, deer, deed, feed, feet, fret, free.
LetLADDER DFA={⟨M,s,t ⟩|Mis aDFAandL(M)contains a ladder of strings,
starting with sand ending with t}.S h o wt h a t LADDER DFAis in PSPACE .
8.10 The Japanese game go-moku is played by two players, “X” and “O,” on a 19×19
grid. Players take turns placing markers, and the first player to achieve five of her
markers consecutively in a row, column, or diagonal is the winner. Consider this
game generalized to an n×nboard. Let
GM ={⟨B⟩|Bis a position in generalized go-moku,
where player “X” has a winning strategy }.
By a position we mean a board with markers placed on it, such as may occur in the
middle of a play of the game, together with an indication of which player moves
next. Show that GM ∈PSPACE .
8.11 Show that if every NP-hard language is also PSPACE -hard, then PSPACE = NP .
8.12 Show that TQBF restricted to formulas where the part following the quantifiers is
in conjunctive normal form is still PSPACE -complete.
8.13 Define ALBA={⟨M,w⟩|Mis an LBAthat accepts input w}.S h o w t h a t ALBAis
PSPACE-complete.
⋆8.14 The cat-and-mouse game is played by two players, “Cat” and “Mouse,” on an arbi-
trary undirected graph. At a given point, each player occupies a node of the graph.
The players take turns moving to a node adjacent to the one that they currently
occupy. A special node of the graph is called “Hole.” Cat wins if the two players
ever occupy the same node. Mouse wins if it reaches the Hole before the preceding
happens. The game is a draw if a situation repeats (i.e., the two players simultane-
ously occupy positions that they simultaneously occupied previously, and it is the
same player’s turn to move).
HAPPY-CAT ={⟨G, c, m, h ⟩|G, c, m, h are respectively a graph, and
positions of the Cat, Mouse, and Hole, such that
Cat has a winning strategy if Cat moves first }.
Show that HAPPY-CAT is inP. (Hint: The solution is not complicated and doesn’t
depend on subtle details in the way the game is defined. Consider the entire game
tree. It is exponentially big, but you can search it in polynomial time.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 383 ---
PROBLEMS 359
8.15 Consider the following two-person version of the language PUZZLE that was de-
scribed in Problem 7.28. Each player starts with an ordered stack of puzzle cards.
The players take turns placing the cards in order in the box and may choose which
side faces up. Player I wins if all hole positions are blocked in the final stack, and
Player II wins if some hole position remains unblocked. Show that the problem of
determining which player has a winning strategy for a given starting configuration
of the cards is PSPACE -complete.
8.16 Read the definition of MIN-FORMULA in Problem 7.46.
a.Show that MIN-FORMULA ∈PSPACE .
b.Explain why this argument fails to show that MIN-FORMULA ∈coNP :
Ifφ̸∈MIN-FORMULA ,t h e n φhas a smaller equivalent formula. An NTM
can verify that φ∈
MIN-FORMULA by guessing that formula.
8.17 LetAbe the language of properly nested parentheses. For example, (()) and
(()(()))() are in A,b u t )(is not. Show that Ais in L.
⋆8.18 LetBbe the language of properly nested parentheses and brackets. For example,
([()()]()[]) is in Bbut([)] is not. Show that Bis in L.
⋆8.19 The game of Nim is played with a collection of piles of sticks. In one move, a
player may remove any nonzero number of sticks from a single pile. The players
alternately take turns making moves. The player who removes the very last stick
loses. Say that we have a game position in Nim with kpiles containing s1,...,s k
sticks. Call the position balanced if each column of bits contains an even number
of1sw h e ne a c ho ft h en u m b e r s siis written in binary, and the binary numbers are
written as rows of a matrix aligned at the low order bits. Prove the following two
facts.
a.Starting in an unbalanced position, a single move exists that changes the
position into a balanced one.
b.Starting in a balanced position, every single move changes the position into
an unbalanced one.
LetNIM ={⟨s1,...,s k⟩|each siis a binary number and Player I has a winning
strategy in the Nim game starting at this position }.U s et h ep r e c e d i n gf a c t sa b o u t
balanced positions to show that NIM ∈L.
8.20 LetMULT ={a#b#c|a,b,care binary natural numbers and a×b=c}.S h o w
thatMULT ∈L.
8.21 For any positive integer x,l e t xRbe the integer whose binary representation is
the reverse of the binary representation of x.( A s s u m e n ol e a d i n g 0si nt h eb i n a r y
representation of x.) Define the function R+:N− →N where R+(x)=x+xR.
a.LetA2={⟨x, y⟩| R+(x)=y}.S h o w A2∈L.
b.LetA3={⟨x, y⟩| R+(R+(x)) = y}.S h o w A3∈L.
8.22 a. LetADD ={⟨x, y, z ⟩|x,y, z > 0are binary integers and x+y=z}.S h o w
thatADD ∈L.
b.LetPAL-ADD ={⟨x, y⟩|x, y > 0are binary integers where x+yis an
integer whose binary representation is a palindrome }. (Note that the binary
representation of the sum is assumed not to have leading zeros. A palin-
drome is a string that equals its reverse.) Show that PAL-ADD ∈L.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 384 ---
360 CHAPTER 8 / SPACE COMPLEXITY
⋆8.23 Define UCYCLE ={⟨G⟩|Gis an undirected graph that contains a simple cycle }.
Show that UCYCLE ∈L. (Note: Gmay be a graph that is not connected.)
⋆8.24 For each n,e x h i b i tt w or e g u l a re x p r e s s i o n s , RandS,o fl e n g t h poly(n),w h e r e
L(R)̸=L(S),b u tw h e r et h efi r s ts t r i n go nw h i c ht h e yd i f f e ri se x p o n e n t i a l l yl o n g .
In other words, L(R)andL(S)must be different, yet agree on all strings of length
up to 2ϵnfor some constant ϵ>0.
8.25 An undirected graph is bipartite if its nodes may be divided into two sets so that
all edges go from a node in one set to a node in the other set. Show that a graph is
bipartite if and only if it doesn’t contain a cycle that has an odd number of nodes.
LetBIPARTITE ={⟨G⟩|Gis a bipartite graph }.Show that BIPARTITE ∈NL.
8.26 Define UPATH to be the counterpart of PATH for undirected graphs. Show that
BIPARTITE ≤LUPATH . (Note: In fact, we can prove UPATH ∈L,a n dt h e r e f o r e
BIPARTITE ∈L,b u tt h ea l g o r i t h m[ 6 2 ]i st o od i f fi c u l tt op r e s e n th e r e . )
8.27 Recall that a directed graph is strongly connected if every two nodes are connected
by a directed path in each direction. Let
STRONGLY -CONNECTED ={⟨G⟩|Gis a strongly connected graph }.
Show that STRONGLY -CONNECTED isNL-complete.
8.28 LetBOTH NFA={⟨M1,M2⟩|M1andM2are NFAs where L(M1)∩L(M2)̸=∅}.
Show that BOTH NFAisNL-complete.
8.29 Show that ANFAisNL-complete.
8.30 Show that EDFAisNL-complete.
⋆8.31 Show that 2SAT isNL-complete.
8.32 LetCNF H1={⟨φ⟩|φis a satisfiable cnf-formula where each clause contains
any number of positive literals and at most one negated literal. Furthermore,
each negated literal has at most one occurrence in φ}.S h o w t h a t CNF H1isNL-
complete.
⋆8.33 Give an example of an NL-complete context-free language.
A⋆8.34 Define CYCLE ={⟨G⟩|Gis a directed graph that contains a directed cycle }.S h o w
thatCYCLE isNL-complete.
SELECTED SOLUTIONS
8.5 Construct a TMMto decide ADFA.W h e n Mreceives input ⟨A,w⟩,aDFAand a
string, Msimulates Aonwby keeping track of A’s current state and its current
head location, and updating them appropriately. The space required to carry out
this simulation is O(logn)because Mcan record each of these values by storing a
pointer into its input.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 385 ---
SELECTED SOLUTIONS 361
8.7 LetA1andA2be languages that are decided by NL-machines N1andN2.C o n -
struct three T uring machines: N∪deciding A1∪A2;N◦deciding A1◦A2;
andN∗deciding A∗
1.E a c ho ft h e s em a c h i n e so p e r a t e sa sf o l l o w s .
Machine N∪nondeterministically branches to simulate N1or to simulate N2.I n
either case, N∪accepts if the simulated machine accepts.
Machine N◦nondeterministically selects a position on the input to divide it into
two substrings. Only a pointer to that position is stored on the work tape—
insufficient space is available to store the substrings themselves. Then N◦simulates
N1on the first substring, branching nondeterministically to simulate N1’s nonde-
terminism. On any branch that reaches N1’s accept state, N◦simulates N2on the
second substring. On any branch that reaches N2’s accept state, N◦accepts.
Machine N∗has a more complex algorithm, so we describe its stages.
N∗=“On input w:
1.Initialize two input position pointers p1andp2to 0, the position
immediately preceding the first input symbol.
2.Accept if no input symbols occur after p2.
3.Move p2forward to a nondeterministically selected position.
4.Simulate N1on the substring of wfrom the position following
p1to the position at p2,b r a n c h i n gn o n d e t e r m i n i s t i c a l l yt os i m -
ulate N1’s nondeterminism.
5.If this branch of the simulation reaches N1’s accept state, copy
p2top1and go to stage 2. If N1rejects on this branch, reject .”
8.34 Reduce PATH toCYCLE .T h e i d e a b e h i n d t h e r e d u c t i o n i s t o m o d i f y t h e PATH
problem instance ⟨G, s, t ⟩by adding an edge from ttosinG.I fap a t he x i s t sf r o m
stotinG,ad i r e c t e dc y c l ew i l le x i s ti nt h em o d i fi e d G. However, other cycles may
exist in the modified Gbecause they may already be present in G.T oh a n d l et h a t
problem, first change Gso that it contains no cycles. A leveled directed graph is
one where the nodes are divided into groups, A1,A2,...,A k,c a l l e d levels ,a n do n l y
edges from one level to the next higher level are permitted. Observe that a leveled
graph is acyclic. The PATH problem for leveled graphs is still NL-complete, as the
following reduction from the unrestricted PATH problem shows. Given a graph G
with two nodes sandt,a n d mnodes in total, produce the leveled graph G′whose
levels are mcopies of G’s nodes. Draw an edge from node iat each level to node j
in the next level if Gcontains an edge from itoj.A d d i t i o n a l l y ,d r a wa ne d g ef r o m
node iin each level to node iin the next level. Let s′be the node sin the first level
and let t′be the node tin the last level. Graph Gcontains a path from stotiffG′
contains a path from s′tot′.I fy o um o d i f y G′by adding an edge from t′tos′,y o u
obtain a reduction from PATH toCYCLE .T h er e d u c t i o ni sc o m p u t a t i o n a l l ys i m -
ple, and its implementation in logspace is routine. Furthermore, a straightforward
procedure shows that CYCLE ∈NL. Hence CYCLE isNL-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 386 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 387 ---
9
INTRACTABILITY
Certain computational problems are solvable in principle, but the solutions re-
quire so much time or space that they can’t be used in practice. Such problems
are called intractable .
In Chapters 7 and 8, we introduced several problems thought to be intractable
but none that have been proven to be intractable. For example, most people
believe the SAT problem and all other NP-complete problems are intractable,
although we don’t know how to prove that they are. In this chapter, we give
examples of problems that we can prove to be intractable.
In order to present these examples, we develop several theorems that relate
the power of T uring machines to the amount of time or space available for
computation. We conclude the chapter with a discussion of the possibility of
proving that problems in NPare intractable and thereby solving the Pversus
NPquestion. First, we introduce the relativization technique and use it to argue
that certain methods won’t allow us to achieve this goal. Then, we discuss cir-
cuit complexity theory, an approach taken by researchers that has shown some
promise.
363
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 388 ---
364 CHAPTER 9 / INTRACTABILITY
9.1
HIERARCHY THEOREMS
Common sense suggests that giving a T uring machine more time or more space
should increase the class of problems that it can solve. For example, T uring
machines should be able to decide more languages in time n3than they can in
time n2.T h e hierarchy theorems prove that this intuition is correct, subject to
certain conditions described below. We use the term hierarchy theorem because
these theorems prove that the time and space complexity classes aren’t all the
same—they form a hierarchy whereby the classes with larger bounds contain
more languages than do the classes with smaller bounds.
The hierarchy theorem for space complexity is slightly simpler than the one
for time complexity, so we present it first. We begin with the following technical
definition.
DEFINITION 9.1
Af u n c t i o n f:N− →N ,w h e r e f(n)is at least O(logn),i sc a l l e d
space constructible if the function that maps the string 1nto the
binary representation of f(n)is computable in space O(f(n)).1
In other words, fis space constructible if some O(f(n))space TMexists that
always halts with the binary representation of f(n)on its tape when started on
input 1n.F r a c t i o n a lf u n c t i o n ss u c ha s nlog2nand√
nare rounded down to the
next lower integer for the purposes of time and space constructibility.
EXAMPLE 9.2
All commonly occurring functions that are at least O(logn)are space con-
structible, including the functions log2n,nlog2n,a n d n2.
For example, n2is space constructible because a machine may take its input
1n,o b t a i n nin binary by counting the number of 1s, and output n2by using any
standard method for multiplying nby itself. The total space used is O(n),w h i c h
is certainly O(n2).
When showing functions f(n)that are o(n)to be space constructible, we use
as e p a r a t er e a d - o n l yi n p u tt a p e ,a sw ed i dw h e nw ed e fi n e ds u b l i n e a rs p a c ec o m -
plexity in Section 8.4. For example, such a machine can compute the function
that maps 1nto the binary representation of log2nas follows. It first counts the
number of 1si ni t si n p u ti nb i n a r y ,u s i n gi t sw o r kt a p ea si tm o v e si t sh e a da l o n g
the input tape. Then, with nin binary on its work tape, it can compute log2n
by counting the number of bits in the binary representation of n.
1Recall that 1nmeans a string of n1s.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 389 ---
9.1 HIERARCHY THEOREMS 365
The role of space constructibility in the space hierarchy theorem may be un-
derstood from the following situation. If f(n)andg(n)are two space bounds,
where f(n)is asymptotically larger than g(n),w ew o u l de x p e c tam a c h i n et ob e
able to decide more languages in f(n)space than in g(n)space. However, sup-
pose that f(n)exceeds g(n)by only a very small and hard to compute amount.
Then, the machine may not be able to use the extra space profitably because even
computing the amount of extra space may require more space than is available.
In this case, a machine may not be able to compute more languages in f(n)space
than it can in g(n)space. Stipulating that f(n)is space constructible avoids this
situation and allows us to prove that a machine can compute more than it would
be able to in any asymptotically smaller bound, as the following theorem shows.
THEOREM 9.3
Space hierarchy theorem For any space constructible function f:N− →N ,
al a n g u a g e Aexists that is decidable in O(f(n))space but not in o(f(n))space.
PROOF IDEA We must demonstrate a language Athat has two properties.
The first says that Ais decidable in O(f(n))space. The second says that Aisn’t
decidable in o(f(n))space.
We describe Aby giving an algorithm Dthat decides it. Algorithm Druns in
O(f(n))space, thereby ensuring the first property. Furthermore, Dguarantees
thatAis different from any language that is decidable in o(f(n))space, thereby
ensuring the second property. Language Ais different from languages we have
discussed previously in that it lacks a nonalgorithmic definition. Therefore, we
cannot offer a simple mental picture of A.
In order to ensure that Anot be decidable in o(f(n))space, we design Dto
implement the diagonalization method that we used to prove the unsolvability
of the acceptance problem ATMin Theorem 4.11 on page 202. If Mis a TM
that decides a language in o(f(n))space, Dguarantees that Adiffers from M’s
language in at least one place. Which place? The place corresponding to a
description of Mitself.
Let’s look at the way Doperates. Roughly speaking, Dtakes its input to be
the description of a TMM.( I ft h ei n p u ti s n ’ tt h ed e s c r i p t i o no fa n y TM,t h e n D’s
action is inconsequential on this input, so we arbitrarily make Dreject.) Then,
Druns Mon the same input—namely, ⟨M⟩—within the space bound f(n).I f
Mhalts within that much space, Daccepts iff Mrejects. If Mdoesn’t halt,
Djust rejects. So if Mruns within space f(n),Dhas enough space to ensure
that its language is different from M’s. If not, Ddoesn’t have enough space to
figure out what Mdoes. But fortunately Dhas no requirement to act differently
from machines that don’t run in o(f(n))space, so D’s action on this input is
inconsequential.
This description captures the essence of the proof but omits several impor-
tant details. If Mruns in o(f(n))space, Dmust guarantee that its language is
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 390 ---
366 CHAPTER 9 / INTRACTABILITY
different from M’s language. But even when Mruns in o(f(n))space, it may use
more than f(n)space for small n,w h e nt h ea s y m p t o t i cb e h a v i o rh a s n ’ t“ k i c k e d
in” yet. Possibly, Dmight not have enough space to run Mto completion on
input ⟨M⟩,a n dh e n c e Dwill miss its one opportunity to avoid M’s language. So,
if we aren’t careful, Dmight end up deciding the same language that Mdecides,
and the theorem wouldn’t be proved.
We can fix this problem by modifying Dto give it additional opportunities
to avoid M’s language. Instead of running Monly when Dreceives input ⟨M⟩,
it runs Mwhenever it receives an input of the form ⟨M⟩10∗;t h a ti s ,a ni n p u t
of the form ⟨M⟩followed by a 1and some number of 0s. Then, if Mreally is
running in o(f(n))space, Dwill have enough space to run it to completion on
input ⟨M⟩10kfor some large value of kbecause the asymptotic behavior must
eventually kick in.
One last technical point arises. When Druns Mon some string, Mmay get
into an infinite loop while using only a finite amount of space. But Dis supposed
to be a decider, so we must ensure that Ddoesn’t loop while simulating M.A n y
machine that runs in space o(f(n))uses only 2o(f(n))time. We modify Dso that
it counts the number of steps used in simulating M.I f t h i s c o u n t e v e r e x c e e d s
2f(n),t h e n Drejects.
PROOF The following O(f(n))space algorithm Ddecides a language Athat
is not decidable in o(f(n))space.
D=“On input w:
1.Letnbe the length of w.
2.Compute f(n)using space constructibility and mark off this
much tape. If later stages ever attempt to use more, reject .
3.Ifwis not of the form ⟨M⟩10∗for some TMM,reject .
4.Simulate Monwwhile counting the number of steps used in
the simulation. If the count ever exceeds 2f(n),reject .
5.IfMaccepts, reject .I fMrejects, accept .”
In stage 4, we need to give additional details of the simulation in order to
determine the amount of space used. The simulated TMMhas an arbitrary tape
alphabet and Dhas a fixed tape alphabet, so we represent each cell of M’s tape
with several cells on D’s tape. Therefore, the simulation introduces a constant
factor overhead in the space used. In other words, if Mruns in g(n)space, then
Dusesdg(n)space to simulate Mfor some constant dthat depends on M.
Machine Dis a decider because each of its stages can run for a limited time.
LetAbe the language that Ddecides. Clearly, Ais decidable in space O(f(n))
because Ddoes so. Next, we show that Ais not decidable in o(f(n))space.
Assume to the contrary that some T uring machine Mdecides Ain space g(n),
where g(n)iso(f(n)).A s m e n t i o n e d e a r l i e r , Dcan simulate M,u s i n gs p a c e
dg(n)for some constant d.B e c a u s e g(n)iso(f(n)),s o m ec o n s t a n t n0exists,
where dg(n)<f(n)for all n≥n0.T h e r e f o r e , D’s simulation of Mwill run to
completion so long as the input has length n0or more. Consider what happens
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 391 ---
9.1 HIERARCHY THEOREMS 367
when Dis run on input ⟨M⟩10n0.T h i si n p u ti sl o n g e rt h a n n0,s ot h es i m u l a t i o n
in stage 4 will complete. Therefore, Dwill do the opposite of Mon the same in-
put. Hence Mdoesn’t decide A,w h i c hc o n t r a d i c t so u ra s s u m p t i o n .T h e r e f o r e ,
Ais not decidable in o(f(n))space.
COROLLARY 9.4
For any two functions f1,f2:N− →N ,w h e r e f1(n)iso(f2(n))andf2is space
constructible, SPACE( f1(n))/subsetnoteqlSPACE( f2(n)).2
This corollary allows us to separate various space complexity classes. For
example, we can show that the function ncis space constructible for any natu-
ral number c.H e n c e f o r a n y t w o n a t u r a l n u m b e r s c1<c 2,w ec a np r o v et h a t
SPACE( nc1)/subsetnoteqlSPACE( nc2).W i t h a b i t m o r e w o r k , w e c a n s h o w t h a t ncis
space constructible for any rational number c>0and thereby extend the pre-
ceding containment to hold for any rational numbers 0≤c1<c 2.O b s e r v i n g
that two rational numbers c1andc2always exist between any two real numbers
ϵ1<ϵ 2such that ϵ1<c1<c2<ϵ 2,w eo b t a i nt h ef o l l o w i n ga d d i t i o n a lc o r o l l a r y
demonstrating a fine hierarchy within the class PSPACE .
COROLLARY 9.5
For any two real numbers 0≤ϵ1<ϵ 2,
SPACE( nϵ1)/subsetnoteqlSPACE( nϵ2).
We can also use the space hierarchy theorem to separate two space complexity
classes we previously encountered.
COROLLARY 9.6
NL/subsetnoteqlPSPACE .
PROOF Savitch’s theorem shows that NL⊆SPACE(log2n),a n dt h es p a c e
hierarchy theorem shows that SPACE(log2n)/subsetnoteqlSPACE( n).H e n c et h ec o r o l -
lary follows.
As we observed on page 354, this separation shows that TQBF ̸∈NLbecause
TQBF isPSPACE -complete with respect to log space reducibility.
2Recall that A/subsetnoteqlBmeans Ais a proper (i.e., not equal) subset of B.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 392 ---
368 CHAPTER 9 / INTRACTABILITY
Now we establish the main objective of this chapter: proving the existence of
problems that are decidable in principle but not in practice—that is, problems
that are decidable but intractable. Each of the SPACE( nk)classes is contained
within the class SPACE( nlogn),w h i c hi nt u r ni ss t r i c t l yc o n t a i n e dw i t h i nt h e
class SPACE(2n).T h e r e f o r e ,w eo b t a i nt h ef o l l o w i n ga d d i t i o n a lc o r o l l a r ys e p a -
rating PSPACE from EXPSPACE =⎪uniontext
kSPACE(2nk).
COROLLARY 9.7
PSPACE /subsetnoteqlEXPSPACE .
This corollary establishes the existence of decidable problems that are in-
tractable, in the sense that their decision procedures must use more than poly-
nomial space. The languages themselves are somewhat artificial—interesting
only for the purpose of separating complexity classes. We use these languages
to prove the intractability of other, more natural, languages after we discuss the
time hierarchy theorem.
DEFINITION 9.8
Af u n c t i o n t:N− →N ,w h e r e t(n)is at least O(nlogn),i sc a l l e d
time constructible if the function that maps the string 1nto the
binary representation of t(n)is computable in time O(t(n)).
In other words, tis time constructible if some O(t(n))time TMexists that
always halts with the binary representation of t(n)on its tape when started on
input 1n.
EXAMPLE 9.9
All commonly occurring functions that are at least nlognare time constructible,
including the functions nlogn,n√
n,n2,a n d 2n.
For example, to show that n√
nis time constructible, we first design a TM
to count the number of 1si nb i n a r y . T od os o ,t h e TMmoves a binary counter
along the tape, incrementing it by 1for every input position, until it reaches
the end of the input. This part uses O(nlogn)steps because O(logn)steps are
used for each of the ninput positions. Then, we compute ⌊n√
n⌋in binary from
the binary representation of n.A n yr e a s o n a b l em e t h o do fd o i n gs ow i l lw o r ki n
O(nlogn)time because the length of the numbers involved is O(logn).
The time hierarchy theorem is an analog for time complexity to Theorem 9.3.
For technical reasons that will appear in its proof, the time hierarchy theorem
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 393 ---
9.1 HIERARCHY THEOREMS 369
is slightly weaker than the one we proved for space. Whereas anyspace con-
structible asymptotic increase in the space bound enlarges the class of languages
decidable therein, for time we must further increase the time bound by a log-
arithmic factor in order to guarantee that we can obtain additional languages.
Conceivably, a tighter time hierarchy theorem is true; but at present, we don’t
know how to prove it. This aspect of the time hierarchy theorem arises because
we measure time complexity with single-tape T uring machines. We can prove
tighter time hierarchy theorems for other models of computation.
THEOREM 9.10
Time hierarchy theorem For any time constructible function t:N− →N ,
al a n g u a g e Aexists that is decidable in O(t(n))time but not decidable in time
o(t(n)/logt(n)).
PROOF IDEA This proof is similar to the proof of Theorem 9.3. We con-
struct a TMDthat decides a language Ain time O(t(n)),w h e r e b y Acannot be
decided in o(t(n)/logt(n))time. Here, Dtakes an input wof the form ⟨M⟩10∗
and simulates Mon input w,m a k i n gs u r en o tt ou s em o r et h a n t(n)time. If M
halts within that much time, Dgives the opposite output.
The important difference in the proof concerns the cost of simulating M
while, at the same time, counting the number of steps that the simulation is us-
ing. Machine Dmust perform this timed simulation efficiently so that Druns
inO(t(n))time while accomplishing the goal of avoiding all languages decid-
able in o(t(n)/logt(n))time. For space complexity, the simulation introduced
ac o n s t a n tf a c t o ro v e r h e a d ,a sw eo b s e r v e di nt h ep r o o fo fT h e o r e m9 . 3 . F o r
time complexity, the simulation introduces a logarithmic factor overhead. The
larger overhead for time is the reason for the appearance of the 1/logt(n)factor
in the statement of this theorem. If we had a way of simulating a single-tape
TMby another single-tape TMfor a prespecified number of steps, using only a
constant factor overhead in time, we would be able to strengthen this theorem
by changing o(t(n)/logt(n))too(t(n)).N os u c he f fi c i e n ts i m u l a t i o ni sk n o w n .
PROOF The following O(t(n))time algorithm Ddecides a language Athat
is not decidable in o(t(n)/logt(n))time.
D=“On input w:
1.Letnbe the length of w.
2.Compute t(n)using time constructibility and store the value
⌈t(n)/logt(n)⌉in a binary counter. Decrement this counter
before each step used to carry out stages 4 and 5. If the counter
ever hits 0, reject .
3.Ifwis not of the form ⟨M⟩10∗for some TMM,reject .
4.Simulate Monw.
5.IfMaccepts, then reject .I fMrejects, then accept .”
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 394 ---
370 CHAPTER 9 / INTRACTABILITY
We examine each of the stages of this algorithm to determine the running
time. Stages 1, 2, and 3 can be performed within O(t(n))time.
In stage 4, every time Dsimulates one step of M,i tt a k e s M’s current state
together with the tape symbol under M’s tape head and looks up M’s next action
in its transition function so that it can update M’s tape appropriately. All three
of these objects (state, tape symbol, and transition function) are stored on D’s
tape somewhere. If they are stored far from each other, Dwill need many steps
to gather this information each time it simulates one of M’s steps. Instead, D
always keeps this information close together.
We can think of D’s single tape as organized into tracks .O n ew a yt og e tt w o
tracks is by storing one track in the odd positions and the other in the even posi-
tions. Alternatively, the two-track effect may be obtained by enlarging D’s tape
alphabet to include each pair of symbols: one from the top track and the second
from the bottom track. We can get the effect of additional tracks similarly. Note
that multiple tracks introduce only a constant factor overhead in time, provided
that only a fixed number of tracks are used. Here, Dhas three tracks.
One of the tracks contains the information on M’s tape, and a second contains
its current state and a copy of M’s transition function. During the simulation,
Dkeeps the information on the second track near the current position of M’s
head on the first track. Every time M’s head position moves, Dshifts all the
information on the second track to keep it near the head. Because the size of the
information on the second track depends only on Mand not on the length of
the input to M,t h es h i f t i n ga d d so n l yac o n s t a n tf a c t o rt ot h es i m u l a t i o nt i m e .
Furthermore, because the required information is kept close together, the cost
of looking up M’s next action in its transition function and updating its tape is
only a constant. Hence if Mruns in g(n)time, Dcan simulate it in O(g(n))
time.
At every step in stage 4, Dmust decrement the step counter it originally set in
stage 2. Here, Dcan do so without adding excessively to the simulation time by
keeping the counter in binary on a third track and moving it to keep it near the
present head position. This counter has a magnitude of about t(n)/logt(n),s o
its length is log(t(n)/logt(n)),w h i c hi s O(logt(n)).H e n c et h ec o s to fu p d a t i n g
and moving it at each step adds a logt(n)factor to the simulation time, thus
bringing the total running time to O(t(n)).T h e r e f o r e , Ais decidable in time
O(t(n)).
To s h o w t h a t Ais not decidable in o(t(n)/logt(n))time, we use an argument
similar to one used in the proof of Theorem 9.3. Assume to the contrary that
TMMdecides Ain time g(n),w h e r e g(n)iso(t(n)/logt(n)).H e r e , Dcan sim-
ulate M,u s i n gt i m e dg(n)for some constant d.I ft h et o t a ls i m u l a t i o nt i m e( n o t
counting the time to update the step counter) is at most t(n)/logt(n),t h es i m -
ulation will run to completion. Because g(n)iso(t(n)/logt(n)),s o m ec o n s t a n t
n0exists where dg(n)<t(n)/logt(n)for all n≥n0.T h e r e f o r e , D’s simula-
tion of Mwill run to completion as long as the input has length n0or more.
Consider what happens when we run Don input ⟨M⟩10n0.T h i si n p u ti sl o n g e r
than n0,s ot h es i m u l a t i o ni ns t a g e4w i l lc o m p l e t e . T h e r e f o r e , Dwill do the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 395 ---
9.1 HIERARCHY THEOREMS 371
opposite of Mon the same input. Hence Mdoesn’t decide A,w h i c hc o n t r a d i c t s
our assumption. Therefore, Ais not decidable in o(t(n)/logt(n))time.
We establish analogs to Corollaries 9.4, 9.5, and 9.7 for time complexity.
COROLLARY 9.11
For any two functions t1,t2:N− →N ,w h e r e t1(n)iso(t2(n)/logt2(n))and
t2is time constructible, TIME( t1(n))/subsetnoteqlTIME( t2(n)).
COROLLARY 9.12
For any two real numbers 1≤ϵ1<ϵ 2,w eh a v e TIME( nϵ1)/subsetnoteqlTIME( nϵ2).
COROLLARY 9.13
P/subsetnoteqlEXPTIME .
EXPONENTIAL SPACE COMPLETENESS
We can use the preceding results to demonstrate that a specific language is ac-
tually intractable. We do so in two steps. First, the hierarchy theorems tell us
that a T uring machine can decide more languages in EXPSPACE than it can
inPSPACE .T h e n ,w es h o w t h a t a p a r t i c u l a r l a n g u a g ec o n c e r n i n gg e n e r a l i z e d
regular expressions is complete for EXPSPACE and hence can’t be decided in
polynomial time or even in polynomial space.
Before getting to their generalization, let’s briefly review the way we intro-
duced regular expressions in Definition 1.52. They are built up from the atomic
expressions ∅,ε,a n dm e m b e r so ft h ea l p h a b e t ,b yu s i n gt h er e g u l a ro p e r a t i o n s
union, concatenation, and star, denoted ∪,◦,a n d∗,r e s p e c t i v e l y . F r o mP r o b -
lem 8.8, we know that we can test the equivalence of two regular expressions in
polynomial space.
We show that by allowing regular expressions with more operations than the
usual regular operations, the complexity of analyzing the expressions may grow
dramatically. Let ↑be the exponentiation operation .I fRis a regular expression
andkis a nonnegative integer, writing R↑kis equivalent to the concatenation
ofRwith itself ktimes. We also write Rkas shorthand for R↑k.I n o t h e r
words,
Rk=R↑k=k⎪bracehtipdownleft
 ⎪bracehtipupright⎪bracehtipupleft
 ⎪bracehtipdownright
R◦R◦···◦ R.
Generalized regular expressions allow the exponentiation operation in addition
to the usual regular operations. Obviously, these generalized regular expressions
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 396 ---
372 CHAPTER 9 / INTRACTABILITY
still generate the same class of regular languages as do the standard regular ex-
pressions because we can eliminate the exponentiation operation by repeating
the base expression. Let
EQREX↑={⟨Q, R⟩|QandRare equivalent regular
expressions with exponentiation }.
To s h o w t h a t EQREX↑is intractable, we demonstrate that it is complete for the
class EXPSPACE .A n y EXPSPACE -complete problem cannot be in PSPACE ,
much less in P.O t h e r w i s e , EXPSPACE would equal PSPACE ,c o n t r a d i c t i n g
Corollary 9.7.
DEFINITION 9.14
Al a n g u a g e BisEXPSPACE-complete if
1.B∈EXPSPACE ,a n d
2.every AinEXPSPACE is polynomial time reducible to B.
THEOREM 9.15
EQREX↑isEXPSPACE -complete.
PROOF IDEA In measuring the complexity of deciding EQREX↑,w ea s s u m e
that all exponents are written as binary integers. The length of an expression is
the total number of symbols that it contains.
We sketch an EXPSPACE algorithm for EQREX↑.T o t e s t w h e t h e r t w o e x -
pressions with exponentiation are equivalent, we first use repetition to eliminate
exponentiation, then convert the resulting expressions to NFAs. Finally, we use
anNFAequivalence testing procedure similar to the one used for deciding the
complement of ALL NFAin Example 8.4.
To s h o w t h a t a l a n g u a g e AinEXPSPACE is polynomial time reducible to
EQREX↑,w eu t i l i z et h et e c h n i q u eo fr e d u c t i o n sv i ac o m p u t a t i o nh i s t o r i e st h a t
we introduced in Section 5.1. The construction is similar to the construction
given in the proof of Theorem 5.13.
Given a TMMforA,w ed e s i g nap o l y n o m i a lt i m er e d u c t i o nm a p p i n ga ni n -
putwto a pair of expressions, R1andR2,t h a ta r ee q u i v a l e n te x a c t l yw h e n M
accepts w.T h ee x p r e s s i o n s R1andR2simulate the computation of Monw.E x -
pression R1simply generates all strings over the alphabet consisting of symbols
that may appear in computation histories. Expression R2generates all strings
that are not rejecting computation histories. So if the TMaccepts its input, no
rejecting computation histories exist, and expressions R1andR2generate the
same language. Recall that a rejecting computation history is the sequence of
configurations that the machine enters in a rejecting computation on the input.
See page 220 in Section 5.1 for a review of computation histories.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 397 ---
9.1 HIERARCHY THEOREMS 373
The difficulty in this proof is that the size of the expressions constructed must
be polynomial in n(so that the reduction can run in polynomial time), whereas
the simulated computation may have exponential length. The exponentiation
operation is useful here to represent the long computation with a relatively short
expression.
PROOF First, we present a nondeterministic algorithm for testing whether
two NFAsa r ei n e q u i v a l e n t .
N=“On input ⟨N1,N2⟩,w h e r e N1andN2areNFAs:
1.Place a marker on each of the start states of N1andN2.
2.Repeat 2q1+q2times, where q1andq2are the numbers of states
inN1andN2:
3. Nondeterministically select an input symbol and change the
positions of the markers on the states of N1andN2to simu-
late reading that symbol.
4.If at any point a marker was placed on an accept state of one
of the finite automata and not on any accept state of the other
finite automaton, accept .O t h e r w i s e , reject .”
If automata N1andN2are equivalent, Nclearly rejects because it only ac-
cepts when it determines that one machine accepts a string that the other does
not accept. If the automata are not equivalent, some string is accepted by one
machine and not by the other. Some such string must be of length at most 2q1+q2.
Otherwise, consider using the shortest such string as the sequence of nondeter-
ministic choices. Only 2q1+q2different ways exist to place markers on the states
ofN1andN2;s oi nal o n g e rs t r i n g ,t h ep o s i t i o n so ft h em a r k e r sw o u l dr e p e a t .
By removing the portion of the string between the repetitions, a shorter such
string would be obtained. Hence algorithm Nwould guess this string among its
nondeterministic choices and would accept. Thus, Noperates correctly.
Algorithm Nruns in nondeterministic linear space. Thus, Savitch’s theorem
provides a deterministic O(n2)space algorithm for this problem. Next, we use
the deterministic form of this algorithm to design the following algorithm E
that decides EQREX↑.
E=“On input ⟨R1,R2⟩,w h e r e R1andR2are regular expressions with
exponentiation:
1.Convert R1andR2to equivalent regular expressions B1andB2
that use repetition instead of exponentiation.
2.Convert B1andB2to equivalent NFAsN1andN2,u s i n gt h e
conversion procedure given in the proof of Lemma 1.55.
3.Use the deterministic version of algorithm Nto determine
whether N1andN2are equivalent. ”
Algorithm Eobviously is correct. T o analyze its space complexity, we ob-
serve that using repetition to replace exponentiation may increase the length
of an expression by a factor of 2l,w h e r e lis the sum of the lengths of the ex-
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 398 ---
374 CHAPTER 9 / INTRACTABILITY
ponents. Thus, expressions B1andB2have a length of at most n2n,w h e r e
nis the input length. The conversion procedure of Lemma 1.55 increases the
size linearly, and hence NFAsN1andN2have at most O(n2n)states. Thus,
with input size O(n2n),t h ed e t e r m i n i s t i cv e r s i o no fa l g o r i t h m Nuses space
O((n2n)2)=O(n222n).H e n c e EQREX↑is decidable in exponential space.
Next, we show that EQREX↑isEXPSPACE -hard. Let Abe a language that
is decided by TMMrunning in space 2(nk)for some constant k.T h er e d u c t i o n
maps an input wto a pair of regular expressions, R1andR2.E x p r e s s i o n R1is∆∗
where if ΓandQareM’s tape alphabet and states, ∆=Γ ∪Q∪{#}is the alphabet
consisting of all symbols that may appear in a computation history. We construct
expression R2to generate all strings that aren’t rejecting computation histories
ofMonw.O f c o u r s e , Maccepts wiffMonwhas no rejecting computation
histories. Therefore, the two expressions are equivalent iff Maccepts w.T h e
construction is as follows.
Ar e j e c t i n gc o m p u t a t i o nh i s t o r yf o r Monwis a sequence of configura-
tions separated by #symbols. We use our standard encoding of configurations
whereby a symbol corresponding to the current state is placed to the left of the
current head position. We assume that all configurations have length 2(nk)and
are padded on the right by blank symbols if they otherwise would be too short.
The first configuration in a rejecting computation history is the start configu-
ration of Monw.T h e l a s t c o n fi g u r a t i o n i s a r e j e c t i n g c o n fi g u r a t i o n .E a c h
configuration must follow from the preceding one according to the rules speci-
fied in the transition function.
As t r i n gm a yf a i lt ob ear e j e c t i n gc o m p u t a t i o ni ns e v e r a lw a y s : I tm a yf a i l
to start or end properly, or it may be incorrect somewhere in the middle. Ex-
pression R2equals Rbad-start ∪Rbad-window ∪Rbad-reject ,w h e r ee a c hs u b e x p r e s s i o n
corresponds to one of the three ways a string may fail.
We construct expression Rbad-start to generate all strings that fail to start with
the start configuration C1ofMonw, as follows. Configuration C1looks like
q0w1w2···wn␣␣···␣#.W e w r i t e Rbad-start as the union of several subexpres-
sions to handle each part of C1:
Rbad-start =S0∪S1∪· · ·∪ Sn∪Sb∪S#.
Expression S0generates all strings that don’t start with q0.W el e t S0be the
expression ∆−q0∆∗.T h e n o t a t i o n ∆−q0is shorthand for writing the union of
all symbols in ∆except q0.
Expression S1generates all strings that don’t contain w1in the second po-
sition. We let S1be∆∆−w1∆∗.I n g e n e r a l , f o r 1≤i≤n,e x p r e s s i o n Siis
∆i∆−wi∆∗.T h u s , Sigenerates all strings that contain any symbols in the first
ipositions, any symbol except wiin position i+1,a n da n ys t r i n go fs y m b o l sf o l -
lowing position i+1.N o t et h a tw eh a v eu s e dt h ee x p o n e n t i a t i o no p e r a t i o nh e r e .
Actually, at this point, exponentiation is more of a convenience than a necessity
because we could have instead repeated the symbol ∆itimes without exces-
sively increasing the length of the expression. But in the next subexpression,
exponentiation is crucial to keeping the size polynomial.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 399 ---
9.1 HIERARCHY THEOREMS 375
Expression Sbgenerates all strings that fail to contain a blank symbol in some
position n+2through 2(nk).W ec o u l di n t r o d u c es u b e x p r e s s i o n s Sn+2through
S2(nk)for this purpose, but then expression Rbad-start would have exponential
length. Instead, we let
Sb=∆n+1(∆∪ε)2(nk)−n−2∆−␣∆∗.
Thus, Sbgenerates strings that contain any symbols in the first n+1positions,
any symbols in the next tpositions, where tcan range from 0to2(nk)−n−2,
and any symbol except blank in the next position.
Finally, S#generates all strings that don’t have a #symbol in position 2(nk)+1.
LetS#be∆(2(nk))∆−#∆∗.
Now that we have completed the construction of Rbad-start ,w et u r nt ot h e
next piece, Rbad-reject .I t g e n e r a t e s a l l s t r i n g s t h a t d o n ’ t e n d p r o p e r l y ; t h a t i s ,
strings that fail to contain a rejecting configuration. Any rejecting configuration
contains the state qreject,s ow el e t
Rbad-reject =∆∗
−qreject.
Thus, Rbad-reject generates all strings that don’t contain qreject.
Finally, we construct Rbad-window ,t h ee x p r e s s i o nt h a tg e n e r a t e sa l ls t r i n g s
whereby one configuration does not properly lead to the next configuration.
Recall that in the proof of the Cook–Levin theorem, we determined that one
configuration legally yields another whenever every three consecutive symbols
in the first configuration correctly yield the corresponding three symbols in the
second configuration according to the transition function. Hence, if one config-
uration fails to yield another, the error will be apparent from an examination of
the appropriate six symbols. We use this idea to construct Rbad-window :
Rbad-window =⎪uniondisplay
bad( abc,def)∆∗abc∆(2(nk)−2)def∆∗,
where bad(abc,def)means that abcdoesn’t yield defaccording to the transition
function. The union is taken only over such symbols a,b,c,d,e,a n d fin∆.
The following figure illustrates the placement of these symbols in a computation
history.FIGURE 9.16
Corresponding places in adjacent configurations
To c a l c u l a t e t h e l e n g t h o f R2,w ed e t e r m i n et h el e n g t ho ft h ee x p o n e n t st h a t
appear in it. Several exponents of magnitude roughly 2(nk)appear, and their
total length in binary is O(nk).T h e r e f o r e ,t h el e n g t ho f R2is polynomial in n.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 400 ---
376 CHAPTER 9 / INTRACTABILITY
9.2
RELATIVIZATION
The proof that EQREX↑is intractable rests on the diagonalization method. Why
don’t we show that SAT is intractable in the same way? Possibly we could use
diagonalization to show that a nondeterministic polynomial time TMcan decide
al a n g u a g et h a ti sp r o v a b l yn o ti n P.I n t h i s s e c t i o n , w e i n t r o d u c e t h e m e t h o d
ofrelativization to give strong evidence against the possibility of solving the P
versus NPquestion by using a proof by diagonalization.
In the relativization method, we modify our model of computation by giv-
ing the T uring machine certain information essentially for “free.” Depending
on which information is actually provided, the TMmay be able to solve some
problems more easily than before.
For example, suppose that we grant the TMthe ability to solve the satisfiability
problem in a single step, for any size Boolean formula. Never mind how this feat
is accomplished—imagine an attached “black box” that gives the machine this
capability. We call the black box an oracle to emphasize that it doesn’t necessarily
correspond to any physical device. Obviously, the machine could use the oracle
to solve any NPproblem in polynomial time, regardless of whether Pequals
NP,b e c a u s ee v e r y NPproblem is polynomial time reducible to the satisfiability
problem. Such a TMis said to be computing relative to the satisfiability problem;
hence the term relativization .
In general, an oracle can correspond to any particular language, not just the
satisfiability problem. The oracle allows the TMto test membership in the lan-
guage without actually having to compute the answer itself. We formalize this
notion shortly. You may recall that we introduced oracles in Section 6.3. There,
we defined them for the purpose of classifying problems according to the de-
gree of unsolvability. Here, we use oracles to understand better the power of the
diagonalization method.
DEFINITION 9.17
Anoracle for a language Ais a device that is capable of reporting
whether any string wis a member of A.A n oracle Turing machine
MAis a modified T uring machine that has the additional capability
of querying an oracle for A.W h e n e v e r MAwrites a string on a
special oracle tape ,i ti si n f o r m e dw h e t h e rt h a ts t r i n gi sam e m b e r
ofAin a single computation step.
LetPAbe the class of languages decidable with a polynomial
time oracle T uring machine that uses oracle A.D e fi n e t h e c l a s s
NPAsimilarly.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 401 ---
9.2 RELATIVIZATION 377
EXAMPLE 9.18
As we mentioned earlier, polynomial time computation relative to the satisfia-
bility problem contains all of NP.I n o t h e r w o r d s , NP⊆PSAT.F u r t h e r m o r e ,
coNP ⊆PSATbecause PSAT,b e i n gad e t e r m i n i s t i cc o m p l e x i t yc l a s s ,i sc l o s e d
under complementation.
EXAMPLE 9.19
Just as PSATcontains languages that we believe are not in P,t h ec l a s s NPSAT
contains languages that we believe are not in NP.T h ec o m p l e m e n to f t h el a n -
guage MIN-FORMULA that we defined in Problem 7.46 on page 328 provides
one such example.
MIN-FORMULA doesn’t seem to be in NP(though whether it actually be-
longs to NPis not known). However,
 MIN-FORMULA is in NPSATbecause a
nondeterministic polynomial time oracle T uring machine with a SAT oracle can
test whether φis a member, as follows. First, the inequivalence problem for two
Boolean formulas is solvable in NP,a n dh e n c et h ee q u i v a l e n c ep r o b l e mi si n
coNP because a nondeterministic machine can guess the assignment on which
the two formulas have different values. Then, the nondeterministic oracle ma-
chine for
 MIN-FORMULA nondeterministically guesses the smaller equivalent
formula, tests whether it actually is equivalent, using the SAT oracle, and accepts
if it is.
LIMITS OF THE DIAGONALIZATION METHOD
The next theorem demonstrates oracles AandBfor which PAandNPAare
provably different, and PBandNPBare provably equal. These two oracles are
important because their existence indicates that we are unlikely to resolve the P
versus NPquestion by using the diagonalization method.
At its core, the diagonalization method is a simulation of one T uring machine
by another. The simulation is done so that the simulating machine can deter-
mine the behavior of the other machine and then behave differently. Suppose
that both of these T uring machines were given identical oracles. Then, whenever
the simulated machine queries the oracle, so can the simulator; and therefore,
the simulation can proceed as before. Consequently, any theorem proved about
Tu r i n g m a c h i n e s b y u s i n g o n l y t h e d i a g o n a l i z a t i o n m e t h o d w o u l d s t i l l h o l d i f
both machines were given the same oracle.
In particular, if we could prove that PandNPwere different by diagonaliz-
ing, we could conclude that they are different relative to any oracle as well. But
PBandNPBare equal, so that conclusion is false. Hence diagonalization isn’t
sufficient to separate these two classes. Similarly, no proof that relies on a sim-
ple simulation could show that the two classes are the same because that would
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 402 ---
378 CHAPTER 9 / INTRACTABILITY
show that they are the same relative to any oracle; but in fact, PAandNPAare
different.
THEOREM 9.20
1.An oracle Aexists whereby PA̸=N PA.
2.An oracle Bexists whereby PB=N PB.
PROOF IDEA Exhibiting oracle Bis easy. Let Bbe any PSPACE -complete
problem such as TQBF .
We exhibit oracle Aby construction. We design Aso that a certain language
LAinNPAprovably requires brute-force search, and so LAcannot be in PA.
Hence we can conclude that PA̸=N PA.T h e c o n s t r u c t i o n c o n s i d e r s e v e r y
polynomial time oracle machine in turn and ensures that each fails to decide the
language LA.
PROOF LetBbeTQBF .W eh a v et h es e r i e so fc o n t a i n m e n t s
NPTQBF1
⊆NPSPACE2
⊆PSPACE3
⊆PTQBF.
Containment 1 holds because we can convert the nondeterministic polynomial
time oracle TMto a nondeterministic polynomial space machine that computes
the answers to queries regarding TQBF instead of using the oracle. Contain-
ment 2 follows from Savitch’s theorem. Containment 3 holds because TQBF is
PSPACE -complete. Hence we conclude that PTQBF=N PTQBF.
Next, we show how to construct oracle A.F o r a n y o r a c l e A,l e t LAbe the
collection of all strings for which a string of equal length appears in A.T h u s ,
LA={w|∃x∈A[|x|=|w|]}.
Obviously, for any A,t h el a n g u a g e LAis in NPA.
To s h o w LAis not in PA,w ed e s i g n Aas follows. Let M1,M2,...be a list of
all polynomial time oracle TMs. We may assume for simplicity that Miruns in
time ni.T h ec o n s t r u c t i o np r o c e e d si ns t a g e s ,w h e r es t a g e iconstructs a part of
A,w h i c he n s u r e st h a t MA
idoesn’t decide LA.W ec o n s t r u c t Aby declaring that
certain strings are in Aand others aren’t in A.E a c hs t a g ed e t e r m i n e st h es t a t u s
of only a finite number of strings. Initially, we have no information about A.W e
begin with stage 1.
Stage i.So far, a finite number of strings have been declared to be in or out
ofA.W ec h o o s e ngreater than the length of any such string and large enough
that2nis greater than ni,t h er u n n i n gt i m eo f Mi.W es h o wh o wt oe x t e n do u r
information about Aso that MA
iaccepts 1nwhenever that string is not in LA.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 403 ---
9.3 CIRCUIT COMPLEXITY 379
We run Mion input 1nand respond to its oracle queries as follows. If Mi
queries a string ywhose status has already been determined, we respond consis-
tently. If y’s status is undetermined, we respond NOto the query and declare y
to be out of A.W ec o n t i n u et h es i m u l a t i o no f Miuntil it halts.
Now consider the situation from Mi’s perspective. If it finds a string of
length ninA,i ts h o u l da c c e p tb e c a u s ei tk n o w st h a t 1nis in LA.I fMide-
termines that all strings of length naren’t in A,i ts h o u l dr e j e c tb e c a u s ei tk n o w s
that 1nis not in LA.H o w e v e r , i t d o e s n ’ t h a v e e n o u g h t i m e t o a s k a b o u t a l l
strings of length n,a n dw eh a v ea n s w e r e d NOto each of the queries it has made.
Hence when Mihalts and must decide whether to accept or reject, it doesn’t
have enough information to be sure that its decision is correct.
Our objective is to ensure that its decision is notcorrect. We do so by observ-
ing its decision and then extending Aso that the reverse is true. Specifically, if
Miaccepts 1n,w ed e c l a r ea l lt h er e m a i n i n gs t r i n g so fl e n g t h nto be out of A
and so determine that 1nis not in LA.I fMirejects 1n,w efi n das t r i n go fl e n g t h
nthatMihasn’t queried and declare that string to be in Ato guarantee that 1n
is in LA.S u c h a s t r i n g m u s t e x i s t b e c a u s e Miruns for nisteps, which is fewer
than 2n,t h et o t a ln u m b e ro fs t r i n g so fl e n g t h n.E i t h e r w a y , w e h a v e e n s u r e d
thatMA
idoesn’t decide LA.
We finish stage iby arbitrarily declaring that any string of length at most n,
whose status remains undetermined at this point, is out of A.S t a g e iis com-
pleted and we proceed with stage i+1.
We have shown that no polynomial time oracle TMdecides LAwith oracle A,
thereby proving the theorem.
In summary, the relativization method tells us that to solve the Pversus NP
question, we must analyze computations, not just simulate them. In Section 9.3,
we introduce one approach that may lead to such an analysis.
9.3
CIRCUIT COMPLEXITY
Computers are built from electronic devices wired together in a design called a
digital circuit .W ec a na l s os i m u l a t et h e o r e t i c a lm o d e l s ,s u c ha sT u r i n gm a c h i n e s ,
with the theoretical counterpart to digital circuits, called Boolean circuits .T w o
purposes are served by establishing the connection between TMsa n dB o o l e a n
circuits. First, researchers believe that circuits provide a convenient compu-
tational model for attacking the Pversus NPand related questions. Second,
circuits provide an alternative proof of the Cook–Levin theorem that SAT is
NP-complete. We cover both topics in this section.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 404 ---
380 CHAPTER 9 / INTRACTABILITY
DEFINITION 9.21
ABoolean circuit is a collection of gates andinputs connected by
wires . Cycles aren’t permitted. Gates take three forms: AND gates,
ORgates, and NOT gates, as shown schematically in the following
figure.
FIGURE 9.22
AnAND gate, an ORgate, and a NOT gate
The wires in a Boolean circuit carry the Boolean values 0and1.T h eg a t e sa r e
simple processors that compute the Boolean functions AND ,OR,a n d NOT .T h e
AND function outputs 1if both of its inputs are 1and outputs 0otherwise. The
ORfunction outputs 0if both of its inputs are 0and outputs 1otherwise. The
NOT function outputs the opposite of its input; in other words, it outputs a 1if
its input is 0and a 0if its input is 1.T h ei n p u t sa r el a b e l e d x1,...,x n.O n eo f
the gates is designated the output gate .T h ef o l l o w i n gfi g u r ed e p i c t s aB o o l e a n
circuit.
FIGURE 9.23
An example of a Boolean circuit
AB o o l e a nc i r c u i tc o m p u t e sa no u t p u tv a l u ef r o mas e t t i n go ft h ei n p u t sb y
propagating values along the wires and computing the function associated with
the respective gates until the output gate is assigned a value. The following
figure shows a Boolean circuit computing a value from a setting of its inputs.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 405 ---
9.3 CIRCUIT COMPLEXITY 381
FIGURE 9.24
An example of a Boolean circuit computing
We use functions to describe the input/output behavior of Boolean cir-
cuits. T o a Boolean circuit Cwith ninput variables, we associate a function
fC:{0,1}n−→ {0,1},w h e r ei f Coutputs bwhen its inputs x1,...,x nare set to
a1,...,a n,w ew r i t e fC(a1,...,a n)=b.W e s a y t h a t Ccomputes the function
fC.W es o m e t i m e sc o n s i d e rB o o l e a nc i r c u i t st h a th a v em u l t i p l eo u t p u tg a t e s .A
function with koutput bits computes a function whose range is {0,1}k.
EXAMPLE 9.25
The n-input parity function parityn:{0,1}n−→ {0,1}outputs 1if an odd num-
ber of 1sa p p e a ri nt h ei n p u tv a r i a b l e s . T h ec i r c u i ti nF i g u r e9 . 2 6c o m p u t e s
parity4,t h ep a r i t yf u n c t i o no n 4variables.
FIGURE 9.26
AB o o l e a nc i r c u i tt h a tc o m p u t e st h ep a r i t yf u n c t i o no n4v a r i a b l e s
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 406 ---
382 CHAPTER 9 / INTRACTABILITY
We plan to use circuits to test membership in languages once they have been
suitably encoded into {0,1}.O n e p r o b l e m t h a t o c c u r s i s t h a t a n y p a r t i c u l a r
circuit can handle only inputs of some fixed length, whereas a language may
contain strings of different lengths. So instead of using a single circuit to test
language membership, we use an entire family of circuits, one for each input
length, to perform this task. We formalize this notion in the following definition.
DEFINITION 9.27
Acircuit family Cis an infinite list of circuits, (C0,C1,C2,...),
where Cnhasninput variables. We say that Cdecides a language
Aover {0,1}if for every string w,
w∈Aiff Cn(w)=1,
where nis the length of w.
The sizeof a circuit is the number of gates that it contains. T wo circuits
are equivalent if they have the same input variables and output the same value
on every input assignment. A circuit is size minimal if no smaller circuit is
equivalent to it. The problem of minimizing circuits has obvious engineering
applications but is very difficult to solve in general. Even the problem of testing
whether a particular circuit is minimal does not appear to be solvable in Por in
NP.Ac i r c u i tf a m i l yi sm i n i m a li fe v e r y Cion the list is a minimal circuit. The
size complexity of a circuit family (C0,C1,C2,...)is the function f:N− →N ,
where f(n)is the size of Cn.W em a ys i m p l yr e f e rt ot h ec o m p l e x i t yo fac i r c u i t
family, instead of the size complexity, when it is clear that we are speaking about
size.
The depth of a circuit is the length (number of wires) of the longest path
from an input variable to the output gate. We define depth minimal circuits
and circuit families, and the depth complexity of circuit families, as we did with
circuit size. Circuit depth complexity is of particular interest in Section 10.5
concerning parallel computation.
DEFINITION 9.28
The circuit complexity of a language is the size complexity of a min-
imal circuit family for that language. The circuit depth complexity
of a language is defined similarly, using depth instead of size.
EXAMPLE 9.29
We can easily generalize Example 9.25 to give circuits that compute the parity
function on nvariables with O(n)gates. One way to do so is to build a binary
tree of gates that compute the XOR function, where the XOR function is the
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 407 ---
9.3 CIRCUIT COMPLEXITY 383
same as the parity2function, and then implement each XOR gate with two NOT s,
two AND s, and one OR,a sw ed i di nt h a te a r l i e re x a m p l e .
LetAbe the language of strings that contain an odd number of 1s. Then A
has circuit complexity O(n).
The circuit complexity of a language is related to its time complexity. Any
language with small time complexity also has small circuit complexity, as the
following theorem shows.
THEOREM 9.30
Lett:N− →N be a function, where t(n)≥n.I fA∈TIME( t(n)),t h e n Ahas
circuit complexity O(t2(n)).
This theorem gives an approach to proving that P̸=N P whereby we attempt
to show that some language in NPhas more than polynomial circuit complexity.
PROOF IDEA LetMbe a TMthat decides Ain time t(n).( F o rs i m p l i c i t y ,w e
ignore the constant factor in O(t(n)),t h ea c t u a lr u n n i n gt i m eo f M.) For each
n,w ec o n s t r u c tac i r c u i t Cnthat simulates Mon inputs of length n.T h e g a t e s
ofCnare organized in rows, one for each of the t(n)steps in M’s computation
on an input of length n.E a c hr o wo fg a t e sr e p r e s e n t st h ec o n fi g u r a t i o no f Mat
the corresponding step. Each row is wired into the previous row so that it can
calculate its configuration from the previous row’s configuration. We modify M
so that the input is encoded into {0,1}.M o r e o v e r ,w h e n Mis about to accept,
it moves its head onto the leftmost tape cell and writes the ␣symbol on that cell
prior to entering the accept state. That way, we can designate a gate in the final
row of the circuit to be the output gate.
PROOF LetM=(Q,Σ,Γ,δ ,q 0,qaccept,qreject)decide Ain time t(n),a n dl e t
wbe an input of length ntoM.D e fi n ea tableau forMonwto be a t(n)×t(n)
table whose rows are configurations of M.T h et o pr o wo ft h et a b l e a uc o n t a i n s
the start configuration of Monw.T h e ith row contains the configuration at the
ith step of the computation.
For convenience, we modify the representation format for configurations in
this proof. Instead of the old format, described on page 168, where the state
appears to the left of the symbol that the head is reading, we represent both the
state and the tape symbol under the tape head by a single composite character.
For example, if Mis in state qand its tape contains the string 1011 with the head
reading the second symbol from the left, the old format would be 1q011and the
new format would be 1
q0
11—where the composite character
 q0
represents both
q,t h es t a t e ,a n d 0,t h es y m b o lu n d e rt h eh e a d .
Each entry of the tableau can contain a tape symbol (member of Γ)o rac o m -
bination of a state and a tape symbol (member of Q×Γ). The entry at the ith
row and jth column of the tableau is cell[i,j].T h e t o p r o wo f t h et a b l e a u t h e n
iscell[1,1],...,cell[1,t(n)]and contains the starting configuration.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 408 ---
384 CHAPTER 9 / INTRACTABILITY
We make two assumptions about TMMin defining the notion of a tableau.
First, as we mentioned in the proof idea, Maccepts only when its head is on
the leftmost tape cell and that cell contains the ␣symbol. Second, once Mhas
halted, it stays in the same configuration for all future time steps. So by looking
at the leftmost cell in the final row of the tableau, cell[t(n),1],w ec a nd e t e r m i n e
whether Mhas accepted. The following figure shows part of a tableau for Mon
the input 0010 .  
FIGURE 9.31
At a b l e a uf o r Mon input 0010
The content of each cell is determined by certain cells in the preceding row.
If we know the values at cell[i−1,j−1],cell[i−1,j],a n d cell[i−1,j+1 ],w e
can obtain the value at cell[i, j]with M’s transition function. For example, the
following figure magnifies a portion of the tableau in Figure 9.31. The three top
symbols, 0,0,a n d 1,a r et a p es y m b o l sw i t h o u ts t a t e s ,s ot h em i d d l es y m b o lm u s t
remain a 0in the next row, as shown.
Now we can begin to construct the circuit Cn.I t h a s s e v e r a l g a t e s f o r e a c h
cell in the tableau. These gates compute the value at a cell from the values of the
three cells that affect it.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 409 ---
9.3 CIRCUIT COMPLEXITY 385
To m a k e t h e c o n s t r u c t i o n e a s i e r t o d e s c r i b e , w e a d d l i g h t s t h a t s h o w t h e o u t -
put of some of the gates in the circuit. The lights are for illustrative purposes
only and don’t affect the operation of the circuit.
Letkbe the number of elements in Γ∪(Q×Γ).W e c r e a t e klights for
each cell in the tableau—one light for each member of Γ,a n do n el i g h tf o re a c h
member of (Q×Γ)—or a total of kt2(n)lights. We call these lights light[i, j, s],
where 1≤i, j≤t(n)ands∈Γ∪(Q×Γ).T h e c o n d i t i o n o f t h e l i g h t s i n a
cell indicates the contents of that cell. If light[i, j, s]is on, cell[i, j]contains the
symbol s.O fc o u r s e ,i ft h ec i r c u i ti sc o n s t r u c t e dp r o p e r l y ,o n l yo n el i g h tw o u l d
be on per cell.
Let’s pick one of the lights—say, light[i, j, s]incell[i, j].T h i sl i g h ts h o u l db e
on if that cell contains the symbol s.W ec o n s i d e rt h et h r e ec e l l st h a tc a na f f e c t
cell[i, j]and determine which of their settings cause cell[i, j]to contain s.T h i s
determination can be made by examining the transition function δ.
Suppose that if the cells cell[i−1,j−1],cell[i−1,j],a n d cell[i−1,j+1 ]
contain a,b,a n d c,r e s p e c t i v e l y , cell[i, j]contains s,a c c o r d i n gt o δ.W ew i r et h e
circuit so that if light[i−1,j−1,a],light[i−1,j ,b],a n d light[i−1,j+1,c]
are on, then so is light[i, j, s].W e d o s o b y c o n n e c t i n g t h e t h r e e l i g h t s a t t h e
i−1level to an AND gate whose output is connected to light[i, j, s].
In general, several different settings (a1,b1,c1),(a2,b2,c2),..., (al,bl,cl)of
cell[i−1,j−1],cell[i−1,j],a n d cell[i−1,j+1 ]may cause cell[i, j]to contain
s.I nt h i sc a s e ,w ew i r et h ec i r c u i ts ot h a tf o re a c hs e t t i n g ai,bi,ci,t h er e s p e c t i v e
lights are connected with an AND gate, and all the AND gates are connected with
anORgate. This circuitry is illustrated in the following figure.
FIGURE 9.32
Circuitry for one light
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 410 ---
386 CHAPTER 9 / INTRACTABILITY
The circuitry just described is repeated for each light, with a few exceptions
at the boundaries. Each cell at the left boundary of the tableau—that is, cell[i,1]
for1≤i≤t(n)—has only two preceding cells that affect its contents. The
cells at the right boundary are similar. In these cases, we modify the circuitry to
simulate the behavior of TMMin this situation.
The cells in the first row have no predecessors and are handled in a special
way. These cells contain the start configuration and their lights are wired to
the input variables. Thus, light[1,1,
q01
]is connected to input w1because the
start configuration begins with the start state symbol q0and the head starts over
w1.S i m i l a r l y , light[1,1,
q00
]is connected through a NOT gate to input w1.
Furthermore, light[1,2,1],...,light[1,n ,1]are connected to inputs w2,...,w n,
andlight[1,2,0],...,light[1,n ,0]are connected through NOT gates to inputs
w2,...,w nbecause the input string wdetermines these values. Additionally,
light[1,n+1,␣],...,light[1,t(n),␣]are on because the remaining cells in the
first row correspond to positions on the tape that initially are blank ( ␣). Finally,
all other lights in the first row are off.
So far, we have constructed a circuit that simulates Mthrough its t(n)th step.
All that remains to be done is to assign one of the gates to be the output gate of
the circuit. We know that Maccepts wif it is in an accept state qaccept on a cell
containing ␣at the left-hand end of the tape at step t(n).S o w e d e s i g n a t e t h e
output gate to be the one attached to light[t(n),1,
qaccept␣
].T h i sc o m p l e t e st h e
proof of the theorem.
Besides linking circuit complexity and time complexity, Theorem 9.30 yields
an alternative proof of Theorem 7.27, the Cook–Levin theorem, as follows. We
say that a Boolean circuit is satisfiable if some setting of the inputs causes the
circuit to output 1.T h e circuit-satisfiability problem tests whether a circuit is
satisfiable. Let
CIRCUIT-SAT ={⟨C⟩|Cis a satisfiable Boolean circuit }.
Theorem 9.30 shows that Boolean circuits are capable of simulating T uring ma-
chines. We use that result to show that CIRCUIT-SAT isNP-complete.
THEOREM 9.33
CIRCUIT-SAT isNP-complete.
PROOF To p r o v e t h i s t h e o r e m , w e m u s t s h o w t h a t CIRCUIT-SAT is in NP,
and that any language AinNPis reducible to CIRCUIT-SAT .T h efi r s ti so b v i -
ous. T o do the second, we must give a polynomial time reduction fthat maps
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 411 ---
9.3 CIRCUIT COMPLEXITY 387
strings to circuits, where
f(w)=⟨C⟩
implies that
w∈A⇐⇒Boolean circuit Cis satisfiable.
Because Ais in NP,i th a sap o l y n o m i a lt i m ev e r i fi e r Vwhose input has the
form ⟨x, c⟩,w h e r e cmay be the certificate showing that xis in A.T o c o n s t r u c t
f,w eo b t a i nt h ec i r c u i ts i m u l a t i n g Vusing the method in Theorem 9.30. We
fill in the inputs to the circuit that correspond to xwith the symbols of w.T h e
only remaining inputs to the circuit correspond to the certificate c.W ec a l lt h i s
circuit Cand output it.
IfCis satisfiable, a certificate exists, so wis in A. Conversely, if wis in A,a
certificate exists, so Cis satisfiable.
To s h o w t h a t t h i s r e d u c t i o n r u n s i n p o l y n o m i a l t i m e , w e o b s e r v e t h a t i n t h e
proof of Theorem 9.30, the construction of the circuit can be done in time that
is polynomial in n.T h er u n n i n gt i m eo ft h ev e r i fi e ri s nkfor some k,s ot h es i z e
of the circuit constructed is O(n2k).T h es t r u c t u r eo ft h ec i r c u i ti sq u i t es i m p l e
(actually, it is highly repetitious), so the running time of the reduction is O(n2k).
Now we show that 3SAT isNP-complete, completing the alternative proof
of the Cook–Levin theorem.
THEOREM 9.34
3SAT isNP-complete.
PROOF IDEA 3SAT is obviously in NP.W e s h o w t h a t a l l l a n g u a g e s i n NP
reduce to 3SAT in polynomial time. We do so by reducing CIRCUIT-SAT to
3SAT in polynomial time. The reduction converts a circuit Cto a formula φ,
whereby Cis satisfiable iff φis satisfiable. The formula contains one variable for
each variable and each gate in the circuit.
Conceptually, the formula simulates the circuit. A satisfying assignment for
φcontains a satisfying assignment to C.I t a l s o c o n t a i n s t h e v a l u e s a t e a c h o f
C’s gates in C’s computation on its satisfying assignment. In effect, φ’s satisfying
assignment “guesses” C’s entire computation on its satisfying assignment, and
φ’s clauses check the correctness of that computation. In addition, φcontains a
clause stipulating that C’s output is 1.
PROOF We give a polynomial time reduction ffrom CIRCUIT-SAT to
3SAT .L e t Cbe a circuit containing inputs x1,...,x land gates g1,...,g m.T h e
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 412 ---
388 CHAPTER 9 / INTRACTABILITY
reduction builds from Caf o r m u l a φwith variables x1,...,x l,g1,...,g m.E a c h
ofφ’s variables corresponds to a wire in C.T h e xivariables correspond to the
input wires, and the givariables correspond to the wires at the gate outputs. We
relabel φ’s variables as w1,...,w l+m.
Now we describe φ’s clauses. We write φ’s clauses more intuitively using im-
plications. Recall that we can convert the implication operation (P→Q)to the
clause (
P∨Q).E a c h NOT gate in Cwith input wire wiand output wire wjis
equivalent to the expression
(
wi→wj)∧(wi→
wj),
which in turn yields the two clauses
(wi∨wj)∧(
wi∨
wj).
Observe that both clauses are satisfied iff an assignment is made to the variables
wiandwjcorresponding to the correct functioning of the NOT gate.
Each AND gate in Cwith inputs wiandwjand output wkis equivalent to
((
wi∧
wj)→
wk)∧((
wi∧wj)→
wk)∧((wi∧
wj)→
wk)∧((wi∧wj)→wk),
which in turn yields the four clauses
(wi∨wj∨
wk)∧(wi∨
wj∨
wk)∧(
wi∨wj∨
wk)∧(
wi∨
wj∨wk).
Similarly, each ORgate in Cwith inputs wiandwjand output wkis equivalent
to
((
wi∧
wj)→
wk)∧((
wi∧wj)→wk)∧((wi∧
wj)→wk)∧((wi∧wj)→wk),
which in turn yields the four clauses
(wi∨wj∨
wk)∧(wi∨
wj∨wk)∧(
wi∨wj∨wk)∧(
wi∨
wj∨wk).
In each case, all four clauses are satisfied when an assignment is made to the
variables wi,wj,a n d wk,c o r r e s p o n d i n gt ot h ec o r r e c tf u n c t i o n i n go ft h eg a t e .
Additionally, we add the clause (wm)toφ,w h e r e wmisC’s output gate.
Some of the clauses described contain fewer than three literals. We expand
such clauses to the desired size by repeating literals. For example, we expand
the clause (wm)to the equivalent clause (wm∨wm∨wm).T h a tc o m p l e t e st h e
construction.
We briefly argue that the construction works. If a satisfying assignment for
Cexists, we obtain a satisfying assignment for φby assigning the givariables
according to C’s computation on this assignment. Conversely, if a satisfying as-
signment for φexists, it gives an assignment for Cbecause it describes C’s entire
computation where the output value is 1.T h e r e d u c t i o n c a n b e d o n e i n p o l y -
nomial time because it is simple to compute and the output size is polynomial
(actually linear) in the size of the input.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 413 ---
EXERCISES 389
EXERCISES
A9.1 Prove that TIME(2n)=T I M E ( 2n+1).
A9.2 Prove that TIME(2n)/subsetnoteqlTIME(22n).
A9.3 Prove that NTIME( n)/subsetnoteqlPSPACE .
9.4 Show how the circuit depicted in Figure 9.26 computes on input 0110 by showing
the values computed by all of the gates, as we did in Figure 9.24.
9.5 Give a circuit that computes the parity function on three input variables and show
how it computes on input 011.
9.6 Prove that if A∈P,t h e n PA=P.
9.7 Give regular expressions with exponentiation that generate the following languages
over the alphabet {0,1}.
Aa.All strings of length 500
Ab.All strings of length 500 or less
Ac.All strings of length 500 or more
Ad.All strings of length different than 500
e.All strings that contain exactly 500 1s
f.All strings that contain at least 500 1s
g.All strings that contain at most 500 1s
h.All strings of length 500 or more that contain a 0in the 500th position
i.All strings that contain two 0st h a th a v ea tl e a s t5 0 0s y m b o l sb e t w e e nt h e m
9.8 IfRis a regular expression, let R{m,n}represent the expression
Rm∪Rm+1∪· · ·∪ Rn.
Show how to implement the R{m,n}operator, using the ordinary exponentiation
operator, but without “ ···”.
9.9 Show that if NP = PSAT,t h e n NP = coNP .
9.10 Problem 8.13 showed that ALBAisPSPACE -complete.
a.Do we know whether ALBA∈NL?E x p l a i ny o u ra n s w e r .
b.Do we know whether ALBA∈P?E x p l a i ny o u ra n s w e r .
9.11 Show that the language MAX-CLIQUE from Problem 7.48 is in PSAT.
PROBLEMS
9.12 Describe the error in the following fallacious “proof” that P̸=N P .A s s u m e t h a t
P=N P and obtain a contradiction. If P=N P ,t h e n SAT ∈Pand so for some k,
SAT ∈TIME( nk).B e c a u s e e v e r y l a n g u a g e i n NPis polynomial time reducible
toSAT,y o uh a v e NP⊆TIME( nk).T h e r e f o r e , P⊆TIME( nk).B u t b y t h e
time hierarchy theorem, TIME( nk+1)contains a language that isn’t in TIME( nk),
which contradicts P⊆TIME( nk).T h e r e f o r e , P̸=N P .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 414 ---
390 CHAPTER 9 / INTRACTABILITY
9.13 Consider the function pad:Σ∗×N − → Σ∗#∗that is defined as follows. Let
pad(s,l)=s#j,w h e r e j=m a x ( 0 ,l−m)andmis the length of s.T h u s , pad(s,l)
simply adds enough copies of the new symbol #to the end of sso that the length
of the result is at least l.F o ra n yl a n g u a g e Aand function f:N− →N ,d e fi n et h e
language pad(A,f)as
pad(A, f)={pad(s, f(m))|where s∈Aandmis the length of s}.
Prove that if A∈TIME( n6),t h e n pad(A, n2)∈TIME( n3).
9.14 Prove that if NEXPTIME ̸=E X P T I M E ,t h e n P̸=N P .Y o u m a yfi n d t h e f u n c -
tionpad,d e fi n e di nP r o b l e m9 . 1 3 ,t ob eh e l p f u l .
A9.15 Define padas in Problem 9.13.
a.Prove that for every Aand natural number k,A∈Piffpad(A, nk)∈P.
b.Prove that P̸=S P A C E ( n).
9.16 Prove that TQBF ̸∈SPACE( n1/3).
⋆9.17 Read the definition of a 2DFA (two-headed finite automaton) given in Prob-
lem 5.26. Prove that Pcontains a language that is not recognizable by a 2DFA .
9.18 LetEREX↑={⟨R⟩|Ris a regular expression with exponentiation and L(R)=∅}.
Show that EREX↑∈P.
9.19 Define the unique-sat problem to be
USAT ={⟨φ⟩|φis a Boolean formula that has a single satisfying assignment }.
Show that USAT ∈PSAT.
9.20 Prove that an oracle Cexists for which NPC̸=c o N PC.
9.21 Ak-query oracle Turing machine is an oracle T uring machine that is permitted to
make at most kqueries on each input. A k-query oracle T uring machine Mwith
an oracle for Ais written MA,k.D e fi n e PA,kto be the collection of languages that
are decidable by polynomial time k-query oracle T uring machines with an oracle
forA.
a.Show that NP∪coNP ⊆PSAT,1.
b.Assume that NP̸=c o N P .S h o wt h a t NP∪coNP /subsetnoteqlPSAT,1.
9.22 Suppose that AandBare two oracles. One of them is an oracle for TQBF ,b u ty o u
don’t know which. Give an algorithm that has access to both AandB,a n dt h a ti s
guaranteed to solve TQBF in polynomial time.
9.23 Recall that you may consider circuits that output strings over {0,1}by designating
several output gates. Let add n:{0,1}2n−→ {0,1}n+1take two nbit binary inte-
gers and produce the n+1bit sum. Show that you can compute the add nfunction
with O(n)size circuits.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 415 ---
SELECTED SOLUTIONS 391
9.24 Define the function majorityn:{0,1}n−→ {0,1}as
majorityn(x1,...,x n)=⎨braceleftBigg
0⎨summationtextxi<n /2;
1⎨summationtextxi≥n/2.
Thus, the majoritynfunction returns the majority vote of the inputs. Show that
majorityncan be computed with:
a.O(n2)size circuits.
b.O(nlogn)size circuits. (Hint: Recursively divide the number of inputs in
half and use the result of Problem 9.23.)
⋆9.25 Define the function majoritynas in Problem 9.24. Show that it may be computed
with O(n)size circuits.
SELECTED SOLUTIONS
9.1 The time complexity classes are defined in terms of the big- Onotation, so constant
factors have no effect. The function 2n+1isO(2n)and thus A∈TIME(2n)iff
A∈TIME(2n+1).
9.2 The containment TIME(2n)⊆TIME(22n)holds because 2n≤22n.T h e c o n -
tainment is proper by virtue of the time hierarchy theorem. The function 22n
is time constructible because a TMcan write the number 1followed by 2n0si n
O(22n)time. Hence the theorem guarantees that a language Aexists that can be
decided in O(22n)time but not in o(22n/log 22n)=o(22n/2n)time. Therefore,
A∈TIME(22n)butA̸∈TIME(2n).
9.3 NTIME( n)⊆NSPACE( n)because any T uring machine that operates in time
t(n)on every computation branch can use at most t(n)tape cells on every branch.
Furthermore, NSPACE( n)⊆SPACE( n2)due to Savitch’s theorem. However,
SPACE( n2)/subsetnoteqlSPACE( n3)because of the space hierarchy theorem. The result
follows because SPACE( n3)⊆PSPACE .
9.7 (a)Σ500;(b)(Σ∪ε)500;(c)Σ500Σ∗;(d)(Σ∪ε)499∪Σ501Σ∗.
9.15 (a)LetAbe any language and k∈N.I fA∈P,t h e n pad(A,nk)∈Pbecause you
can determine whether w∈pad(A, nk)by writing wass#lwhere sdoesn’t contain
the#symbol, then testing whether |w|=|s|k;a n dfi n a l l yt e s t i n gw h e t h e r s∈A.
Implementing the first test in polynomial time is straightforward. The second test
runs in time poly( |s|),a n db e c a u s e |s|≤|w|,t h et e s tr u n si nt i m e poly( |w|)and
hence is in polynomial time. If pad(A, nk)∈P,t h e n A∈Pbecause you can
determine whether w∈Aby padding wwith #symbols until it has length |w|k
and then testing whether the result is in pad(A, nk).B o t ho ft h e s ea c t i o n sr e q u i r e
only polynomial time.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 416 ---
392 CHAPTER 9 / INTRACTABILITY
(b)Assume that P=S P A C E ( n).L e t Abe a language in SPACE( n2)but not
inSPACE( n)as shown to exist in the space hierarchy theorem. The language
pad(A,n2)∈SPACE( n)because you have enough space to run the O(n2)space
algorithm for A,u s i n gs p a c et h a ti sl i n e a ri nt h ep a d d e dl a n g u a g e .B e c a u s eo ft h e
assumption, pad(A, n2)∈P,h e n c e A∈Pby part (a), and hence A∈SPACE( n),
due to the assumption once again. But that is a contradiction.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 417 ---
10
ADVANCED TOPICS IN
COMPLEXITY THEORY
In this chapter, we briefly introduce a few additional topics in complexity theory.
This subject is an active field of research, and it has an extensive literature. This
chapter is a sample of more advanced developments, but is not a comprehensive
survey. In particular, two important topics that are beyond the scope of this book
are quantum computation and probabilistically checkable proofs. The Handbook
of Theoretical Computer Science [77] presents a survey of earlier work in complexity
theory.
This chapter contains sections on approximation algorithms, probabilistic
algorithms, interactive proof systems, parallel computation, and cryptography.
These sections are independent except that probabilistic algorithms are used in
the sections on interactive proof systems and cryptography.
10.1
APPROXIMATION ALGORITHMS
In certain problems called optimization problems ,w es e e kt h eb e s ts o l u t i o n
among a collection of possible solutions. For example, we may want to find a
largest clique in a graph, a smallest vertex cover, or a shortest path connecting
two nodes. When an optimization problem is NP-hard, as is the case with the
393
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 418 ---
394 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
first two of these types of problems, no polynomial time algorithm exists that
finds the best solution unless P=N P .
In practice, we may not need the absolute best or optimal solution to a prob-
lem. A solution that is nearly optimal may be good enough and may be much
easier to find. As its name implies, an approximation algorithm is designed to
find such approximately optimal solutions.
For example, take the vertex cover problem that we introduced in Section 7.5.
There we presented the problem as the language VERTEX-COVER representing
adecision problem —one that has a yes/no answer. In the optimization ver-
sion of this problem, called MIN-VERTEX-COVER ,w ea i mt op r o d u c eo n eo f
the smallest vertex covers among all possible vertex covers in the input graph.
The following polynomial time algorithm approximately solves this optimiza-
tion problem. It produces a vertex cover that is never more than twice the size
of one of the smallest vertex covers.
A=“On input ⟨G⟩,w h e r e Gis an undirected graph:
1.Repeat the following until all edges in Gtouch a marked edge:
2. Find an edge in Guntouched by any marked edge.
3. Mark that edge.
4.Output all nodes that are endpoints of marked edges. ”
THEOREM 10.1
Ais a polynomial time algorithm that produces a vertex cover of Gthat is no
more than twice as large as a smallest vertex cover.
PROOF Aobviously runs in polynomial time. Let Xbe the set of nodes that
it outputs. Let Hbe the set of edges that it marks. We know that Xis a vertex
cover because Hcontains or touches every edge in G,a n dh e n c e Xtouches all
edges in G.
To p r o v e t h a t Xis at most twice as large as a smallest vertex cover Y,w e
establish two facts: Xis twice as large as H,a n d His not larger than Y.F i r s t ,
every edge in Hcontributes two nodes to X,s oXis twice as large as H.S e c o n d ,
Yis a vertex cover, so every edge in His touched by some node in Y.N os u c h
node touches two edges in Hbecause the edges in Hdo not touch each other.
Therefore, vertex cover Yis at least as large as Hbecause Ycontains a different
node that touches every edge in H.H e n c e Xis no more than twice as large as Y.
MIN-VERTEX-COVER is an example of a minimization problem because
we aim to find a smallest among the collection of possible solutions. In a maxi-
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 419 ---
10.1 APPROXIMATION ALGORITHMS 395
mization problem ,w es e e ka largest solution. An approximation algorithm for a
minimization problem is k-optimal if it always finds a solution that is not more
than ktimes optimal. The preceding algorithm is 2-optimal for the vertex cover
problem. For a maximization problem, a k-optimal approximation algorithm
always finds a solution that is at least1
ktimes the size of the optimal.
The following is an approximation algorithm for a maximization problem
called MAX-CUT .Acutin an undirected graph is a separation of the vertices V
into two disjoint subsets SandT.Acut edge is an edge that goes between a node
inSand a node in T.A n uncut edge is an edge that is not a cut edge. The size
of a cut is the number of cut edges. The MAX-CUT problem asks for a largest
cut in a graph G.A sw es h o w e di nP r o b l e m7 . 2 7 ,t h i sp r o b l e mi s NP-complete.
The following algorithm approximates MAX-CUT within a factor of 2.
B=“On input ⟨G⟩,w h e r e Gis an undirected graph with nodes V:
1.LetS=∅andT=V.
2.If moving a single node, either from StoTor from TtoS,
increases the size of the cut, make that move and repeat this
stage.
3.If no such node exists, output the current cut and halt. ”
This algorithm starts with a (presumably) bad cut and makes local improve-
ments until no further local improvement is possible. Although this procedure
won’t give an optimal cut in general, we show that it does give one that is at least
half the size of an optimal one.
THEOREM 10.2
Bis a polynomial time, 2-optimal approximation algorithm for MAX-CUT .
PROOF Bruns in polynomial time because every execution of stage 2 in-
creases the size of the cut to a maximum of the total number of edges in G.
Now we show that B’s cut is at least half optimal. Actually, we show some-
thing stronger: B’s cut edges are at least half of all edges in G.O b s e r v et h a t a t
every node of G,t h en u m b e ro fc u te d g e si sa tl e a s ta sl a r g ea st h en u m b e ro f
uncut edges, or Bwould have shifted that node to the other side. We add up
the numbers of cut edges at every node. That sum is twice the total number of
cut edges because every cut edge is counted once for each of its two endpoints.
By the preceding observation, that sum must be at least the corresponding sum
of the numbers of uncut edges at every node. Thus, Ghas at least as many cut
edges as uncut edges. Therefore, the cut contains at least half of all edges.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 420 ---
396 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
10.2
PROBABILISTIC ALGORITHMS
Aprobabilistic algorithm is an algorithm designed to use the outcome of a ran-
dom process. T ypically, such an algorithm would contain an instruction to “flip a
coin” and the result of that coin flip would influence the algorithm’s subsequent
execution and output. Certain types of problems seem to be more easily solvable
by probabilistic algorithms than by deterministic algorithms.
How can making a decision by flipping a coin ever be better than actually
calculating, or even estimating, the best choice in a particular situation? Some-
times, calculating the best choice may require excessive time, and estimating it
may introduce a bias that invalidates the result. For example, statisticians use
random sampling to determine information about the individuals in a large pop-
ulation, such as their tastes or political preferences. Querying all the individuals
might take too long, and querying a nonrandomly selected subset might tend to
give erroneous results.
THE CLASS BPP
We begin our formal discussion of probabilistic computation by defining a model
of a probabilistic T uring machine. Then we give a complexity class associated
with efficient probabilistic computation and a few examples.
DEFINITION 10.3
Aprobabilistic Turing machine Mis a type of nondeterministic
Tu r i n g m a c h i n e i n w h i c h e a c h n o n d e t e r m i n i s t i c s t e p i s c a l l e d a
coin-flip step and has two legal next moves. We assign a proba-
bility to each branch bofM’s computation on input was follows.
Define the probability of branch bto be
Pr[b]=2−k,
where kis the number of coin-flip steps that occur on branch b.
Define the probability that Maccepts wto be
Pr[Maccepts w]=⎪summationdisplay
bis an
accepting branchPr[b].
In other words, the probability that Maccepts wis the probability that we
would reach an accepting configuration if we simulated Monwby flipping a
coin to determine which move to follow at each coin-flip step. We let
Pr[Mrejects w]=1 −Pr[Maccepts w].
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 421 ---
10.2 PROBABILISTIC ALGORITHMS 397
When a probabilistic T uring machine decides a language, it must accept all
strings in the language and reject all strings out of the language as usual, except
that now we allow the machine a small probability of error. For 0≤ϵ<1
2,w e
say that Mdecides language Awith error probability ϵif
1.w∈Aimplies Pr[Maccepts w]≥1−ϵ,a n d
2.w̸∈Aimplies Pr[Mrejects w]≥1−ϵ.
In other words, the probability that we would obtain the wrong answer by sim-
ulating Mis at most ϵ.W e a l s o c o n s i d e r e r r o rp r o b a b i l i t y b o u n d s t h a t d e p e n d
on the input length n.F o r e x a m p l e , e r r o r p r o b a b i l i t y ϵ=2−nindicates an
exponentially small probability of error.
We are interested in probabilistic algorithms that run efficiently in time
and/or space. We measure the time and space complexity of a probabilistic T ur-
ing machine in the same way we do for a nondeterministic T uring machine: by
using the worst case computation branch on each input.
DEFINITION 10.4
BPPis the class of languages that are decided by probabilistic poly-
nomial time T uring machines with an error probability of1
3.
We defined this class with an error probability of1
3,b u ta n yc o n s t a n te r r o r
probability would yield an equivalent definition as long as it is strictly between 0
and1
2by virtue of the following amplification lemma .I t g i v e s a s i m p l e w a y o f
making the error probability exponentially small. Note that a probabilistic algo-
rithm with an error probability of 2−100is far more likely to give an erroneous
result because the computer on which it runs has a hardware failure than because
of an unlucky toss of its coins.
LEMMA 10.5
Letϵbe a fixed constant strictly between 0and1
2.T h e nf o ra n yp o l y n o m i a l p(n),
ap r o b a b i l i s t i cp o l y n o m i a lt i m eT u r i n gm a c h i n e M1that operates with error
probability ϵhas an equivalent probabilistic polynomial time T uring machine
M2that operates with an error probability of 2−p(n).
PROOF IDEA M2simulates M1by running it a polynomial number of times
and taking the majority vote of the outcomes. The probability of error decreases
exponentially with the number of runs of M1made.
Consider the case where ϵ=1
3.I t c o r r e s p o n d s t o a b o x t h a t c o n t a i n s m a n y
red and blue balls. We know that2
3of the balls are of one color and that the
remaining1
3are of the other color, but we don’t know which color is predomi-
nant. We can test for that color by sampling several—say, 100—balls at random
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 422 ---
398 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
to determine which color comes up most frequently. Almost certainly, the pre-
dominant color in the box will be the most frequent one in the sample.
The balls correspond to branches of M1’s computation: red to accepting and
blue to rejecting. M2samples the color by running M1.Ac a l c u l a t i o ns h o w st h a t
M2errs with exponentially small probability if it runs M1ap o l y n o m i a ln u m b e r
of times and outputs the result that comes up most often.
PROOF Given TMM1deciding a language with an error probability of ϵ<1
2
and a polynomial p(n),w ec o n s t r u c ta TMM2that decides the same language
with an error probability of 2−p(n).
M2=“On input x:
1.Calculate k(see analysis below).
2.Run 2kindependent simulations of M1on input x.
3.If most runs of M1accept, then accept ;o t h e r w i s e , reject .”
We bound1the probability that M2gives the wrong answer on an input x.
Stage 2 yields a sequence of 2kresults from simulating M1,e a c hr e s u l te i t h e r
correct or wrong. If most of these results are correct, M2gives the correct an-
swer. We bound the probability that at least half of these results are wrong.
LetSbe any sequence of results that M2might obtain in stage 2. Let PS
be the probability M2obtains S.S a y t h a t Shasccorrect results and wwrong
results, so c+w=2k.I fc≤wandM2obtains S,t h e n M2outputs incorrectly.
We call such an Sabad sequence .L e t ϵxbe the probability that M1is wrong on x.
IfSis any bad sequence, then PS≤(ϵx)w(1−ϵx)c,w h i c hi sa tm o s t ϵw(1−ϵ)c
because ϵx≤ϵ<1
2soϵx(1−ϵx)≤ϵ(1−ϵ),a n db e c a u s e c≤w.F u r t h e r m o r e ,
ϵw(1−ϵ)cis at most ϵk(1−ϵ)kbecause k≤wandϵ<1−ϵ.
Summing PSfor all bad sequences Sgives the probability that M2outputs
incorrectly. We have at most 22kbad sequences because 22kis the number of all
sequences. Hence
Pr⎪bracketleftbig
M2outputs incorrectly on input x⎪bracketrightbig
=⎪summationdisplay
badSPS≤22k·ϵk(1−ϵ)k=( 4ϵ(1−ϵ))k.
We’ve assumed ϵ<1
2,s o 4ϵ(1−ϵ)<1.T h e r e f o r e , t h e a b o v e p r o b a b i l i t y
decreases exponentially in kand so does M2’s error probability. To calculate a
specific value of kthat allows us to bound M2’s error probability by 2−tfor any
t≥1,w el e t α=−log2(4ϵ(1−ϵ))and choose k≥t/α.T h e n w e o b t a i n a n
error probability of 2−p(n)within polynomial time.
1The analysis of the error probability follows from the Chernoff bound ,as t a n d a r dr e s u l t
in probability theory. Here we give an alternative, self-contained calculation that avoids
any dependence on that result.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 423 ---
10.2 PROBABILISTIC ALGORITHMS 399
PRIMALITY
Aprime number is an integer greater than 1that is not divisible by positive
integers other than 1and itself. A nonprime number greater than 1is called
composite .T h ea n c i e n tp r o b l e mo ft e s t i n gw h e t h e ra ni n t e g e ri sp r i m eo rc o m -
posite has been the subject of extensive research. A polynomial time algorithm
for this problem is now known [4], but it is too difficult to include here. In-
stead, we describe a much simpler probabilistic polynomial time algorithm for
primality testing.
One way to determine whether a number is prime is to try all possible integers
less than that number and see whether any are divisors, also called factors .T h a t
algorithm has exponential time complexity because the magnitude of a number
is exponential in its length. The probabilistic primality testing algorithm that
we describe operates in a different manner entirely. It doesn’t search for factors.
Indeed, no probabilistic polynomial time algorithm for finding factors is known
to exist.
Before discussing the algorithm, we mention some notation from number
theory. All numbers in this section are integers. For any pgreater than 1,w e
say that two numbers are equivalent modulo pif they differ by a multiple of p.
If numbers xandyare equivalent modulo p,w ew r i t e x≡y(mod p).W e l e t
xmod pbe the smallest nonnegative ywhere x≡y(mod p).E v e r y n u m b e r
is equivalent modulo pto some member of the set Zp={0,...,p −1}.F o r
convenience, we let Z+
p={1,...,p −1}.W em a yr e f e rt ot h ee l e m e n t so ft h e s e
sets by other numbers that are equivalent modulo p,a sw h e nw er e f e rt o p−1
by−1.
The main idea behind the algorithm stems from the following result, called
Fermat’s little theorem .
THEOREM 10.6
Ifpis prime and a∈Z+
p,t h e n ap−1≡1( m o d p).
For example, if p=7anda=2,t h et h e o r e ms a y st h a t 2(7−1)mod 7 should be
1because 7is prime. The simple calculations
2(7−1)=26=6 4 and 64 mod 7 = 1
confirm this result. Suppose that we try p=6instead. Then
2(6−1)=25=3 2 and 32 mod 6 = 2
give a result different from 1,i m p l y i n gb yt h et h e o r e mt h a t 6is not prime. Of
course, we already knew that. However, this method demonstrates that 6is
composite without finding its factors. Problem 10.15 asks you to provide a proof
of this theorem.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 424 ---
400 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Think of the preceding theorem as providing a type of “test” for primality,
called a Fermat test .W h e n w es a y t h a t ppasses the Fermat test at a,w em e a n
thatap−1≡1( m o d p).T h et h e o r e ms t a t e st h a tp r i m e sp a s sa l lF e r m a tt e s t sf o r
a∈Z+
p.W eo b s e r v e dt h a t 6fails some Fermat test, so 6isn’t prime.
Can we use the Fermat tests to give an algorithm for determining primality?
Almost. Call a number pseudoprime if it passes Fermat tests for all smaller a’s
relatively prime to it. With the exception of the infrequent Carmichael num-
bers,w h i c ha r ec o m p o s i t ey e tp a s sa l lF e r m a tt e s t s ,t h ep s e u d o p r i m en u m b e r s
are identical to the prime numbers. We begin by giving a very simple probabilis-
tic polynomial time algorithm that distinguishes primes from composites except
for the Carmichael numbers. Afterwards, we present and analyze the complete
probabilistic primality testing algorithm.
Ap s e u d o p r i m a l i t ya l g o r i t h mt h a tg o e st h r o u g ha l lF e r m a tt e s t sw o u l dr e q u i r e
exponential time. The key to the probabilistic polynomial time algorithm is that
if a number is not pseudoprime, it fails at least half of all tests. (Just accept this
assertion for now. Problem 10.16 asks you to prove it.) The algorithm works by
trying several tests chosen at random. If any fail, the number must be composite.
The algorithm contains a parameter kthat determines the error probability.
PSEUDOPRIME =“On input p:
1.Select a1,...,a krandomly in Z+
p.
2.Compute ap−1
imod pfor each i.
3.If all computed values are 1,accept ;o t h e r w i s e , reject .”
Ifpis pseudoprime, it passes all tests and the algorithm accepts with certainty.
Ifpisn’t pseudoprime, it passes at most half of all tests. In that case, it passes each
randomly selected test with probability at most1
2.T h ep r o b a b i l i t yt h a ti tp a s s e s
allkrandomly selected tests is thus at most 2−k.T h e a l g o r i t h m o p e r a t e s i n
polynomial time because modular exponentiation is computable in polynomial
time (see Problem 7.13).
To c o n v e r t t h e p r e c e d i n g a l g o r i t h m t o a p r i m a l i t y a l g o r i t h m , w e i n t r o d u c e a
more sophisticated test that avoids the problem with the Carmichael numbers.
The underlying principle is that the number 1has exactly two square roots,
1and−1,m o d u l oa n yp r i m e p.F o r m a n y c o m p o s i t e n u m b e r s , i n c l u d i n g a l l
the Carmichael numbers, 1has four or more square roots. For example, ±1and
±8are the four square roots of 1,m o d u l o 21.I fan u m b e rp a s s e st h eF e r m a tt e s t
ata,t h ea l g o r i t h mfi n d so n eo fi t ss q u a r er o o t so f 1at random and determines
whether that square root is 1or−1.I f i t i s n ’ t , w e k n o w t h a t t h e n u m b e r i s n ’ t
prime.
We can obtain square roots of 1ifppasses the Fermat test at abecause
ap−1mod p=1,a n ds o a(p−1)/2mod pis a square root of 1.I f t h a t v a l u e i s
still1,w em a yr e p e a t e d l yd i v i d et h ee x p o n e n tb y 2,s ol o n ga st h er e s u l t i n g
exponent remains an integer, and see whether the first number that is different
from 1is−1or some other number. We give a formal proof of the correctness of
the algorithm immediately following its description. Select k≥1as a parameter
that determines the maximum error probability to be 2−k.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 425 ---
10.2 PROBABILISTIC ALGORITHMS 401
PRIME =“On input p:
1.Ifpis even, accept ifp=2;o t h e r w i s e , reject .
2.Select a1,...,a krandomly in Z+
p.
3.For each ifrom 1tok:
4. Compute ap−1
imod pandreject if different from 1.
5. Letp−1=s·2lwhere sis odd.
6. Compute the sequence as·20
i,as·21
i,as·22
i,...,as·2l
imodulo p.
7. If some element of this sequence is not 1,fi n dt h el a s te l e m e n t
that is not 1andreject if that element is not −1.
8.All tests have passed at this point, so accept .”
The following two lemmas show that algorithm PRIME works correctly. Obvi-
ously the algorithm is correct when pis even, so we only consider the case when
pis odd. Say that aiis a(compositeness) witness if the algorithm rejects at either
stage 4 or 7, using ai.
LEMMA 10.7
Ifpis an odd prime number, Pr⎪bracketleftbig
PRIME accepts p⎪bracketrightbig
=1.
PROOF We first show that if pis prime, no witness exists and so no branch
of the algorithm rejects. If awere a stage 4 witness, (ap−1mod p)̸=1 and
Fermat’s little theorem implies that pis composite. If awere a stage 7 witness,
some bexists in Z+
p,w h e r e b̸≡ ±1( m o d p)andb2≡1( m o d p).
Therefore, b2−1≡0( m o d p).F a c t o r i n g b2−1yields
(b−1)(b+1 ) ≡0( m o d p),
which implies that
(b−1)(b+1 )= cp
for some positive integer c.B e c a u s e b̸≡ ±1( m o d p),b o t h b−1andb+1are
strictly between 0andp.T h e r e f o r e , pis composite because a multiple of a prime
number cannot be expressed as a product of numbers that are smaller than it is.
The next lemma shows that the algorithm identifies composite numbers with
high probability. First, we present an important elementary tool from number
theory. T wo numbers are relatively prime if they have no common divisor other
than 1.T h e Chinese remainder theorem says that a one-to-one correspondence
exists between ZpqandZp×Z qifpandqare relatively prime. Each number
r∈Zpqcorresponds to a pair (a, b),w h e r e a∈Zpandb∈Zq,s u c ht h a t
r≡a(mod p),and
r≡b(mod q).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 426 ---
402 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
LEMMA 10.8
Ifpis an odd composite number, Pr⎪bracketleftbig
PRIME accepts p⎪bracketrightbig
≤2−k.
PROOF We show that if pis an odd composite number and ais selected ran-
domly in Z+
p,
Pr⎪bracketleftbig
ais a witness⎪bracketrightbig
≥1
2
by demonstrating that at least as many witnesses as nonwitnesses exist in Z+
p.
We do so by finding a unique witness for each nonwitness.
In every nonwitness, the sequence computed in stage 6 is either all 1so rc o n -
tains −1at some position, followed by 1s. For example, 1itself is a nonwitness
of the first kind, and −1is a nonwitness of the second kind because sis odd and
(−1)s·20≡−1and(−1)s·21≡1.A m o n g a l l n o n w i t n e s s e s o f t h e s e c o n d k i n d ,
find a nonwitness for which the −1appears in the largest position in the se-
quence. Let hbe that nonwitness and let jbe the position of −1in its sequence,
where the sequence positions are numbered starting at 0.H e n c e hs·2j≡−1
(mod p).
Because pis composite, either pis the power of a prime or we can write pas
the product of qandr—two numbers that are relatively prime. We consider the
latter case first. The Chinese remainder theorem implies that some number t
exists in Zpwhereby
t≡h(mod q)and
t≡1( m o d r).
Therefore,
ts·2j≡−1( m o d q)and
ts·2j≡1( m o d r).
Hence tis a witness because ts·2j̸≡ ±1( m o d p)butts·2j+1≡1( m o d p).
Now that we have one witness, we can get many more. We prove that
dtmod pis a unique witness for each nonwitness dby making two observations.
First, ds·2j≡±1( m o d p)andds·2j+1≡1( m o d p)owing to the way jwas cho-
sen. Therefore, dtmod pis a witness because (dt)s·2j̸≡ ±1and(dt)s·2j+1≡1
(mod p).
Second, if d1andd2are distinct nonwitnesses, d1tmod p̸=d2tmod p.T h e
reason is that ts·2j+1mod p=1.H e n c e t·ts·2j+1−1mod p=1.T h e r e f o r e ,i f
td1mod p=td2mod p,t h e n
d1=t·ts·2j+1−1d1mod p=t·ts·2j+1−1d2mod p=d2.
Thus, the number of witnesses must be as large as the number of nonwitnesses,
and we have completed the analysis for the case where pis not a prime power.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 427 ---
10.2 PROBABILISTIC ALGORITHMS 403
For the prime power case, we have p=qewhere qis prime and e>1.L e t
t=1+ qe−1.E x p a n d i n g tpusing the binomial theorem, we obtain
tp=( 1+ qe−1)p=1+ p·qe−1+multiples of higher powers of qe−1,
which is equivalent to 1mod p.H e n c e tis a stage 4 witness because if tp−1≡1
(mod p),t h e n tp≡t̸≡1( m o d p).A s i n t h e p r e v i o u s c a s e , w e u s e t h i s o n e
witness to get many others. If dis a nonwitness, we have dp−1≡1( m o d p),
but then dtmod pis a witness. Moreover, if d1andd2are distinct nonwitnesses,
then d1tmod p̸=d2tmod p.O t h e r w i s e ,
d1=d1·t·tp−1mod p=d2·t·tp−1mod p=d2.
Thus, the number of witnesses must be as large as the number of nonwitnesses
and the proof is complete.
The preceding algorithm and its analysis establishes the following theorem.
LetPRIMES ={n|nis a prime number in binary }.
THEOREM 10.9
PRIMES ∈BPP .
Note that the probabilistic primality algorithm has one-sided error .W h e n
the algorithm outputs reject ,w ek n o wt h a tt h ei n p u tm u s tb ec o m p o s i t e .W h e n
the output is accept ,w ek n o wo n l yt h a tt h ei n p u tc o u l db ep r i m eo rc o m p o s i t e .
Thus, an incorrect answer can only occur when the input is a composite number.
The one-sided error feature is common to many probabilistic algorithms, so the
special complexity class RPis designated for it.
DEFINITION 10.10
RPis the class of languages that are decided by probabilistic poly-
nomial time T uring machines where inputs in the language are
accepted with a probability of at least1
2,a n di n p u t sn o ti nt h el a n -
guage are rejected with a probability of 1.
We can make the error probability exponentially small and maintain a poly-
nomial running time by using a probability amplification technique similar to
(actually simpler than) the one we used in Lemma 10.5. Our earlier algorithm
shows that COMPOSITES ∈RP.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 428 ---
404 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
READ-ONCE BRANCHING PROGRAMS
Abranching program is a model of computation used in complexity theory and in
certain practical areas such as computer-aided design. This model represents a
decision process that queries the values of input variables and determines how
to proceed based on the answers to those queries. We represent this decision
process as a graph whose nodes correspond to the particular variable queried at
that point in the process.
In this section, we investigate the complexity of testing whether two branch-
ing programs are equivalent. In general, that problem is coNP -complete. If
we place a certain natural restriction on the class of branching programs, we
can give a probabilistic polynomial time algorithm for testing equivalence. This
algorithm is especially interesting for two reasons. First, no polynomial time
algorithm is known for this problem, so it provides an example of probabilism
apparently expanding the class of languages whereby membership can be tested
efficiently. Second, this algorithm introduces the technique of assigning non-
Boolean values to normally Boolean variables in order to analyze the behavior of
some Boolean function of those variables. That technique is used to great effect
in interactive proof systems, as we show in Section 10.4.
DEFINITION 10.11
Abranching program is a directed acyclic2graph where all nodes
are labeled by variables, except for two output nodes labeled 0or1.
The nodes that are labeled by variables are called query nodes .
Every query node has two outgoing edges: one labeled 0and the
other labeled 1.B o t h o u t p u t n o d e s h a v e n o o u t g o i n g e d g e s .O n e
of the nodes in a branching program is designated the start node.
Ab r a n c h i n gp r o g r a md e t e r m i n e saB o o l e a nf u n c t i o na sf o l l o w s . T a k ea n y
assignment to the variables appearing on its query nodes and, beginning at the
start node, follow the path determined by taking the outgoing edge from each
query node according to the value assigned to the indicated variable until one
of the output nodes is reached. The output is the label of that output node.
Figure 10.12 gives two examples of branching programs.
Branching programs are related to the class Lin a way that is analogous to
the relationship between Boolean circuits and the class P.P r o b l e m 1 0 . 1 7 a s k s
you to show that a branching program with polynomially many nodes can test
membership in any language over {0,1}that is in L.
2Ad i r e c t e dg r a p hi s acyclic if it has no directed cycles.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 429 ---
10.2 PROBABILISTIC ALGORITHMS 405
FIGURE 10.12
Tw o r e a d - o n c e b r a n c h i n g p r o g r a m s
Tw o b r a n c h i n g p r o g r a m s a r e e q u i v a l e n t i f t h e y d e t e r m i n e e q u a l f u n c t i o n s .
Problem 10.21 asks you to show that the problem of testing equivalence for
branching programs is coNP -complete. Here we consider a restricted form of
branching programs. A read-once branching program is one that can query each
variable at most one time on every directed path from the start node to an output
node. Both branching programs in Figure 10.12 have the read-once feature. Let
EQROBP={⟨B1,B2⟩|B1andB2are equivalent read-once branching programs }.
THEOREM 10.13
EQROBPis in BPP .
PROOF IDEA First, let’s try assigning random values to the variables x1
through xmthat appear in B1andB2,a n de v a l u a t et h e s eb r a n c h i n gp r o g r a m s
on that setting. We accept if B1andB2agree on the assignment and reject oth-
erwise. However, this strategy doesn’t work because two inequivalent read-once
branching programs may disagree only on a single assignment out of the 2m
possible Boolean assignments to the variables. The probability that we would
select that assignment is exponentially small. Hence we would accept with high
probability even when B1andB2are not equivalent, and that is unsatisfactory.
Instead, we modify this strategy by randomly selecting a non-Boolean assign-
ment to the variables, and evaluate B1andB2in a suitably defined manner. We
can then show that if B1andB2are not equivalent, the random evaluations will
likely be unequal.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 430 ---
406 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
PROOF We assign polynomials over x1,...,x mto the nodes and to the edges
of a read-once branching program Bas follows. The constant function 1is
assigned to the start node. If a node labeled xhas been assigned polynomial
p,a s s i g nt h ep o l y n o m i a l xpto its outgoing 1-edge, and assign the polynomial
(1−x)pto its outgoing 0-edge. If the edges incoming to some node have been
assigned polynomials, assign the sum of those polynomials to that node. Fi-
nally, the polynomial that has been assigned to the output node labeled 1is also
assigned to the branching program itself. Now we are ready to present the prob-
abilistic polynomial time algorithm for EQROBP.L e t Fbe a finite field with at
least 3melements.
D=“On input ⟨B1,B2⟩,t w or e a d - o n c eb r a n c h i n gp r o g r a m s :
1.Select elements a1through amat random from F.
2.Evaluate the assigned polynomials p1andp2ata1through am.
3.Ifp1(a1,...,a m)=p2(a1,...,a m),accept ;o t h e r w i s e , reject .”
This algorithm runs in polynomial time because we can evaluate the polyno-
mial corresponding to a branching program without actually constructing the
polynomial. We show that the algorithm decides EQROBPwith an error proba-
bility of at most1
3.
Let’s examine the relationship between a read-once branching program B
and its assigned polynomial p.O b s e r v et h a t f o r a n y B o o l e a n a s s i g n m e n t t o B’s
variables, all polynomials assigned to its nodes evaluate to either 0or1.T h e
polynomials that evaluate to 1are those on the computation path for that as-
signment. Hence Bandpagree when the variables take on Boolean values.
Similarly, because Bis read-once, we may write pas a sum of product terms
y1y2···ym,w h e r ee a c h yiisxi,(1−xi),o r1,a n dw h e r ee a c hp r o d u c tt e r m
corresponds to a path in Bfrom the start node to the output node labeled 1.
The case of yi=1occurs when a path doesn’t contain variable xi.
Ta k e e a c h s u c h p r o d u c t t e r m o f pcontaining a yithat is 1and split it into the
sum of two product terms, one where yi=xiand the other where yi=( 1 −xi).
Doing so yields an equivalent polynomial because 1=xi+( 1 −xi). Continue
splitting product terms until each yiis either xior(1−xi).T h e e n d r e s u l t
is an equivalent polynomial qthat contains a product term for each assignment
on which Bevaluates to 1.N o w w e a r e r e a d y t o a n a l y z e t h e b e h a v i o r o f t h e
algorithm D.
First, we show that if B1andB2are equivalent, Dalways accepts. If the
branching programs are equivalent, they evaluate to 1on exactly the same assign-
ments. Consequently, the polynomials q1andq2are equal because they contain
identical product terms. Therefore, p1andp2are equal on every assignment.
Second, we show that if B1andB2aren’t equivalent, Drejects with a proba-
bility of at least2
3.T h i sc o n c l u s i o nf o l l o w si m m e d i a t e l yf r o mL e m m a1 0 . 1 5 .
The preceding proof relies on the following lemmas concerning the proba-
bility of randomly finding a root of a polynomial as a function of the number of
variables it has, the degrees of its variables, and the size of the underlying field.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 431 ---
10.2 PROBABILISTIC ALGORITHMS 407
LEMMA 10.14
For every d≥0,ad e g r e e - dpolynomial pon a single variable xeither has at
most droots, or is everywhere equal to 0.
PROOF We use induction on d.
Basis: Prove for d=0.Ap o l y n o m i a lo fd e g r e e 0is constant. If that constant is
not0,t h ep o l y n o m i a lc l e a r l yh a sn or o o t s .
Induction step: Assume true for d−1and prove true for d.I fpis a nonzero
polynomial of degree dwith a root at a,t h ep o l y n o m i a l x−adivides pevenly.
Then p/(x−a)is a nonzero polynomial of degree d−1,a n di th a sa tm o s t d−1
roots by virtue of the induction hypothesis.
LEMMA 10.15
LetFbe a finite field with felements and let pbe a nonzero polynomial on the
variables x1through xm,w h e r ee a c hv a r i a b l eh a sd e g r e ea tm o s t d.I fa1through
amare selected randomly in F,t h e n Pr⎪bracketleftbig
p(a1,...,a m)=0⎪bracketrightbig
≤md/f.
PROOF We use induction on m.
Basis: Prove for m=1.B yL e m m a1 0 . 1 4 , phas at most droots, so the proba-
bility that a1is one of them is at most d/f.
Induction step: Assume true for m−1and prove true for m.L e t x1be one of
p’s variables. For each i≤d,l e tpibe the polynomial comprising the terms of p
containing xi
1,b u tw h e r e xi
1has been factored out. Then
p=p0+x1p1+x2
1p2+···+xd
1pd.
Ifp(a1,...,a m)=0 ,o n eo ft w oc a s e sa r i s e s .E i t h e ra l l pievaluate to 0,o rs o m e
pidoesn’t evaluate to 0anda1is a root of the single variable polynomial obtained
by evaluating p0through pdona2through am.
To b o u n d t h e p r o b a b i l i t y t h a t t h e fi r s t c a s e o c c u r s , o b s e r v e t h a t o n e o f t h e pj
must be nonzero because pis nonzero. Then the probability that all pievaluate
to0is at most the probability that pjevaluates to 0.B yt h ei n d u c t i o nh y p o t h e s i s ,
that is at most (m−1)d/fbecause pjhas at most m−1variables.
To b o u n d t h e p r o b a b i l i t y t h a t t h e s e c o n d c a s e o c c u r s , o b s e r v e t h a t i f s o m e pi
doesn’t evaluate to 0,t h e no nt h ea s s i g n m e n to f a2through am,preduces to a
nonzero polynomial in the single variable x1.T h eb a s i sa l r e a d ys h o w st h a t a1is
ar o o to fs u c hap o l y n o m i a lw i t hap r o b a b i l i t yo fa tm o s t d/f.
Therefore, the probability that a1through amis a root of the polynomial is
at most (m−1)d/f+d/f=md/f .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 432 ---
408 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
We conclude this section with one important point concerning the use of
randomness in probabilistic algorithms. In our analyses, we assume that these
algorithms are implemented using true randomness. T rue randomness may be
difficult (or impossible) to obtain, so it is usually simulated with pseudorandom
generators ,w h i c ha r ed e t e r m i n i s t i ca l g o r i t h m sw h o s eo u t p u ta p p e a r sr a n d o m .
Although the output of any deterministic procedure can never be truly random,
some of these procedures generate results that have certain characteristics of
randomly generated results. Algorithms that are designed to use randomness
may work equally well with these pseudorandom generators, but proving that
they do is generally more difficult. Indeed, sometimes probabilistic algorithms
may not work well with certain pseudorandom generators. Sophisticated pseu-
dorandom generators have been devised that produce results indistinguishable
from truly random results by any test that operates in polynomial time, under
the assumption that a one-way function exists. (See Section 10.6 for a discussion
of one-way functions.)
10.3
ALTERNATION
Alternation is a generalization of nondeterminism that has proven to be useful in
understanding relationships among complexity classes, and in classifying specific
problems according to their complexity. Using alternation, we may simplify
various proofs in complexity theory and exhibit a surprising connection between
the time and space complexity measures.
An alternating algorithm may contain instructions to branch a process into
multiple child processes, just as in a nondeterministic algorithm. The difference
between the two lies in the mode of determining acceptance. A nondeterministic
computation accepts if any one of the initiated processes accepts. When an alter-
nating computation divides into multiple processes, two possibilities arise. The
algorithm can designate that the current process accepts if anyof the children
accept, or it can designate that the current process accepts if allof the children
accept.
Picture the difference between alternating and nondeterministic computation
with trees that represent the branching structure of the spawned processes. Each
node represents a configuration in a process. In a nondeterministic computa-
tion, each node computes the ORoperation of its children. That corresponds
to the usual nondeterministic acceptance mode whereby a process is accepting
if any of its children are accepting. In an alternating computation, the nodes
may compute the AND orORoperations as determined by the algorithm. That
corresponds to the alternating acceptance mode whereby a process is accepting
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 433 ---
10.3 ALTERNATION 409
if all or any of its children accept. We define an alternating T uring machine as
follows.
DEFINITION 10.16
Analternating Turing machine is a nondeterministic T uring ma-
chine with an additional feature. Its states, except for qaccept and
qreject,a r ed i v i d e di n t o universal states andexistential states .W h e n
we run an alternating T uring machine on an input string, we label
each node of its nondeterministic computation tree with ∧or∨,
depending on whether the corresponding configuration contains a
universal state or an existential state. We designate a node to be
accepting if it is labeled with ∧and all of its children are accepting,
or if it is labeled with ∨and any of its children are accepting. The
input is accepted if the start node is designated accepting.
The following figure shows nondeterministic and alternating computation
trees. We label the nodes of the alternating computation tree with ∧or∨to
indicate which function of their children they compute.
FIGURE 10.17
Nondeterministic and alternating computation trees
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 434 ---
410 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
ALTERNATING TIME AND SPACE
We define the time and space complexity of these machines in the same way that
we did for nondeterministic T uring machines: by taking the maximum time or
space used by any computation branch. We define the alternating time and space
complexity classes as follows.
DEFINITION 10.18
ATIME (t(n)) ={L|Lis decided by an O(t(n))time
alternating T uring machine }.
ASPACE (f(n)) ={L|Lis decided by an O(f(n))space
alternating T uring machine }.
We define AP,APSPACE ,a n d ALto be the classes of languages that are
decided by alternating polynomial time, alternating polynomial space, and alter-
nating logarithmic space T uring machines, respectively.
EXAMPLE 10.19
Atautology is a Boolean formula that evaluates to 1on every assignment to
its variables. Let TAUT ={⟨φ⟩|φis a tautology }.The following alternating
algorithm shows that TAUT is in AP.
“On input ⟨φ⟩:
1.Universally select all assignments to the variables of φ.
2.For a particular assignment, evaluate φ.
3.Ifφevaluates to 1,accept ;o t h e r w i s e , reject .”
Stage 1 of this algorithm nondeterministically selects every assignment to φ’s
variables with universal branching. That requires all branches to accept in order
for the entire computation to accept. Stages 2 and 3 deterministically check
whether the assignment that was selected on a particular computation branch
satisfies the formula. Hence this algorithm accepts its input if it determines that
all assignments are satisfying.
Observe that TAUT is a member of coNP .I nf a c t ,a n yp r o b l e mi n coNP can
easily be shown to be in APby using an algorithm similar to the preceding one.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 435 ---
10.3 ALTERNATION 411
EXAMPLE 10.20
This example features a language in APthat isn’t known to be in NPor in
coNP .R e c a l l t h e l a n g u a g e MIN-FORMULA that we defined in Problem 7.46
on page 328. The following algorithm shows that MIN-FORMULA is in AP.
“On input ⟨φ⟩:
1.Universally select all formulas ψthat are shorter than φ.
2.Existentially select an assignment to the variables of φ.
3.Evaluate both φandψon this assignment.
4.Accept if the formulas evaluate to different values.
Reject if they evaluate to the same value. ”
This algorithm starts with universal branching to select all shorter formulas
in stage 1 and then switches to existential branching to select an assignment
in stage 2. The term alternation stems from the ability to alternate, or switch,
between universal and existential branching.
Alternation allows us to make a remarkable connection between the time
and space measures of complexity. Roughly speaking, the following theorem
demonstrates an equivalence between alternating time and deterministic space
for polynomially related bounds, and another equivalence between alternating
space and deterministic time when the time bound is exponentially more than
the space bound.
THEOREM 10.21
Forf(n)≥n,w eh a v e ATIME( f(n))⊆SPACE( f(n))⊆ATIME( f2(n)).
Forf(n)≥logn,w eh a v e ASPACE( f(n)) = TIME(2O(f(n))).
Consequently, AL = P ,AP = PSPACE ,a n d APSPACE = EXPTIME .T h e
proof of this theorem is in the following four lemmas.
LEMMA 10.22
Forf(n)≥n,w eh a v e ATIME( f(n))⊆SPACE( f(n)).
PROOF We convert an alternating time O(f(n))machine Mto a determin-
istic space O(f(n))machine Sthat simulates Mas follows. On input w,t h e
simulator Sperforms a depth-first search of M’s computation tree to determine
which nodes in the tree are accepting. Then Saccepts if it determines that the
root of the tree, corresponding to M’s starting configuration, is accepting.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 436 ---
412 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Machine Srequires space for storing the recursion stack that is used in the
depth-first search. Each level of the recursion stores one configuration. The
recursion depth is M’s time complexity. Each configuration uses O(f(n))space,
andM’s time complexity is O(f(n)).H e n c e Suses O(f2(n))space.
We can improve the space complexity by observing that Sdoes not need to
store the entire configuration at each level of the recursion. Instead it records
only the nondeterministic choice that Mmade to reach that configuration from
its parent. Then Scan recover this configuration by replaying the computation
from the start and following the recorded “signposts.” Making this change re-
duces the space usage to a constant at each level of the recursion. The total used
now is thus O(f(n)).
LEMMA 10.23
Forf(n)≥n,w eh a v e SPACE( f(n))⊆ATIME( f2(n)).
PROOF We start with a deterministic space O(f(n))machine Mand con-
struct an alternating machine Sthat uses time O(f2(n))to simulate it. The
approach is similar to that used in the proof of Savitch’s theorem (Theorem 8.5),
where we constructed a general procedure for the yieldability problem.
In the yieldability problem, we are given configurations c1andc2ofMand
an u m b e r t.W e m u s t t e s t w h e t h e r Mcan get from c1toc2within tsteps.
An alternating procedure for this problem first branches existentially to guess a
configuration cmmidway between c1andc2.T h e ni t b r a n c h e su n i v e r s a l l yi n t o
two processes: one that recursively tests whether c1can get to cmwithin t/2
steps, and the other whether cmcan get to c2within t/2steps.
Machine Suses this recursive alternating procedure to test whether the start
configuration can reach an accepting configuration within 2df(n)steps. Here,
dis selected so that Mhas no more than 2df(n)configurations within its space
bound.
The maximum time used on any branch of this alternating procedure is
O(f(n))to write a configuration at each level of the recursion, times the depth
of the recursion, which is log 2df(n)=O(f(n)).H e n c e t h i s a l g o r i t h m r u n s i n
alternating time O(f2(n)).
LEMMA 10.24
Forf(n)≥logn,w eh a v e ASPACE( f(n))⊆TIME(2O(f(n))).
PROOF We construct a deterministic time 2O(f(n))machine Sto simulate an
alternating space O(f(n))machine M.O ni n p u t w,t h es i m u l a t o r Sconstructs
the following graph of the computation of Monw.T h e n o d e s a r e t h e c o n -
figurations of Monwthat use at most df(n)space, where dis the appropriate
constant factor for M.E d g e sg of r o mac o n fi g u r a t i o nt ot h o s ec o n fi g u r a t i o n si t
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 437 ---
10.3 ALTERNATION 413
can yield in a single move of M.A f t e rc o n s t r u c t i n gt h eg r a p h , Srepeatedly scans
it and marks certain configurations as accepting. Initially, only the actual accept-
ing configurations of Mare marked this way. A configuration that performs
universal branching is marked accepting if all of its children are so marked, and
an existential configuration is marked if any of its children are marked. Machine
Scontinues scanning and marking until no additional nodes are marked on a
scan. Finally, Saccepts if the start configuration of Monwis marked.
The number of configurations of Monwis2O(f(n))because f(n)≥logn.
Therefore, the size of the configuration graph is 2O(f(n))and constructing it may
be done in 2O(f(n))time. Scanning the graph once takes roughly the same time.
The total number of scans is at most the number of nodes in the graph because
each scan except for the final one marks at least one additional node. Hence the
total time used is 2O(f(n)).
LEMMA 10.25
Forf(n)≥logn,w eh a v e ASPACE( f(n))⊇TIME(2O(f(n))).
PROOF We show how to simulate a deterministic time 2O(f(n))machine M
by an alternating T uring machine Sthat uses space O(f(n)).T h i s s i m u l a t i o n
is tricky because the space available to Sis so much less than the size of M’s
computation. In this case, Shas only enough space to store pointers into a
tableau for Monw,a sd e p i c t e di nt h ef o l l o w i n gfi g u r e .
FIGURE 10.26
At a b l e a uf o r Monw
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 438 ---
414 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
We use the representation for configurations as given in the proof of Theo-
rem 9.30, whereby a single symbol may represent both the state of the machine
and the content of the tape cell under the head. The contents of cell din Fig-
ure 10.26 are then determined by the contents of its parents a,b,a n d c.( Ac e l l
on the left or right boundary has only two parents.)
Simulator Soperates recursively to guess and then verify the contents of the
individual cells of the tableau. T o verify the contents of a cell doutside the first
row, simulator Sexistentially guesses the contents of the parents, checks whether
their contents would yield d’s contents according to M’s transition function, and
then universally branches to verify these guesses recursively. If dwere in the first
row, Sverifies the answer directly because it knows M’s starting configuration.
We assume that Mmoves its head to the left-hand end of the tape on acceptance,
soScan determine whether Maccepts wby checking the contents of the lower
leftmost cell of the tableau. Hence Snever needs to store more than a single
pointer to a cell in the tableau, so it uses space log 2O(f(n))=O(f(n)).
THE POLYNOMIAL TIME HIERARCHY
Alternating machines provide a way to define a natural hierarchy of classes within
the class PSPACE .
DEFINITION 10.27
Letibe a natural number. A Σi-alternating Turing machine is
an alternating T uring machine that on every input and on every
computation branch contains at most iruns of universal or existen-
tial steps, starting with existential steps. A Πi-alternating Turing
machine is similar except that it starts with universal steps.
Define ΣiTIME( f(n))to be the class of languages that a Σi-alternating
TMcan decide in O(f(n))time. Similarly, define the class ΠiTIME( f(n))for
Πi-alternating T uring machines, and define the classes ΣiSPACE( f(n))and
ΠiSPACE( f(n))for space bounded alternating T uring machines. We define
thepolynomial time hierarchy to be the collection of classes
ΣiP=⎪uniondisplay
kΣiTIME( nk)and
ΠiP=⎪uniondisplay
kΠiTIME( nk).
Define class PH=⎪uniontext
iΣiP=⎪uniontext
iΠiP. Clearly, NP = Σ 1PandcoNP = Π 1P.
Additionally, MIN-FORMULA ∈Π2P.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 439 ---
10.4 INTERACTIVE PROOF SYSTEMS 415
10.4
INTERACTIVE PROOF SYSTEMS
Interactive proof systems provide a way to define a probabilistic analog of the
class NP,m u c hl i k ep r o b a b i l i s t i cp o l y n o m i a lt i m ea l g o r i t h m sp r o v i d eap r o b -
abilistic analog to P.T h e d e v e l o p m e n t o f i n t e r a c t i v e p r o o f s y s t e m s h a s p r o -
foundly affected complexity theory and has led to important advances in the
fields of cryptography and approximation algorithms. T o get a feel for this new
concept, let’s revisit our intuition about NP.
The languages in NPare those whose members all have short certificates of
membership that can be easily checked. If you need to, go back to page 294
and review this formulation of NP.L e t ’ s r e p h r a s e t h i s f o r m u l a t i o n b y c r e a t i n g
two entities: a Prover that finds the proofs of membership, and a Verifier that
checks them. Think of the Prover as if it were convincing the Verifier of w’s
membership in A.W e r e q u i r e t h e V e r i fi e r t o b e a p o l y n o m i a l t i m e b o u n d e d
machine; otherwise, it could figure out the answer itself. We don’t impose any
computational bound on the Prover because finding the proof may be time-
consuming.
Ta k e t h e SAT problem, for example. A Prover can convince a polynomial
time Verifier that a formula φis satisfiable by supplying a satisfying assignment.
Can a Prover similarly convince a computationally limited Verifier that a for-
mula is notsatisfiable? The complement of SAT is not known to be in NP,s o
we can’t rely on the certificate idea. Nonetheless, the surprising answer is yes,
provided we give the Prover and Verifier two additional features. First, they are
permitted to engage in a two-way dialog. Second, the Verifier may be a prob-
abilistic polynomial time machine that reaches the correct answer with a high
degree of, but not absolute, certainty. Such a Prover and Verifier constitute an
interactive proof system.
GRAPH NONISOMORPHISM
We illustrate the interactive proof concept through the elegant example of the
graph isomorphism problem. Call graphs GandHisomorphic if the nodes of G
may be reordered so that it is identical to H.L e t
ISO={⟨G, H⟩|GandHare isomorphic graphs }.
Although ISOis obviously in NP,e x t e n s i v er e s e a r c hh a ss of a rf a i l e dt od e m o n -
strate either a polynomial time algorithm for this problem or a proof that it is
NP-complete. It is one of a relatively small number of naturally occurring lan-
guages in NPthat haven’t been placed in either category.
Here, we consider the language that is complementary to ISO—namely, the
language NONISO ={⟨G, H⟩|GandHarenotisomorphic graphs }.NONISO is
not known to be in NPbecause we don’t know how to provide short certificates
that graphs aren’t isomorphic. Nonetheless, when two graphs aren’t isomorphic,
aP r o v e rc a nc o n v i n c eaV e r i fi e ro ft h i sf a c t ,a sw ew i l ls h o w .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 440 ---
416 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Suppose that we have two graphs: G1andG2.I f t h e y a r e i s o m o r p h i c , t h e
Prover can convince the Verifier of this fact by presenting the isomorphism or
reordering. But if they aren’t isomorphic, how can the Prover convince the
Verifier of that fact? Don’t forget: The Verifier doesn’t necessarily trust the
Prover, so it isn’t enough for the Prover to declare that they aren’t isomorphic.
The Prover must convince the Verifier. Consider the following short protocol.
The Verifier randomly selects either G1orG2and then randomly reorders its
nodes to obtain a graph H.T h eV e r i fi e rs e n d s Hto the Prover. The Prover must
respond by declaring whether G1orG2was the source of H.T h a t c o n c l u d e s
the protocol.
IfG1andG2were indeed nonisomorphic, the Prover could always carry out
the protocol because the Prover could identify whether Hcame from G1orG2.
However, if the graphs were isomorphic, Hmight have come from either G1
orG2.S oe v e nw i t hu n l i m i t e dc o m p u t a t i o n a lp o w e r ,t h eP r o v e rw o u l dh a v en o
better than a 50–50 chance of getting the correct answer. Thus, if the Prover is
able to answer correctly consistently (say in 100 repetitions of the protocol), the
Verifier has convincing evidence that the graphs are actually nonisomorphic.
DEFINITION OF THE MODEL
To d e fi n e t h e i n t e r a c t i v e p r o o f s y s t e m m o d e l f o r m a l l y, w e d e s c r i b e t h e Ve r i fi e r,
the Prover, and their interaction. You’ll find it helpful to keep the graph non-
isomorphism example in mind. We define the Verifier to be a function Vthat
computes its next transmission to the Prover from the message history sent so
far. The function Vhas three inputs:
1. Input string. The objective is to determine whether this string is a mem-
ber of some language. In the NONISO example, the input string encoded
the two graphs.
2. Random input. For convenience in making the definition, we provide
the Verifier with a randomly chosen input string instead of the equivalent
capability to make probabilistic moves during its computation.
3. Partial message history. Af u n c t i o nh a sn om e m o r yo ft h ed i a l o gt h a t
has been sent so far, so we provide the memory externally via a string
representing the exchange of messages up to the present point. We use
the notation m1#m2#···#mito represent the exchange of messages m1
through mi.
The Verifier’s output is either the next message mi+1in the sequence or accept
orreject ,d e s i g n a t i n gt h ec o n c l u s i o no ft h ei n t e r a c t i o n . T h u s , Vhas the func-
tional form V:Σ∗×Σ∗×Σ∗−→Σ∗∪{accept ,reject }.
V(w, r, m 1#···#mi)=mi+1means that the input string is w,t h er a n d o m
input is r,t h ec u r r e n tm e s s a g eh i s t o r yi s m1through mi,a n dt h eV e r i fi e r ’ sn e x t
message to the Prover is mi+1.
The Prover is a party with unlimited computational ability. We define it to
be a function Pwith two inputs:
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 441 ---
10.4 INTERACTIVE PROOF SYSTEMS 417
1. Input string
2. Partial message history
The Prover’s output is the next message to the Verifier. Formally, Phas the form
P:Σ∗×Σ∗−→Σ∗.
P(w, m 1#···#mi)=mi+1means that the Prover sends mi+1to the Verifier
after having exchanged messages m1through miso far.
Next, we define the interaction between the Prover and the Verifier. For par-
ticular strings wandr,w ew r i t e (V↔P)(w, r)=accept if a message sequence
m1through mkexists for some kwhereby
1.for0≤i<k ,w h e r e iis an even number, V(w, r, m 1#···#mi)=mi+1;
2.for0<i<k ,w h e r e iis an odd number, P(w, m 1#···#mi)=mi+1;a n d
3.the final message mkin the message history is accept .
To s i m p l i f y t h e d e fi n i t i o n o f t h e c l a s s IP,w ea s s u m et h a tt h el e n g t h so ft h e
Verifier’s random input and each of the messages exchanged between the Verifier
and the Prover are p(n)for some polynomial pthat depends only on the Verifier.
Furthermore, we assume that the total number of messages exchanged is at most
p(n).T h e f o l l o w i n g d e fi n i t i o n g i v e s t h e p r o b a b i l i t y t h a t a n i n t e r a c t i v e p r o o f
system accepts an input string w.F o ra n ys t r i n g wof length n,w ed e fi n e
Pr⎪bracketleftbig
V↔Paccepts w⎪bracketrightbig
=P r⎪bracketleftbig
(V↔P)(w, r)=accept⎪bracketrightbig
,
where ris a randomly selected string of length p(n).
DEFINITION 10.28
Say that language Ais in IPif some polynomial time computable
function Vexists such that for some (arbitrary) function Pand for
every (arbitrary) function ⎪tildewidePand for every string w,
1.w∈Aimplies Pr⎪bracketleftbig
V↔Paccepts w⎪bracketrightbig
≥2
3,a n d
2.w̸∈Aimplies Pr⎪bracketleftbig
V↔⎪tildewidePaccepts w⎪bracketrightbig
≤1
3.
In other words, if w∈Athen some Prover P(an “honest” Prover) causes the
Verifier to accept with high probability; but if w̸∈A,t h e nn oP r o v e r( n o te v e n
a“ c r o o k e d ”P r o v e r ⎪tildewideP)c a u s e st h eV e r i fi e rt oa c c e p tw i t hh i g hp r o b a b i l i t y .
We may amplify the success probability of an interactive proof system by
repetition, as we did in Lemma 10.5, to make the error probability exponentially
small. Obviously, IPcontains both the classes NPandBPP .W eh a v ea l s os h o w n
that it contains the language NONISO ,w h i c hi sn o tk n o w nt ob ei ne i t h e r NP
orBPP .A sw ew i l ln e x ts h o w , IPis a surprisingly large class, equal to the class
PSPACE .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 442 ---
418 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
IP = PSPACE
In this section, we prove one of the more remarkable theorems in complexity
theory: the equality of the classes IPandPSPACE .T h u s , f o r a n y l a n g u a g e i n
PSPACE ,aP r o v e rc a nc o n v i n c eap r o b a b i l i s t i cp o l y n o m i a lt i m eV e r i fi e ra b o u t
the membership of a string in the language, even though a conventional proof
of membership might be exponentially long.
THEOREM 10.29
IP = PSPACE .
We divide the proof of this theorem into lemmas that establish containment
in each direction. The first lemma shows IP⊆PSPACE .Though a bit tech-
nical, the proof of this lemma is a standard simulation of an interactive proof
system by a polynomial space machine.
LEMMA 10.30
IP⊆PSPACE .
PROOF LetAbe a language in IP.A s s u m e t h a t A’s Verifier Vexchanges
exactly p=p(n)messages when the input whas length n.W e c o n s t r u c t a
PSPACE machine that simulates V.F i r s t ,f o ra n ys t r i n g w,w ed e fi n e
Pr⎪bracketleftbig
Vaccepts w⎪bracketrightbig
=m a x
PPr⎪bracketleftbig
V↔Paccepts w⎪bracketrightbig
.
This value is at least2
3ifwis in A,a n di sa tm o s t1
3if not. We show how
to calculate this value in polynomial space. Let Mjdenote a message history
m1#···#mj.W eg e n e r a l i z et h ed e fi n i t i o no ft h ei n t e r a c t i o no f VandPto start
with an arbitrary message stream Mj.W e w r i t e (V↔P)(w, r, M j)=accept if
we can extend Mjwith messages mj+1through mpso that
1.for0≤i<p,w h e r e iis an even number, V(w, r, m 1#···#mi)=mi+1;
2.forj≤i<p,w h e r e iis an odd number, P(w, m 1#···#mi)=mi+1;a n d
3.the final message mpin the message history is accept .
Observe that these conditions require that V’s messages be consistent with the
messages already present in Mj.F u r t h e rg e n e r a l i z i n go u re a r l i e rd e fi n i t i o n s ,w e
define
Pr⎪bracketleftbig
V↔Paccepts wstarting at Mj⎪bracketrightbig
=P r r⎪bracketleftbig
(V↔P)(w, r, M j)=accept⎪bracketrightbig
.
Here, and for the remainder of this proof, the notation Prrmeans that the prob-
ability is taken over all strings rthat are consistent with Mj.I f n o s u c h rexist,
then define the probability to be 0.W et h e nd e fi n e
Pr⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
=m a x
PPr⎪bracketleftbig
V↔Paccepts wstarting at Mj⎪bracketrightbig
.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 443 ---
10.4 INTERACTIVE PROOF SYSTEMS 419
For every 0≤j≤pand every message stream Mj,l e t NMjbe defined
inductively for decreasing j,s t a r t i n gf r o mt h eb a s ec a s e sa t j=p.F o ram e s s a g e
stream Mpthat contains pmessages, let NMp=1ifMpis consistent with V’s
messages for some string randmp=accept .O t h e r w i s e ,l e t NMp=0.
Forj<p and a message stream Mj,d e fi n e NMjas follows.
NMj=⎪braceleft⎢igg
max mj+1NMj+1 oddj<p
wt-avgmj+1NMj+1even j<p
Here, wt-avgmj+1NMj+1means⎪summationtext
mj+1⎪parenleftbig
Prr⎪bracketleftbig
V(w, r, M j)=mj+1⎪bracketrightbig
·NMj+1⎪parenrightbig
.
The expression is the average of NMj+1,w e i g h t e db yt h ep r o b a b i l i t yt h a tt h e
Verifier sent message mj+1.
LetM0be the empty message stream. We make two claims about the value
NM0.F i r s t ,w ec a nc a l c u l a t e NM0in polynomial space. We do so recursively by
calculating NMjfor every jandMj. Calculating max mj+1is straightforward. T o
calculate wt-avgmj+1,w eg ot h r o u g ha l ls t r i n g s rof length p,a n de l i m i n a t et h o s e
that cause the Verifier to produce an output that is inconsistent with Mj.I f n o
strings rremain, then wt-avgmj+1is0.I fs o m es t r i n g sr e m a i n ,w ed e t e r m i n et h e
fraction of the remaining strings rthat cause the Verifier to output mj+1.T h e n
we weight NMj+1by that fraction to compute the average value. The depth of
the recursion is p,a n dt h e r e f o r eo n l yp o l y n o m i a ls p a c ei sn e e d e d .
Second, NM0equals Pr⎪bracketleftbig
Vaccepts w⎪bracketrightbig
,t h ev a l u en e e d e di no r d e rt od e t e r -
mine whether wis in A.W ep r o v et h i ss e c o n dc l a i mb yi n d u c t i o na sf o l l o w s .
CLAIM 10.31
For every 0≤j≤pand every Mj,
NMj=P r⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
.
We prove this claim by induction on j,w h e r et h eb a s i so c c u r sa t j=pand the
induction proceeds from pdown to 0.
Basis: Prove the claim for j=p.W ek n o wt h a t mpis either accept orreject .I f
mpisaccept ,NMpis defined to be 1,a n d Pr⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
=1
because the message stream already indicates acceptance, so the claim is true.
The case when mpisreject is similar.
Induction step: Assume that the claim is true for some j+1≤pand any message
stream Mj+1.P r o v et h a ti ti st r u ef o r jand any message stream Mj.I fjis even,
mj+1is a message from VtoP.W et h e nh a v et h es e r i e so fe q u a l i t i e s :
NMj1=⎪summationdisplay
mj+1⎪parenleftbig
Prr⎪bracketleftbig
V(w, r, M j)=mj+1⎪bracketrightbig
·NMj+1⎪parenrightbig
2=⎪summationdisplay
mj+1⎪parenleftbig
Prr⎪bracketleftbig
V(w, r, M j)=mj+1⎪bracketrightbig
·Pr⎪bracketleftbig
Vaccepts wstarting at Mj+1⎪bracketrightbig⎪parenrightbig
3=P r⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 444 ---
420 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Equality 1 is the definition of NMj.E q u a l i t y2i sb a s e do nt h ei n d u c t i o nh y p o t h -
esis. Equality 3 follows from the definition of Pr⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
.
Thus, the claim holds if jis even. If jis odd, mj+1is a message from PtoV.
We then have the series of equalities:
NMj1=m a x
mj+1NMj+1
2=m a x
mj+1Pr⎪bracketleftbig
Vaccepts wstarting at Mj+1⎪bracketrightbig
3=P r⎪bracketleftbig
Vaccepts wstarting at Mj⎪bracketrightbig
.
Equality 1 is the definition of NMj.E q u a l i t y 2 u s e s t h e i n d u c t i o n h y p o t h e s i s .
We break equality 3 into two inequalities. We have ≤because the Prover that
maximizes the lower line could send the message mj+1that maximizes the up-
per line. We have ≥because that same Prover cannot do any better than send
that same message. Sending anything other than a message that maximizes the
upper line would lower the resulting value. That proves the claim for odd jand
completes one direction of the proof of Theorem 10.29.
Now we prove the other direction of the theorem. The proof of this lemma
introduces a novel algebraic method of analyzing computation.
LEMMA 10.32
PSPACE ⊆IP.
Before getting to the proof of this lemma, we prove a weaker result that il-
lustrates the technique. Define the counting problem for satisfiability to be the
language
#SAT ={⟨φ, k⟩|φis a cnf-formula with exactly ksatisfying assignments }.
THEOREM 10.33
#SAT ∈IP.
PROOF IDEA This proof presents a protocol whereby the Prover persuades
the Verifier that kis the actual number of satisfying assignments of a given cnf-
formula φ.B e f o r eg e t t i n g t o t h e p r o t o c o l i t s e l f ,l e t ’ s c o n s i d e r a n o t h e r p r o t o c o l
that has some of the flavor of the correct one but is unsatisfactory because it
requires an exponential time Verifier. Say that φhas variables x1through xm.
Letfibe the function where for 0≤i≤manda1,...,a i∈{0,1},w es e t
fi(a1,...,a i)equal to the number of satisfying assignments of φsuch that each
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 445 ---
10.4 INTERACTIVE PROOF SYSTEMS 421
xj=ajforj≤i.T h ec o n s t a n tf u n c t i o n f0()is the number of satisfying assign-
ments of φ.T h ef u n c t i o n fm(a1,...,a m)is1if those ai’s satisfy φ;o t h e r w i s e ,i t
is0.A ne a s yi d e n t i t yh o l d sf o re v e r y i<m anda1,...,a i:
fi(a1,...,a i)=fi+1(a1,...,a i,0) + fi+1(a1,...,a i,1).
The protocol for #SAT begins with phase 0 and ends with phase m+1.T h e
input is the pair ⟨φ, k⟩.
Phase 0. Psends f0()toV.
Vchecks that k=f0()andrejects if not.
Phase 1. Psends f1(0)andf1(1)toV.
Vchecks that f0() = f1(0) + f1(1)andrejects if not.
Phase 2. Psends f2(0,0),f2(0,1),f2(1,0),a n d f2(1,1)toV.
Vchecks that f1(0) = f2(0,0)+f2(0,1)andf1(1) = f2(1,0)+f2(1,1)andrejects
if not....
Phase m.Psends fm(a1,...,a m)for each assignment to the ai’s.
Vchecks the 2m−1equations linking fm−1with fmandrejects if any fail.
Phase m+1.Vchecks that the values fm(a1,...,a m)are correct for each
assignment to the ai’s by evaluating φon each assignment. If all assignments are
correct, Vaccepts ;o t h e r w i s e , Vrejects .T h a tc o m p l e t e st h ed e s c r i p t i o no ft h e
protocol.
This protocol doesn’t provide a proof that #SAT is in IPbecause the Verifier
must spend exponential time just to read the exponentially long messages that
the Prover sends. Let’s examine it for correctness anyway because that helps us
understand the next, more efficient protocol.
Intuitively, a protocol decides a language Aif a Prover can convince the Ver-
ifier of the membership of strings in A.I n o t h e r w o r d s , i fa s t r i n gi s a m e m b e r
ofA,s o m eP r o v e rc a nc a u s et h eV e r i fi e rt oa c c e p tw i t hh i g hp r o b a b i l i t y .I ft h e
string isn’t a member of A, no Prover—not even a crooked or devious one—can
cause the Verifier to accept with more than low probability. We use the symbol
Pto designate the Prover that correctly follows the protocol, and that thereby
makes Vaccept with high probability when the input is in A.W eu s et h es y m -
bol⎪tildewidePto designate any Prover that interacts with the Verifier when the input
isn’t in A.T h i n k o f ⎪tildewidePas an adversary—as though ⎪tildewidePwere attempting to make
Vaccept when Vshould reject. The notation ⎪tildewidePis suggestive of a “crooked”
Prover.
In the #SAT protocol we just described, the Verifier ignores its random input
and operates deterministically once the Prover has been selected. T o prove the
protocol is correct, we establish two facts. First, if kis the correct number of sat-
isfying assignments for φin the input ⟨φ, k⟩,s o m eP r o v e r Pcauses Vto accept.
The Prover that gives accurate responses at every phase does the job. Second, if
kisn’t correct, every Prover ⎪tildewidePcauses Vto reject. We argue this case as follows.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 446 ---
422 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Ifkis not correct and ⎪tildewidePgives accurate responses, Vrejects outright in phase 0
because f0()is the number of φ’s satisfying assignments and therefore f0()̸=k.
To p r e v e n t Vfrom rejecting in phase 0, ⎪tildewidePmust send an incorrect value for f0(),
denoted ⎪tildewidef0().I n t u i t i v e l y , ⎪tildewidef0()is alieabout the value of f0().A si nr e a ll i f e ,l i e s
beget lies, and ⎪tildewidePis forced to continue lying about other values of fiin order to
avoid being caught during later phases. Eventually these lies catch up with ⎪tildewidePin
phase m+1,w h e r e Vchecks the values of fmdirectly.
More precisely, because ⎪tildewidef0()̸=f0(),a tl e a s to n eo ft h ev a l u e s f1(0)andf1(1)
that ⎪tildewidePsends in phase 1 must be incorrect; otherwise, Vrejects when it checks
whether f0() = f1(0) + f1(1).L e t ’ ss a yt h a t f1(0)is incorrect and call the value
that is sent instead ⎪tildewidef1(0). Continuing in this way, we see that at every phase
⎪tildewidePmust end up sending some incorrect value ⎪tildewidefi(a1,...,a i),o rVwould have
rejected by that point. But when Vchecks the incorrect value ⎪tildewidefm(a1,...,a m)
in phase m+1,i tr e j e c t sa n y w a y .T h u s ,w eh a v es h o w nt h a ti f kis incorrect, V
rejects no matter what ⎪tildewidePdoes. Therefore, the protocol is correct.
The problem with this protocol is that the number of messages doubles with
every phase. This doubling occurs because the Verifier requires the two values
fi+1(...,0)andfi+1(...,1)to confirm the one value fi(...).I f w e c o u l d fi n d
aw a yf o rt h eV e r i fi e rt oc o n fi r mav a l u eo f fiwith only a single value of fi+1,
the number of messages wouldn’t grow at all. We can do so by extending the
functions fito non-Boolean inputs and confirming the single value fi+1(...,z )
for some zselected at random from a finite field.
PROOF Letφbe a cnf-formula with variables x1through xm.I nat e c h n i q u e
called arithmetization ,w ea s s o c i a t ew i t h φap o l y n o m i a l p(x1,...,x m)where
pmimics φby simulating the Boolean ∧,∨,a n d ¬operations with the arith-
metic operations +and×as follows. If αandβare subformulas, we replace
expressions
α∧βwith αβ,
¬α with 1−α,and
α∨βwith α∗β=1−(1−α)(1−β).
One observation regarding pthat will be important to us later is that the
degree of any of its variables is not large. The operations αβandα∗βeach
produce a polynomial whose degree is at most the sum of the degrees of the
polynomials for αandβ.T h u s , t h e d e g r e e o f a n y v a r i a b l e i s a t m o s t n,t h e
length of φ.
Ifp’s variables are assigned Boolean values, it agrees with φon that assign-
ment. Evaluating pwhen the variables are assigned non-Boolean values has no
obvious interpretation in φ.H o w e v e r ,t h e p r o o f u s e s s u c h a s s i g n m e n t s a n y w a y
to analyze φ,m u c ha st h ep r o o fo fT h e o r e m1 0 . 1 3u s e sn o n - B o o l e a na s s i g n -
ments to analyze read-once branching programs. The variables range over a
finite field Fwith qelements, where qis at least 2n.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 447 ---
10.4 INTERACTIVE PROOF SYSTEMS 423
We use pto redefine the functions fithat we defined in the proof idea section.
For0≤i≤mand for a1,...,a i∈F,l e t
fi(a1,...,a i)=⎪summationdisplay
ai+1,...,a m∈{0,1}p(a1,...,a m).
Observe that this redefinition extends the original definition because the two
agree when a1through aitake on Boolean values. Thus, f0()is still the num-
ber of satisfying assignments of φ.E a c h o f t h e f u n c t i o n s fi(x1,...,x i)can be
expressed as a polynomial in x1through xi.T h e d e g r e e o f e a c h o f t h e s e p o l y -
nomials is at most that of p.
Next, we present the protocol for #SAT .I n i t i a l l y , Vreceives input ⟨φ, k⟩and
arithmetizes φto obtain polynomial p.A l la r i t h m e t i ci sd o n ei nt h efi e l d Fwith
qelements, where qis a prime that is larger than 2n.( F i n d i n g s u c h a p r i m e q
requires an extra step, but we ignore this point here because the proof we give
shortly of the stronger result IP = PSPACE doesn’t require it.) A comment in
double brackets appears at the start of the description of each phase.
Phase 0. [[Psends f0().]]
P→V:Psends f0()toV.
Vchecks that k=f0().Vrejects if that fails.
Phase 1. [[Ppersuades Vthatf0()is correct if f1(r1)is correct. ]]
P→V:Psends the coefficients of f1(z)as a polynomial in z.
Vuses these coefficients to evaluate f1(0)andf1(1).
Vchecks that f0() = f1(0) + f1(1)andrejects if not.
(Remember that all calculations are done over F.)
V→P:Vselects r1at random from Fand sends it to P.
Phase 2. [[Ppersuades Vthatf1(r1)is correct if f2(r1,r2)is correct. ]]
P→V:Psends the coefficients of f2(r1,z)as a polynomial in z.
Vuses these coefficients to evaluate f2(r1,0)andf2(r1,1).
Vchecks that f1(r1)=f2(r1,0) + f2(r1,1)andrejects if not.
V→P:Vselects r2at random from Fand sends it to P.
...
Phase i.[[Ppersuades Vthatfi−1(r1,...,r i−1)is correct if fi(r1,...,r i)is correct. ]]
P→V:Psends the coefficients of fi(r1,...,r i−1,z)as a polynomial in z.
Vuses these coefficients to evaluate fi(r1,...,r i−1,0)andfi(r1,...,r i−1,1).
Vchecks that fi−1(r1,...,r i−1)=fi(r1,...,r i−1,0) + fi(r1,...,r i−1,1)and
rejects if not.
V→P:Vselects riat random from Fand sends it to P.
...
Phase m+1.[[Vchecks directly that fm(r1,...,r m)is correct. ]]
Vevaluates p(r1,...,r m)to compare with the value Vhas for fm(r1,...,r m).
If they are equal, Vaccepts ;o t h e r w i s e , Vrejects .
That completes the description of the protocol.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 448 ---
424 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Now we show that this protocol decides #SAT .F i r s t , i f φhasksatisfying
assignments, Vobviously accepts with certainty if Prover Pfollows the protocol.
Second, we show that if φdoesn’t have kassignments, no Prover can make it
accept with more than a low probability. Let ⎪tildewidePbe any Prover.
To p r e v e n t Vfrom rejecting outright, ⎪tildewidePmust send an incorrect value ⎪tildewidef0()
forf0()in phase 0. Therefore, in phase 1, one of the values that Vcalculates
forf1(0)andf1(1)must be incorrect. Thus, the coefficients that ⎪tildewidePsent for
f1(z)as a polynomial in zmust be wrong. Let ⎪tildewidef1(z)be the function that these
coefficients represent instead. Next comes a key step of the proof.
When Vpicks a random r1inF,w ec l a i mt h a t ⎪tildewidef1(r1)is unlikely to equal
f1(r1).F o r n≥10,w es h o wt h a t
Pr⎪bracketleftbig⎪tildewidef1(r1)=f1(r1)⎪bracketrightbig
<n−2.
That bound on the probability follows from Lemma 10.14: A polynomial in
as i n g l ev a r i a b l eo fd e g r e ea tm o s t dcan have no more than droots, unless it
always evaluates to 0.T h e r e f o r e , a n y t w o p o l y n o m i a l s i n a s i n g l e v a r i a b l e o f
degree at most dcan agree in at most dplaces, unless they agree everywhere.
Recall that the degree of the polynomial for f1is at most n,a n dt h a t Vrejects
if the degree of the polynomial ⎪tildewidePsends for ⎪tildewidef1is greater than n.W e h a v e a l -
ready determined that these functions don’t agree everywhere, so Lemma 10.14
implies they can agree in at most nplaces. The size of Fis greater than 2n.T h e
chance that r1happens to be one of the places where the functions agree is at
most n/2n,w h i c hi sl e s st h a n n−2forn≥10.
To r e c a p w h a t w e ’ v e s h o w n s o f a r, i f ⎪tildewidef0()is wrong, ⎪tildewidef1’s polynomial must be
wrong, and then ⎪tildewidef1(r1)would likely be wrong by virtue of the preceding claim.
In the unlikely event that ⎪tildewidef1(r1)agrees with f1(r1),⎪tildewidePwas “lucky” at this phase
and it will be able to make Vaccept (even though Vshould reject) by following
the instructions for Pin the rest of the protocol.
Continuing further with the argument, if ⎪tildewidef1(r1)were wrong, at least one of
the values Vcomputes for f2(r1,0)andf2(r1,1)in phase 2 must be wrong, so
the coefficients that ⎪tildewidePsent for f2(r1,z)as a polynomial in zmust be wrong. Let
⎪tildewidef2(r1,z)be the function these coefficients represent instead. The polynomials
forf2(r1,z)and ⎪tildewidef2(r1,z)have degree at most n.S o a s b e f o r e , t h e p r o b a b i l i t y
that they agree at a random r2inFis at most n−2.T h u s , w h e n Vpicks r2at
random, ⎪tildewidef2(r1,r2)is likely to be wrong.
The general case follows in the same way to show that for each 1≤i≤m,i f
⎪tildewidefi−1(r1,...,r i−1)̸=fi−1(r1,...,r i−1),
then for n≥10and for richosen at random in F,
Pr⎪bracketleftbig⎪tildewidefi(r1,...,r i)=fi(r1,...,r i)⎪bracketrightbig
≤n−2.
Thus, by giving an incorrect value for f0(),⎪tildewidePis probably forced to give incor-
rect values for f1(r1),f2(r1,r2),a n ds oo nt o fm(r1,...,r m).T h e p r o b a b i l i t y
that ⎪tildewidePgets lucky because Vselects an ri,w h e r e ⎪tildewidefi(r1,...,r i)=fi(r1,...,r i)
even though ⎪tildewidefiandfiare different in some phase, is at most the number of
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 449 ---
10.4 INTERACTIVE PROOF SYSTEMS 425
phases mtimes n−2or at most 1/n.I f ⎪tildewidePnever gets lucky, it eventually sends
an incorrect value for fm(r1,...,r m).B u t Vchecks that value of fmdirectly
in phase m+1and will catch any error at that point. So if kis not the num-
ber of satisfying assignments of φ,n oP r o v e rc a nm a k et h eV e r i fi e ra c c e p tw i t h
probability greater than 1/n.
To c o m p l e t e t h e p r o o f o f t h e t h e o r e m , w e n e e d o n l y s h o w t h a t t h e Ve r i fi e r
operates in probabilistic polynomial time, which is obvious from its description.
Next, we return to the proof of Lemma 10.32, that PSPACE ⊆IP.T h e
proof is similar to that of Theorem 10.33 except for an additional idea used here
to lower the degrees of polynomials that occur in the protocol.
PROOF IDEA Let’s first try the idea we used in the preceding proof and de-
termine where the difficulty occurs. T o show that every language in PSPACE is
inIP,w en e e do n l ys h o wt h a tt h e PSPACE -complete language TQBF is in IP.
Letψbe a quantified Boolean formula of the form
ψ=Q1x1Q2x2···Qmxm[φ],
where φis a cnf-formula and each Qiis∃or∀.W ed e fi n ef u n c t i o n s fias before,
except that now we take the quantifiers into account. For 0≤i≤mand
a1,...,a m∈{0,1},l e t
fi(a1,...,a i)=⎪braceleft⎢igg
1ifQi+1xi+1···Qmxm[φ(a1,...,a i)]is true ;
0otherwise ,
where φ(a1,...,a i)isφwith a1through aisubstituted for x1through xi.T h u s ,
f0()is the truth value of ψ.W et h e nh a v et h ea r i t h m e t i ci d e n t i t i e s
Qi+1=∀:fi(a1,...,a i)=fi+1(a1,...,a i,0)·fi+1(a1,...,a i,1) and
Qi+1=∃:fi(a1,...,a i)=fi+1(a1,...,a i,0)∗fi+1(a1,...,a i,1).
Recall that we defined x∗yto be 1−(1−x)(1−y).
An a t u r a lv a r i a t i o no ft h ep r o t o c o lf o r #SAT suggests itself where we extend
thefi’s to a finite field and use the identities for quantifiers instead of the identi-
ties for summation. The problem with this idea is that when arithmetized, every
quantifier may double the degree of the resulting polynomial. The degrees of
the polynomials might then grow exponentially large, which would require the
Verifier to run for exponential time to process the exponentially many coeffi-
cients that the Prover would need to send to describe the polynomials.
To k e e p t h e d e g r e e s o f t h e p o l y n o m i a l s s m a l l , w e i n t r o d u c e a r e d u c t i o n o p e r -
ation Rthat reduces the degrees of polynomials without changing their behavior
on Boolean inputs.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 450 ---
426 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
PROOF Letψ=Qx1···Qxm[φ]be a quantified Boolean formula, where φ
is a cnf-formula. T o arithmetize ψ,w ei n t r o d u c et h ee x p r e s s i o n
ψ′=Qx1Rx1Qx2Rx1Rx2Qx3Rx1Rx2Rx3···QxmRx1···Rxm[φ].
Don’t worry about the meaning of Rxifor now. It is useful only for defining the
functions fi.W er e w r i t e ψ′as
ψ′=S1y1S2y2···Skyk[φ],
where each Si∈{ ∀,∃,R}andyi∈{x1,...,x m}.
For each i≤k,w ed e fi n et h ef u n c t i o n fi.W ed e fi n e fk(x1,...,x m)to be the
polynomial p(x1,...,x m)obtained by arithmetizing φ.F o r i<k ,w ed e fi n e fi
in terms of fi+1:
Si+1=∀:fi(...)=fi+1(...,0)·fi+1(...,1);
Si+1=∃:fi(...)=fi+1(...,0)∗fi+1(...,1);
Si+1=R:fi(...,a )=( 1 −a)fi+1(...,0) + afi+1(...,1).
IfSi+1is∀or∃,fihas one fewer input variable than fi+1does. If Si+1isR,
the two functions have the same number of input variables. Thus, function fi
will not, in general, depend on ivariables. T o avoid cumbersome subscripts, we
use “ ...”i np l a c eo f a1through ajfor the appropriate values of j.F u r t h e r m o r e ,
we reorder the inputs to the functions so that input variable yi+1is the last
argument.
Note that the Rxoperation on polynomials doesn’t change their values on
Boolean inputs. Therefore, f0()is still the truth value of ψ.H o w e v e r ,n o t et h a t
theRxoperation produces a result that is linear in x.W e a d d e d Rx1···Rxi
after Qixiinψ′in order to reduce the degree of each variable to 1prior to the
squaring due to arithmetizing Qi.
Now we are ready to describe the protocol. All arithmetic operations in this
protocol are over a field Fof size at least n4,w h e r e nis the length of ψ.Vcan
find a prime of this size on its own, so Pdoesn’t need to provide one.
Phase 0. [[Psends f0().]]
P→V:Psends f0()toV.
Vchecks that f0() = 1 andrejects if not.
...
Phase i.[[Ppersuades Vthatfi−1(r1···)is correct if fi(r1···,r)is correct. ]]
P→V:Psends the coefficients of fi(r1···,z)as a polynomial in z.
(Here r1···denotes a setting of the variables to the previously selected random
values r1,r2,. . ..)
Vuses these coefficients to evaluate fi(r1···,0)andfi(r1···,1).
Vchecks that these identities hold:
fi−1(r1···)=⎪braceleft⎢igg
fi(r1···,0)·fi(r1···,1) Si=∀,
fi(r1···,0)∗fi(r1···,1)Si=∃,
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 451 ---
10.5 PARALLEL COMPUTATION 427
and
fi−1(r1···,r)=( 1 −r)fi(r1···,0) + rfi(r1···,1)Si=R.
If not, Vrejects .
V→P:Vpicks a random rinFand sends it to P.
(When Si=R,t h i s rreplaces the previous r.)
Go to Phase i+1,w h e r e Pmust persuade Vthatfi(r1···,r)is correct.
...
Phase k+1.[[Vchecks directly that fk(r1,...,r m)is correct. ]]
Vevaluates p(r1,...,r m)to compare with the value Vhas for fk(r1,...,r m).I f
they are equal, Vaccepts ;o t h e r w i s e , Vrejects .
That completes the description of the protocol.
Proving the correctness of this protocol is similar to proving the correctness
of the #SAT protocol. Clearly, if ψis true, Pcan follow the protocol and V
will accept. If ψis false, ⎪tildewidePmust lie at phase 0 by sending an incorrect value for
f0().A t p h a s e i,i fVhas an incorrect value for fi−1(r1···),o n eo ft h ev a l u e s
fi(r1···,0)andfi(r1···,1)must be incorrect and the polynomial for fimust
be incorrect. Consequently, for a random r,t h ep r o b a b i l i t yt h a t ⎪tildewidePgets lucky
at this phase because fi(r1···,r)is correct is at most the polynomial degree
divided by the field size or n/n4.T h e p r o t o c o l p r o c e e d s f o r O(n2)phases, so
the probability that ⎪tildewidePgets lucky at some phase is at most 1/n.I f ⎪tildewidePis never
lucky, Vwill reject at phase k+1.
10.5
PARALLEL COMPUTATION
Aparallel computer is one that can perform multiple operations simultaneously.
Parallel computers may solve certain problems much faster than sequential com-
puters ,w h i c hc a no n l yd oas i n g l eo p e r a t i o na tat i m e . I np r a c t i c e ,t h ed i s t i n c -
tion between the two is slightly blurred because most real computers (including
“sequential” ones) are designed to use some parallelism as they execute individ-
ual instructions. We focus here on massive parallelism whereby a huge number
(think of millions or more) of processing elements are actively participating in a
single computation.
In this section, we briefly introduce the theory of parallel computation. We
describe one model of a parallel computer and use it to give examples of cer-
tain problems that lend themselves well to parallelization. We also explore the
possibility that parallelism may not be suitable for certain other problems.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 452 ---
428 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
UNIFORM BOOLEAN CIRCUITS
One of the most popular models in theoretical work on parallel algorithms is
called the Parallel Random Access Machine orPRAM .I nt h eP R A Mm o d e l ,
idealized processors with a simple instruction set patterned on actual computers
interact via a shared memory. In this short section we can’t describe PRAMs
in detail. Instead, we use an alternative model of parallel computer that we
introduced for another purpose in Chapter 9: Boolean circuits.
Boolean circuits have certain advantages and disadvantages as a parallel com-
putation model. On the positive side, the model is simple to describe, which
makes proofs easier. Circuits also bear an obvious resemblance to actual hard-
ware designs, and in that sense the model is realistic. On the negative side,
circuits are awkward to “program” because the individual processors are so weak.
Furthermore, we disallow cycles in our definition of Boolean circuits, in contrast
to circuits that we can actually build.
In the Boolean circuit model of a parallel computer, we take each gate to be an
individual processor, so we define the processor complexity of a Boolean circuit
to be its size.W e c o n s i d e r e a c h p r o c e s s o r t o c o m p u t e i t s f u n c t i o n i n a s i n g l e
time step, so we define the parallel time complexity of a Boolean circuit to be its
depth ,o rt h el o n g e s td i s t a n c ef r o ma ni n p u tv a r i a b l et ot h eo u t p u tg a t e .
Any particular circuit has a fixed number of input variables, so we use cir-
cuit families as defined in Definition 9.27 for deciding languages. We need to
impose a technical requirement on circuit families so that they correspond to
parallel computation models such as PRAMs, where a single machine is capable
of handling all input lengths. That requirement states that we can easily ob-
tain all members in a circuit family. This uniformity requirement is reasonable
because knowing that a small circuit exists for deciding certain elements of a
language isn’t very useful if the circuit itself is hard to find. That leads us to the
following definition.
DEFINITION 10.34
Af a m i l yo fc i r c u i t s (C0,C1,C2,...)isuniform if some log space
transducer Toutputs ⟨Cn⟩when T’s input is 1n.
Recall that Definition 9.28 defined the size and depth complexity of languages
in terms of families of circuits of minimal size and depth. Here, we consider
thesimultaneous size and depth of a single circuit family in order to identify how
many processors we need in order to achieve a particular parallel time complexity
or vice versa. Say that a language has simultaneous size–depth circuit complexity
at most (f(n),g(n))if a uniform circuit family exists for that language with size
complexity f(n)and depth complexity g(n).
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 453 ---
10.5 PARALLEL COMPUTATION 429
EXAMPLE 10.35
LetAbe the language over {0,1}consisting of all strings with an odd number
of1s. We can test membership in Aby computing the parity function. We can
implement the two-input parity gate x⊕ywith the standard AND ,OR,a n d NOT
operations as (x∧¬y)∨(¬x∧y).L e t t h e i n p u t s t o t h e c i r c u i t b e x1,...,x n.
One way to get a circuit for the parity function is to construct gates giwhereby
g1=x1andgi=xi⊕gi−1fori≤n.T h i s c o n s t r u c t i o n u s e s O(n)size and
depth.
Example 9.29 described another circuit for the parity function with O(n)size
andO(logn)depth by constructing a binary tree of ⊕gates. This construc-
tion is a significant improvement because it uses exponentially less parallel time
than does the preceding construction. Thus, the size–depth complexity of Ais
(O(n),O(logn)).
EXAMPLE 10.36
Recall that we may use circuits to compute functions that output strings. Con-
sider the Boolean matrix multiplication function. The input has 2m2=n
variables representing two m×mmatrices A={aik}andB={bik}.T h e
output is m2values representing the m×mmatrix C={cik},w h e r e
cik=⎪logicalordisplay
j⎪parenleftbig
aij∧bjk⎪parenrightbig
.
The circuit for this function has gates gijkthat compute aij∧bjkfor each i,
j,a n d k.A d d i t i o n a l l y , f o r e a c h iandk,t h ec i r c u i tc o n t a i n sab i n a r yt r e eo f ∨
gates to compute⎪logicalortext
jgijk.E a c hs u c ht r e ec o n t a i n s m−1ORgates and has logm
depth. Consequently, these circuits for Boolean matrix multiplication have size
O(m3)=O(n3/2)and depth O(logn).
EXAMPLE 10.37
IfA={aij}is an m×mmatrix, we let the transitive closure ofAbe the matrix
A∨A2∨···∨ Am,
where Aiis the matrix product of Awith itself itimes and ∨is the bitwise OR
of the matrix elements. The transitive closure operation is closely related to the
PATH problem and hence to the class NL.I f Ais the adjacency matrix of a
directed graph G,Aiis the adjacency matrix of the graph with the same nodes
in which an edge indicates the presence of a path of length iinG.T h et r a n s i t i v e
closure of Ais the adjacency matrix of the graph in which an edge indicates the
presence of a path of any length in G.
We can represent the computation of Aiwith a binary tree of size iand depth
logiwherein a node computes the product of the two matrices below it. Each
node is computed by a circuit of O(n3/2)size and logarithmic depth. Hence
the circuit computing Amhas size O(n2)and depth O(log2n).W e m a k e c i r -
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 454 ---
430 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
cuits for each Ai,w h i c ha d d sa n o t h e rf a c t o ro f mto the size and an additional
layer of O(logn)depth. Hence the size–depth complexity of transitive closure is
(O(n5/2),O(log2n)).
THE CLASS NC
Many interesting problems have size–depth complexity (O(nk),O(logkn))for
some constant k.S u c h p r o b l e m s m a y b e c o n s i d e r e d t o b e h i g h l y p a r a l l e l i z a b l e
with a moderate number of processors. That prompts the following definition.
DEFINITION 10.38
Fori≥1,l e t NCibe the class of languages that can be decided
by a uniform3family of circuits with polynomial size and O(login)
depth. Let NCbe the class of languages that are in NCifor some i.
Functions that are computed by such circuit families are called NCi
computable orNCcomputable .4
We explore the relationship of these complexity classes with other classes of
languages we have encountered. First, we make a connection between T uring
machine space and circuit depth. Problems that are solvable in logarithmic depth
are also solvable in logarithmic space. Conversely, problems that are solvable in
logarithmic space, even nondeterministically, are solvable in logarithmic squared
depth.
THEOREM 10.39
NC1⊆L.
PROOF We sketch a log space algorithm to decide a language AinNC1.O n
input wof length n,t h ea l g o r i t h mc a nc o n s t r u c tt h ed e s c r i p t i o na sn e e d e do f
thenth circuit in the uniform circuit family for A.T h e n t h e a l g o r i t h m c a n
evaluate the circuit by using a depth-first search from the output gate. Memory
for this search is needed only to record the path to the currently explored gate,
and to record any partial results that have been obtained along that path. The
circuit has logarithmic depth; hence only logarithmic space is required by the
simulation.
3Defining uniformity in terms of log space transducers is standard for NCiwhen i≥
2,b u tg i v e san o n s t a n d a r dr e s u l tf o r NC1(which contains the standard class NC1as
as u b s e t ) . W eg i v et h i sd e fi n i t i o na n y w a yb e c a u s ei ti ss i m p l e ra n da d e q u a t ef o ro u r
purposes.
4Steven Cook coined the name NCfor “Nick’s class” because Nick Pippenger was the
first person to recognize its importance.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 455 ---
10.5 PARALLEL COMPUTATION 431
THEOREM 10.40
NL⊆NC2.
PROOF IDEA Compute the transitive closure of the graph of configurations
of an NL-machine. Output the position corresponding to the presence of a path
from the start configuration to the accept configuration.
PROOF LetAbe a language that is decided by an NLmachine M,w h e r e
Ahas been encoded into the {0,1}alphabet. We construct a uniform circuit
family (C0,C1,...)forA.T o g e t Cn,w ec o n s t r u c tag r a p h Gthat is similar to
the computation graph for Mon an input wof length n.W e d o n o t k n o w t h e
input wwhen we construct the circuit—only its length n.T h e i n p u t s t o t h e
circuit are variables w1through wn—each corresponding to a position in the
input.
Recall that a configuration of Monwdescribes the state, the contents of
the work tape, and the positions of both the input and the work tape heads, but
does not include witself. Hence the collection of configurations of Monw
does not actually depend on w—only on w’s length n.T h e s ep o l y n o m i a l l ym a n y
configurations form the nodes of G.
The edges of Gare labeled with the input variables wi.I fc1andc2are two
nodes of G,a n d c1indicates input head position i,w ep u te d g e (c1,c2)inGwith
label wi(or
wi)i fc1can yield c2in a single step when the input head is reading
a1(or0), according to M’s transition function. If c1can yield c2in a single step,
whatever the input head is reading, we put that edge in Gunlabeled.
If we set the edges of Gaccording to a string wof length n,ap a t he x i s t sf r o m
the start configuration to the accepting configuration if and only if Maccepts
w.H e n c e a c i r c u i t t h a t c o m p u t e s t h e t r a n s i t i v e c l o s u r e o f Gand outputs the
position indicating the presence of such a path accepts exactly those strings in A
of length n.T h a tc i r c u i th a sp o l y n o m i a ls i z ea n d O(log2n)depth.
Al o gs p a c et r a n s d u c e ri sc a p a b l eo fc o n s t r u c t i n g Gand therefore Cnon input
1n.S e e T h e o r e m 8 . 2 5 f o r a m o r e d e t a i l e d d e s c r i p t i o n o f a s i m i l a r l o g s p a c e
transducer.
The class of problems solvable in polynomial time includes all the problems
solvable in NC,a st h ef o l l o w i n gt h e o r e ms h o w s .
THEOREM 10.41
NC⊆P.
PROOF Ap o l y n o m i a lt i m ea l g o r i t h mc a nr u nt h el o gs p a c et r a n s d u c e rt o
generate circuit Cnand simulate it on an input of length n.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 456 ---
432 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
P-COMPLETENESS
Now we consider the possibility that all problems in Pare also in NC.E q u a l -
ity between these classes would be surprising because it would imply that all
polynomial time solvable problems are highly parallelizable. We introduce the
phenomenon of P-completeness to give theoretical evidence that some problems
inPare inherently sequential.
DEFINITION 10.42
Al a n g u a g e BisP-complete if
1.B∈P,a n d
2.every AinPis log space reducible to B.
The next theorem follows in the spirit of Theorem 8.23 and has a similar
proof because NCcircuit families can compute log space reductions. We leave
its proof as Exercise 10.3.
THEOREM 10.43
IfA≤LBandBis in NC,t h e n Ais in NC.
We show that the problem of circuit evaluation is P-complete. For a circuit
Cand input setting x,w ew r i t e C(x)to be the value of Conx.L e t
CIRCUIT-VALUE ={⟨C, x⟩|Cis a Boolean circuit and C(x)=1 }.
THEOREM 10.44
CIRCUIT-VALUE isP-complete.
PROOF The construction given in Theorem 9.30 shows how to reduce any
language AinPtoCIRCUIT-VALUE .O n i n p u t w,t h er e d u c t i o np r o d u c e sa
circuit that simulates the polynomial time T uring machine for A.T h ei n p u t t o
the circuit is witself. The reduction can be carried out in log space because the
circuit it produces has a simple and repetitive structure.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 457 ---
10.6 CRYPTOGRAPHY 433
10.6
CRYPTOGRAPHY
The practice of encryption, using secret codes for private communication, dates
back thousands of years. During Roman times, Julius Caesar encoded mes-
sages to his generals to protect against the possibility of interception. More
recently, Alan T uring, the inventor of the T uring machine, led a group of British
mathematicians who broke the German code used in World War II for sending
instructions to U-boats patrolling the Atlantic Ocean. Governments still depend
on secret codes and invest a great deal of effort in devising codes that are hard to
break, and in finding weaknesses in codes that others use. These days, corpora-
tions and individuals use encryption to increase the security of their information.
Soon, nearly all electronic communication will be cryptographically protected.
In recent years, computational complexity theory has led to a revolution in
the design of secret codes. The field of cryptography, as this area is known,
now extends well beyond secret codes for private communication and addresses
ab r o a dr a n g eo fi s s u e sc o n c e r n i n gt h es e c u r i t yo fi n f o r m a t i o n .F o re x a m p l e ,w e
now have the technology to digitally “sign” messages to authenticate the identity
of the sender; to allow electronic elections whereby participants can vote over a
network and the results can be publicly tallied without revealing any individual’s
vote, while preventing multiple voting and other violations; and to construct new
kinds of secret codes that do not require the communicators to agree in advance
on the encryption and decryption algorithms.
Cryptography is an important practical application of complexity theory.
Digital cellular telephones, direct satellite television broadcast, and electronic
commerce over the Internet all depend on cryptographic measures to protect
information. Such systems will soon play a role in most people’s lives. Indeed,
cryptography has stimulated much research in complexity theory and in other
mathematical fields.
SECRET KEYS
Tr a d i t i o n a l l y, w h e n a s e n d e r w a n t s t o e n c r y p t a m e s s a g e s o t h a t o n l y a c e r t a i n
recipient can decrypt it, the sender and receiver share a secret key .T h e s e c r e t
key is a piece of information that is used by the encrypting and decrypting algo-
rithms. Maintaining the secrecy of the key is crucial to the security of the code
because any person with access to the key can encrypt and decrypt messages.
Ak e yt h a ti st o os h o r tm a yb ed i s c o v e r e dt h r o u g hab r u t e - f o r c es e a r c ho ft h e
entire space of possible keys. Even a somewhat longer key may be vulnerable
to certain kinds of attack—we say more about that shortly. The only way to
get perfect cryptographic security is with keys that are as long as the combined
length of all messages sent.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 458 ---
434 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Ak e yt h a ti sa sl o n ga st h ec o m b i n e dm e s s a g el e n g t hi sc a l l e da one-time pad .
Essentially, every bit of a one-time pad key is used just once to encrypt a bit of
the message, and then that bit of the key is discarded. The main problem with
one-time pads is that they may be rather large if a significant amount of commu-
nication is anticipated. For most purposes, one-time pads are too cumbersome
to be considered practical.
Ac r y p t o g r a p h i cc o d et h a ta l l o w sa nu n l i m i t e da m o u n to fs e c u r ec o m m u n i c a -
tion with keys of only moderate length is preferable. Interestingly, such codes
can’t exist in principle but paradoxically are used in practice. This type of code
can’t exist in principle because a key that is significantly shorter than the com-
bined message length can be found by a brute-force search through the space of
possible keys. Therefore, a code that is based on such keys is breakable in princi-
ple. But therein lies the solution to the paradox. A code could provide adequate
security in practice anyway because brute-force search is extremely slow when
the key is moderately long—say, in the range of 100 bits. Of course, if the code
could be broken in some other, fast way, it is insecure and shouldn’t be used.
The difficulty lies in being sure that the code can’t be broken quickly.
We currently have no way of ensuring that a code with moderate-length keys
is actually secure. T o guarantee that a code can’t be broken quickly, we’d need a
mathematical proof that, at the very least, finding the key can’t be done quickly.
However, such proofs seem beyond the capabilities of contemporary mathemat-
ics! The reason is that once a key is discovered, verifying its correctness is easily
done by inspecting the messages that have been decrypted with it. Therefore,
the key verification problem can be formulated so as to be in P.I f w e c o u l d
prove that keys can’t be found in polynomial time, we would achieve a major
mathematical advance by proving that Pis different from NP.
Because we are unable to prove mathematically that codes are unbreakable,
we rely instead on circumstantial evidence. In the past, evidence of a code’s qual-
ity was obtained by hiring experts who tried to break it. If they were unable to do
so, confidence in its security increased. That approach has obvious deficiencies.
If someone has better experts than ours, or if we can’t trust our own experts, the
integrity of our code may be compromised. Nonetheless, this approach was the
only one available until recently and was used to support the reliability of widely
used codes such as the Data Encryption Standard (DES) that was sanctioned by
the U.S. National Institute of Standards and T echnology.
Complexity theory provides another way to gain evidence for a code’s secu-
rity. We may show that the complexity of breaking the code is linked to the
complexity of some other problem for which compelling evidence of intractabil-
ity is already available. Recall that we have used NP-completeness to provide
evidence that certain problems are intractable. Reducing an NP-complete prob-
lem to the code-breaking problem would show that the code-breaking problem
was itself NP-complete. However, that doesn’t provide sufficient evidence of
security because NP-completeness concerns worst-case complexity. A problem
may be NP-complete, yet easy to solve most of the time. Codes must almost al-
ways be difficult to break, so we need to measure average-case complexity rather
than worst-case complexity.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 459 ---
10.6 CRYPTOGRAPHY 435
One problem that is generally believed to be difficult for the average case is
the problem of integer factorization. T op mathematicians have been interested
in factorization for centuries, but no one has yet discovered a fast procedure for
doing so. Certain modern codes have been built around the factoring problem
so that breaking the code corresponds to factoring a number. That constitutes
convincing evidence for the security of these codes because an efficient way of
breaking such a code would lead to a fast factoring algorithm, which would be a
remarkable development in computational number theory.
PUBLIC-KEY CRYPTOSYSTEMS
Even when cryptographic keys are moderately short, their management still
presents an obstacle to their widespread use in conventional cryptography. One
problem is that every pair of parties that desires private communication needs to
establish a joint secret key for this purpose. Another problem is that each indi-
vidual needs to keep a secret database of all keys that have been so established.
The recent development of public-key cryptography provides an elegant solu-
tion to both problems. In a conventional or private-key cryptosystem ,t h es a m e
key is used for both encryption and decryption. Compare that with the novel
public-key cryptosystem for which the decryption key is different from, and not
easily computed from, the encryption key.
Although it is a deceptively simple idea, separating the two keys has profound
consequences. Now each individual only needs to establish a single pair of keys:
an encryption key Eand a decryption key D.T h e i n d i v i d u a l k e e p s Dsecret
but publicizes E.I f a n o t h e r i n d i v i d u a l w a n t s t o s e n d h i m a m e s s a g e , s h e l o o k s
upEin the public directory, encrypts the message with it, and sends it to him.
The first individual is the only one who knows D,s oo n l yh ec a nd e c r y p tt h a t
message.
Certain public-key cryptosystems can also be used for digital signatures .I f
an individual applies his secret decryption algorithm to a message before send-
ing it, anyone can check that it actually came from him by applying the public
encryption algorithm. He has thus effectively “signed” that message. This ap-
plication assumes that the encryption and decryption functions may be applied
in either order, as is the case with the RSA cryptosystem.
ONE-WAY FUNCTIONS
Now we briefly investigate some of the theoretical underpinnings of the modern
theory of cryptography, called one-way functions andtrapdoor functions .O n eo ft h e
advantages of using complexity theory as a foundation for cryptography is that
it helps to clarify the assumptions being made when we argue about security. By
assuming the existence of a one-way function, we may construct secure private-
key cryptosystems. Assuming the existence of trapdoor functions allows us to
construct public-key cryptosystems. Both assumptions have additional theoret-
ical and practical consequences. We define these types of functions after some
preliminaries.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 460 ---
436 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
Af u n c t i o n f:Σ∗−→Σ∗islength-preserving if the lengths of wandf(w)are
equal for every w.Al e n g t h - p r e s e r v i n gf u n c t i o ni sa permutation if it never maps
two strings to the same place; that is, if f(x)̸=f(y)whenever x̸=y.
Recall the definition of a probabilistic T uring machine given in Section 10.2.
Let’s say that a probabilistic T uring machine Mcomputes a probabilistic func-
tionM:Σ∗−→Σ∗,w h e r ei f wis an input and xis an output, we assign
Pr⎪bracketleftbig
M(w)=x⎪bracketrightbig
to be the probability that Mhalts in an accept state with xon its tape when it is
started on input w.N o t et h a t Mmay sometimes fail to accept on input w,s o
⎪summationdisplay
x∈Σ∗Pr⎪bracketleftbig
M(w)=x⎪bracketrightbig
≤1.
Next, we get to the definition of a one-way function. Roughly speaking, a
function is one-way if it is easy to compute but nearly always hard to invert. In
the following definition, fdenotes the easily computed one-way function and
Mdenotes the probabilistic polynomial time algorithm that we may think of as
trying to invert f.W e d e fi n e o n e - w a y p e r m u t a t i o n s fi r s t b e c a u s e t h a t c a s e i s
somewhat simpler.
DEFINITION 10.45
Aone-way permutation is a permutation fwith the following two
properties.
1.It is computable in polynomial time.
2.For every probabilistic polynomial time TMM,e v e r y k,a n d
sufficiently large n,i fw ep i c kar a n d o m wof length nand run
Mon input f(w),
PrM,w⎪bracketleftbig
M(f(w)) = w⎪bracketrightbig
≤n−k.
Here, PrM,wmeans that the probability is taken over the ran-
dom choices made by Mand the random selection of w.
Aone-way function is a length-preserving function fwith the fol-
lowing two properties.
1.It is computable in polynomial time.
2.For every probabilistic polynomial time TMM,e v e r y k,a n d
sufficiently large n,i fw ep i c kar a n d o m wof length nand run
Mon input f(w),
PrM,w⎪bracketleftbig
M(f(w)) = y,where f(y)=f(w)⎪bracketrightbig
≤n−k.
For one-way permutations, any probabilistic polynomial time algorithm has
only a small probability of inverting f;t h a ti s ,i ti su n l i k e l yt oc o m p u t e wfrom
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 461 ---
10.6 CRYPTOGRAPHY 437
f(w).F o r o n e - w a y f u n c t i o n s , a n y p r o b a b i l i s t i c p o l y n o m i a l t i m e a l g o r i t h m i s
unlikely to be able to find any ythat maps to f(w).
EXAMPLE 10.46
The multiplication function mult is a candidate for a one-way function. We let
Σ= {0,1};a n df o ra n y w∈Σ∗,w el e t mult(w)be the string representing the
product of the first and second halves of w.F o r m a l l y ,
mult(w)=w1·w2,
where w=w1w2such that |w1|=|w2|,o r|w1|=|w2|+1if|w|is odd. The
strings w1andw2are treated as binary numbers. We pad mult(w)with leading
0ss ot h a ti th a st h es a m el e n g t ha s w.D e s p i t e a g r e a t d e a l o f r e s e a r c h i n t o
the integer factorization problem, no probabilistic polynomial time algorithm is
known that can invert mult,e v e no nap o l y n o m i a lf r a c t i o no fi n p u t s .
If we assume the existence of a one-way function, we may construct a private-
key cryptosystem that is provably secure. That construction is too complicated
to present here. Instead, we illustrate how to implement a different crypto-
graphic application with a one-way function.
One simple application of a one-way function is a provably secure password
system. In a typical password system, a user must enter a password to gain ac-
cess to some resource. The system keeps a database of users’ passwords in an
encrypted form. The passwords are encrypted to protect them if the database
is left unprotected either by accident or design. Password databases are often
left unprotected so that various application programs can read them and check
passwords. When a user enters a password, the system checks it for validity by
encrypting it to determine whether it matches the version stored in the database.
Obviously, an encryption scheme that is difficult to invert is desirable because it
makes the unencrypted password difficult to obtain from the encrypted form. A
one-way function is a natural choice for a password encryption function.
TRAPDOOR FUNCTIONS
We don’t know whether the existence of a one-way function alone is enough to
allow the construction of a public-key cryptosystem. T o get such a construction,
we use a related object called a trapdoor function ,w h i c hc a nb ee f fi c i e n t l yi n v e r t e d
in the presence of special information.
First, we need to discuss the notion of a function that indexes a family of
functions. If we have a family of functions {fi}foriinΣ∗,w ec a nr e p r e s e n t
them by the single function f:Σ∗×Σ∗−→Σ∗,w h e r e f(i, w)=fi(w)for any i
andw.W ec a l l fan indexing function. Say that fis length-preserving if each of
the indexed functions fiis length preserving.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 462 ---
438 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
DEFINITION 10.47
Atrapdoor function f:Σ∗×Σ∗−→Σ∗is a length-preserving in-
dexing function that has an auxiliary probabilistic polynomial time
TMGand an auxiliary function h:Σ∗×Σ∗−→Σ∗.T h e t r i o f,G,
andhsatisfy the following three conditions.
1.Functions fandhare computable in polynomial time.
2.For every probabilistic polynomial time TME,a n de v e r y k
and sufficiently large n,i fw et a k ear a n d o mo u t p u t ⟨i, t⟩ofG
on1nand a random w∈Σn,t h e n
PrE,w⎪bracketleftbig
E(i, fi(w)) = y,where fi(y)=fi(w)⎪bracketrightbig
≤n−k.
3.For every n,e v e r y wof length n,a n de v e r yo u t p u t ⟨i, t⟩ofG
that occurs with nonzero probability for some input to G,
h(t, fi(w)) = y,where fi(y)=fi(w).
The probabilistic TMGgenerates an index iof a function in the index family
while simultaneously generating a value tthat allows fito be inverted quickly.
Condition 2 says that fiis hard to invert in the absence of t. Condition 3 says
thatfiis easy to invert when tis known. Function his the inverting function.
EXAMPLE 10.48
Here, we describe the trapdoor function that underlies the well-known RSA
cryptosystem. We give its associated trio f,G,a n d h.T h e g e n e r a t o r m a c h i n e
Goperates as follows. On input 1n,i ts e l e c t st w on u m b e r so fs i z e nat random
and tests them for primality. If they aren’t prime, it repeats the selection until it
succeeds or until it reaches a prespecified timeout limit and reports failure. After
finding pandq,i tc o m p u t e s N=pqand the value φ(N)=( p−1)(q−1).I ts e -
lects a random number ebetween 1 and φ(N),a n dc h e c k sw h e t h e rt h a tn u m b e r
is relatively prime to φ(N).I fn o t ,t h ea l g o r i t h ms e l e c t sa n o t h e rn u m b e ra n dr e -
peats the check. Finally, the algorithm computes the multiplicative inverse dofe
modulo φ(N).D o i n gs oi sp o s s i b l eb e c a u s et h es e to fn u m b e r si n {1,...,φ (N)}
that are relatively prime to φ(N)forms a group under the operation of multipli-
cation modulo φ(N).F i n a l l y , Goutputs ((N,e),d).T h ei n d e xt ot h ef u n c t i o n f
consists of the two numbers Nande.L e t
fN,e(w)=wemod N.
The inverting function his
h(d, x)=xdmod N.
Function hproperly inverts because h(d, fN,e(w)) = wedmod N=w.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 463 ---
EXERCISES 439
We can use a trapdoor function such as the RSA trapdoor function to con-
struct a public-key cryptosystem as follows. The public key is the index igener-
ated by the probabilistic machine G.T h es e c r e tk e yi st h ec o r r e s p o n d i n gv a l u e t.
The encryption algorithm breaks the message minto blocks of size at most
logN.F o r e a c h b l o c k w,t h es e n d e rc o m p u t e s fi.T h e r e s u l t i n g s e q u e n c e o f
strings is the encrypted message. The receiver uses the function hto obtain the
original message from its encryption.
EXERCISES
10.1 Show that a circuit family with depth O(logn)is also a polynomial size circuit
family.
10.2 Show that 12is not pseudoprime because it fails some Fermat test.
10.3 Prove that if A≤LBandBis in NC,t h e n Ais in NC.
10.4 Show that the parity function with ninputs can be computed by a branching pro-
gram that has O(n)nodes.
10.5 Show that the majority function with ninputs can be computed by a branching
program that has O(n2)nodes.
10.6 Show that any function with ninputs can be computed by a branching program
that has O(2n)nodes.
A10.7 Show that BPP ⊆PSPACE .
PROBLEMS
10.8 LetAbe a regular language over {0,1}.S h o w t h a t Ahas size–depth complexity
(O(n),O(logn)).
⋆10.9 ABoolean formula is a Boolean circuit wherein every gate has only one output
wire. The same input variable may appear in multiple places of a Boolean formula.
Prove that a language has a polynomial size family of formulas iff it is in NC1.
Ignore uniformity considerations.
⋆10.10 Ak-head pushdown automaton (k-PDA)i sad e t e r m i n i s t i cp u s h d o w na u t o m a t o n
with kread-only, two-way input heads and a read/write stack. Define the class
PDA k={A|Ais recognized by a k-PDA}.Show that P=⎨uniontext
kPDA k. (Hint:
Recall that Pequals alternating log space.)
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 464 ---
440 CHAPTER 10 / ADVANCED TOPICS IN COMPLEXITY THEORY
10.11 LetMbe a probabilistic polynomial time T uring machine, and let Cbe a language
where for some fixed 0<ϵ 1<ϵ 2<1,
a.w̸∈Cimplies Pr[Maccepts w]≤ϵ1,a n d
b.w∈Cimplies Pr[Maccepts w]≥ϵ2.
Show that C∈BPP. (Hint: Use the result of Lemma 10.5.)
10.12 Show that if P=N P ,t h e n P=P H .
10.13 Show that if PH = PSPACE ,t h e nt h ep o l y n o m i a lt i m eh i e r a r c h yh a so n l yfi n i t e l y
many distinct levels.
10.14 Recall that NPSATis the class of languages that are decided by nondeterminis-
tic polynomial time T uring machines with an oracle for the satisfiability problem.
Show that NPSAT=Σ 2P.
⋆10.15 Prove Fermat’s little theorem, which is given in Theorem 10.6. (Hint: Consider
the sequence a1,a2,....W h a tm u s th a p p e n ,a n dh o w ? )
A⋆10.16 Prove that for any integer p>1,i fpisn’t pseudoprime, then pfails the Fermat test
for at least half of all numbers in Z+
p.
10.17 Prove that if Ais a language in L,af a m i l yo fb r a n c h i n gp r o g r a m s (B1,B2,...)
exists wherein each Bnaccepts exactly the strings in Aof length nand is bounded
in size by a polynomial in n.
10.18 Prove that if Ais a regular language, a family of branching programs (B1,B2,...)
exists wherein each Bnaccepts exactly the strings in Aof length nand is bounded
in size by a constant times n.
10.19 Show that if NP⊆BPP,t h e n NP = RP .
10.20 Define a ZPP-machine to be a probabilistic T uring machine that is permitted
three types of output on each of its branches: accept ,reject ,a n d ?.AZPP-machine
Mdecides a language AifMoutputs the correct answer on every input string w
(accept ifw∈Aandreject ifw̸∈A)w i t hp r o b a b i l i t ya tl e a s t2
3,a n d Mnever
outputs the wrong answer. On every input, Mmay output ?with probability at
most1
3.F u r t h e r m o r e ,t h ea v e r a g er u n n i n gt i m eo v e ra l lb r a n c h e so f Monwmust
be bounded by a polynomial in the length of w.S h o w t h a t RP∩coRP = ZPP ,
where ZPP is the collection of languages that are recognized by ZPP-machines.
10.21 LetEQBP={⟨B1,B2⟩|B1andB2are equivalent branching programs }.S h o w
thatEQBPiscoNP -complete.
10.22 Let BPL be the collection of languages that are decided by probabilistic log space
Tu r i n g m a c h i n e s w i t h e r r o r p r o b a b i l i t y1
3.P r o v et h a tB P L ⊆P.
10.23 LetCNF H={⟨φ⟩|φis a satisfiable cnf-formula where each clause contains any
number of literals, but at most one negated literal }.P r o b l e m 7 . 2 5 a s k e d y o u t o
show that CNF H∈P. Now give a log-space reduction from CIRCUIT-VALUE to
CNF Hto conclude that CNF HisP-complete.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 465 ---
SELECTED SOLUTIONS 441
SELECTED SOLUTIONS
10.7 IfMis a probabilistic TMthat runs in polynomial time, we can modify Mso that
it makes exactly nrcoin tosses on each branch of its computation, for some con-
stant r.T h u s ,t h ep r o b l e mo fd e t e r m i n i n gt h ep r o b a b i l i t yt h a t Maccepts its input
string reduces to counting how many branches are accepting and comparing this
number with2
32(nr).T h i sc o u n tc a nb ep e r f o r m e db yu s i n gp o l y n o m i a ls p a c e .
10.16 Call aawitness if it fails the Fermat test for p;t h a ti s ,i f ap−1̸≡1( m o d p).
LetZ∗
pbe all numbers in {1,...,p −1}that are relatively prime to p.I fpisn’t
pseudoprime, it has a witness ainZ∗
p.
Use ato get many more witnesses. Find a unique witness in Z∗
pfor each non-
witness. If d∈Z∗
pis a nonwitness, you have dp−1≡1( m o d p). Hence
(damod p)p−1̸≡1( m o d p)and so damod pis a witness. If d1andd2are distinct
nonwitnesses in Z∗
p,t h e n d1amod p̸=d2amod p.O t h e r w i s e , (d1−d2)a≡0
(mod p),a n dt h u s (d1−d2)a=cpfor some integer c.B u t d1andd2are in Z∗
p,
and thus (d1−d2)<p,s oa=cp/(d1−d2)andphave a factor greater than 1
in common, which is impossible because aandpare relatively prime. Thus, the
number of witnesses in Z∗
pmust be as large as the number of nonwitnesses in Z∗
p,
and consequently at least half of the members of Z∗
pare witnesses.
Next, show that every member bofZ+
pthat is not relatively prime to pis a witness.
Ifbandpshare a factor, then beandpshare that factor for any e>0. Hence
bp−1̸≡1( m o d p).T h e r e f o r e ,y o uc a nc o n c l u d et h a ta tl e a s th a l fo ft h em e m b e r s
ofZ+
pare witnesses.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 466 ---
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 467 ---
Selected Bibliography
1. A DLEMAN ,L . T w ot h e o r e m so nr a n d o mp o l y n o m i a lt i m e . I n Proceedings
of the Nineteenth IEEE Symposium on Foundations of Computer Science (1978),
75–83.
2. A DLEMAN ,L .M . , AND HUANG ,M .A . R e c o g n i z i n gp r i m e si nr a n d o m
polynomial time. In Proceedings of the Nineteenth Annual ACM Symposium on
the Theory of Computing (1987), 462–469.
3. A DLEMAN ,L .M . ,P OMERANCE , C., AND RUMELY ,R .S .O nd i s t i n g u i s h -
ing prime numbers from composite numbers. Annals of Mathematics 117
(1983), 173–206.
4. A GRAWAL ,M . ,K AYAL ,N . , AND SAXENA ,N .P R I M E Si si nP . The Annals
of Mathematics ,S e c o n dS e r i e s ,v o l .1 6 0 ,n o .2( 2 0 0 4 ) ,7 8 1 – 7 9 3 .
5. A HO,A .V . ,H OPCROFT ,J .E . , AND ULLMAN ,J .D . Data Structures and
Algorithms .A d d i s o n - W e s l e y ,1 9 8 2 .
6. A HO,A .V . ,S ETHI ,R . , AND ULLMAN ,J .D . Compilers: Principles, T ech-
niques, T ools .A d d i s o n - W e s l e y ,1 9 8 6 .
7. A KL,S .G . The Design and Analysis of Parallel Algorithms .P r e n t i c e - H a l l
International, 1989.
8. A LON ,N . ,E RD¨OS,P . , AND SPENCER ,J .H . The Probabilistic Method .J o h n
Wiley & Sons, 1992.
9. A NGLUIN ,D . , AND VALIANT ,L .G . F a s tp r o b a b i l i s t i ca l g o r i t h m sf o r
Hamiltonian circuits and matchings. Journal of Computer and System Sciences
18(1979), 155–193.
10. A RORA ,S . ,L UND , C., M OTWANI ,R . ,S UDAN ,M . , AND SZEGEDY ,M .
Proof verification and hardness of approximation problems. In Proceedings
of the Thirty-third IEEE Symposium on Foundations of Computer Science (1992),
14–23.
11. B AASE ,S .Computer Algorithms: Introduction to Design and Analysis .A d d i s o n -
Wesley, 1978.
12. B ABAI ,L .E - m a i la n dt h eu n e x p e c t e dp o w e ro fi n t e r a c t i o n .I n Proceedings of
the Fifth Annual Conference on Structure in Complexity Theory (1990), 30–44.
13. B ACH,E . , AND SHALLIT ,J .Algorithmic Number Theory, Vol. 1 .M I TP r e s s ,
1996.
443
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 468 ---
444
14. B ALC´AZAR ,J .L . ,D ´IAZ,J . , AND GABARR ´O,J .Structural Complexity I, II .
EATCS Monographs on Theoretical Computer Science. Springer Verlag,
1988 (I) and 1990 (II).
15. B EAME ,P .W . ,C OOK ,S .A . , AND HOOVER ,H .J . L o gd e p t hc i r c u i t sf o r
division and related problems. SIAM Journal on Computing 15 ,4( 1 9 8 6 ) ,
994–1003.
16. B LUM ,M . ,C HANDRA ,A . , AND WEGMAN ,M .E q u i v a l e n c eo ff r e eb o o l e a n
graphs can be decided probabilistically in polynomial time. Information Pro-
cessing Letters 10 (1980), 80–82.
17. B RASSARD ,G . , AND BRATLEY ,P .Algorithmics: Theory and Practice .P r e n -
tice-Hall, 1988.
18. C ARMICHAEL ,R .D . O nc o m p o s i t en u m b e r s pwhich satisfy the Fermat
congruence aP−1≡Pmod P.American Mathematical Monthly 19 (1912),
22–27.
19. C HOMSKY ,N .T h r e em o d e l sf o rt h ed e s c r i p t i o no fl a n g u a g e . IRE Trans. on
Information Theory 2 (1956), 113–124.
20. C OBHAM ,A . T h ei n t r i n s i cc o m p u t a t i o n a ld i f fi c u l t yo ff u n c t i o n s . I n Pro-
ceedings of the International Congress for Logic, Methodology, and Philosophy of
Science ,Y .B a r - H i l l e l ,E d . ,N o r t h - H o l l a n d ,1 9 6 4 ,2 4 – 3 0 .
21. C OOK ,S .A .T h ec o m p l e x i t yo ft h e o r e m - p r o v i n gp r o c e d u r e s .I n Proceedings
of the Third Annual ACM Symposium on the Theory of Computing (1971), 151–
158.
22. C ORMEN ,T . ,L EISERSON , C., AND RIVEST ,R .Introduction to Algorithms .
MIT Press, 1989.
23. E DMONDS ,J .P a t h s ,t r e e s ,a n dfl o w e r s . Canadian Journal of Mathematics 17
(1965), 449–467.
24. E NDERTON ,H .B . AM a t h e m a t i c a lI n t r o d u c t i o nt oL o g i c .A c a d e m i c P r e s s ,
1972.
25. E VEN,S .Graph Algorithms .P i t m a n ,1 9 7 9 .
26. F ELLER ,W . An Introduction to Probability Theory and Its Applications, Vol. 1 .
John Wiley & Sons, 1970.
27. F EYNMAN ,R .P . ,H EY,A .J .G . , AND ALLEN ,R .W . Feynman lectures on
computation .A d d i s o n - W e s l e y ,1 9 9 6 .
28. G AREY ,M .R . , AND JOHNSON ,D .S . Computers and Intractability—A Guide
to the Theory of NP-completeness .W .H .F r e e m a n ,1 9 7 9 .
29. G ILL, J. T. Computational complexity of probabilistic T uring machines.
SIAM Journal on Computing 6 ,4( 1 9 7 7 ) ,6 7 5 – 6 9 5 .
30. G ¨ODEL , K. On formally undecidable propositions in Principia Mathematica
and related systems I. In The Undecidable ,M .D a v i s ,E d . ,R a v e nP r e s s ,1 9 6 5 ,
4–38.
31. G OEMANS ,M .X . , AND WILLIAMSON ,D .P . . 8 7 8 - a p p r o x i m a t i o na l g o -
rithms for MAX CUT and MAX 2SAT. In Proceedings of the Twenty-sixth
Annual ACM Symposium on the Theory of Computing (1994), 422–431.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 469 ---
445
32. G OLDWASSER ,S . , AND MICALI ,S . P r o b a b i l i s t i ce n c r y p t i o n . Journal of
Computer and System Sciences (1984), 270–299.
33. G OLDWASSER ,S . ,M ICALI ,S . , AND RACKOFF , C. The knowledge com-
plexity of interactive proof-systems. SIAM Journal on Computing (1989),
186–208.
34. G REENLAW ,R . ,H OOVER ,H .J . , AND RUZZO ,W .L . Limits to Parallel
Computation: P-completeness Theory .O x f o r dU n i v e r s i t yP r e s s ,1 9 9 5 .
35. H ARARY ,F .Graph Theory ,2 de d .A d d i s o n - W e s l e y ,1 9 7 1 .
36. H ARTMANIS ,J . , AND STEARNS ,R .E . O nt h ec o m p u t a t i o n a lc o m p l e x i t y
of algorithms. Transactions of the American Mathematical Society 117 (1965),
285–306.
37. H ILBERT ,D . M a t h e m a t i c a lp r o b l e m s .L e c t u r ed e l i v e r e db e f o r et h eI n -
ternational Congress of Mathematicians at Paris in 1900. In Mathematical
Developments Arising from Hilbert Problems ,v o l .2 8 .A m e r i c a nM a t h e m a t i c a l
Society, 1976, 1–34.
38. H OFSTADTER ,D .R . Goedel, Escher, Bach: An Eternal Golden Braid .B a s i c
Books, 1979.
39. H OPCROFT ,J .E . , AND ULLMAN ,J .D . Introduction to Automata Theory,
Languages and Computation .A d d i s o n - W e s l e y , 1 9 7 9 .
40. I MMERMAN ,N .N o n d e t e r m i n s t i cs p a c ei sc l o s e du n d e rc o m p l e m e n t . SIAM
Journal on Computing 17 (1988), 935–938.
41. J OHNSON ,D .S .T h eN P - c o m p l e t e n e s sc o l u m n :I n t e r a c t i v ep r o o fs y s t e m s
for fun and profit. Journal of Algorithms 9 ,3( 1 9 8 8 ) ,4 2 6 – 4 4 4 .
42. K ARP,R .M . R e d u c i b i l i t ya m o n gc o m b i n a t o r i a lp r o b l e m s . I n Complexity
of Computer Computations (1972), R. E. Miller and J. W. Thatcher, Eds.,
Plenum Press, 85–103.
43. K ARP,R .M . , AND LIPTON ,R .J . T u r i n gm a c h i n e st h a tt a k ea d v i c e . EN-
SEIGN: L’Enseignement Mathematique Revue Internationale 28 (1982).
44. K NUTH ,D .E .O nt h et r a n s l a t i o no fl a n g u a g e sf r o ml e f tt or i g h t . Informa-
tion and Control (1965), 607–639.
45. L AWLER ,E .L . Combinatorial Optimization: Networks and Matroids .H o l t ,
Rinehart and Winston, 1991.
46. L AWLER ,E .L . ,L ENSTRA , J. K., R INOOY KAN,A .H .G . , AND SHMOYS ,
D. B. The Traveling Salesman Problem .J o h nW i l e y&S o n s ,1 9 8 5 .
47. L EIGHTON ,F .T . Introduction to Parallel Algorithms and Architectures: Array,
Trees, Hypercubes . Morgan Kaufmann, 1991.
48. L EVIN ,L .U n i v e r s a ls e a r c hp r o b l e m s( i nR u s s i a n ) . Problemy Peredachi Infor-
matsii 9 ,3( 1 9 7 3 ) ,1 1 5 – 1 1 6 .
49. L EWIS ,H . , AND PAPADIMITRIOU , C.Elements of the Theory of Computation .
Prentice-Hall, 1981.
50. L I,M . , AND VITANYI ,P .Introduction to Kolmogorov Complexity and its Appli-
cations. Springer-Verlag, 1993.
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 470 ---
446
51. L ICHTENSTEIN ,D . , AND SIPSER , M. GO is PSPACE hard. Journal of the
ACM (1980), 393–401.
52. L UBY,M . Pseudorandomness and Cryptographic Applications .P r i n c e t o n U n i -
versity Press, 1996.
53. L UND , C., F ORTNOW ,L . ,K ARLOFF ,H . , AND NISAN ,N .A l g e b r a i cm e t h -
ods for interactive proof systems. Journal of the ACM 39 ,4( 1 9 9 2 ) ,8 5 9 – 8 6 8 .
54. M ILLER ,G .L . R i e m a n n ’ sh y p o t h e s i sa n dt e s t sf o rp r i m a l i t y . Journal of
Computer and System Sciences 13 (1976), 300–317.
55. N IVEN ,I . , AND ZUCKERMAN ,H .S . An Introduction to the Theory of Num-
bers,4 t he d .J o h nW i l e y&S o n s ,1 9 8 0 .
56. P APADIMITRIOU , C. H. Computational Complexity. Addison-Wesley, 1994.
57. P APADIMITRIOU , C. H., AND STEIGLITZ , K. Combinatorial Optimization
(Algorithms and Complexity) .P r e n t i c e - H a l l ,1 9 8 2 .
58. P APADIMITRIOU , C. H., AND YANNAKAKIS ,M . O p t i m i z a t i o n ,a p p r o x i -
mation, and complexity classes. Journal of Computer and System Sciences 43 ,3
(1991), 425–440.
59. P OMERANCE , C. On the distribution of pseudoprimes. Mathematics of Com-
putation 37 ,1 5 6( 1 9 8 1 ) ,5 8 7 – 5 9 3 .
60. P RATT ,V .R .E v e r yp r i m eh a sas u c c i n c tc e r t i fi c a t e . SIAM Journal on Com-
puting 4 ,3( 1 9 7 5 ) ,2 1 4 – 2 2 0 .
61. R ABIN ,M .O . P r o b a b i l i s t i ca l g o r i t h m s . I n Algorithms and Complexity: New
Directions and Recent Results ,J .F .T r a u b ,E d . ,A c a d e m i cP r e s s( 1 9 7 6 )2 1 – 3 9 .
62. R EINGOLD ,O .U n d i r e c t e ds t - c o n n e c t i v i t yi nl o g - s p a c e . Journal of the ACM
55,4( 2 0 0 8 ) ,1 – 2 4 .
63. R IVEST ,R .L . ,S HAMIR ,A . , AND ADLEMAN ,L . Am e t h o df o ro b t a i n i n g
digital signatures and public key cryptosystems. Communications of the ACM
21,2( 1 9 7 8 ) ,1 2 0 – 1 2 6 .
64. R OCHE ,E . , AND SCHABES ,Y .Finite-State Language Processing .M I TP r e s s ,
1997.
65. S CHAEFER ,T .J . O nt h ec o m p l e x i t yo fs o m et w o - p e r s o np e r f e c t - i n f o r -
mation games. Journal of Computer and System Sciences 16 ,2( 1 9 7 8 ) ,1 8 5 – 2 2 5 .
66. S EDGEWICK ,R .Algorithms ,2 de d .A d d i s o n - W e s l e y ,1 9 8 9 .
67. S HAMIR , A. IP = PSPACE. Journal of the ACM 39 ,4( 1 9 9 2 ) ,8 6 9 – 8 7 7 .
68. S HEN , A. IP = PSPACE: Simplified proof. Journal of the ACM 39 ,4( 1 9 9 2 ) ,
878–880.
69. S HOR ,P .W . P o l y n o m i a l - t i m ea l g o r i t h m sf o rp r i m ef a c t o r i z a t i o na n dd i s -
crete logarithms on a quantum computer. SIAM Journal on Computing 26 ,
(1997), 1484–1509.
70. S IPSER ,M . L o w e rb o u n d so nt h es i z eo fs w e e p i n ga u t o m a t a . Journal of
Computer and System Sciences 21 ,2( 1 9 8 0 ) ,1 9 5 – 2 0 2 .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 471 ---
447
71. S IPSER ,M .T h eh i s t o r ya n ds t a t u so ft h ePv e r s u sN Pq u e s t i o n .I n Proceed-
ings of the Twenty-fourth Annual ACM Symposium on the Theory of Computing
(1992), 603–618.
72. S TINSON ,D .R . Cryptography: Theory and Practice . CRC Press, 1995.
73. S ZELEPCZ ´ENYI ,R .T h em e t h o do ff o r c e de n u m e r a t i o nf o rn o n d e t e r m i n -
istic automata, Acta Informatica 26 ,( 1 9 8 8 ) ,2 7 9 – 2 8 4 .
74. T ARJAN ,R .E . Data structures and network algorithms ,v o l .4 4o f CBMS-NSF
Regional Conference Series in Applied Mathematics ,S I A M ,1 9 8 3 .
75. T URING ,A .M . O nc o m p u t a b l en u m b e r s ,w i t ha na p p l i c a t i o nt ot h e
Entscheidungsproblem. In Proceedings, London Mathematical Society, (1936),
230–265.
76. U LLMAN ,J .D . ,A HO,A .V . , AND HOPCROFT ,J .E . The Design and Analysis
of Computer Algorithms .A d d i s o n - W e s l e y ,1 9 7 4 .
77. VAN LEEUWEN ,J . ,E d . Handbook of Theoretical Computer Science A: Algo-
rithms and Complexity .E l s e v i e r ,1 9 9 0 .
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 472 ---
Index
N(natural numbers), 4, 255
R(real numbers), 185, 205
R+(nonnegative real numbers), 277
∅(empty set), 4
∈(element), 4
̸∈(not element), 4
⊆(subset), 4
/subsetnoteql(proper subset), 4
∪(union operation), 4, 44
∩(intersection operation), 4
×(Cartesian or cross product), 6
Z(integers), 4
ε (empty string), 14
wR(reverse of w), 14
¬(negation operation), 14
∧(conjunction operation), 14
∨(disjunction operation), 14
⊕(exclusive ORoperation), 15
→(implication operation), 15
↔(equality operation), 15
⇐(reverse implication), 18
⇒(implication), 18⇐⇒(logical equivalence), 18
◦(concatenation operation), 44
∗(star operation), 44
+(plus operation), 65
P(Q)(power set), 53
Σ(alphabet), 53
Σε(Σ∪{ε}), 53
⟨·⟩(encoding), 185, 287
␣ (blank), 168
≤m(mapping reduction), 235
≤T(T uring reduction), 261
≤L(log space reduction), 352
≤P(polynomial time reduction), 300
d(x)(minimal description), 264
Th(M)(theory of model), 255
K(x)(descriptive complexity), 264
∀(universal quantifier), 338
∃(existential quantifier), 338
↑(exponentiation), 371
O(f(n))(big- Onotation), 277–278
o(f(n))(small- onotation), 278
448
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 473 ---
INDEX 449
Accept state, 34, 35
Acceptance problem
forCFG,1 9 8
forDFA,1 9 4
forLBA,2 2 2
forNFA,1 9 5
forTM,2 0 2
Accepting computation history, 221
Accepting configuration, 169
Accepts a language, meaning of, 36
ACFG,1 9 8
Acyclic graph, 404
ADFA,1 9 4
Adjacency matrix, 287
Adleman, Leonard M., 443, 446
Agrawal, Manindra, 443
Aho, Alfred V ., 443, 447
Akl, Selim G., 443
ALBA,2 2 2
Algorithm
complexity analysis, 276–281
decidability and undecidability,
193–210
defined, 182–184
describing, 184–187
Euclidean, 289
polynomial time, 284–291
running time, 276
ALL CFG,2 2 5
Allen, Robin W., 444
Alon, Noga, 443
Alphabet, defined, 13
Alternating T uring machine, 409
Alternation, 408–414
Ambiguity, 107–108
Ambiguous
NFA,2 1 2
grammar, 107, 240
Amplification lemma, 397
AND operation, 14
ANFA,1 9 5
Angluin, Dana, 443
Anti-clique, 28
Approximation algorithm, 393–395
AREX,1 9 6
Argument, 8
Arithmetization, 422
Arity, 8, 253
Arora, Sanjeev, 443
ASPACE( f(n)),4 1 0Asymptotic analysis, 276
Asymptotic notation
big-Onotation, 277–278
small- onotation, 278
Asymptotic upper bound, 277
ATIME( t(n)),4 1 0
ATM,2 0 2
Atomic formula, 253
Automata theory, 3, see also
Context-free language;
Regular language
Average-case analysis, 276
Baase, Sara, 443
Babai, Laszlo, 443
Bach, Eric, 443
Balc´azar, Jos ´eL u i s ,4 4 4
Basis of induction, 23
Beame, Paul W., 444
Big-Onotation, 276–278
Bijective function, 203
Binary function, 8
Binary operation, 44
Binary relation, 9
Bipartite graph, 360
Blank symbol ␣,1 6 8
Blum, Manuel, 444
Boolean circuit, 379–387
depth, 428
gate, 380
size, 428
uniform family, 428
wire, 380
Boolean formula, 299, 338
minimal, 328, 377, 411, 414
quantified, 339
Boolean logic, 14–15
Boolean matrix multiplication, 429
Boolean operation, 14, 253, 299
Boolean variable, 299
Bound variable, 338
Branching program, 404
read-once, 405
Brassard, Gilles, 444
Bratley, Paul, 444
Breadth-first search, 284
Brute-force search, 285, 288, 292, 298
Cantor, Georg, 202
Carmichael number, 400
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 474 ---
450 INDEX
Carmichael, R. D., 444
Cartesian product, 6, 46
CD-ROM, 349
Certificate, 293
CFG,seeContext-free grammar
CFL,seeContext-free language
Chaitin, Gregory J., 264
Chandra, Ashok, 444
Characteristic sequence, 206
Checkers, game of, 348
Chernoff bound, 398
Chess, game of, 348
Chinese remainder theorem, 401
Chomsky normal form, 108–111, 157,
198, 291
Chomsky, Noam, 444
Church, Alonzo, 3, 183, 255
Church– T uring thesis, 183–184, 281
CIRCUIT-SAT ,3 8 6
Circuit-satisfiability problem, 386
CIRCUIT-VALUE ,4 3 2
Circular definition, 65
Clause, 302
Clique, 28, 296
CLIQUE ,2 9 6
Closed under, 45
Closure under complementation
context-free languages, non-, 154
deterministic context-free
languages, 133
P,3 2 2
regular languages, 85
Closure under concatenation
context-free languages, 156
NP,3 2 2
P,3 2 2
regular languages, 47, 60
Closure under intersection
context-free languages, non-, 154
regular languages, 46
Closure under star
context-free languages, 156
NP,3 2 3
P,3 2 3
regular languages, 62
Closure under union
context-free languages, 156
NP,3 2 2
P,3 2 2
regular languages, 45, 59CNF-formula, 302
Co-T uring-recognizable language, 209
Cobham, Alan, 444
Coefficient, 183
Coin-flip step, 396
Complement operation, 4
Completed rule, 140
Complexity class
ASPACE( f(n)),4 1 0
ATIME( t(n)),4 1 0
BPP,3 9 7
coNL ,3 5 4
coNP ,2 9 7
EXPSPACE ,3 6 8
EXPTIME ,3 3 6
IP,4 1 7
L,3 4 9
NC,4 3 0
NL,3 4 9
NP,2 9 2 – 2 9 8
NPSPACE ,3 3 6
NSPACE( f(n)),3 3 2
NTIME( f(n)),2 9 5
P,2 8 4 – 2 9 1 ,2 9 7 – 2 9 8
PH,4 1 4
PSPACE ,3 3 6
RP,4 0 3
SPACE( f(n)),3 3 2
TIME( f(n)),2 7 9
ZPP,4 4 0
Complexity theory, 2
Composite number, 293, 399
Compositeness witness, 401
COMPOSITES ,2 9 3
Compressible string, 267
Computability theory, 3
decidability and undecidability,
193–210
recursion theorem, 245–252
reducibility, 215–239
Tu r i n g m a c h i n e s , 1 6 5 –1 8 2
Computable function, 234
Computation history
context-free languages, 225–226
defined, 220
linear bounded automata,
221–225
Post Correspondence Problem,
227–233
reducibility, 220–233
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 475 ---
INDEX 451
Computational model, 31
Computer virus, 250
Concatenation of strings, 14
Concatenation operation, 44, 47, 60–61
Configuration, 168, 169, 350
Conjunction operation, 14
Conjunctive normal form, 302
coNL ,3 5 4
Connected graph, 12, 185
coNP ,2 9 7
Context-free grammar
ambiguous, 107, 240
defined, 104
Context-free language
decidability, 198–200
defined, 103
deterministic, 131
efficient decidability, 290–291
inherently ambiguous, 108
pumping lemma, 125–130
Cook, Stephen A., 299, 387, 430, 444
Cook–Levin theorem, 299–388
Cormen, Thomas, 444
Corollary, 17
Correspondence, 203
Countable set, 203
Counterexample, 18
Counting problem, 420
Cross product, 6
Cryptography, 433–439
Cut edge, 395
Cut, in a graph, 325, 395
Cycle, 12
Davis, Martin, 183
DCFG ,seeDeterministic context-free
grammar
Decidability, see also Undecidability
context-free language, 198–200
ofACFG,1 9 8
ofADFA,1 9 4
ofAREX,1 9 6
ofECFG,1 9 9
ofEQDFA,1 9 7
regular language, 194–198
Decidable language, 170
Decider
deterministic, 170
nondeterministic, 180
Decision problem, 394Definition, 17
Degree of a node, 10
DeMorgan’s laws, example of proof, 20
Depth complexity, 428
Derivation, 102
leftmost, 108
Derives, 104
Descriptive complexity, 264
Deterministic computation, 47
Deterministic context-free grammar,
139
Deterministic context-free language
defined, 131
properties, 133
Deterministic finite automaton
acceptance problem, 194
defined, 35
emptiness testing, 196
minimization, 327
Deterministic pushdown automaton,
131
defined, 130
DFA,seeDeterministic finite automaton
Diagonalization method, 202–209
D´ıaz, Josep, 444
Difference hierarchy, 328
Digital signatures, 435
Directed graph, 12
Directed path, 13
Disjunction operation, 14
Distributive law, 15
DK-test, 143
DK 1-test, 152
Domain of a function, 7
Dotted rule, 140
DPDA ,seeDeterministic pushdown
automaton
Dynamic programming, 290
ECFG,1 9 9
EDFA,1 9 6
Edge of a graph, 10
Edmonds, Jack, 444
ELBA,2 2 3
Element distinctness problem, 175
Element of a set, 3
Emptiness testing
forCFG,1 9 9
forDFA,1 9 6
forLBA,2 2 3
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 476 ---
452 INDEX
forTM,2 1 7
Empty set, 4
Empty string, 14
Encoding, 185, 287
Enderton, Herbert B., 444
Endmarked language, 134
Enumerator, 180–181
EQCFG,2 0 0
EQDFA,1 9 7
EQREX↑,3 7 2
EQTM
Tu r i n g - u n r e c o g n i z a b i l i t y, 2 3 8
undecidability, 220
Equality operation, 15
Equivalence relation, 9
Equivalent machines, 54
Erd¨os, Paul, 443
Error probability, 397
ETM,2 1 1
ETM,u n d e c i d a b i l i t y ,2 1 7
Euclidean algorithm, 289
Even, Shimon, 444
EXCLUSIVE OR operation, 15
Existential state, 409
Exponential bound, 278
Exponential, versus polynomial, 285
EXPSPACE ,3 6 8
EXPSPACE -completeness, 371–376
EXPTIME ,3 3 6
Factor of a number, 399
Feller, William, 444
Fermat test, 400
Fermat’s little theorem, 399
Feynman, Richard P ., 444
Final state, 35
Finite automaton
automatic door example, 32
computation of, 40
decidability, 194–198
defined, 35
designing, 41–44
transition function, 35
two-dimensional, 241
two-headed, 240
Finite state machine, see
Finite automaton
Finite state transducer, 87
Fixed point theorem, 251
Forced handle, 138Formal proof, 258
Formula, 253, 299
FORMULA-GAME ,3 4 2
Fortnow, Lance, 446
Free variable, 253
FST,seeFinite state transducer
Function, 7–10
argument, 8
binary, 8
computable, 234
domain, 7
one-to-one, 203
one-way, 436
onto, 7, 203
polynomial time computable, 300
range, 7
space constructible, 364
time constructible, 368
transition, 35
unary, 8
Gabarr ´o, Joaquim, 444
Gadget in a completeness proof, 311
Game, 341
Garey, Michael R., 444
Gate in a Boolean circuit, 380
Generalized geography, 344
Generalized nondeterministic finite
automaton, 70–76
converting to a regular
expression, 71
defined, 70, 73
Geography game, 343
GG(generalized geography), 345
Gill, John T., 444
GNFA ,seeGeneralized
nondeterministic finite
automaton
GO, game of, 348
Go-moku, game of, 358
G¨odel, Kurt, 3, 255, 258, 444
Goemans, Michel X., 444
Goldwasser, Shafi, 445
Graph
acyclic, 404
coloring, 325
cycle in, 12
degree, 10
directed, 12
edge, 10
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 477 ---
INDEX 453
isomorphism problem, 323, 415
k-regular, 21
labeled, 11
node, 10
strongly connected, 13
sub-, 11
undirected, 10
vertex, 10
Greenlaw, Raymond, 445
Halting configuration, 169
Halting problem, 216–217
unsolvability of, 216
HALT TM,2 1 6
Hamiltonian path problem, 292
exponential time algorithm, 292
NP-completeness of, 314–319
polynomial time verifier, 293
HAMPATH ,2 9 2 ,3 1 4
Handle, 136
forced, 138
Harary, Frank, 445
Hartmanis, Juris, 445
Hey, Anthony J. G., 444
Hierarchy theorem, 364–371
space, 365
time, 369
High-level description of a T uring
machine, 185
Hilbert, David, 182, 445
Hofstadter, Douglas R., 445
Hoover, H. James, 444, 445
Hopcroft, John E., 443, 445, 447
Huang, Ming-Deh A., 443
iff, 18
Immerman, Neil, 445
Implementation description of a
Tu r i n g m a c h i n e , 1 8 5
Implication operation, 15
Incompleteness theorem, 258
Incompressible string, 267
Indegree of a node, 12
Independent set, 28
Induction
basis, 23
proof by, 22–25
step, 23
Induction hypothesis, 23
Inductive definition, 65Infinite set, 4
Infix notation, 8
Inherent ambiguity, 108
Inherently ambiguous context-free
language, 108
Injective function, 203
Integers, 4
Interactive proof system, 415–427
Interpretation, 254
Intersection operation, 4
IP,4 1 7
ISO,4 1 5
Isomorphic graphs, 323
Johnson, David S., 444, 445
k-ary function, 8
k-ary relation, 9
k-clique, 295
k-optimal approximation algorithm,
395
k-tuple, 6
Karloff, Howard, 446
Karp, Richard M., 445
Kayal, Neeraj, 443
Knuth, Donald E., 139, 445
Kolmogorov, Andrei N., 264
L,3 4 9
Labeled graph, 11
Ladder, 358
Language
co-T uring-recognizable, 209
context-free, 103
decidable, 170
defined, 14
deterministic context-free, 131
endmarked, 134
of a grammar, 103
recursively enumerable, 170
regular, 40
Tu r i n g - d e c i d a b l e , 1 7 0
Tu r i n g - r e c o g n i z a b l e , 1 7 0
Tu r i n g - u n r e c o g n i z a b l e , 2 0 9
Lawler, Eugene L., 445
LBA,seeLinear bounded automaton
Leaf in a tree, 12
Leeuwen, Jan van, 447
Leftmost derivation, 108
Leighton, F. Thomson, 445
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 478 ---
454 INDEX
Leiserson, Charles E., 444
Lemma, 17
Lenstra, Jan Karel, 445
Leveled graph, 361
Levin, Leonid A., 299, 387, 445
Lewis, Harry, 445
Lexical analyzer, 66
Lexicographic order, 14
Li, Ming, 445
Lichtenstein, David, 446
Linear bounded automaton, 221–225
Linear time, 281
Lipton, Richard J., 445
LISP , 182
Literal, 302
Log space computable function, 352
Log space reduction, 352, 432
Log space transducer, 352
Lookahead, 152
LR(k)grammar, 152
Luby, Michael, 446
Lund, Carsten, 443, 446
Majority function, 391
Many–one reducibility, 234
Mapping, 7
Mapping reducibility, 234–239
polynomial time, 300
Markov chain, 33
Match, 227
Matijasevi ˘c, Yuri, 183
MAX-CLIQUE ,3 2 8 ,3 8 9
MAX-CUT ,3 2 5
Maximization problem, 395
Member of a set, 3
Micali, Silvio, 445
Miller, Gary L., 446
MIN-FORMULA ,3 2 8 ,3 5 9 ,3 7 7 ,4 1 1 ,
414
Minesweeper, 326
Minimal description, 264
Minimal formula, 328, 359, 377, 411,
414
Minimization of a DFA,3 2 7
Minimization problem, 394
Minimum pumping length, 91
MIN TM,2 5 1 ,2 7 0
Model, 254
MODEXP ,3 2 3
Modulo operation, 8Motwani, Rajeev, 443
Multiset, 4, 297
Multitape T uring machine, 176–178
Myhill–Nerode theorem, 91
Natural numbers, 4
NC,4 3 0
Negation operation, 14
NFA,seeNondeterministic finite
automaton
Nim, game of, 359
Nisan, Noam, 446
Niven, Ivan, 446
NL,3 4 9
NL-complete problem
PATH ,3 5 0
NL-completeness
defined, 352
Node of a graph, 10
degree, 10
indegree, 12
outdegree, 12
Nondeterministic computation, 47
Nondeterministic finite automaton,
47–58
computation by, 48
defined, 53
equivalence with deterministic
finite automaton, 55
equivalence with regular
expression, 66
Nondeterministic polynomial time, 294
Nondeterministic T uring machine,
178–180
space complexity of, 332
time complexity of, 283
NONISO ,4 1 5
NOT operation, 14
NP,2 9 2 – 2 9 8
NP-complete problem
3SAT ,3 0 2 ,3 8 7
CIRCUIT-SAT ,3 8 6
HAMPATH ,3 1 4
SUBSET-SUM ,3 2 0
3COLOR ,3 2 5
UHAMPATH ,3 1 9
VERTEX-COVER ,3 1 2
NP-completeness, 299–322
defined, 304
NP-hard, 326
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 479 ---
INDEX 455
NP-problem, 294
NPA,3 7 6
NPSPACE ,3 3 6
NSPACE( f(n)),3 3 2
NTIME( f(n)),2 9 5
NTM,seeNondeterministic T uring
machine
o(f(n))(small- onotation), 278
One-sided error, 403
One-time pad, 434
One-to-one function, 203
One-way function, 436
One-way permutation, 436
Onto function, 7, 203
Optimal solution, 394
Optimization problem, 393
ORoperation, 14
Oracle, 260, 376
Oracle tape, 376
Ordered pair, 6
Outdegree of a node, 12
P,2 8 4 – 2 9 1 ,2 9 7 – 2 9 8
P-complete problem
CIRCUIT-VALUE ,4 3 2
P-completeness, 432
PA,3 7 6
Pair
ordered, 6
unordered, 4
Palindrome, 90, 155
Papadimitriou, Christos H., 445, 446
Parallel computation, 427–432
Parallel random access machine, 428
Parity function, 381
Parse tree, 102
Parser, 101
Pascal, 182
Path
Hamiltonian, 292
in a graph, 12
simple, 12
PATH ,2 8 7 ,3 5 0
PCP , seePost Correspondence Problem
PDA,seePushdown automaton
Perfect shuffle operation, 89, 158
PH,4 1 4
Pigeonhole principle, 78, 79, 126
Pippenger, Nick, 430Polynomial, 182
Polynomial bound, 278
Polynomial time
algorithm, 284–291
computable function, 300
hierarchy, 414
verifier, 293
Polynomial verifiability, 293
Polynomial, versus exponential, 285
Polynomially equivalent models, 285
Pomerance, Carl, 443, 446
Popping a symbol, 112
Post Correspondence Problem (PCP),
227–233
modified, 228
Power set, 6, 53
PRAM, 428
Pratt, Vaughan R., 446
Prefix notation, 8
Prefix of a string, 14, 89
Prefix-free language, 14, 212
Prenex normal form, 253, 339
Prime number, 293, 324, 399
Private-key cryptosystem, 435
Probabilistic algorithm, 396–408
Probabilistic function, 436
Probabilistic T uring machine, 396
Processor complexity, 428
Production, 102
Proof, 17
by construction, 21
by contradiction, 21–22
by induction, 22–25
finding, 17–20
necessity for, 77
Proper subset, 4
Prover, 416
Pseudoprime, 400
PSPACE ,3 3 6
PSPACE -complete problem
FORMULA-GAME ,3 4 2
GG,3 4 5
TQBF ,3 3 9
PSPACE -completeness, 337–348
defined, 337
PSPACE -hard, 337
Public-key cryptosystem, 435
Pumping lemma
for context-free languages,
125–130
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 480 ---
456 INDEX
for regular languages, 77–82
Pumping length, 77, 91, 125
Pushdown automaton, 111–124
context-free grammars, 117–124
defined, 113
deterministic, 131
examples, 114–116
schematic of, 112
Pushing a symbol, 112
Putnam, Hilary, 183
PUZZLE ,3 2 5 ,3 5 9
Quantified Boolean formula, 339
Quantifier, 338
in a logical sentence, 253
Query node in a branching program,
404
Rabin, Michael O., 446
Rackoff, Charles, 445
Ramsey’s theorem, 28
Range of a function, 7
Read-once branching program, 405
Real number, 204
Recognizes a language, meaning of, 36,
40
Recursion theorem, 245–252
fixed-point version, 251
terminology for, 249
Recursive language, see
Decidable language
Recursively enumerable, see
Tu r i n g - r e c o g n i z a b l e
Recursively enumerable language, 170
Reduce step, 135
Reducibility, 215–239
mapping, 234–239
polynomial time, 300
via computation histories,
220–233
Reducing string, 135
Reduction
between problems, 215
function, 235
mapping, 235
reversed derivation, 135
Tu r i n g , 2 6 1
Reflexive relation, 9
Regular expression, 63–76
defined, 64equivalence to finite automaton,
66–76
examples of, 65
Regular language, 31–82
closure under concatenation, 47,
60
closure under intersection, 46
closure under star, 62
closure under union, 45, 59
decidability, 194–198
defined, 40
Regular operation, 44
REGULAR TM,2 1 8
Reingold, Omer, 446
Rejecting computation history, 221
Rejecting configuration, 169
Relation, 9, 253
binary, 9
Relatively prime, 288
Relativization, 376–379
RELPRIME ,2 8 9
Reverse of a string, 14
Rice’s theorem, 219, 241, 243, 270, 272
Rinooy Kan, A. H. G., 445
Rivest, Ronald L., 444, 446
Robinson, Julia, 183
Roche, Emmanuel, 446
Root
in a tree, 12
of a polynomial, 183
Rule in a context-free grammar, 102,
104
Rumely, Robert S., 443
Ruzzo, Walter L., 445
SAT,3 0 4 ,3 3 6
#SAT ,4 2 0
Satisfiability problem, 299
Satisfiable formula, 299
Savitch’s theorem, 333–335
Saxena, Nitin, 443
Schabes, Yves, 446
Schaefer, Thomas J., 446
Scope, 338
Scope, of a quantifier, 253
Secret key, 433
Sedgewick, Robert, 446
Self-loop, 10
Self-reference, 246
Sentence, 339
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 481 ---
INDEX 457
Sequence, 6
Sequential computer, 427
Set, 3
countable, 203
uncountable, 204
Sethi, Ravi, 443
Shallit, Jeffrey, 443
Shamir, Adi, 446
Shen, Alexander, 446
Shmoys, David B., 445
Shor, Peter W., 446
Shortlex order, 14
Shuffle operation, 89, 158
Simple path, 12
Singleton set, 4
Sipser, Michael, 446, 447
Size complexity, 428
Small- onotation, 278
SPACE( f(n)),3 3 2
Space complexity, 331–361
Space complexity class, 332
Space complexity of
nondeterministic T uring machine,
332
Space constructible function, 364
Space hierarchy theorem, 365
Spencer, Joel H., 443
Stack, 111
Star operation, 44, 62–63, 323
Start configuration, 169
Start state, 34
Start variable, in a context-free
grammar, 102, 104
State diagram
finite automaton, 34
pushdown automaton, 114
Tu r i n g m a c h i n e , 1 7 2
Stearns, Richard E., 445
Steiglitz, Kenneth, 446
Stinson, Douglas R., 447
String, 14
String order, 14
Strongly connected graph, 13, 360
Structure, 254
Subgraph, 11
Subset of a set, 4
SUBSET-SUM ,2 9 6 ,3 2 0
Substitution rule, 102
Substring, 14
Sudan, Madhu, 443Surjective function, 203
Symmetric difference, 197
Symmetric relation, 9
Synchronizing sequence, 92
Szegedy, Mario, 443
Szelepcz ´enyi, R ´obert, 447
Ta b l e a u , 3 8 3
Ta r j a n , R o b e r t E . , 4 4 7
Ta u t o l o g y, 4 1 0
Te r m , i n a p o l y n o m i a l , 1 8 2
Te r m i n a l , 1 0 2
Te r m i n a l i n a c o n t e x t - f r e e g r a m m a r,
104
Th(M),2 5 5
Theorem, 17
Theory, of a model, 255
3COLOR ,3 2 5
3SAT ,3 0 2 ,3 8 7
Tic-tac-toe, game of, 357
TIME( f(n)),2 7 9
Time complexity, 275–322
analysis of, 276–281
of nondeterministic T uring
machine, 283
Time complexity class, 295
Time constructible function, 368
Time hierarchy theorem, 369
TM,seeTu r i n g m a c h i n e
TQBF ,3 3 9
Tr a n s d u c e r
finite state, 87
log space, 352
Tr a n s i t i o n , 3 4
Tr a n s i t i o n f u n c t i o n , 3 5
Tr a n s i t i v e c l o s u r e , 4 2 9
Tr a n s i t i v e r e l a t i o n , 9
Tr a p d o o r f u n c t i o n , 4 3 8
Tr e e , 1 2
leaf, 12
parse, 102
root, 12
Tr i a n g l e i n a g r a p h , 3 2 3
Tu p l e , 6
Tu r i n g m a c h i n e , 1 6 5 –1 8 2
alternating, 409
comparison with finite
automaton, 166
defined, 168
describing, 184–187
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

File: --- Introduction to the theory of computation_third edition - Michael Sipser.pdf --- Page 482 ---
458 INDEX
examples of, 170–175
marking tape symbols, 174
multitape, 176–178
nondeterministic, 178–180
oracle, 260, 376
schematic of, 166
universal, 202
Tu r i n g r e d u c i b i l i t y, 2 6 0 –2 6 1
Tu r i n g , A l a n M ., 3 , 1 6 5 , 1 8 3 , 4 4 7
Tu r i n g - d e c i d a b l e l a n g u a g e , 1 7 0
Tu r i n g - r e c o g n i z a b l e l a n g u a g e , 1 7 0
Tu r i n g - u n r e c o g n i z a b l e l a n g u a g e ,
209–210
EQTM,2 3 8
Tw o - d i m e n s i o n a l fi n i t e a u t o m a t o n , 2 4 1
Tw o - h e a d e d fi n i t e a u t o m a t o n , 2 4 0
2DFA ,seeTw o - h e a d e d fi n i t e a u t o m a t o n
2DIM-DFA ,seeTw o - d i m e n s i o n a l fi n i t e
automaton
2SAT ,3 2 7
Ullman, Jeffrey D., 443, 445, 447
Unary
alphabet, 52, 82, 240
function, 8
notation, 287, 323
operation, 44
Uncountable set, 204
Undecidability
diagonalization method, 202–209
ofATM,2 0 2
ofELBA,2 2 3
ofEQTM,2 2 0
ofETM,2 1 7
ofHALT TM,2 1 6
ofREGULAR TM,2 1 9
ofEQCFG,2 0 0
of Post Correspondence Problem,
228
via computation histories,
220–233
Undirected graph, 10
Union operation, 4, 44, 45, 59–60
Unit rule, 109
Universal quantifier, 338
Universal state, 409
Universal T uring machine, 202
Universe, 253, 338
Unordered pair, 4
Useless stateinPDA,2 1 2
inTM,2 3 9
Valiant, Leslie G., 443
Valid string, 136
Variable
Boolean, 299
bound, 338
in a context-free grammar, 102,
104
start, 102, 104
Venn diagram, 4
Verifier, 293, 416
Vertex of a graph, 10
VERTEX-COVER ,3 1 2
Virus, 250
Vitanyi, Paul, 445
Wegman, Mark, 444
Well-formed formula, 253
Williamson, David P., 444
Window, in a tableau, 307
Winning strategy, 342
Wire in a Boolean circuit, 380
Worst-case analysis, 276
XOR operation, 15, 383
Yannakakis, Mihalis, 446
Yields
for configurations, 169
for context-free grammars, 104
ZPP,4 4 0
Zuckerman, Herbert S., 446
Copyright 2012 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.

Parent

01K9W87EVXFE21NYG6VGAQDJCD

No children (leaf entity)