Instructor: Shai Ben-David

Time: MW 4:00PM - 5:20PM MC4063

.1A * 4+ + .2MT + .4FE

+: best 4 of 5

Week 1. Jan 6

countability of sets

notation. set operations

notation. multisets: multiplicity matters.

notation. sequence sets: multiplicity and order matter.

defn. relations are subsets of cartesian products.

eg. in a graph G=(V,E)G=(V,E), edges EV2E\subseteq V^2 are relation.

defn. an order relation is set of pairs in a relation such that the first element is less than or equal to the second. it has

defn. a function relation of f:ABf:A\rightarrow B is RfA×B={(x,f(x)):xA,f(x)B}R_f\subseteq A\times B=\{(x,f(x)):x\in A,f(x)\in B\}.

defn. a function f:ABf:A\rightarrow B is one-to-one if for all x1,x2Ax_1,x_2\in A, x1x2x_1\neq x_2 imples f(x1)f(x2)f(x_1)\neq f(x_2).

defn. a function f:ABf:A\rightarrow B is onto BB if for every yBy\in B there exists xAx\in A such that f(x)=yf(x)=y.

defn. an equivalent relation is a relation with

defn. AB|A|\leq|B| if there exists a one-to-one function f:ABf:A\rightarrow B.

defn. A=B|A|=|B| if any of following true

theorem. (Cantor–Bernstein) two definitions above are equivalent.

eg. let E={2m:mN}E=\{2m:m\in \mathbb{N}\}. then EN|E|\leq|\mathbb{N}| because we have f(m)=m2f(m)=\frac{m}{2}, and NE|\mathbb{N}|\leq|E| because we have f(m)=2mf(m)=2m. so E=N|E|=|\mathbb{N}|.

eg. let A=(0,1)R,B=[0,1]RA=(0,1)\subseteq \mathbb{R},B=[0,1]\subseteq\mathbb{R}. then AB|A|\subseteq|B| because we have f(x)=xf(x)=x, and BA|B|\leq|A| because we have f(x)=x2+14f(x)=\frac{x}{2}+\frac{1}{4} (shrink B to fit in A). so A=B|A|=|B|. but it is not obvious to find a both one-to-one and onto function.

eg. show Z=N|\mathbb{Z}|=|\mathbb{N}|.
define g:ZNg:\mathbb{Z}\rightarrow\mathbb{N}

g(x)={2xx02x+1x<0g(x)=\begin{cases} 2x & \text{, } x\geq 0 \\ -2x+1 & \text{, } x<0 \end{cases}

it is both one-to-one and onto. \square

examples of infinite sets of same cardinality:

theorem. (Cantor) NZE≉(0,1)[0,1]R\mathbb{N}\approx\mathbb{Z}\approx E\not\approx (0,1)\approx[0,1]\approx\mathbb{R}.
proof. show there does not exist an onto function from N\mathbb{N} to (0,1)(0,1) (or equivalently (0,1)(0,1) to N\mathbb{N}). let f:N(0,1)f: \mathbb{N}\rightarrow(0,1) be any function, write

f(0)=0.x00x10x20...xn0...f(1)=0.x01x11x21...xn1......f(m)=0.x0mx1mx2m...xnm......f(0)=0.x_0^0x_1^0x_2^0...x_n^0...\\ f(1)=0.x_0^1x_1^1x_2^1...x_n^1...\\ ...\\ f(m)=0.x_0^mx_1^mx_2^m...x_n^m...\\ ...\\

choose the diagonal d=0.x00x11...xnn...d=0.x_0^0x_1^1...x_n^n..., and let rf=0.y0y1...ynr_f=0.y_0y_1...y_n where for all ii we have yixiiy_i\neq x_i^i. we claim rfr_f is not in the range of ff. so ff is not onto. \square

theorem. (generalized Cantor) for every set AA, A≉2AA\not\approx 2^A.
pf. let f:A2Af:A\rightarrow2^A and we want to show it cannot be onto. define Zf={xA:xf(x)}AZ_f=\{x\in A:x\notin f(x)\}\subseteq A, we have Zf2A,Z_f\in2^A, we want to show for no xAx\in A we get f(x)=Zff(x)=Z_f (ff must miss this subset in 2A2^A). assume for contradiction that we have x0x_0 such that f(x0)=Zff(x_0)=Z_f, if x0Zfx_0\in Z_f then x0Zfx_0\notin Z_f, and vice versa... \square

Week 2. Jan 10

eg. show for all AA we have AP(A)A\leq \mathcal{P}(A).
define one-to-one mapping f(a)={a}aAf(a)=\{a\}\forall a\in A. further by generalized Cantor we have A<P(A)A<\mathcal{P}(A). \square

eg. show RP(N)\mathbb{R}\approx\mathcal{P}(\mathbb{N}).
start by R(0,1)\mathbb{R}\approx(0,1). note we can use binary expansion to represent numbers in (0,1), eg 0.10110100... but for each number, this series of 0 and 1's will represent a subset of natural numbers if 1 represents chosen and 0 otherwise for iNi\in\mathbb{N}-th digit. eg f(0.10101) = {0,2,4}. \square

corollary. N<R|\mathbb{N}|<|\mathbb{R}|.

puzzle. suppose we have A<P(A)<P(P(A))<...A<\mathcal{P}(A)<\mathcal{P}(\mathcal{P}(A))<... what is the size of these sets altogether? the 'number' of sizes is larger than the 'size' of any set.

puzzle. is there any set AA such that N<A<R|\mathbb{N}|<|A|<|\mathbb{R}|, ie size between that of N\mathbb{N} and R\mathbb{R}? there is no way to answer this question.

notation.

defn. set of size 0\alef_0 are called countable sets.

defn. set of size greater than that of N\mathbb{N} is uncountable.

theorem. (closure properties of family of countable sets)

  1. if A,BA,B are both countable, so is ABA\cup B.
  2. if there are finite countable sets A1,A2,...A_1,A_2,... that are countable, then so is iNAi\bigcup_{i\in\mathbb{N}}A_i.

corollary. Q\mathbb{Q} is countable.
because you can use a pairs of naturals to represent rationals, then union them.

theorem. some further notes about sizes of infinite sets:

  1. every infinite set AA has a proper subset BB such that A=B|A|=|B|.
  2. for pairs of infinite sets A,BA,B, then AB=max{A,B}|A\cup B|=\max\{|A|,|B|\}
  3. for every infinite set AA and natural number nn, An=A|A^n|=|A|
    1. it is not the case if nn is infinite, eg NN>N|\mathbb{N}^\mathbb{N}|>|\mathbb{N}|

regular languages

computation devices tasks
DFA given some language and a word,
does this word belong to the language?

defn.

  1. an alphabet Σ\Sigma is any finite set of symbols
  2. for all nNn\in\mathbb{N}, Σn:=Σ×...×Σ\Sigma^n:=\Sigma\times...\times\Sigma (n times) is the set of all sequences a1...ana_1...a_n such that for all ini\leq n we have aiΣa_i\in\Sigma
  3. Σ:=nNΣn\Sigma^*:=\bigcup_{n\in\mathbb{N}}\Sigma^n, ie the set of all finite sequence of members of Σ\Sigma
  4. a language is a subset of Σ\Sigma^*.

every alphabet Σ\Sigma that we discuss will be finite.

theorem. Σn=Σn|\Sigma^n|=|\Sigma|^n.

theorem. ΣN\Sigma^*\approx\mathbb{N}, ie it is countable.

theorem. the set of all languages over Σ\Sigma^* is not countable.
proof. P(Σ)P(N)R\mathcal{P}(\Sigma^*)\approx\mathcal{P}(\mathbb{N})\approx\mathbb{R}. \square

defn. a deterministic automaton (DFA) is a 5-tuple:

defn. the language LL is accepted by AA, ie w=σ1...σnL(A)w=\sigma_1...\sigma_n\in L(A) iff x1,...,xnQ\exists x_1,...,x_n\in Q such that x0=q0x_0=q_0 and ixi+1=δ(xi,Σi+1)\forall i\, x_{i+1}=\delta(x_i,\Sigma_{i+1}) and xnFx_n\in F.

eg. construct a DFA that accepts a string w{0,1}w\in\{0,1\}^* iff the number of 1's in ww is even.

q0 q1
0 q0 q1
1 q1 q0

accepting state is q0.

eg. describe a DFA AA such that L(A)={w=w1...wn:nN,wn=1}L(A)=\{w=w_1...w_n:n\in\mathbb{N},w_n=1\} (it it ends with 1).

img

eg. find a DFA such that L(A)=L(A)= all strings that starts and ends with the same symbol.
start from q0, if the first symbol is 1, then go to the same DFA above; if first symbol is 0, ...

eg. find AA such that L(A)={w:#0(w)1modk}L(A)=\{w:\#_0(w)\equiv1\mod k\} where #0=\#_0= number of symbols 0 in ww.
we need to store how many 0's we have seen so far.

img

eg. describe a DFA such that L(A)={w=w11111w2:w1,w2{0,1}}L(A)=\{w=w_11111w_2:w_1,w_2\in\{0,1\}^*\}, ie it has four 1's.

img

how can we prove that the language is accepted by A? one way of providing such statement is, for every state qiQq_i\in Q, let L(qi)={w:upon reading w we end in state qi}L(q_i)=\{w:\text{upon reading }w\text{ we end in state }q_i\}. describe each language L(qi)L(q_i) and prove by induction on the length of ww that wL(qi)w\in L(q_i) iff it satisfies properties.

theorem. there are languages LΣL\subseteq\Sigma^* such that no DFA accepts it.
proof. for every nonempty alphabet Σ\Sigma we have Σ=N=0\Sigma^*=|\mathbb{N}|=\alef_0, then {L:LΣ}=P(N)>0\{L:L\subseteq\Sigma^*\}=|\mathcal{P}(\mathbb{N})|>\alef_0, so the number of languages uncountable. we claim the number of DFAs over a given Σ\Sigma is countable. let Wn={A:A is DFA with at most n states}W_n=\{A:A\text{ is DFA with at most }n\text{ states}\}, then we have at most nnΣn^{n|\Sigma|} possible transition tables. recall that A=(q0,{q0,...,qn1},δ,F)A=(q_0,\{q_0,...,q_{n-1}\},\delta,F), the number of possible FF is 2n2^n, so WnnnnΣ2n|W_n|\leq nn^{n|\Sigma|}2^n is finite. hence the set DFAs is countable, but the number of languages are not. we cannot cover all the languages. \square

definition. a regular language is a language that is accepted by a DFA.

corollary. the number of regular languages based on Σ\Sigma is countable.

theorem. (closure properties of the set of regular languages)

  1. closure under complement: if LL is a regular language then so is L={w:wL}\overline{L}=\{w:w\notin L\}

  2. if both L1,L2L_1,L_2 are regular, then so is L1L2L_1\cup L_2

    use induction to show it satisfies. \square

  3. if both L1,L2L_1,L_2 are regular, then so is L1L2L_1\cap L_2.

Week 3. Jan 17

defn. a non-deterministic automaton (NFA) is a tuple N=(Σ,Q,δ,F,q0)N=(\Sigma,Q,\delta,F,q_0) where

  1. δ:Q×(Σ{ϵ})P(Q)\delta:Q\times(\Sigma\cup\{\epsilon\})\rightarrow\mathcal{P}(Q)

or informally, NFA is a DFA plus following relaxations:

  1. allowing multiple same-label transition out of state
  2. allowing $\epsilon$-transition (switching between states without reading an input symbol)

defn. the language LL is accepted by NN, ie wL(N)w\in L(N) iff w\exists\overline{w} obtained from ww by inserting the symbol ϵ\epsilon in as many places and x1,...,xm\exists x_1,...,x_m such that x1=q0x_1=q_0 ixi+1δ(xi,yi)\forall i\,x_{i+1}\in\delta(x_i,y_i) where yiy_i is ith symbol in w\overline{w} and ymFy_m\in F.

eg. given DFA's A1,A2A_1,A_2 construct A3A_3 accepting L(A1)L(A2)L(A_1)\cup L(A_2)

img

eg. let L={w{0,1}:third-last symbol in w is 1}L=\{w\in\{0,1\}^*:\text{third-last symbol in }w\text{ is }1\} = {111, 000100, 00101, ...}.

img

we can choose many paths. if any path leads us to an accepting state then w is accepted.

using DFA, we would remember the last 3 letters and each correspond to a state.

theorem. (closure properties of the set of languages accepted by NFAs)

  1. closure under concatenation: if L1,L2L_1,L_2 are accepted by N1,N2N_1,N_2, respectively, then L1L2={w1w2:w1L1,w2L2}L_1\circ L_2=\{w_1w_2:w_1\in L_1,w_2\in L_2\} is accepted by an NFA.
  2. closure under * operation: given language LL accepted by NN, let L0=,L1=L,Ln+1=LnLL_0=\varnothing,L_1=L,L_{n+1}=L_nL, then L=nNLn={w1...wn:wiΣ,nN}L^*=\bigcup_{n\in\mathbb{N}}L_n=\{w_1...w_n:w_i\in\Sigma,n\in\mathbb{N}\} is accepted by an NFA.

remark. if we do not add this new qq_* but reuse q0q_0 we would have error: a word finishes scanning and land on non-accepting state which can transit to q0q_0, then this word is suddenly accepted.

theorem. a language LL is accepted by an DFA iff is accepted by a NFA.
proof.

corollary. converting an NFA into a DFA, the DFA may have 2n2^n states.

corollary. a language is regular is regular iff it is accepted by an NFA.

corollary. the set of all regular languages is closed by union, intersection, concatenation and star.

theorem. every finite language is regular.
proof. let L={w1,...,wn}L=\{w_1,...,w_n\}. we know for every word wi=ai1...aikiw_i=a_{i1}...a_{ik_i}, the individual letters {ai1},...{aiki}\{a_{i1}\},...\{a_{ik_i}\} are regular, by closure of concatenation, we know for every word wiw_i, Lwi={wi}={ai1...aiki}L_{w_i}=\{w_i\}=\{a_{i1}...a_{ik_i}\} is regular. by closure of union we know L=Lw1...LwnL=L_{w_1}\cup...\cup L_{w_n} is regular. \square

theorem. every regular language can be derived from these basic languages by a finite sequence of closure operations.

inductive definition of regular languages

we can define sets in 3 ways:

Week 4. Jan 24

eg. X = all human beings, A = {me}, O = {parent of, child of, spouse of}. my blood relatives are I(X,A,O).

theorem. if W1,W2W_1,W_2 are two sets that contain AA and is closed under OO, then W1W2W_1\cap W_2 also contains AA and is closed under OO.

theorem. (minimality) define W=iWiW=\bigcap_iW_i where WiW_i's are all sets that contain AA and is closed under OO, then not only WW contain AA and is closed under OO, but also is minimal in this respect.
proof. because if any set W\overline{W} satisfies the properties, then it is among the sets intersected in the definition of WW, so WWW\subseteq\overline{W}. \square

theorem. such W=iWiW=\bigcap_i W_i is unique.

eg. assume we know real numbers, define natural numbers: I(R,{0},{xx+1})=NI(\mathbb{R},\{0\},\{x\mapsto x+1\})=\mathbb{N}.

eg. define set of all polynomials with integer coefficients.

eg. set of all propositional formulas.

inductive sets have the ability to prove properties by structural induction.

eg. have 3 cups initially positioned as ∪∩∪, one can flip two cups each time. can i finally flip all of them upright?

we can use inductive sets to revisit regular expressions.

defn. set of regular expressions is I=(alphabet,A,O)I=(\text{alphabet}^*,A,O) where:

  1. alphabet: Σ{ϵ,,(,),,,}\Sigma\cup\{\epsilon,\varnothing,(,),\circ,*,\cup\}
  2. core: A=Σ{ϵ,}A=\Sigma\cup\{\epsilon,\varnothing\}
  3. operations: O={R1,R2R1R2,R1,R2R1R2,R1R1}O=\{\frac{R_1,R_2}{R_1\cup R_2},\frac{R_1,R_2}{R_1\circ R_2},\frac{R_1}{R_1^*}\}

we can generate ϵ,a,b,ab,a,(ab)\epsilon,a,b,a\circ b,a^*,(a\cup b)^*, etc.

for simplicity we omit "\circ", and delete brackets according to priorities: >>* >\circ>\cup. it is also common to use ++ instead of \cup.

theorem. for alphabet Σ\Sigma, we define a function LL that maps regular expressions to languages for every regular expression rr such that L(r)L(r) is subset of Σ\Sigma^*

  1. L()=L(\varnothing)=\varnothing (empty expression -> empty language)
  2. L(ϵ)={ϵ}L(\epsilon)=\{\epsilon\}
  3. L(a)={a}L(a)=\{a\} for every aΣa\in\Sigma
  4. L(R1R2)=L(R1)L(R2)L(R_1\cup R_2)=L(R_1)\cup L(R_2)
  5. L(R1R2)=L(R1)L(R2)L(R_1R_2)=L(R_1)L(R_2)
  6. L(R)=(L(R))L(R^*)=(L(R))^*

such LL is well-defined.
idea. this is well defined whenever I(X,A,O) has a unique readability property, ie for every rI(X,A,O)r\in I(X,A,O), either rAr\in A or there are uniquely defined fOf\in O and r1,r2I(X,A,O)r_1,r_2\in I(X,A,O) such that f(r1,r2)=rf(r_1,r_2)=r.

remark. why do we need \varnothing in definition of regular languages? because there are languages that do not accept anything, and we need a mapping from regular languages to it.

eg. assume Σ={a,b}\Sigma=\{a,b\}, let L1=L_1= all strings containing aa, r=(ab)a(ab)r=(a\cup b)^*a(a\cup b)^*, then L(r)=L1L(r)=L_1.

eg. let r=(a+b)r=(a+b)\varnothing, what is L(r)L(r)? it is \varnothing because it is followed by empty set => cannot generate.

eg. find rr such that L(r)=L(r)= all strings where every aa is immediately followed by a bb. r=((ϵa)b)r=((\epsilon\cup a)b)^*.

defn. a generalized non deterministic automaton (GNFA) is a tuple (Σ,Q,q0,qF,δ)(\Sigma,Q,q_0,q_F,\delta)

  1. Σ\Sigma: finite alphabet
  2. QQ: finite states
  3. starting state: q0q_0
  4. accepting states: qFq_F
  5. transitions δ:(Q{q0})×(Q{qF})(Q{q0})\delta:(Q-\{q_0\})\times(Q-\{q_F\})\rightarrow (Q-\{q_0\}): δ\delta defines the labeling of edges in the GNFA, where edges are regular expressions

or informally an NFA where,

  1. there is a unique start state q0q_0 and no arrows go into it
  2. there is a unique end state qFq_F and no arrows go out of it
  3. every arrow is labeled by a regular expression

defn. a string ww is accepted by GNFA DD, ie wL(D)w\in L(D), iff there exists a path q0...qkqFq_0...q_kq_F such that w=w0w1...wkw=w_0w_1...w_k and for every iki\leq k we have wiL(δ(qi,qi+1))w_i\in L(\delta(q_i,q_{i+1})) and wkL(δ(qk,qF))w_k\in L(\delta(q_k,q_F)). ie we are matching the input word by regexes part by part.

aaabb: a* b* q0 ------> q1 ------> qF

theorem. for every DFA DD, there exists a NGFA GG accepting same language, ie L(D)=L(G)L(D)=L(G).
proof. we need to remove labels entering q0q_0: add a new nonaccepting starting state q0\overline{q}_0 that goes to the original q0q_0 by ϵ. then need to remove labels leaving accepting states: add a new accepting state qF\overline{q}_F and any previous accepting states go to it by ϵ. make the previous accepting states non-accepting.

if we have self-loops q1q2q_1\rightarrow q_2 by aa, we can star it: aa^*.

then we relabel using regexes and eliminate useless states. given any GNFA G=(Σ,Q,q0,qF,δ),Q={q0,...,qk,qF}G=(\Sigma,Q,q_0,q_F,\delta),Q=\{q_0,...,q_k,q_F\}, we will construct GNFA's G1,...,GkG_1,...,G_k such that Q(Gi+1)=Q(Gi){qi}Q(G_{i+1})=Q(G_i)-\{q_i\} and L(Gi+1)=L(Gi)L(G_{i+1})=L(G_i), Q(Gk)={q0,qF}Q(G_k)=\{q_0,q_F\} (?). connect any qq with an edge leading into qiq_i to any node with an arrow from qiq_i to that state

img

(if state q1q_1 goes to q2q_2 by two arrows r1,r2r_1,r_2, then replace two arrows by a single one r1+r2r_1+r_2)

we shrink edges until we only have two states from q0q_0 to qFq_F. \square

we now have the third type of automaton aside from DFA and NFA, we we can show the following theorem.

theorem. for every Σ\Sigma, {L(r):r is regular expression over Σ}={L(A):A is a DFA over Σ}=\{L(r):r\text{ is regular expression over }\Sigma\}=\{L(A):A\text{ is a DFA over }\Sigma\}= all regular languages over Σ\Sigma^*.
proof. we want to show for a regular expression rr, there exists some DFA ArA_r such that L(Ar)=L(r)L(A_r)=L(r). we know regular expressions = I(X=strings, A={∅,ϵ}∪Σ, O={...}), we want to show it has equivalent automaton. by induction, base: we just show each member AA has a DFA:

induction: let r1,r2r_1,r_2 be regular expressions, assume there are DFAs D1,D2D_1,D_2 s.t. L(D1)=L(r1),L(D2)=L(r2)L(D_1)=L(r_1),L(D_2)=L(r_2) (IH), we want to show after operations IH still holds:

so for every regular expression rr there exists some DFA DD s.t. L(D)=L(r)L(D)=L(r), and left ⊆ right.

the remaining direction is for every DFA, there is a same regular language. we just convert it into a GNFA with only two states. the regex on the edge is the regular expression we want. \square

Week 5. Jan 31

the pumping lemma

corollary. for every Σ\Sigma there exists LΣL\subseteq\Sigma^* that is not regular.
proof. we know the set of languages is uncountable and set of DFAs over Σ\Sigma^* is countable (so set of regular expressions over Σ\Sigma^* is countable). \square

can we point to a non-regular language?

theorem. (the pumping lemma) for every regular language LL there exist some number nLn_L such that for every wLw\in L such that w>nL|w|>n_L there is a partition of ww into sections w=xyzw=xyz satisfying:

  1. xynL|xy|\leq n_L,
  2. yϵy\neq\epsilon,
  3. iNxyizL\forall i\in\N\in xy^iz\in L.

proof. let AA be a DFA accepting LL and let nLn_L denote number of states in AA. let wLw\in L with w>nL|w|>n_L, consider the sequence of states q0,δ(q0σ1),,δ(q0σ1σ2),...,δ(q0σ1...σm)q_0,\delta(q_0\sigma_1),,\delta(q_0\sigma_1\sigma_2),...,\delta(q_0\sigma_1...\sigma_m) where w=σ1...σmw=\sigma_1...\sigma_m, then there exists σr,σs\sigma_r,\sigma_s such that δ(q0σ1...σn)=δ(q0σ1...σnσr...σs)\delta(q_0\sigma_1...\sigma_n)=\delta(q_0\sigma_1...\sigma_n\sigma_r...\sigma_s) (we are repeating some states).

now define x=σ1...σn,y=σr+1...σs,z=σ=δs+1...σmx=\sigma_1...\sigma_n,y=\sigma_{r+1}...\sigma_s,z=\sigma=\delta_{s+1}...\sigma_m by picking rr such that δ(q0σ1...σr)\delta(q_0\sigma_1...\sigma_r) is different than any δ(q0...qi)\delta(q_0...q_i) for any i<ni<n and that δ(q0σ1...σr+1),σ(q0σ1...σr+2),...,δ(q0σ1...σs)\delta(q_0\sigma_1...\sigma_{r+1}),\sigma(q_0\sigma_1...\sigma_{r+2}),...,\delta(q_0\sigma_1...\sigma_s) are all different, we are guaranteed that xynL|xy|\leq n_L. then δ(q0σ1...σr(σr+1...σs)iσs+1...σm)=δ(q0σ1...σr(σr+1...σs)jσs+1...σm)\delta(q_0\sigma_1...\sigma_r(\sigma_{r+1}...\sigma_s)^i\sigma_{s+1}...\sigma_m)=\delta(q_0\sigma_1...\sigma_r(\sigma_{r+1}...\sigma_s)^j\sigma_{s+1}...\sigma_m) for all i,jNi,j\in\N. \square

eg. given L={anbn:nN}L=\{a^nb^n:n\in\N\}, show it is not regular.
pick any nLn_L, given a word w=anL+1bnL+1Lw=a^{n_L+1}b^{n_L+1}\in L, since we want to have xynL|xy|\leq n_L we pick partition xyxy to be a sequence of aa's. then xy2zxy^2z will have nL+1+yn_L+1+|y| aa's and nL+1n_L+1 bb's, which is not in LL, failing the pumping lemma (3). \square

eg. show palindromes L={wwR:w{a,b}}L=\{ww^R:w\in\{a,b\}^*\} is not regular.
consider w=anL+1bbanL+1w=a^{n_L+1}bba^{n_L+1}.

eg. (languages over an alphabet with just one symbol) let L{a},Ln={wL:w=n}L\subseteq\{a\}^*,L_n=\{w\in L:|w|=n\}, show every infinite language KK where for arbitrarily large numbers kk there exists nkn_k such that for all nkink+kn_k\leq i\leq n_k+k such that Li=L_i=\varnothing is not regular.

given any number there is a hole of that length a aa ... a^5 a^6 a^7 ... a^15 a^16

pick any nLn_L, and pick any k>nLk>n_L such that w:=akKw:=a^k\in K and for all inL+1i\leq n_L+1 we have ak+iKa^{k+i}\notin K. then let xyzxyz be any partition of this word such that xynL|xy|\leq n_L, then xy2z=w+yk+nL<k+nL+1|xy^2z|=|w|+|y|\leq k+n_L<k+n_L+1 and it is not in KK. \square

the machine will lose track of states and 'sleep' after rejecting too many.

eg. {an2:nN}\{a^{n^2}:n\in\N\} is not regular.
because the gap between n2n^2 and (n+1)2(n+1)^2 can be arbitrarily large.

eg. {ap:p is prime}\{a^p:p\text{ is prime}\} is not regular.
it is an infinite language but for every kk there exists a prime number pp such that p+1,...,p+kp+1,...,p+k is not in the language.

eg. show L={w{a,b}:#a(w)#b(w)}L=\{w\in\{a,b\}^*:\#_a(w)\neq\#_b(w)\} is not regular.
use closure property. consider {ab}L=anbn\{ab\}^*-L=a^nb^n, we know it not is regular either by last example, but it should if LL were regular. \square

combinatorial definition of regular languages

the pumping lemma is not a characteristic of regular languages, but MN theorem is.

defn. given a language LΣL\subseteq\Sigma^*, we say w,wΣw,w'\in\Sigma^* are distinguishable with respect to LL if there exists some zΣz\in\Sigma^* such that one of wz,wzwz,w'z is in LL and the other is not.

claim. given LL we define an equivalence relation over Σ\Sigma^* by wLww\equiv_Lw' if w,ww,w' are not distinguishable via LL.
proof.

using this relation we can partition Σ\Sigma^* into equivalence classes.

eg. let Le={w:w2N}L_e=\{w:|w|\in2\N\}. then abab and abbabb are distinguishable (a simplest empty string as zz will work). we can also partition Σ\Sigma^* into two equivalence classes (even length and odd).

eg. let L={anbk:n,kN}L=\{a^nb^k:n,k\in\N\}. then there are infinitely many equivalent classes {w:#a(w)#b(w)=i}\{w:\#_a(w)-\#_b(w)=i\} for all ii...

eg. let Lk={wΣ:#a(w)k}L_k=\{w\in\Sigma^*:\#_a(w)\leq k\}. depending on how many aa's there are k+2k+2 equivalent classes.

theorem. (Myhill Nerode) a language LL is regular iff the number of equivalence classes of L\equiv_L is finite. furthermore, the number of equivalence classes is the number the minimal number of states in any DFA for LL.

proof.

eg. show L={w=w1...wn{a,b}:nk,wnk+1=a}L=\{w=w_1...w_n\in\{a,b\}^*:n\geq k,w_{n-k+1}=a\} is regular.
we see wLww\equiv_L w' iff the last k-long substring in ww is the same as the last k-long substring in ww'. so we have 2k2^k equivalence classes depending on the available combinations. \square

claim. any nonmempty language LL accepted by a DFA with kk states must accept some word of length less than or equal to kk.

context-free languages

every CFL is defined by a context-free grammar. while regular languages are defined by either a DFA/NFA or a regular expression (description), CFL are defined by a method generating the language.

defn. a context-free grammar (CFG) is a tuple G=(Σ,V,S,R)G=(\Sigma,V,S,R) where

defn. a context-free language for GG is L=I((ΣV),C,O)ΣL=I((\Sigma\cup V)^*,C,O)\cap\Sigma^*, where C=SC=S, OO is a collection of operations www\mapsto w' where w=uAv,w=uwvw=uAv,w'=u\overline{w}v and AwRA\rightarrow\overline{w}\in R.

eg. show the language for the following grammar is L={0n1n:nN}L=\{0^n1^n:n\in\N\}.

Σ: {0,1} V: {s} S: s R: s -> 0s1, s -> ϵ words before intersection: 0s1, 00ss1, ..., 0000s1111, 00001111, ...

eg. find a CFG to generate L={anbk:nk}L=\{a^nb^k:n\leq k\}.

Σ: {a,b} V: {S,T} S: S R: S -> aSbT | ϵ T -> bT | ϵ

eg. find a CFG to generate L={w{a,b}:#a(w)=#b(w)}L=\{w\in\{a,b\}^*:\#_a(w)=\#_b(w)\}.

Σ: {a,b} V: {S} S: S R: S -> aSbS | bSaS | ϵ

showing LL includes this is easy. proving this includes LL: use induction on w|w|, assume every 'balanced' w=w1...wnw=w_1...w_n of length nn can be generated by the grammar. let mm be minimum index from which w1...wmw_1...w_m is balanced, then we claim w1wnw_1\neq w_n. so w2...wm1w_2...w_{m-1} is balanced, so my grammar will generate based on this portion. \square

theorem. every regular language is a context-free language.
proof. use induction on all regular languages: I(P(Σ),{,ϵ}Σ,{,,})I(\mathcal{P}(\Sigma^*),\{\varnothing,\epsilon\}\cup\Sigma,\{\cup,\circ,*\})

Week 6. Feb 7

parse tree: internal nodes are in VV. leaves are in Σ\Sigma^*. it describes the generating sequence too.

a parse tree for a given wL(G)w\in L(G) is not necessarily unique.

eg. set of well formed propositional logic expressions

Σ: {(,),∧,∨,¬,→,p1,...,pn} V = {S} R = S -> p1 | ... | pn | ¬S | (S ∧ S) | (S ∨ S) | (S → S) | ϵ

parse tree for (p1p2)p3(p_1\lor p_2)\rightarrow p_3:

img

the brackets are necessary to avoid ambiguity (multiple possible pares trees for same expression).

the pumping lemma

the set of context-free languages over an alphabet is countable because derivation rules are finite objects.

defn. the degree of a tree is the max number of down-going edges from any vertex; the height of a tree is the length of the longest branch of the tree.

lemma. for any tree TT, the number of leaves of TdhT\leq d^h where d=deg(T),h=height(T)d=\text{deg}(T),h=\text{height}(T).

note that if T is a derivation tree for a word ww in some LL, then # leavesw\text{\# leaves}\geq|w|, and d(T)maxw{AwR}d(T)\leq\max_{|w|}\{A\rightarrow w\in R\} (longest word produced by one rule), so hlogdwh\geq\log_{d}|w|.

theorem. (the CFL pumping lemma) for every CFL LL, there is some number pNp\in N such that for every wLw\in L if w>p|w|>p then there is a partition of ww into 5 parts w=xyzuvw=xyzuv and

  1. yzup|yzu|\leq p,
  2. yu1|yu|\geq1,
  3. xyizuivLxy^izu^iv\in L for all iNi\in\N.

proof. let GG be the grammar and let p=dV+1p=d^{|V|+1} where dd is the degree of any derivation tree and V|V| be number of variables. let wL(G)w\in L(G) be a word longer than pp. pick a derivation tree for ww with minimal possible number of vertices so we have the 'height' of derivation steps hV+1h\geq|V|+1, which means we must have repeated some variables. consider the two lowest occurrences of such variables, call these nodes A,BA,B and AA is on the left of BB. as the leaves of the tree are the word itself, we pick

  1. the leftmost subtree of AA as xx,
  2. to the leftmost subtree of BB as yy,
  3. to the rightmost subtree of BB as zz,
  4. to the rightmost subtree of AA as uu,
  5. the rightmost subtree of AA as vv.

we can generate powers of yy and uu because we can mimic the reuse of variable seen in AA and BB in the leaf level. \square

img

eg. let L={anbncn{a,b,c}:nN}L=\{a^nb^nc^n\in\{a,b,c\}^*:n\in\N\}, show it is not context-free.
given any pNp\in\N, pick w=apbpcpLw=a^pb^pc^p\in L. for every partition w=xyzuvw=xyzuv, we have yzup|yzu|\leq p (1), there exists a letter in a,b,c that does not appear in it. now pick i=2i=2 and consider xy2zu2vxy^2zu^2v, then apply (2) that yu1|yu|\geq 1 (it is not empty). we found that we are only increasing the count of possibly one or two letters, but not for the third letter, so the count for all three letters do not equal, failing (3). \square

eg. let L={ww:w{a,b}}L=\{ww:w\in\{a,b\}^*\} is not context-free (in contrast, set of palindromes are using S -> aSa|bSb|ϵ).
given any pNp\in\N, consider w=apbpapbpLw=a^pb^pa^pb^p\in L. a division yzuyzu can only touch 2 parts at most. \square

theorem. (closure property of CFL) if L1,L2L_1,L_2 are context-free, then

  1. L1L2L_1L_2 is context-free
  2. L1L2L_1\cup L_2 is context-free
  3. L1L_1^* is context-free.

defn. LL is a boolean combination of L1,L2L_1,L_2 if there is a function :{T,F}2{T,F}*:\{T,F\}^2\rightarrow\{T,F\} such that for all wΣw\in\Sigma^* we have wLw\in L iff (wL1,wL2)=T*(w\in L_1,w\in L_2)=T.

the set of CFL is however not closed under boolean operations.

eg. show L1={akbncn:n,kN},L2={anbnck:n,kN}L_1=\{a^kb^nc^n:n,k\in\N\},L_2=\{a^nb^nc^k:n,k\in\N\} are CFL, but their intersection is not.
the rules for L1L_1 is:

S -> AB | ϵ A -> Aa | ϵ B -> bBc | ϵ

for L2L_2 it is similar. we have L1L2={anbncn:nN}L_1\cap L_2=\{a^nb^nc^n:n\in\N\} which is not context-free. \square

we also know the complement of a CFL can also be not closed because complement can be represented by union/intersection and negations.

pushdown automata

we use pushdown automata to compute CFL. it is an NFA endowed with a stack memory. whenever we scan a new letter, we can compare it with the top elements on the stack, and push it to the stack.

Week 7. Feb 14

defn. a pushdown automata is a tuple P=(Σ,Q,q0,F,δ,Γ)P=(\Sigma,Q,q_0,F,\delta,\Gamma) where

defn. a word ww is accepted by PP, ie wL(P)w\in L(P) iff there exist w0w1...wm(Σ{ϵ})w_0w_1...w_m\in(\Sigma\cup\{\epsilon\})^*, r0r1...rmQr_0r_1...r_m\in Q, s0s1...sm(Γ{ϵ})s_0s_1...s_m\in(\Gamma\cup\{\epsilon\})^* (ie some live contents of letter, state and stack) such that for all ii we have (ri+1,b)δ(ri,wi,a)(r_{i+1},b)\in\delta(r_i,w_i,a) such that si=ats_i=at (curr stack) and si+1=bts_{i+1}=bt (new stack) for tΓt\in\Gamma^* (sequence of stack contents), and last state rmFr_m\in F.

we can use a 3d matrix to represent the transition function...

eg. construct a PDA accepting language L={anbn:nN}L=\{a^nb^n:n\in\N\}.
whenever we see aa, we push a symbol, when we see bb, we pop a symbol. if we reach the bottom when we finish the word, we accept it.

let P=(Σ={a,b},Q={q0,q1,q2,q3},q0,F={q0,q3},δ,Γ={$,a})P=(\Sigma=\{a,b\},Q=\{q_0,q_1,q_2,q_3\}, q_0, F=\{q_0,q_3\}, \delta, \Gamma=\{\$,a\}), where the transition is

Q\ΣQ\backslash\Sigma ϵ\epsilon aa bb
q0q_0 ϵ(q1,$)\epsilon\mapsto(q_1,\$)
q1q_1 ϵ(q1,a)\epsilon\mapsto(q_1,a) a(q2,ϵ)a\mapsto(q_2,\epsilon)
q2q_2 b(q3,ϵ)b\mapsto(q_3,\epsilon) a(q2,ϵ)a\mapsto(q_2,\epsilon)
q3q_3

(incomplete)

can we read/write multiple symbols on stack? nothing changes as long as we bound the length of word, just split the operation into some auxiliary states.

eg. construct PDA for L={anbick:n=ii=k,n,i,kN}L=\{a^nb^ic^k:n=i\lor i=k,n,i,k\in\N\}.

theorem. a language LL is context-free iff there exists a PDA PP such that L=L(P)L=L(P).
proof.

corollary. if LL is a CFL and LL' is regular, then LLL\cap L' is a CFL.
proof. let PP be a PDA for LL, and DD be a DFA for LL'. define a new PDA such that it is a product of PP and DD:

eg. given LL is CFL and RR is regular, is LRL-R a CFL?
note LR=LRL-R=L\cap\overline{R}, so it is CFL. \square

eg. show L={w{a,b,c}:#a(w)=#b(w)=#c(w)}L=\{w\in\{a,b,c\}^*:\#_a(w)=\#_b(w)=\#c(w)\} is not context-free.
assume for contradiction LL is context-free, then consider L=LL(abc)={anbncn:nN}L'=L\cap L(a^*b^*c^*)=\{a^nb^nc^n:n\in\N\} which is not an CFL. but since L(abb)L(a^*b^*b^*) is CFL (regular) it should. \square

Week 9. Feb 28

turing machines

church thesis: all reasonable ways of defining what are computable tasks, are equivalent.

eg.

defn. a turing machine is a tuple T=(Σ,Γ,Q,q0,qacc,qrej,δ)T=(\Sigma,\Gamma,Q,q_0,q_\text{acc},q_\text{rej},\delta) where

defn. a configuration of a turing machine is C=(u,q,v)C=(u,q,v) where uu is the tape locations before the current location, uu is the locations after the current location, qq is the current state.

_________|a|_______ <---u---> <--v--->

defn. CCC\overset{*}{\to} C' if allows going from CC to CC' in finitely many steps.

eg. (uaqbv)(uacqv)(uaqbv)\to(uacq'v) means δ(q,b)=(q,c,R)\delta(q,b)=(q',c,R); (uaqbr)(uqacv)(uaqbr)\to(uq'acv) means δ(q,b)=(q,c,L)\delta(q,b)=(q',c,L).

defn. a word wΣw\in\Sigma^* is accepted by TT, ie wL(T)w\in L(T) iff there exist some u,vΓu,v\in\Gamma^*, such that (ϵq0w)(uqaccv)(\epsilon q_0w)\overset{*}{\to}(uq_\text{acc}v).

defn. a language LL is turing recognizable if there exists some turing machine TT such that L=L(T)L=L(T).

defn. a language is decidable if there exists a turing machine for it and the machine gets to either accepting or rejecting states.

eg. fix some first order logic, let L={α: α}L=\{\alpha:~\vdash\alpha\} (any proof that proves α\alpha). it is recognizable as TT will go over all finite sequence of formulas, for each it will 1) check whether it is legal proof 2) is the last formula α\alpha. it is not decidable however.

countable uncountable decidable ⊆ recognizable ⊆ unrecognizable

note

robustness of TM model of computation

we can consider some variations of turing machines that are equivalent.

eg. allowing the reading head not move left or right: Q×ΓQ×Γ×{L,R,S}Q\times\Gamma\rightarrow Q\times\Gamma\times\{L,R,S\}.
can be simulated by a standard TM: go left then back right without changing state

eg. have kk tapes simultaneously: δ:Q×ΓkQ×Γk×{L,R}k\delta:Q\times\Gamma^k\rightarrow Q\times\Gamma^k\times\{L,R\}^k.
clearly the k-tape machine is as powerful as a standard TM. create a tape that has kk segments, and for each of them duplicate the γi\gamma_i's. each has a special letter γj\overline{\gamma}_j where jj's are different across segments.

-----+----------------------+----- -----+----------------------+----- tape 1 ... tape k

eg. two-stack PDA.
one stack for left side, one for right side. moving head to right => moving top element from right stack to left stack, ...

eg. non-deterministic turing machines: δ:Q×ΓP(Q×Γ×{L,R})\delta:Q\times\Gamma\rightarrow\mathcal{P}(Q\times\Gamma\times\{L,R\}). a word ww is accepted by it iff there is a way to get (uqaccv)(uq_\text{acc}v) from (q0w)(q_0w).
use a 3-tape TM to simulate it. start from C0=(q0w)C_0=(q_0w), we can go to three configurations C11,C12,C13C_{11},C_{12},C_{13}, which can each have three configurations... using BFS, if there is a way to get to accepting state, then it is accepted. for the 3 tapes

numerating machines

add a new state qprintq_\text{print}. when the machine reaches qprintq_\text{print}, we view the current content of the tape as the output.

defn. a word ww is accepted by the enumerator EE, ie wL(E)w\in L(E), iff it can be printed.

defn. LL is recursively enumerable if there exists some enumerator EE such that L=L(E)L=L(E).

claim. LL is recursively enumerable iff it is recognizable.
proof.

theorem. a language LL is decidable iff both LL and ΣL\Sigma^*-L are recursively enumerable.
proof.

remark. language is decidable if enumerator can print strings in lexicographical order.

how can one show a language is not decidable?

hilbert's 10th problem: given a polynomial of several variables P(x,y,z,...)=0P(x,y,z,...)=0 decide if it has integer solutions.

theorem. Lp={P:P is a polynomial with integer solution}L_p=\{\lang P\rang:P\text{ is a polynomial with integer solution}\} is undecidable. (angle bracket: encode object as a binary string)

eg. {G:G is undirected graph that is connected}\{\lang G\rang:G\text{ is undirected graph that is connected}\} is decidable.
try design a TM to scan the encoded string..

eg. LDFA={w,D:D is DFA that accepts w}L_\text{DFA}=\{\lang w,D\rang:D\text{ is DFA that accepts }w\} is decidable.
just run the DFA with w...

eg. EDFA={D:D is DFA and L(D)=}E_\text{DFA}=\{\lang D\rang:D\text{ is DFA and }L(D)=\varnothing\} is decidable.
use pumping lemma, if DD accepts no word of length less than Q+1|Q|+1 then it does not accept other words (assume otherwise, choose a word > this number in language, pump by power of 0 to create contradiction).

eg. LDFA-eq={D,D:D,D are DFAs and L(D)=L(D)}L_\text{DFA-eq}=\{\lang D,D'\rang:D,D'\text{ are DFAs and }L(D)=L(D')\} is decidable.
note L(D)=L(D)L(D)=L(D') iff both L(D)L(D)L(D)-L(D') and L(D)L(D)L(D')-L(D) are empty. construct the difference DFA (product DFA) and go to the previous eg.

eg. LPDA={w,G:wL(G)}L_\text{PDA}=\{\lang w,G\rang:w\in L(G)\} is decidable.
turn the grammar G into a PDA...

eg. ECFG={G:L(G)=}E_\text{CFG}=\{\lang G\rang:L(G)=\varnothing\} is decidable.
pumping... the threshold is sV+1|s|^{|V|+1} where s=max{s:AsR}|s|=\max\{|s|:A\rightarrow s\in R\}

eg. LCFG-eq={G,G:L(G)=L(G)}L_\text{CFG-eq}=\{\lang G,G'\rang:L(G)=L(G')\} is undecidable.
construct G\overline{G} such that L(G)=L(G)L(G)L(\overline{G})=L(G)-L(G'). turns out that it is undecidable...

Week 10. Mar 7

eg. ATM={T,w:wL(T)}A_{\text{TM}}=\{\lang T,w\rang:w\in L(T)\} is undecidable.
assume for a contradiction that it is decidable and let AA be the turing machine/algorithm. define a new algorithm DD as follows:

then feed D\lang D\rang into DD. if D(D)=acceptD(\lang D\rang)=\text{accept}, then A(D,w)=rejectA(\lang D\rang,w)=\text{reject}, then it means D(D)=acceptD(\lang D\rang)=\text{accept}. same for the other direction. so we get contradiction. \square

however it is recognizable. given word of and the machine, just run it.

the problem for rejecting a word is it can possibly run forever.

eg. (halting problem) H={M,w:M halts on input w}H=\{\lang M,w\rang:M\text{ halts on input }w\} is undecidable.
we use proof by reduction. suppose otherwise it has decider HH. given T,w\lang T\rang,w be input for ATMA_{\text{TM}} problem. then feed this into HH, then it can either accept or reject and run in finite steps. then we have solved ATMA_\text{TM} but we should not. \square

eg. ETM={M:L(M)=}E_\text{TM}=\{\lang M\rang:L(M)=\varnothing\} is undecidable.
suppose otherwise it has decider ETMETM. given T,w\lang T\rang,w be input for ATMA_\text{TM} problem, fix ww and define MM' such that

feed MM' into ETMETM, as it runs in finitely many steps we can solve ATMA_{\text{TM}} problem. \square

eg. EQTM={M,M:L(M)=L(M)}\text{EQ}_\text{TM}=\{\lang M,M'\rang:L(M)=L(M')\} is undecidable.
suppose otherwise it has decider EQEQ. let M0M_0 be some machine that accepts nothing. then for an input M\lang M\rang for the ETME_{\text{TM}} problem, L(M)=L(M0)L(M)=L(M_0) iff L(M)=L(M)=\varnothing so we can use EQEQ to solve it. \square

eg. All={M:L(M)=everything}\text{All}=\{\lang M\rang:L(M)=\text{everything}\} is undecidable.

examples of undecidable problems that do not mention turing machines

word complexity

how to measure complexity of a words? try do this by its 'description length', ie how long is the shortest way to describe the word.

the Barry paradox: consider a number has a short definition if there is an English sentence of words less than 200 characters that defines it. is the set of all shortly described numbers finite?

the problem is the description must be some fixed 'description language'.

eg. a Python program that given empty inputs it returns a string. then the program is the description.

defn. kolmogorov complexity is the length of the shortest description Kd(s):=min{p:d(p)=s}K_d(s):=\min\{|p|:d(p)=s\}.

theorem. some properties:

  1. for every dd, there exists cNc\in\N such that for all cc we have Kd(s)s+cK_d(s)\leq|s|+c.
  2. there exists description that is much shorter than the word.
  3. there are infinitely many string ss such that Kd(s)sK_d(s)\geq|s|.

defn. we say string ss is b-compressible if Kd(s)sbK_d(s)\leq|s|-b.

claim. for every b>1b>1 and for all large enough nn, most of the string of length nn are not b-compressible.
proof. there are at most i=1sb2i=2sb1\sum_{i=1}^{|s|-b}2^i=2^{|s|-b}-1 descriptions of size below sb|s|-b and there are 2s2^{|s|} strings of length s|s|. even b = 1 we cannot compress half of them. \square

how to say a string is random? can you use probability? no because strings have same probability to be generated

10101010101010101010 p=1/2^20 10110101000101010111 p=1/2^20

Chaitin defines a string to be 'random' if it is not compressible.

defn. a property PP of a string is rare if limn{s:s=n,s has P}2n=0\lim_{n\rightarrow\infty}\frac{|\{s:|s|=n,s\text{ has }P\}|}{2^n}=0.

claim. for every rare property PP, for every bb there exists nn such that all strings of length nn that satisfy PP are b-compressible.

pf. 'the i-th string that satisfies P' is a compressed description of that string once PP is random and ii is larger (c+logic+\log i).

theorem. some straightforward 'positive' results (upperbound of KdK_d):

  1. x c Kd(xx)c+Kd(x)\forall x~\exists c~K_d(xx)\leq c+K_d(x)
  2. xy c Kd(xy)c+Kd(x)+Kd(y)+log(Kd(x))\forall x\forall y~\exists c~K_d(xy)\leq c+K_d(x)+K_d(y)+\log(K_d(x))
  3. for every two programming languages d,dd,d' and every string ss, there is some constant bd,db_{d,d'} such that Kd(s)Kd(s)+bddK_d(s)\leq K_{d'}(s)+b_{dd'}.
  4. for every pairs of strings s1,s2s_1,s_2 there exist programming languages d1,d2d_1,d_2 such that Kd1(s1)<<Kd2(s2)K_{d_1}(s_1)<<K_{d_2}(s_2) and vice versa.

theorem. for any given programming language dd there is no algorithm such that for every ss it computes Kd(s)K_d(s) (KdK_d is not computable).
proof. following the argument of the Barry paradox. assume there is some algo AA that computes correctly Kd(s)K_d(s) for all ss. what is the length of the a d-program computing 'the first string s of Kd(s)>LK_d(s)>L for LL? then it is

for each string in lexicographical order: for each si: run A to find si return first si such that A(si) > L

then the length of this program is Kd(s)=c+logLK_d'(s)=c+\log L (L is the only variable and we need logL\log L bits). if LL is sufficiently large, then we can always find some c+logL<Lc+\log L<L. we get contradiction.

another proof halting problem is undecidable: given program that decides HP, we could then compute Kd(s)K_d(s) for every ss: by going over all programs in lexi order. given empty input, check if any program of size <s<|s| outputs ss. before we run program, use HP to check whether it halts on empty input.

we know that for every length LL there are only finitely many strings ss with Kd(s)LK_d(s)\leq L. can we prove a lower bound?

theorem. for any proof system there is some L<L<\infty such that this system cannot proof Kd(s)>LK_d(s)>L for any ss.
again use Barry paradox.

Week 11. Mar 14

computable functions

defn. a function f:ΣΣf:\Sigma^*\rightarrow\Sigma^* is computable if there is a turing machine MM such that for every wΣw\in\Sigma^* as input, MM eventually halts with f(w)f(w) as its tape content.

we know recognizable and recursively enumerable are equivalent notions. it is also equivalent to LL being the range of some computable function ff (ie L={f(n):nN}L=\{f(n):n\in\N\}).

defn. for languages A,BA,B, we say AA is mapping reducible to BB, denoted AmBA\leq_m B, if there is computable function f:ΣΣf:\Sigma^*\rightarrow\Sigma^* such that for every wΣw\in\Sigma^*, wAw\in A iff f(w)Bf(w)\in B.

theorem. if AmBA\leq_mB and BB is decidable, then AA is decidable.

theorem. if AmBA\leq_mB and BB is recognizable, then AA is recognizable.

we know already recognizable language does not imply complement is recognizable. but can there be language such that itself and its complement are both not recognizable?

claim. both EQTM\text{EQ}_{\text{TM}} and EQTM\overline{\text{EQ}_{\text{TM}}} are not recognizable.
proof.

defn. we say f:NNf:\N\rightarrow N dominates g:NNg:\N\rightarrow\N if there exists m0m_0 such that for all n>m0n>m_0, f(n)g(n)f(n)\geq g(n).

lemma. for every countable set of such functions FF, there exists a function gFg_F that dominates every member of FF.
proof. since FF is countable we can find some indexing such that F={fn:nN}F=\{f_n:n\in\N\}. define gFg_F by for every nn, =maxi,jnfi(j)+1=\max_{i,j\leq n} f_i(j)+1. \square

corollary. there exists gcg_c that dominates every computable function.
this is because set of all computable functions is countable (turing machine is countable (finite codes)).

note for every nn there are only finitely many functions computable by a turing machine with at most nn states. given any number kk, define gc(k)g_c(k) by maxM with k states,ik s.t. M halts on input i{# computation steps of M on input i}\max_{M\text{ with }\leq k\text{ states},i\leq k\text{ s.t. }M\text{ halts on input }i}\{\text{\# computation steps of }M\text{ on input }i\}. then this gcg_c is not computable. if it were, we could decide ATMA_\text{TM} as we have a maximum of steps to take to scan any input M,w\lang M,w\rang.

theorem. (Ramsey) there exists a number r(m)r(m) such that for every undirected graph G=(V,E)G=(V,E) where Vr(m)|V|\geq r(m) there is either a click of size mm, or an isolated set of size mm, ie there exists AVA\subseteq V such that Am|A|\geq m and either, for all x,sA,xyEx,s\in A,xy\in E, or for all x,yA,xyEx,y\in A,xy\notin E.

or more generally, for every m,b,lm,b,l there is a number r(m,k,l)r(m,k,l) such that for every WW, if Wr(m,k,l)|W|\geq r(m,k,l), for all f:Wk[l]f:W^k\rightarrow[l], there exists AWA\subseteq W such that ff on AkA^k is constant and Am|A|\geq m.

the function rr is not computable...

if LL is such that fi(i)={i:ith w that is in L}f_i(i)=\{|i|:i\text{th }w\text{ that is in }L\}. ff is fast growing function, and LL is not recognizable.

logic

first order logic:

eg. for M=(N,{a,,+,<})M=(\N,\{a,*,+,<\}), we have language L={...,a,g(,),f(,),R(,)}L=\{...,a,g(,),f(,),R(,)\} to represent the relations and functions. we can use these to write formulas eg xy(R(x,y)st(g(s,t)=y    s=yt=y))\forall x\exist y(R(x,y)\land\forall s\forall t(g(s,t)=y\implies s=y\lor t=y)) ('there are infinitely many primes').

defn. a formula is called a sentence if all of its variables are quantifiers.

defn. given structure MM, its theorems is Th(M)={x:Mx}\text{Th}(M)=\{x:M\vDash x\} (true sentences).

theorem. for a language LL and structure MM for that language, then for every sentence xx in the language, then either xTh(M)x\in\text{Th}(M) or xTh(M)x\notin\text{Th}(M).

question: is there an algo that on any given α\alpha it can decide whether it belongs to the theory of the structure MM? in other words, given language LL, is ThL(M)\text{Th}_L(M) decidable?

claim. let L+={P(,,)}L_+=\{P(,,)\} with only one three-place relation, and M=(N,{x,y,z:x+y=z})M=(\N,\{x,y,z:x+y=z\}). then ThL+(M)\text{Th}_{L_+}(M) is decidable.
show by induction by the number of quantifiers...

claim. let L+={P(,,),G(,,)}L_{+*}=\{P(,,),G(,,)\} and M=(N,{x,y,z:x+y=z},{x,y,z:xy=z})M=(\N,\{x,y,z:x+y=z\},\{x,y,z:xy=z\}). then ThL+(M)\text{Th}_{L_{+*}}(M) is undecidable.
proof. note for this language we do not explicitly define ordering, 0, 1 etc. as they can be expressed in the form of formulas, so it describes properties of N\N.

by reduction to ATMA_\text{TM}. given T,v\lang T,v\rang as input for ATMA_\text{TM}, we want to construct a sentence α\alpha such that T,wATM\lang T,w\rang\in A_\text{TM} iff αThL+(M)\alpha\in\text{Th}_{L_{+*}}(M). we can encode sequence of natural numbers as natural numbers.

and wATMw\in A_\text{TM} iff there exists some s1,...,sks_1,...,s_k such that each sis_i is a pair (qQ,wΓ)(q\in Q,w\in \Gamma^*), s0=(q0,ϵ)s_0=(q_0,\epsilon) and sk=(qacc,w)s_k=(q_\text{acc},w). note each of these steps can be written as a natural number, and then one natural number. if we can decide which number satisfies theorem, then we can decide the ATMA_\text{TM}. \square

it is not recognizable either. if it were, then its complement recognizable (due to definition of theory), then it is decidable.

defn. a proof system PP is set of rules that defines which formula a 'formal theorem' is provable (Pα\vdash_P\alpha), ie satisfying.

  1. there is a certificate 'proof' in the proof system and the set of all legal proofs is decidable;
  2. sound: for some structure MM, for all α\alpha we have Pα    Mα\vdash_P\alpha\implies M\vDash\alpha

claim. if PP is a proof system satisfying property LL, then {α: Pα}\{\alpha:~\vdash_P\alpha\} is recognizable.
proof. go over all strings in lexi order and given string ww apply the proof system decider to check whether ww is legal proof in LL, if so, print out what ww proves.

claim. {α: Pα}Th(N)\{\alpha:~\vdash_P\alpha\}\subseteq\text{Th}(\N).

corollary. for every proof system PP that is sound for Th+(N)\text{Th}_{+*}(\N) there is a sentence α\alpha such that Nα\N\vDash\alpha and ̸Pα\not\vdash_P\alpha.

Week 12. Mar 21

we know the set {α:Nα}\{\alpha:\N\vDash\alpha\} is 'complete' in the sense that either for every α\alpha, either αTh(N)\alpha\in\text{Th}(\N) or ¬αTh(N)\lnot\alpha\in\text{Th}(\N). it is recursively enumerable but not decidable.

observe Th(P)\text{Th}(P) for some system PP is not complete, namely there exists some α\alpha such that ̸Pα,̸P¬α\not\vdash_P\alpha,\not\vdash_P\lnot\alpha.

how to define natural numbers? we cannot prove anything but we can try focusing on natural numbers only. Peano arithmetic is an accepted proof system for N\N.

ZFC (Zermelo-Fraenkel set theory) can prove every axiom of PA.

if ϕ(x,y)\phi(x,y) denotes a function over N\N, then we need

a function is computable iff it can be expressed by a formula pp such that PAtwo properties\text{PA}\vdash\text{two properties}.

as corollary, there exist functions such that ZFC can prove both properties but they grow faster than any computable function (can cannot be proved by PA).

example of ZFC is not complete:

eg. (continuous hypothesis) ZFC cannot prove the statement "R=0|\R|=\alef_0" nor its negation.

eg. (Goldel's 2nd incompleteness) for any consistent proof system PP, P⊬Cons(P)P\not\vdash\text{Cons}(P) ie PP cannot prove PP is consistent.

however ZFCCons(PA)\text{ZFC}\vdash\text{Cons}(\text{PA}).

complexity

once a language/problem is computable, now we can consider its time complexity and space complexity.

defn. for functions f,g:NNf,g:\N\rightarrow\N, we say f(n)=O(g(n))f(n)=O(g(n)) if there exists some cNc\in\N such that for all n>n0n>n_0 such that f(n)<cg(n)f(n)<cg(n).

defn. the time complexity up to a polynomial of a problem is the time it takes for the most efficient algorithm.

eg. find the time complexity of {akbk:kN}{a,b}\{a^kb^k:k\in\N\}\subset\{a,b\}^*. given string of length nn. after checking aa's are followed by bb's, given a one-tape machine we can do

if we have two-tape machine

refined church thesis: for every problem and evert two 'reasonable' models of computation, the time complexity of solving the problem on those machines is within a polynomial of each other.

eg. for every problem solvable in any time f(n)f(n) by a two-tape machine, the problem can be solved by a 1-tape machine in time O(f(n)2)O(f(n)^2).
for the 1-tape machine, put a barrier so that it has two parts corresponding to the 2 tapes. if on the 2-tape machine we cross the tapes at most f(n)f(n) times, then the 1-tape machine has to move O(f(n)2)O(f(n)^2) times to account for this. \square

defn. the class P is the family of all problems solvable in polynomial time (in terms of input size nn).

P=kNT(nk)P=\bigcup_{k\in\N}T(n^k)

the class P is commonly accepted as modeling 'feasible problems'.

issues of this interpretation:

eg. for problems over graphs, we represent a graph by its adjacency matrix MGM_G. then the size of this representation GV2|\lang G\rang|\approx|V|^2. we also know MGM_G is symmetric iff GG is undirected.

eg. for the problem Lp={n:n is prime}L_p=\{n:n\text{ is prime}\}. we represent number by its binary representation nlogn|\lang n\rang|\approx\log n. can we solve this in polynomial time?

defn. a language LL belongs to NP if there exists a polynomial time (in the size of ww) deciding (verification) algorithm VV such that L={w:c V(w,c)=accept}L=\{w:\exist c ~V(w,c)=\text{accept}\}.

observation. PNPP\subseteq NP.
just take ϵ\epsilon as the certificate for every ww and let VV be the algorithm deciding LL.

eg. (clique) given input G=(V,E),k\lang G=(V,E),k\rang, LCl={G,k:G contains a clique of size k}L_\text{Cl}=\{\lang G,k\rang:G\text{ contains a clique of size }k\}.
given such an input, a certificate cc will be a list of kk vertices cVc\subseteq V and the verifier just check for every x,yc,xyEx,y\in c,xy\in E.

eg. LHamPath={G:G has a Hamiltonian path}L_\text{HamPath}=\{\lang G\rang:G\text{ has a Hamiltonian path}\}.
the certificate is the path.

eg. Lcomposite={n:n is not prime}L_\text{composite}=\{\lang n\rang:n\text{ is not prime}\}.
the certificate is two numbers such that they are not 1 and multiply to make nn. in terms of input size this is poly.

prop. the class P is closed under complementation.
use same algorithm and flip answer.

however it is not clear if NP is closed under complementation (NP = CoNP?).

defn. CoNP={L:LNP}\text{CoNP}=\{L:\overline{L}\in NP\}.

eg. SAT={φ:φ is a satisfiable formula in propositional logic}SAT=\{\varphi:\varphi\text{ is a satisfiable formula in propositional logic}\}. formula is satisfiable if there exists a truth assignment such that formula gets true.

why is this class called NP?

defn. an algorithm/turing machine is nondeterministic if at every step it may have more than one next step configurations. such algorithm is polynomial time nondeterministic if no matter which choices are picked, the run ends in some polynomial time.

defn. a language LL is accepted by a nondeterministic turing machine iff at least one of the possible paths of the TM ends in accept state.

theorem. a language LL is in NP iff it can be decided by a nondeterministic polynomial time algorithm.

remark. PNPkNUT(2nk)P\subseteq NP\subseteq\bigcup_{k\in\N}UT(2^{n^k}).

it is not clear whether P=NPP=NP.

eg. how to credit card store pincode?
it uses one way functions that are easy to compute forward but hard to compute backward. the card has f(pin)f(\text{pin}), the ATM computes f(userPin)f(\text{userPin}) and compare. if P=NP then we can inverse easily.

defn. given L1,L2L_1,L_2, f:ΣΣf:\Sigma^*\rightarrow\Sigma^* is a polynomial time reduction if ff runs in polynomial time and w:wL    f(w)L2\forall w:w\in L\iff f(w)\in L_2.

observation. given L1<pL2L_1<_p L_2, if L2PL_2\in P, then L1PL_1\in P.
proof. if ff is O(nk)O(n^k) computable then for every ww, f(w)f(w) has size at most wk|w|^k. so if AA solves L2L_2 in time nk+1n^{k+1} and ff runs in time nk2n^{k_2}, we get a solve for L1L_1 in time (nk2)k1(n^{k_2})^{k_1}. \square

defn. LL is NP-complete if LNPL\in NP and LNP:L<pL\forall \overline{L}\in NP:\overline{L}<_p L.

Week 13. Mar 28

defn. the class S(nk)S(n^k) is all languages decidable in space O(nk)O(n^k), ie there exists deciding MM such that input ww requires space (tape size) O(wk)O(|w|^k). then

PSPACE=kS(nk)P_{SPACE}=\bigcup_{k}S(n^k)

defn. a function ff is time constructable if ff on input 0n0^n, ff computes f(n)f(n) in time O(f(n))O(f(n)).

every 'reasonable' function (n.n2,2n,...n.n^2,2^n,...) is time constructable.

theorem. (time hierarchy) for every 'reasonable' f,g:NNf,g:\N\rightarrow\N, if f(n)=O(g(n)logg(n))f(n)=O\left(\frac{g(n)}{\log g(n)}\right), there is a language LT(g(n))T(f(n))L\in T(g(n))-T(f(n)) (it is in the middle of the two).
proof. define L={M01k:M is code of a TM,M(M01k) outputs reject in time <f(M01k)}L=\{\lang M\rang01^k:M\text{ is code of a TM},M(\lang M\rang01^k)\text{ outputs reject in time }<f(|\lang M\rang01^k|)\}. diagonalization argument can show LT(f(n))L\notin T(f(n)). by the term 'reasonable', we can show LT(g(n))L\in T(g(n)). \square

this means containing PP, we can have more classes, eg

we can do the same for PSPACEESPACE...P_{SPACE}\subset E_{SPACE}\subset...

claim. T(f(n))S(f(n))T(f(n))\subseteq S(f(n)).
because we can use more space than time.

but we do not know if T(f(n))=S(f(n))T(f(n))=S(f(n)).

claim. PSPACEEXPTIMEP_{SPACE}\subseteq EXPTIME.
simulate all possible computations, there is only exponential of them.

but we do not know if P=PSPACEP=P_{\text{SPACE}} or PSPACE=EXPTIMEP_\text{SPACE}=EXPTIME. but by time hierarchy, one of them must be unequal.

P ⊂? PSPACE ⊂? EXPTIME

claim. for every RE language LL, there is a decidable language LL' such that L={w:v (w,v)L}L=\{w:\exists v~(w,v)\in L'\}

img

we can generate more complex languages based on LL by using this 'projection' operation.

we do not know if similarly NPCoNP=PNP\cap CoNP=P.

poly time hierarchy each step we get something new

img

Σn={L:LΓn1 v1:(w1,v1)L}\Sigma_n=\{L':\exist L'\in \Gamma_{n-1}~\exist v_1:(w_1,v_1)\in L'\}, where

we do not know whether the intersection of all this hierarchy is PP.

there are lots of connections between P/NP/coNP and logics.

defn. we say AA is reducible to BB, AmCBA\leq_m^C B if there is a reduction f:ABf:A\rightarrow B within the resources class CC.

observations.

  1. A:AmPA\forall A:A\leq_m^P A
  2. A,B,C:AmPBBmPC    AmPC\forall A,B,C:A\leq_m^P B\land B\leq_m^P C\implies A\leq_m^P C

corollaries.

  1. A,BP:AmPB\forall A,B\in P:A\leq_m^P B
  2. AP,BNP:AmpB\forall A\in P,B\in NP:A\leq_m^p B
  3. BP:AmPB    AP\forall B\in P:A\leq_m^P B\implies A\in P

some examples of reduction:

defn. a 3-SAT variable is any propositional variable, a 3-SAT literal is any variable or its negation, a 3-SAT clause is ,\land,\lor of 3-SAT literals, a CNF (conjunctive normal form) formula has clauses that have ,¬\lor,\lnot, and clauses are joined using \land.

in fact for every boolean function f:1,0n{0,1}f:{1,0}^n\rightarrow\{0,1\} there exists a CNF formula such that the truth table of ff equals the truth table of that formula.

3-SAT: the set of all CNF formulas in which every clause has 3 literals.

claim. CNFSATmp3SATCNF-SAT\leq_m^p 3SAT.
split clause into pieces, eg p1p2p3p4(p1p2p3)(¬p1p3p4)p_1\lor p_2\lor p_3\lor p_4\equiv(p_1\lor p_2\lor p_3)\land(\lnot p_1\lor p_3\lor p_4).

claim. 3SATmPK-Clique3SAT\leq_m^P \text{K-Clique}.
proof. given φ=C1C2...Ck=(l11l12l12)...(lk1lk2lk2)\varphi=C_1\cap C_2\cap...\cap C_k=(l_{11}\lor l_{12}\lor l_{12})\land...\land(l_{k1}\lor l_{k2}\lor l_{k2}) as input of 3SAT. for each clause, spawn 3 vertices. we connect every pair of vertices by an edge except if either the pair belongs to the same triplet, or labeled with contradictory literals. then if φ\varphi is satisfiable iff GG has a k-clique. \square

defn. a language LL is NP-hard if SATmPLSAT\leq_m^P L.

defn. LL is NP-complete if LNPL\in NP and for all VNPV\in NP, VmPV\leq_m^P to LL.

theorem. (cook-levin) for every LL in NPNP, LmPSATL\leq_m^PSAT, ie SATSAT is NP-complete.

corollary. K-Clique\text{K-Clique} is NP-complete.

corollary. if SATSAT can be solved by poly time, then so can any problem in NPNP.

corollary. for every LL such that SATmPLSAT\leq_m^P L, if LL is solvable in poly time, then so is every problem in NPNP.

after this theorem, we know P=NP    3SATPP=NP\iff 3SAT\in P (or any other NPC problem).

many practical problems can be cast as instances of SAT (hardware verification).

there are lots of SAT solvers.

Week 14. Apr 4

outline of proof for cook-levin: given an NP language LL, we want to find a poly time reduction ff such that wLw\in L iff f(w)f(w) is satisfiable formula. we know there is some non-deterministic TM M={Σ,Γ,Q,q0,qacc,δ}M=\{\Sigma,\Gamma,Q,q_0,q_{\text{acc}},\delta\} with wLw\in L iff some computation of MM that starts with input ww and ends in qaccq_\text{acc}. let kk be such that each run of MM on ww consists of at most wk|w|^k steps.

we now construct a prop formula φM,w\varphi_{M,w} that encodes a run of MM on ww.

img

the number of cols is roughly wk|w|^k. we want our formula to express the following:

for each i,jnk,sγQi,j\leq n^k,s\in\gamma\cup Q, we will have a prop variable pijsp_{ijs}. we give this variable true iff ss is the content of the (i,j)(i,j) entry of the matrix. then the first row can be described by:

p11#p12q0...pijwj2p_{11\#}\land p_{12q_0}\land...\land p_{ijw_{j-2}}

and more...

i,j;qQpijq,i,j;rΓpijr,i;jj;q,qQpijqpijq\bigvee_{i,j;q\in Q}p_{ijq}, \bigvee_{i,j;r\in \Gamma}p_{ijr}, \bigwedge_{i; j\neq j';q,q'\in Q}p_{ijq}\rightarrow p_{ij'q'}

and continue in this fashion we will get a formula satisfiable iff we have a valid run on the input.

are there languages LL such that LNPL\in NP and LPL\notin P? if all LNPL\in NP except empty and universe is NPCNPC then P=NPP=NP.

eg. we say a graph G=(V,E)G=(V,E) has a k-size vertex cover if there is some WV,W=kW\subseteq V,|W|=k such that every eEe\in E contains one vertex in WW. let LVC={G,k:G has vertex cover of size k}L_{VC}=\{\lang G,k\rang:G\text{ has vertex cover of size }k\}. then it is NP-complete.
it suffices to show 3SATmPLVC3SAT\leq_m^P L_{VC}. given 3SAT formula φ\varphi, construct a graph GG and a number kk.

then the total number of vertices is 2m+3l2m+3l where mm is number of variables and ll is number of clauses. choose k=m+2lk=m+2l.

img

...

roughly speaking, we view problems in P as feasible (solvable in measurable time) and NP-hard problems as infeasible. the difference is not so sharp

  1. a problem can be outside P due to just very rare hard instances
  2. being in P does not yield easy solutions since the poly time exponent may be to large

consider following scenario: SATPSAT\notin P but there is algo that solves SATSAT in time ns(n)n^{s(n)} where ss is a very slow growing function (inverse of difficult-to-compute fast growing function) and we cannot notice the difference. still cannot prove.

aside

oracle computation: endow a turing machine with the ability to query some 'oracle language' LL. the MLM^L machine has a new query tape, and when it writes some ww on that tape, it gets bit 1 iff wLw\in L.

circuit complexity: