Page: https://cs.uwaterloo.ca/~lapchi/cs341/index.html

Instructor: Lap Chi LAU

Time: W,F 11:20am-12:50pm EST

Week 1. May 11

computation model

eg. graph problem on n vertices -> use log2n\log_2n bits to label vertices (assume they fit into a word)

eg. binary search done in O(log2n)O(\log_2n) assumes constant random access

eg. (3-sum) have n numbers a1,a2,...,ana_1,a_2,...,a_n and a target number c, find indices i, j, k such that ai+aj+ak=ca_i+a_j+a_k=c or report that such triples do not exist.

algo1: observe this can be rewritten as caiaj=akc-a_i-a_j=a_k, idea: enumerate all pairs ai,aja_i,a_j and check whether caiaj=akc-a_i-a_j=a_k for some k.

sort(a) for i = 1, n: for j = 1, n: if binary_search(a, c-a[i]-a[k]) is not nullptr: report them

runtime: O(nlogn)+O(n2logn)=O(n2logn)O(n\log n)+O(n^2\log n)=O(n^2\log n)

(in worst case, hashing does not guarantee constant retrieval)

algo2: this can be rewritten as ai+aj=caka_i+a_j=c-a_k idea: enumerate all pairs aka_k and check whether ai+aj=caka_i+a_j=c-a_k for some pair i, j.

sort(a) for k = 1, n: // solve 2-sum problem in O(n) given a sorted array

runtime: O(n2)O(n^2)

eg. (2-sum) have n numbers a1a2...ana_1\le a_2\le...\le a_n (sorted!) and a target number b, find indices i, j such that ai+aj=ba_i+a_j=b or report that such pairs do not exist.

b = 19: |1 |4 |6 |8 |9 |11|14|20| ^ ^ L R
auto L = 1 auto R = n while L <= R: // allow equal auto sum = a[L] + a[R] if sum == b: report L, R else if sum > b: --R else: --L nope

runtime: each iteration R-L decreases by 1, hence at most n iterations => O(n).

proof of correctness:

solving recurrence

mergesort

see CS240 (recursion tree)

lecture this time says T(n) = cnlogn (tree has logn levels) - why?

eg. T(n)=4T(n2)+n=O(n2)T(n)=4T(\frac{n}{2})+n=O(n^2)
in the recursion tree, each level has nn, 4n2=2n4\frac{n}{2}=2n, 16n4=4n16\frac{n}{4}=4n, 64n8=8n64\frac{n}{8}=8n, ..., 4logn1=n24^{\log n}\cdot1=n^2 (increasing). the order is dominated by the leaf level. \square

eg. T(n)=2T(n2)+n2=O(n2)T(n)=2T(\frac{n}{2})+n^2=O(n^2)
in the recursion tree, each level has n2n^2, 2(n2)2=n222(\frac{n}{2})^2=\frac{n^2}{2}, 4(n4)2=n244(\frac{n}{4})^2=\frac{n^2}{4}, ... (decreasing). the order is dominated by the root level. \square

eg. mergesort T(n)=2T(n2)+nT(n)=2T(\frac{n}{2})+n, the sequence is non-changing. order is the sum of all levels. \square

Week 2. May 19

recursion tree in general setting

theorem. (master theorem) consider T(n)=aT(nb)+ncT(n)=aT(\frac{n}{b})+n^c where constant a1,b>0,c0a\ge1,b>0,c\ge0 (a is integer). then let r:=abcr:=\frac{a}{b^c}

proof. at the root level, work done is ncn^c, next level: a(nb)ca(\frac{n}{b})^c, next level: a2(nb2)ca^2(\frac{n}{b^2})^c, ....
at the leaf level, the problem size should be bounded by constant ie nbiK\frac{n}{b^i}\le K for the level number i. we then have ilogbnlogbKi\ge\log_bn-\log_bK. eg we can let i=logbni=\log_bn. the number of leaves is a# levels=ai=alogbn=nlogbaa^{\textrm{\# levels}}=a^{i}= a^{\log_bn}=n^{\log_ba}.
the work done on each level is a geometric sequence with ratio abc=r\frac{a}{b^c}=r.

  1. if abc=1\frac{a}{b^c}=1, then every level has the same work. total work = (# levels)(work at each level) = nclogbnn^c\log_bn.
  2. if abc<1\frac{a}{b^c}<1, then root level dominates, total work =i=0logbanc(abc)i<nci0(abc)i=nc11abc\sum_{i=0}^{\log_ba}n^c(\frac{a}{b^c})^i<n^c\sum_{i\ge0}(\frac{a}{b^c})^i=n^c\frac{1}{1-\frac{a}{b^c}}.
  3. if abc>1\frac{a}{b^c}>1, note the sequence diverges, leaf level dominates, total work = O(nlogba)O(n^{\log_ba}). \square

eg. T(n)=T(n)+1T(n)=T(\sqrt{n})+1.
note T(n)=(T(n14)+1)+1=...T(n)=(T(n^{\frac{1}{4}})+1)+1=.... the work done on each level is 1, so the total work = # levels. at bottom level i we have n12iK12ilognlogKlogn2ilogKiloglognn^{\frac{1}{2^i}}\le K\Rightarrow \frac{1}{2^i}\log n\le \log K\Rightarrow\log n\le2^i\log K\Rightarrow i\ge\log\log n for bound KK. this means as long as i is this large the problem size is bounded to this constant. we have T(n)=O(# levels)=O(loglogn)T(n)=O(\textrm{\# levels})=O(\log\log n). \square

eg. T(n)=T(n1)+T(n2)+nc=O(1.618...nnc)T(n)=T(n-1)+T(n-2)+n^c=O(1.618...^n\cdot n^c).

divide & conquer

counting inversion pairs

eg. given n distinct numbers a1,...,ana_1,...,a_n, give the number of "inversion" pairs with ij,ai>aji\le j,a_i>a_j.

// idea 2 count_sort(A[n]): if n == 1: return 0 int n1 = count_sort(A[1 : n/2]) // T(n/2) int n2 = count_sort(A[n/2+1 : n]) // T(n/2) int nm = merge_count(A) // T(n) return n1 + nm + n2 merge_count(A[n]): int sum = 0 // same as merge_sort's merge function // there are two ptrs l and r at the start of each sub array // whenever we put one value from right array at r, this means on the left array, // elements from l to end are bigger than it. hence we add the # of these values // to sum. return sum

idea 1 in total uses T(n)=2T(n2)+O(nlogn)=O(nlog2n)T(n)=2T(\frac{n}{2})+O(n\log n)=O(n\log^2n).

idea 2 in total uses T(n)=2T(n2)+O(n)=O(nlogn)T(n)=2T(\frac{n}{2})+O(n)=O(n\log n).

maximum sum subarray

eg. given n numbers a1,...,ana_1,...,a_n, find i, j that maximizes k=ijak\sum_{k=i}^ja_k.

runtime: T(n)=2T(n2)+O(n)=O(nlogn)T(n)=2T(\frac{n}{2})+O(n)=O(n\log n). (this is not the best)

closest pair

eg. given n points (x1,y1),...,(xn,yn)(x_1,y_1),...,(x_n,y_n) on the 2d plane, find a pair ii<jni\le i<j\le n that minimizes (xixj)2+(yiyj)2\sqrt{(x_i-x_j)^2+(y_i-y_j)^2}.

the algo:

  1. base case: if there is 1 point, minimum distance is inf; if there are 2 points, distance is their distance
  2. find the dividing line L by computing the median using the x-values. O(n)
  3. solve two subproblems. 2T(n/2)
  4. using a linear scan, remove all points not within the narrow region. O(n)
  5. sort points in non-decreasing order by their y-values. O(nlogn)
  6. for each point, compute its distance to the next 11 points in this y-ordering (imagine flatting the boxes). O(n)
  7. return minium distance found, and the two points

runtime: T(n)=2T(n2)+O(nlogn)=O(nlog2n)T(n)=2T(\frac{n}{2})+O(n\log n)=O(n\log^2n)

where do we use the assumption x being distinct?

integer multiplication

eg. given two n-bit binary numbers a=a1...an,b=b1...bna=a_1...a_n,b=b_1...b_n, find ab.

Karatsuba's algorithm

matrix multiplication

eg. given matrices A,BRn×nA,B\in \mathbb{R}^{n\times n}, find AB.

Strassen's algorithm two multiply two matrices R2n×2n\in\mathbb{R}^{2n\times 2n}, we divide it into 4 submatrices Rn×n\in\mathbb{R}^{n\times n}, and

AB=[P5+P4P2+P6P1+P2P3+P4P1+P5P3P7]AB= \begin{bmatrix} P_5+P_4-P_2+P_6 &P_1+P_2\\ P_3+P_4 &P_1+P_5-P_3-P_7 \end{bmatrix}

where

P1=A11(B12B22)P2=(A11+A12)B22P3=(A21+A22)B11P4=A22(B21B11)P5=(A11+A22)(B11+B22)P6=(A12A22)(B21+B22)P7=(A11A21)(B11+B12)\begin{aligned} P_1&=A_{11}(B_{12}-B_{22})\\ P_2&=(A_{11}+A_{12})B_{22}\\ P_3&=(A_{21}+A_{22})B_{11}\\ P_4&=A_{22}(B_{21}-B_{11})\\ P_5&=(A_{11}+A_{22})(B_{11}+B_{22})\\ P_6&=(A_{12}-A_{22})(B_{21}+B_{22})\\ P_7&=(A_{11}-A_{21})(B_{11}+B_{12}) \end{aligned}

runtime: T(2n)=7T(n)+O(n2)T(n)=O(nlog27)T(2n)=7T(n)+O(n^2)\Rightarrow T(n)=O(n^{\log_27}). if n5000n\ge5000, it is faster than normal algorithm.

puzzle: determine whether a graph contains triangles. hint: for each entry in Adj^i, it is the # of i-length paths. check Adj with Adj^2.

Week 3. May 26

median-finding

eg. have n distinct numbers a1,...,ana_1,...,a_n, find the median.

eg. relaxed: have n distinct numbers, find the k-th smallest number.

  1. find a pivot aia_i, put all smaller elements on the left, bigger elements on the right
  2. let r be the rank of aia_i

runtime: T(n)max{T(r1),T(nr)}+O(n)T(n)\le \max\{T(r-1), T(n-r)\}+O(n).

we use this method to find the pivot in linear time:

median of medians

  1. divide n elements into n5\frac{n}{5} groups, each group has 5 numbers (O(n))
  2. compute median of each group, call them b1,b2,...,bn5b_1,b_2,...,b_{\frac{n}{5}} (O(n))
  3. return median of these numbers

puzzle: what if we divide into groups of 3, 7, 9, ..., n\sqrt{n}.

graph connectivity

adjacency matrix vs. adjacency list

defn. two vertices u,vu,v are connected if there is a path from uu to vv.

defn. a subset of vertices SVS\subseteq V is connected if u,vu,v are connected for all u,vSu,v\in S.

defn. a graph is connected if u,vu,v are connected for all u,vVu,v\in V.

defn. a connected component is maximally connected subset of vertices.

eg. determine whether a graph is connected.
pick s, run bfs. then graph is connected iff visited[v] = true for all v.

eg. find all connected components of a graph.
pick s, run bfs to find reachable vertices. this is one component. then do the same for remaining vertices. (O(n + m))

eg. determine whether two vertices u, v are connected.
run bfs from u, check visited[v] = true.

eg. find shortest path between two vertices u, v. see next!

breath first search

motivation: ask friend networks eg.

// G = (V, E), s ∈ V // output: reachable vertices from v reachable_vertices(G, s): auto visited = {v: false for v in G.V} auto Q = new queue Q.enqueue(s) visited[s] = true while Q is not empty: auto u = Q.dequeue() for each v in u: if not visited[v]: Q.enqueue(v) visited[v] = true return [v for v, b in visited if b]

runtime: each vertex is enqueued at most once, when we dequeue a vertex u, then the for loop is run deg(n) steps. total time: O(n+vVdeg(v))=O(n+m)O(n+\sum_{v\in V}\mathrm{deg}(v))=O(n+m). (recall handshake lemma: sum of degrees = 2*number of edges)

correctness:

BFS tree

how to trace out from a path from s to v?

sproof.: use induction. base case is one vertex, clear. then in every for loop, we already have a tree (parent dict), we encounter a new unvisited child vertex, then we are just extending from a leaf vertex (as parent) to the current child, hence we maintain the tree. (|E|=|V|-1)
for shortest path: when we enqueue & dequeue vertices from the queue, the 1-distance vertices will always be in the front (because we discovered them early, and we are using a queue), and then 2-distance... (closer vertices always in the front) we are also building a tree in this order, so this tree-building traversal should reflect the shortest path. \square

eg.

// G = (V, E), s ∈ V bfst(G, s): auto visited = {v: false for v in G.V} auto Q = new queue auto parent = {} auto distance = {} Q.enqueue(s) visited[s] = true parent[s] = nullptr distance[s] = 0 // not necessary while Q is not empty: auto u = Q.dequeue() for each v in u: if visited[v]: continue Q.enqueue(v) visited[v] = true parent[v] = u distance[v] = distance[u] + 1

bipartite graphs

defn. a graph G=(V,E)G=(V,E) is bipartite if there is a bipartition (L,R)(L,R) of VV such that all edges have one endpoint in LL and one endpoint in RR.

eg. check whether a graph is bipartite.

  1. assume G is connected (otherwise run algo on all components)
  2. let L={vdist(s,v) is even}L=\{v|\mathrm{dist}(s,v)\text{ is even}\},R={vdist(s,v) is old}R=\{v|\mathrm{dist}(s,v)\text{ is old}\} (use bfst)
  3. if there is an edge with both endpoints in L, or both in R, then G is not a bipartite.
  4. otherwise it is.

runtime: O(|V|+|E|).

correctness:

remark.

motivation: find path from s to t in a maze

dfs(G, s): auto visited = {v: false for v in G.V} visited[s] = true explore(u): for v in u: if not visited[v] visited[v] = true explore(v) explore(s) return [v for v, b in visited if b]

DFS tree

how to trace out from a path from s to v?

defn. usual root, parent, ancestor, descendant

defn. a non-tree edge uv is called a back edge if either u is an ancestor or descendant of v. (in the graph)

theorem. in an undirected graph, all non-tree edges are back edges.
proof. it is sufficient to show it is impossible to have an edges that connects two leaf nodes in the tree (thus the only way is to connect back to the ancestor). suppose u and v are leaves, and u is in a subtree rooted at t. case 1: t is not done and we are at u, if we connect uv, then they have parent-child relation, and uv would be in-tree. case 2: t is done, then we backtrack to the node t and continue exploring its next child trees, then tv have parent-child relation. \square

Week 4. June 2

starting & finishing time

we can record the time when a vertex is first visited and when a vertex finished exploring:

auto visited = {v: false for v in G.V} visited[s] = true auto start = {} auto finish = {} auto time = 1 explore(u): start[u] = time++ for v in u: if not visited[v] visited[v] = true explore(v) finish[u] = time++ explore(s)

theorem. (parenthesis) the intervals [start[u], finish[u]] and [start[v], finish[v]] are either disjoint or one interval contains the other.
proof.

cut vertices and cut edges

suppose GG is connected,

find all cut vertices and cut edges.

claim. in the dfs, a subtree TiT_i below vv is a connected component in GvG-v iff there are no edges with one endpoint in TiT_i and another endpoint in an ancestor of vv.
proof:

img

hence we only need to consider a few cases.

lemma. for a non-root vertex vv in a dfs tree, vv is a cut vertex iff there is a subtree below vv with no edges going to an ancestor of vv.
proof:

lemma. for the root vertex vv in a dfs tree, vv is a cut vertex iff vv has at least two children.
proof:

idea: for each subtree TvT_v rooted at v, find the vertex that is reached by back edges from TvT_v that is closest to root.

how to find cut edges? tree edge uv is cut edge iff low[v] > start[u]; back edges can never be cut edge.

directed graphs

defn. if uvuv is a directed edge, then uu is the tail and vv is the head.

defn. the in-degree of a vertex vv, indeg(v)\mathrm{indeg}(v), is the number of edges with vv being the head. vv is source if indeg is 0.

defn. the out-degree of a vertex vv, outdeg(v)\mathrm{outdeg}(v), is the number of edges with vv being the tail. vv is sink if outdeg is 0.

defn. a directed graph is a directed acyclic graph if there is no directed cycle.

defn. tt is reachable from ss if there is a directed path from ss to tt.

defn. a directed graph G=(V,E)G=(V,E) is strongly connected if for every pair of vertices u,vVu,v\in V, uu is reachable from vv and vv is reachable from uu.

defn. a subset SVS\subseteq V is called strongly connected if for every pair of vertices u,vSu,v\in S, uu is reachable from vv and vv is reachable from uu (path can go outside)

defn. a subset SVS\subseteq V is a strongly connected component if $S is a maximally strongly connected subset. ie S is strongly connected but S+v is not for any vVSv\in V-S.

eg. determine if a graph is strongly connected.

eg. determine if a graph is a DAG.

eg. find all strongly connected components of a directed graph.

both DFS and BFS are same as undirected graphs, except that we only explore out-neighbors.

BFS tree

DFS tree

strongly connected graphs

given a graph G=(V,E), return yes if G is strongly connected.

claim. G is strongly connected iff every vertex vVv\in V is reachable from s and s is reachable from every vertex uVu\in V, where s is arbitrary.
proof.

claim. given G=(V,E)G=(V,E), we reverse the direction of all edges to obtain GR=(V,ER)G^R=(V,E^R). then there is a path from v to s in G iff there is a path from s to v in GRG^R.

algo:

runtime: O(n+m)

eg. find shortest walk from s to t with odd length.
copy the graph and have odd vertices and even vertices. find path from even s to odd t.

DAG

given a graph G=(V,E), return yes if it is DAG.

prop. if a directed graph is acyclic, then one can find a vertex with indegree of 0.
proof. suppose otherwise that all vertices in G has indegree at least 1. to prevent a directed cycle, we must have vertices laid out like: v1v2...v_1\rightarrow v_2\rightarrow ..., the middle ones have indegree >= 1. when we run out of vertices, we cannot extend this pattern further, so we could only go backward. as all vertices must have indegree >= 1, we have to go back to v1v_1 but that would form a directed cycle. \square

defn. a topological ordering is an ordering of vertices such that all edges go forward.

prop. a directed graph is acyclic iff there exists a topological ordering of the vertices.
proof.

algo:

runtime: O(n+m) (kahn's algo)

DAG (2)

theorem. (parenthesis) the intervals [start[u], finish[u]] and [start[v], finish[v]] are either disjoint or one interval contains the other.
proof. same as undirected. \square

lemma. if G is DAG, then for any edge uv, finish[v] < finish[u] for any dfs.
proof.:

algo:

runtime: O(n+m)

Week 5. June 9

strongly connected components

find all strongly connected components of a directed graph.

claim. two strongly connected components must be vertex-disjoint.
proof. otherwise the common vertex reaches and is reachable from all vertices in both components, then the two 'components' are not maximal, and we actually only have one strongly connected component. \square

observation. contracting each strongly connected component results in a DAG (otherwise components are not maximal).
img

observation. suppose we start a bfs/dfs on a 'sink component', then we can identify the vertices in that sink component. (eg GHIJKL)

lemma. if C1,C2C_1,C_2 are strongly connected components and there are directed edges from C1C_1 to C2C_2, then the largest finishing time in C1C_1 is larger than the largest finishing time in C2C_2.
proof.

observation. if we have the topological ordering (decreasing finishing time), then do dfs again using this ordering, then we will always first visit an 'ancestor component' before visiting a 'descendant component'.

algo

  1. run dfs on whole graph and sort vertices by decreasing finishing time
  2. let c = components
  3. for i=1,n do:

runtime: O(m+n)

greedy algorithms

work by using simple and/or local rules to make decisions and then commit on them.

minimizing total completion time

have n jobs with processing times p1,...,pnp_1,...,p_n, find an ordering of the jobs to finish so as to minimize the total completion time, where the completion time of a job is defined as time time when it is finished.

correctness: if it is not in non-decreasing order, then we want to show the solution is not optimal. then there exists an adjacent inversion pair, p[i] > p[i+1]. if we flip these two jobs, we can know that the sum completion times of all other jobs are the same (and after because p[i]+[i+1] does not change). for these two jobs, the time for old ordering is p[i]+p[i]+p[i+1], in the new ordering, it is p[i+1]+p[i+1]+p[i]. the new ordering is better. \square

minimizing total completion time (2)

have n jobs, each with processing time pip_i and weight wiw_i. find an ordering of the jobs to finish so as to minimized to total weighted completion time, where weighted completion time of a job is defined as its weight * its completion time.

interval scheduling

given n intervals [s1,f1],...,[sn,fn][s_1,f_1],...,[s_n,f_n], find the maximum subset of disjoint intervals.

there are multiple natural greedy algos:

algo:

  1. sort the interval so that f1...fnf_1\le...\le f_n. let initial solution S=S=\varnothing
  2. for 1in1\le i\le n, if interval [si,fi][s_i,f_i] is disjoint from all intervals in S, then add it to S
  3. return S

runtime: O(nlogn + n)

claim. there exists an optimal solution with I1=[s1,f1]I_1=[s_1,f_1] chosen.
proof. suppose there exists an optimal solution S that does not include I1I_1. suppose further the first interval in S conflicts with I1I_1. because I1I_1 has smallest finishing time, it is earlier than the second interval in S. we can swap the first interval with I1I_1 and the result is still optimal. \square

lemma. there is an optimal solution using the first k intervals of the greedy solution for any k.
proof. use induction on k. when k=1, this is the above claim. otherwise since we use disjoint intervals, we reuse the above argument. \square

if the number of intervals in solution is l, we apply the lemma with k=l.

interval coloring

given n intervals [s1,f1],...,[sn,fn][s_1,f_1],...,[s_n,f_n], find a coloring of the intervals using the minimum # of colors possible, so that each interval gets one color and any two overlapping intervals get different colors.

algo:

  1. sort intervals so that s1...sns_1\le...\le s_n.
  2. for 1in1\le i\le n, use the minimum available color to color the i-th interval

correctness: it is sufficient to show if the algo returns k colors, it is because at one moment there are k intervals pairwise overlapping with each other.

t ------|- 1 ----|-- 2 -----|----- 3 --|--- k current interval I_i

since all these intervals contain the start time sis_i (we sorted by start time), there are k intervals, so we need k.

Week 6. June 16

huffman coding

defn. we construct prefix code so that no encoding string is a prefix of another encoded string.

given n symbols, with frequencies i=1nfi=1\sum_{i=1}^n f_i=1. find a binary decoding tree T with n leaves that minimizes i=1nfidepthT(i)\sum_{i=1}^nf_i\cdot\mathrm{depth}_T(i) (average length of encoded strings).

observation. any optimal binary decoding tree is full (every internal node is two children).
proof. because we can just delete the violating internal node without affecting disambiguation. \square

r r / / _ => w / w

corollary. in the decoding tree there are at least two leaves of maximum depth that are siblings.

observation. (simple claim) there is an optimal solution in which the two symbols with lowest frequencies are assigned to leaves of maximum depth, and furthermore they are siblings.
proof. if we have fi>fjf_i>f_j but depth(i)>depth(j)\mathrm{depth}(i)>\mathrm{depth}(j), note:

(fidepth(i)+fjdepth(j))(fidepth(j)+fjdepth(i))=(fifj)(depth(i)depth(j))>0\begin{aligned} &\quad \,\,(f_i\cdot\mathrm{depth}(i)+f_j\cdot\mathrm{depth}(j))-(f_i\cdot\mathrm{depth}(j)+f_j\cdot\mathrm{depth}(i))\\ &=(f_i-f_j)(\mathrm{depth}(i)-\mathrm{depth}(j))\\ &>0 \end{aligned}

so we can always swap the two nodes and (original) fidepth(i)+fjdepth(j)>fidepth(j)+fjdepth(i)f_i\cdot\mathrm{depth}(i)+f_j\cdot\mathrm{depth}(j)>f_i\cdot\mathrm{depth}(j)+f_j\cdot\mathrm{depth}(i) (after). so lowest frequency symbols should appear at leaves. \square

algo:

  1. if |S|=2, encode one symbol using 0 and the other using 1, and return tree
    R 0 / \ 1 s1 s2
  2. otherwise let y, z be two symbols with lowest frequencies.
  3. delete y and z from S. add new pseudo symbol w with frequency fy+fzf_y+f_z
  4. solve this reduced problem (n-1 symbols) recursively and get a tree T'
  5. in T', look at the leaf associated with w, add two leaves to it (so that w becomes an internal node) and associate y and z with two new leaves.

(see cs240 - how are they different?)

correctness: use induction. base case is if there are 2 symbols, correct. if there are more than 2 symbols and our T' is optimal, for the new tree T after adding two symbols, obj(T)=obj(T)fwdepth(w)+fy(depth(w)+1)+fz(depth(z)+1)=obj(T)+(fw+fy+fy)depth(w)+fy+fz=obj(T)+fy+fz\mathrm{obj}(T)=\mathrm{obj}(T')-f_w\cdot\mathrm{depth}(w)+f_y\cdot(\mathrm{depth}(w)+1)+f_z\cdot(\mathrm{depth}(z)+1)=\mathrm{obj}(T')+(-f_w+f_y+f_y)\cdot\mathrm{depth}(w)+f_y+f_z=\mathrm{obj}(T')+f_y+f_z. we want to show any other tree will have obj >= this.
suppose an optimal solution is TT^*. remove y and z from TT^* we have TT^*{}' is a feasible solution to the reduced problem. since TT' is optimal, we have obj(T)obj(T)\mathrm{obj}(T')\le \mathrm{obj}(T^*{}'). using same calculation we have obj(T)=obj(T)+fy+fz\mathrm{obj}(T^*)=\mathrm{obj}(T^*{}')+f_y+f_z (1). so we have obj(T)obj(T)+fy+fz=obj(T)\mathrm{obj}(T^*)\ge\mathrm{obj}(T')+f_y+f_z=\mathrm{obj}(T). \square

(1) how can you use same calculation T* might not have w? use the y & z's common parent because they are sibling.

runtime: O(nlogn) (use heap for get min)

single source shortest path

given a directed graph G=(V,E), a non-negative length lel_e for each edge eEe\in E, and two vertices sVs\in V. find a shortest path from s to v for every vertex vVv\in V.

using reduction:

we want to simulate this process, but don't want to follow broken edges step by step. we keep track of an upper bound on the time a vertex to be found.

5 1 f <------- s ------> g | 2 b

algo (dijkstra):

dijkstra(G, s): auto parents = {} auto dist = {v: ∞ for v in G.V} auto Q = new priority_queue dist[s] = 0 for v, bound in dist: Q.insert(v, key=bound) while Q is not empty: auto u = Q.delete_min() for each out_v in u: auto l = /*length of uv*/ if dist[u] + l < dist[out_v]: dist[out_v] = dist[u] + l Q.set_key(out_v, dist[out_v]) // "decrease_key()" parents[out_v] = u // may be set multiple times return ...

runtime:

puzzle. do not trust this; there are better formations than it

Instead of filling the priority queue with all nodes in the initialization phase, it is also possible to initialize it to contain only source; then, inside the if dist[u] + l < dist[out_v] block, the decrease_priority() becomes an add_with_priority() operation if the node is not already in the queue.

single source shortest path (2)

idea:

algo:

  1. let dist = {v: ∞ for v in G.V}, dist[s] = 0, R = ∅
  2. while R != G.V:
    1. pick the vertex u \notin R with smallest dist[u]
    2. R = R U {u}
    3. for each edge uv in neighbors of u:
      1. if dist[u] + luvl_{uv} < dist[v]:
        1. dist[u] = dist[v] + luvl_{uv}

correctness: we want to show the following invariant holds: for any v in R, dist[v] is the shortest path distance from s to v.
proceed by induction. base case R={s}, then dist[s] = 0 is shortest. inductive: suppose R is computed correctly, now we have u, we want to show the shortest path from s to u is dist[v] + lvul_{vu}.

shortest path tree:

remark. algo breaks when length is negative. it fails at the ... in the proof.

Week 7. June 23

dynamic programming

on a high level, we can solve a problem by dynamic programming if there is a recurrence relation with only a polynomial number of subproblems.

eg. how to compute F(n)=F(n1)+F(n2),F(1)=F(1)=1F(n)=F(n-1)+F(n-2),F(1)=F(1)=1? we compute the same term too many times and the run time is O(1.618^n).

in principle, to come up with bottom-up implementation, we just need to use a topological ordering of the 'subproblem dependency graph' to have correct order to solve subproblems.

weighted interval scheduling

given n intervals ([si,fi])i=1n([s_i,f_i])_{i=1}^n and a weight wiw_i for each interval i. find a subset of disjoint intervals that maximizes the total weight.

1 3 6 10 ---- ------- ------ ---- ---- -- -- 8 ----- 2 4 5 9 --- 7 ----- ------- 11

img

note if we already pick 1 and 3 and wish to extend this solution. this is solves the same problem as having picked 2 and 3 - what really matters is the last interval of the current partial solution.

we use a good ordering of the intervals and precompute some useful info.

runtime (top-down):

we can use bottom-up computation too, from n to 1.

puzzle. how to print the intervals? (linked list to maintain linear time)

subset sum

given n positives integers (ai)i=1n(a_i)_{i=1}^n and an integer K. find a subset S[n]S\subseteq[n] with iSai=K\sum_{i\in S}a_i=K or report no such subset exists.

a[1] = 3, a[2] = 2, a[3] = 7, a[4] = 8, ...

note after picking a[1] and a[3], we solve the similar subproblem as having picked a[2] and a[4].

bottom-up:

for i = 1, n: for L = 0, K: D[i][L] = false D[n][a[n]] = D[n][0] = true for i = 1, n: D[i][0] = true for i = n, 1, -1: for L = 1, K: D[i][L] ||= D[i + 1][L] if L - a[i] >= 0: D[i][L] ||= D[i + 1][l - a[i]] return D[1][K]

runtime: nK subproblems, in each subproblem, look up two values => O(nK) (pseudo polynomial).

remark. if decision problem, we can keep only two rows. space complexity becomes O(K) instead of O(nK).

remark. if using top-down, we do not necessarily compute all subproblems unlike bottom-up.

remark. D[1][K] is true iff there exists a directed path from (1, K) to some nodes with 'true' base case.

img

knapsack

given n items, each of weight wiw_i and value viv_i, and a positive integer W. find a subset S[n]S\subseteq[n] such that iSwiW\sum_{i\in S}w_i\le W that maximizes iSvi\sum_{i\in S}v_i.

approach 1:

runtime: O(nWiai)O(nW\cdot\sum_{i}a_i).

approach 2:

runtime: O(nW)

longest increasing subsequence

given n numbers (ai)i=1n(a_i)_{i=1}^n, a subsequence is a subset ai1,...,aika_{i_1},...,a_{i_k} with i1<...<iki_1<...<i_k. a subsequence is increasing if ai1<...<aika_{i_1}<...<a_{i_k}. find an increasing subsequence of maximum length.

[5, 1, 9, 8, 8, 8, 4, 5, 6, 7] => 1 4 5 6 7

runtime: O(n^2)

how to print the sequences?

i | 1 2 3 4 5 6 7 8 9 --+----------------------------- a | 3 8 7 2 6 4 12 14 9 L | 4 3 3 4 3 3 2 1 1 ^ ^ ^ ^ we know answer is of length 4. trace L from 4 to 1.

or use parent array.

remark. if runtime == # of subproblems, then algo is likely to be optimal. otherwise there might be better algo.

Week 8. June 30

longest increasing subsequence (2)

idea:

define pos[k]=maxj>i{aj:L(j)=k}\mathrm{pos}[k]=\max_{j>i}\{a_j:L(j)=k\} to be the best position to extend a subsequence of length k, and let m=maxi+1jnL(j)m=\max_{i+1\leq j\leq n}L(j) be the length of a longest increasing subsequence we computed so far. then we only need L(pos[1]), ..., L(pos[m]) for future computations.

i | 1 2 3 4 5 6 7 8 --+------------------------ a | 2 7 6 1 4 8 5 3 L | 3 2 2 3 2 1 1 1 ^ ^ ^ pos[1] = 6, pos[2] = 2, pos[3] = 1

claim. (monotone) a[pos[1]] > ... > a[pos[m]]: it is harder to extend a longer increasing subsequence than a shorter one.
proof. suppose by contradiction there exists j such that a[pos[j]] >= a[pos[j-1]]. assume we have the longest increasing subsequence ai1a_{i_1}=a[pos[j]] < ai2<...<aija_{i_2}<...<a_{i_j} of length j. then ai2a_{i_2} will be a position to start such a subsequence of length j-1. we have ai2a_{i_2} > a[pos[j]] >= a[pos[j-1]], which is a contradiction because i2i_2 would be a better place to start subsequence of length j-1. \square

algo:

auto m = 1 // length pos[1] = n // base for i = n-1, 1, -1: if a[i] < a[pos[m]]: ++m pos[m] = i else: use binary search to find smallest j st a[pos[j]] <= a[i] // < a[pos[j-1]] pos[j] = i return m

runtime: O(nlogn)

longest common subsequence

given two strings a1a2...ana_1a_2...a_n and b1b2...bmb_1b_2...b_m, find the largest k such that there exist i1<...<ik,j1<...<jki_1<...<i_k,j_1<...<j_k and ail=bjla_{i_l}=b_{j_l} for 1lk1\leq l\leq k.

longest increasing subsequence is a special case of this problem, where the second array is the sorted version.

___ _ S1 = 3 1 2 8 9 7 ___ _ S2 = 1 2 3 7 8 9 answer = 1 2 7

let C[i][j] be the length of a longest common subsequence of a[i]...a[n] and b[j]...b[m].

runtime: there are O(nm) subproblems, each takes constant time => total O(nm).

bottom-up: count i from n to 1, and count j from n to 1.

edit distance

given two strings a1a2...ana_1a_2...a_n and b1b2...bmb_1b_2...b_m, find the minimum k such that we can do k add/delete/change operations to transform a1a2...ana_1a_2...a_n to b1b2...bmb_1b_2...b_m.

let D[i][j] be the edit distance of a[i]...a[n] and b[j]...b[m].

runtime: O(nm).

bottom-up: dependency points bottom-right. we only have one corner base case, fill in the bottom row and rightmost column first.

img

puzzle. given DAG, what is length of longest path from s to t? order V by topo, let d[i] be the length of longest path from s to i, then d[i] = i == s ? 0 : max { d[j]: j and i and neighbors } + 1, answer is d[t].

Week 9. July 7

maximum independent sets on trees

defn. given a graph G=(V, E), a subset of vertices SVS\subseteq V is an independent set if uvEuv\notin E for all u,vSu,v\in S.

given a tree T=(V,E), find an independent set of maximum cardinality.

A -- B -- C -- D sol: A and C, or A and D, or B and D. size 2.

we define subproblem as I[v] be the size of a maximum independent set in the subtree rooted at vertex v.

runtime: there are n subproblems. in total, we look at v(#children of v+ # grandchildren of v)\sum_v({\text{\#children of }v +\text{ \# grandchildren of }v}) vertices. this is the same as v(#parent of v+ # grandparent of v)=v2=2n\sum_v({\text{\#parent of }v +\text{ \# grandparent of }v})=\sum_v2=2n.

puzzle. how to solve max weighted independent set problem?

maximum independent sets on trees (2)

runtime: O(vdeg(v))=O(m+n)=O(n)O(\sum_v\mathrm{deg}(v))=O(m+n)=O(n).

optimal binary search tree

given n keys k1<...<knk_1<...<k_n and non-negative frequencies infi=1\sum_{i}^nf_i=1. find a binary search tree T that minimizes i=1nfidepthT(i)\sum_{i=1}^nf_i\cdot\mathrm{depth}_T(i).

note optimal solution does not necessarily require key with highest frequency at the top - have to respect bst structure.

let C[i][j] be the objective value of an optimal bst with keys ki<...<kjk_i<...<k_j.

runtime: n^2 subproblems, for each we take O(n) possibilities. total: O(n^3).

bottom-up: compute smaller width (j-i) first then larger subproblems.

img

better: Knuth's O(n^2).

shortest paths with negative edges

given a directed graph G=(V,E), a length lel_e for each edge eEe\in E, and two vertices sVs\in V. find a shortest path from s to v for every vertex vVv\in V, assuming no negative cycles exist.

what is wrong with dijkstra's: if we suddenly find a negative edge that decreases the distance to a visited vertex m, then all vertices that are reached by m have to be updated.

1 5 s -----> a -----> b -----> c | ^ v | -1000 g -----> h 100 if there are no negative edges, after deciding sa and dropping sg, we instantly know sa is shortest to reach a. however, if negative edges exist, we cannot drop sg as ah makes it smaller. update a, b and c!

negative cycles: shortest path distance is not well-defined if a cycles consisting of negative edges appear in the middle of path.

-1 s -----> a -----> b ------> t -1 | ^ v -1 | -1 c ------+ we can keep walking in the cycle - the dist is not bounded below

idea: dijkstra will compute distance to some vertices correctly eg. first vertex on a path

let D[v][i] be the shortest path distance from s to v using at most i edges.

runtime:

space: O(n^2) (O(n) if just want distance, only keep D[v][i+1] and D[v][i])

algo (bellman-ford):

dist[s] = 0 for v in V - {s}: dist[v] = inf for i = 1, n-1: for each edge uv in E: if dist[u] + L[uv] < dist[v]: // relaxation step dist[v] = dist[u] + L[uv] parent[v] = u

lemma. if there is a directed cycle C in the edges (parent[v], v), then C must be a negative cycle.
proof. assume there is a directed cycle v1v2...vkv1v_1v_2...v_kv_1 where the final edge is vkv1v_kv_1 (parent of v1v_1 is vkv_k). then for 2ik2\leq i\leq k, the for loop tells us dist[vi]dist[vi1]+lvi1vi\mathrm{dist}[v_i] \geq \mathrm{dist}[v_{i-1}] + l_{v_{i-1}v_i}. if we added the edge vkv1v_kv_1, we also have dist[v1]>dist[vk]+lvkv1\mathrm{dist}[v_1] > \mathrm{dist}[v_k] + l_{v_kv_1}. adding the inequalities we get j=1kdist[vj]>j=1kdist[vj]+e in cyclele0>e in cyclele\sum_{j=1}^k\mathrm{dist}[v_j]>\sum_{j=1}^k\textrm{dist}[v_j]+\sum_{e\text{ in cycle}}l_e\Rightarrow 0>\sum_{e\text{ in cycle}}l_e, which only happens for negative cycles. \square

negative cycles

given a directed graph G=(V,E), a length lel_e for each edge eEe\in E. find a negative cycle.

note D[v][i] is computed correctly even though graph can have negative cycles (we only used such assumption in last question for computing stopping condition).

idea:

  1. assume every vertex can be reached from vertex s (otherwise do it in every component)
  2. if there is no egative cycles, D[v][k] is finite for all v as k -> ∞
  3. if there is, D[v][k] tends to -∞

claim. D[v][n] = D[v][n - 1] for all vVv\in V iff G has no negative cycles.
proof.

remark. early termination rules is D[v][k+1] = D[v][k] for all v.

hence for checking, using the above claim we can do bellman-ford with an extra iteration step.

claim. if D[v][n] < D[v][n - 1] for some v, then in the walk (by tracing parent) between two repeated vertices we find a negative cycle.
proof. then we know the shortest path using at most n edges to get to v must have exactly n edges (otherwise D[v][n] = D[v][n-1]). so there is at least one repeated vertex, meaning there is a cycle in the walk. we suppose otherwise that this cycle is not negative, then D[v][n-1] <= length(W') <= length(W) = D[v][n], where W' is done by using n-1 edges and W is done using n edges. this is a contradiction. \square

all-pairs shortest paths

given a directed graph G=(V,E), a length lel_e for each edge eEe\in E. find shortest path distances from s to t for all s,tVs,t\in V. assume no negative cycles exist.

naive: use bellman-ford for all s. time is O(nm*n) (account for # edges!).

define D[i][j][k] to be the shortest path distance from viv_i to vjv_j using {v1,...,vk}\{v_1,...,v_k\} as intermediate vertices.

algo (floyd-warshall):

D[i][j][0] = inf for all edge v[i]v[j] in E: D[i][j][0] = l[ij] for k = 1, n: for i = 1, n: for j = 1, n: d[i][j][k] = min(D[i][j][k], D[i][k+1][k] + D[k+1][j][k])

runtime: O(n^3)

puzzle. reduce one dimension of D by using adj matrix?

traveling salesman

given a directed graph G=(V,E), a length lel_e for each edge eEe\in E. find a directed cycle C that visits every vertex exactly once that minimizes eCle\sum_{e\in C}l_e.

naive: try every combination of vertices and compute weights (O(n!)n).

define C[i][S] to be the shortest path distance from 1 to viv_i with vertex set SVS\subseteq V on the path.

runtime:

Week 10. July 14

bipartite matching

defn. a subset of edges MEM\subseteq E is a matching if edges in M are pairwise vertex disjoint.

defn. given matching M, a vertex v is matched if v is the endpoint of some edge eMe\in M, otherwise v is unmatched/free.

defn. a matching M is a perfect matching if every vertex is matched in M.

bipartite matching

given a bipartite graph G=(X,Y;E), find a maximum cardinality subset of edges that are vertex disjoint.

defn. a path v1v2...v2kv_1v_2...v_{2k} is an augmenting path with respect to a matching M if

we always have a partial solution (greedily), if we find an augmenting path, then we can remove middle (even) edges and add (odd) edges to increase size of matching by 1.

img

prop. there is augmenting path with respect to M iff M is not a maximum matching.
proof.

algo:

  1. let M := empty
  2. while there is an augmenting path v1...v2kv_1...v_{2k} of M
    1. M = M{v2iv2i+1:iik1}{v2i1v2i:1ik}M-\{v_{2i}v_{2i+1}:i\leq i\leq k-1\}\cup\{v_{2i-1}v_{2i}:1\leq i\leq k\}
  3. return M

runtime:

faster: O(mn)O(m\sqrt{n}) (edmonds and karp)

finding augmenting path

given a bipartite graph G=(X,Y;E) and a matching M. find an augmenting path of M in G or report it does not exist.

because we have bipartite graph, we use directions to encode matching information. we assign one direction (from right to left) for all edges in M, and the other direction for unmatched edges (GMG_M^\rightarrow). then we can follow the directed alternating path.

claim. there exists an augmenting path of M in G iff there exists a directed path from a free vertex on the left to a free vertex on the right in GMG_M^\rightarrow.
proof. clear. \square

naive: do bfs on each free vertex on left (O(nm)).

we can add pseudo source and target nodes connected to free vertices to avoid enumerating all pairs. source in X, sink in Y.

img

claim. there is an augmenting path of M in G iff there exists directed path from s to t in GMG_M^\rightarrow{}'.
proof.: clear, just remove s and t from final answer. \square

algo:

  1. create GMG_M^\rightarrow{}' with directions and super source and sink
  2. use bfs to find a path P from s to t
    1. if yes, return P without s and t
    2. if no, return no

runtime: O(m+n).

puzzle. how to find maximum matching in general graphs? finding aug path step will break. (edmonds)

bipartite vertex cover

defn. given graph G=(V,E), a subset of vertices SVS\subseteq V is a vertex cover if {u,v}S\{u,v\}\cap S\neq\varnothing for all uvEuv\in E.

given a bipartite graph G=(X,Y;E), find a vertex cover of minimum cardinality.

observation. (lower bound of vertex cover) if we have any matching M, and any vertex cover S, then SM|S|\geq |M|.

x z u s | | | y t v matching of size 3, use at least 3 vertices to cover

theorem. (könig) in a bipartite graph G(X,Y;E), max size of a matching is equal to the min size of a vertex cover.
proof. only need to consider the upper bound of |S|, ie suppose we run the matching finding algo and find a max matching M; we want to show SM|S|\leq|M|. since M is maximum, in GMG_M^\rightarrow{}', we cannot find a path from super source s (in X) to super sink t (in Y). let R be set of reachable vertices from s, then there are no edge with tail in R and head in XYRX\cup Y-R (otherwise that edge will go to R from s). note also if R does not include some vertex in M, then it cannot include the corresponding matching pair - ie xlRylR    xl,ylR,1l<jx_l\in R\lor y_l\in R\iff x_l,y_l\in R,1\leq l< j. this is because each xlx_l has indegree = 1 and outdegree = 0, and its only in-neighbor is yly_l; reaching one implies reaching the other.

img

then set (XR)(YR)={x1,...,xj1,yj,...,yk}(X-R)\cup(Y\cap R)=\{x_1,...,x_{j-1},y_j,...,y_k\} is a vertex cover, where {x1,...,xj1}\{x_1,...,x_{j-1}\} will cover all edges with endpoint in {x1,...,xj1}\{x_1,...,x_{j-1}\}, and {yj,...,yk}\{y_j,...,y_k\} will cover all edges with endpoint in {xj,...,xX}\{x_j,...,x_{|X|}\} (this is all types of edges in bipartite graph). (how to show it is minimal?) \square

remark. how to show perfect matching does not exist? vertex cover size is less than n/2.

algo:

  1. find a max matching M, and GMG_M^\rightarrow{}' with directions and super source and sink
  2. do bfs on GMG_M^\rightarrow{}' to get reachable vertices RR
  3. output (XR)(YR)(X-R)\cup(Y\cap R)

runtime: O(T(matching) + m+n)

theorem. (hall) a bipartite graph G=(X,Y;E) with |X|=|Y| has perfect matching iff for all SXS\subseteq X, we have N(S)S|N(S)|\geq|S| where N(S) is the neighbor set of S in Y.
pf. use könig. \square

corollary. every d-regular bipartite graph G=(X,Y;E) has perfect matching.
pf. use könig. \square

remark. randomized O(nlogn) algo exists to find it.

puzzle. capacitated job assignment (each vertex in Y has capacity)? create copies of vertices according to its capacity. number of vertices becomes |X|+O(|X||Y|), number of edges becomes O(|X||E|).

basketball league winner

given the current standing and the remaining schedule, tell whether it is possible that a team can still win in the league.

standing remaining schedule W L Miami-Boston x3 Boston 41 17 New York - Boston x2 New York 40 18 ... Philadelphia 38 19 Toronto - New York x2 ... ... Toronto 33 25

idea: assume toronto win all its remaining games and in total ww^* wins. we want all other teams win less than ww^* games. say team i currently has wiw_i wins, we want it to win no more than wwi1w^*-w_i-1 games.

create bipartite graph, one side is the teams with their capacity = their max wins, one side is games (each with two edges going to two teams). we will want to find a way to assign each game a winner. if such assignment exists, then toronto is possible to win.

remark. it is hard for football games (draws)

max-flow min-cut

find the maximum number of edge-disjoint paths between two vertices s and t (different paths can not use same edge).

if we find a 'cut' of k edges, then the maximum number of flows from s is k.

duality

replace integral constraints x{0,1}x\in\{0,1\} to linear constraints x0x\geq 0. 0 means not picked. gibstic

augmenting path method can be understood as the simplex method of solving linear programs.

Week 11. July 21

polynomial time reduction

it is more convenient to restrict to decision problem so that every problem has same output format. if we know how to solve decision version of the problem in polynomial time, then we can use the decision algorithm as blackbox/subroutine to solve the the search version of problem in polynomial time.

eg. find a max bipartite matching.

  1. decision version: does G have a matching of size at least k?
  2. use binary search on all k to find an optimal value k
  3. try deleting an edge e from G. does G-e have a matching of size at least k?
    1. if yes: delete e
    2. if no: keep e and try other edge
    3. (reduces to |E| decision problems)
  4. what is left is the matching

defn. a decision problem A is polynomial time reducible to decision problem B if there exists a polynomial time algorithm F that maps any input instance IAI_A of A into an input instance IBI_B of B such that IAI_A is a YES-instance of problem A iff IBI_B is a YES-instance of problem B. we write ApBA\leq_p B when such a polynomial time reduction exists, saying A is not more difficult than B in terms of polynomial time solvability.

algo (solving problem A by reduction):

  1. use reduction algorithm F to transform IAI_A into IBI_B
  2. return algB(IB)\texttt{alg}_B(I_B)

runtime: F maps IAI_A to IBI_B in p(IA)p(|I_A|), alg\texttt{alg} solves IBI_B in q(IB)q(|I_B|). total q(p(IA))q(p(|I_A|)).

corollary. if A cannot be solved in polynomial time, then B cannot be solved in polynomial time.

prop. (transitivity) if ApBA\leq_p B and BpCB\leq_p C, then ApCA\leq_p C.
proof. transform IAI_A to IBI_B then to ICI_C. \square

three problems

maximum clique (Clique):

defn. a subset SVS\subseteq V is a clique if uvEuv\in E for all u,vSu,v\in S.

decision version: given graph G=(V,E), is there a clique with at least k vertices?

maximum independent set (IS):

decision version: given graph G=(V,E), is there an independent set with at least k vertices?

minimum vertex cover (VC):

decision version: given graph G=(V,E), is there a vertex cover with at most k vertices?

prop. CliquepIS\texttt{Clique}\leq_p\texttt{IS} and ISpClique\texttt{IS}\leq_p\texttt{Clique}.
proof.

prop. VCpIS\texttt{VC}\leq_p\texttt{IS} and ISpVC\texttt{IS}\leq_p\texttt{VC}
proof.

so all three problems are equivalent in terms of polynomial time solvability.

three more problems

hamiltonian cycle (HC):

defn. a cycle is hamiltonian if it touches every vertex exactly once.

decision version: given graph G=(V,E), does G have a hamiltonian cycle?

hamiltonian path (HP):

defn. a path is hamiltonian if it touches every vertex exactly once.

decision version: given graph G=(V,E), does G have a hamiltonian path?

traveling salesman problem (TSP):

decision version: given graph G=(V,E) with an length lel_e for each edge eEe\in E, is there a hamiltonian cycle with total length at most LL?

prop. HPpHC\mathtt{HP}\leq_p\texttt{HC}.
proof. define reduction F(G(V,E))=G(V{s},E{sv:vV})F(_G(V,E))=_{G'}(V\cup\{s\},E\cup\{sv:v\in V\}), where s is an additional vertex that goes to every existing vertex. claim G has a HP iff G' has a HC.

prop. HCpHP\mathtt{HC}\leq_p\texttt{HP}
proof. define reduction F(G(V,E))=G(V{x,t1,t2},E{xv:vN(x)}{t1x,t2x})F(_G(V,E))=_{G'}(V\cup\{x',t_1,t_2\},E\cup\{x'v:v\in N(x)\}\cup\{t_1x',t_2x\}) where xVx\in V is arbitrary. claim G has a HC iff G' has a HP.

img

prop. HCpTSP\texttt{HC}\leq_p\texttt{TSP}.
proof. define transform F(G(V,E))=G(V,V2)F(_G(V,E))=_{G'}(V,V^2) such that GG' is complete graph, and luv=1uvE,luv=2uvEl_{uv}=1\,\forall uv\in E,l_{uv}=2\,\forall uv\notin E. claim G has a HC iff G' has a HC of total length at most V|V|. \square

note (in hw) also need to show these transformations are done in polynomial time.

remark. it is not clear how to reduce TSP to HC.

3-satisfiability

given n boolean variables (xi)i=1n(x_i)_{i=1}^n which can be either True or False, and a formula in conjunctive normal form (CNF) where it is an AND of the clauses where each clause is an OR of the literals where a literal is either xix_i or xi\overline{x_i}.

(x1x2x3)(x2x3x4)(x1x3x4)(x2x3)(x_1\lor x_2\lor \overline{x_3})\land(\overline{x_2} \lor x_3 \lor \overline{x_4})\land(\overline{x_1}\lor \overline{x_3}\lor x_4)\land(\overline{x_2}\lor x_3)

if each clause has at most 3 literals, is there a truth assignment to the variables that satisfies all the clauses?

theorem. 3SATpIS\texttt{3SAT}\leq_p\texttt{IS}.
proof. given a 3SAT formula, we would like to construct a graph G so that the formula is satisfiable iff the graph has an independent set of size m where m is the number of clauses. for each clause, we connect at most three vertices corresponding to the literals (so single vertex/pair/triangle) (ensure in each clause we satisfy one literal). we also connect vertices xixix_i\overline{x_i} (to prevent both of them appearing in the independent set at same time).

img

remark. it is not important how many literals in a clause.

NP-completeness

it is not easy to compare every pairs (O(n^2)) of problems to see if they can be reduced to each other. we can identify the 'hardest' problems in a large class of problems.

a general feature of problems is that there is a short 'proof/solution' of a YES-instance - it is easier to verify.

defn. for a problem X, each input instance of X is represented by a binary string s. a problem X is in the class NP if there is a polynomial time verification algorithm BXB_X s.t. the input s is a YES-instance iff there is a proof t which is a binary string of length poly(s)\mathrm{poly}(|s|) so that BX(s,t)B_X(s,t) returns YES.

input s --> +----+ | Bx | --> YES/NO proof t --> +----+

eg. show vertex cover is in NP.
we are given graph G and an integer k. we encode (G, k) into a binary string s, and t into a binary string of SVS\subseteq V. let BXB_X be algo that takes s and t as input and checks if every edge is covered by SS with size <= k. then BXB_X says yes if S is vertex of size <= k, and says NO if t does not correspond to a vertex cover of size <= k. \square

eg. show 3SAT is in NP.
we are given a truth assignment (size <= n). algo: check if it is satisfying (runtime O(m)) where m is number of clauses. iff is clear. \square

remark. Clique, IS, HC, HP, subset-sum are all in NP.

remark. (non-HC) we do not know a short proof how to show graph has no HC.

defn. X is in co-NP if there is short proof and an efficient verification for NO-instances.

eg. bipartite matching (at most size k) is in both NP and co-NP.
for NO-instances (asking does G have graph has no matching >= k?), by konig, it is the same a asking 'whether G has vertex cover of size < k?', which is easy to check.

remark. it is common belief NPcoNP\mathrm{NP}\neq\mathrm{co-NP}.

remark. PNP\mathrm{P}\subseteq\mathrm{NP}. if problem can be solved in poly time, it is trivially in NP (do not even need another t and BXB_X).

remark. our NP definition is equivalent to saying "NP is class of problems solvable by a non-deterministic turing machine in poly time".

remark. it is common belief PNP\mathrm{P}\neq\mathrm{NP}.

defn. a problem XNPX\in\mathrm{NP} is NP-complete if YpXY\leq_p X for all YNPY\in\mathrm{NP}.

prop. P=NP iff an NP-complete problem can be solved in polynomial time.

to prove a problem X is NP-complete, we first show X is NP, then find an NP-complete problem Y show YpXY\leq_p X. (common mistake is showing X <=p Y)

Week 12. July 28

cook-levin theorem

we introduce an intermediate problem to show 3SAT is NP-complete.

circuit-SAT: given a circuit with AND/OR/NOT gates, some known input gates and some unknown input gates, assuming input circuit is DAG, and each AND/OR gate has only two incoming edges.. is there a truth assignment on the unknown input gates so that the output is True?

img

theorem. circuit-SAT is NP-complete.
pf. we want to show for all NP problem X we have XpCircuit-SATX\leq_p\texttt{Circuit-SAT}. we start from the definition of NP (algorithm satisfaction: there exists a verification program that runs in poly time), and try to use a poly-time compiler to turn the verification algo into a circuit of poly size so that the circuit says YES if given input s and proof t is for YES instances, and NO if the problem's answer is NO no matter what proof is given. then input s is a YES-instance iff there is a satisfying assignment for Circuit-SAT. \square

theorem. Circuit-SATp3-SAT\texttt{Circuit-SAT}\leq_p\texttt{3-SAT}.
pf. given circuit of n gates, we will construct a 3SAT formula with O(n) variables so that circuit is satisfiable iff the formula is satisfiable.:

img

it is clear we can turn the circuit into a 3SAT formula in poly time in terms of size of circuit (# of gates/edges). plugging in x, y, z into the circuit, and simulate the circuit behavior regarding gates, then the resultant variables satisfy the formula too. if there is assignment of the formula, plugging them into the circuit will make the circuit happy as well. \square

theorem. (cook-levin) 3-SAT is NP-complete.

hamiltonian cycle

defn. a directed cycle is a hamiltonian cycle if it touches every vertex exactly once.

directed hamiltonian cycle (DHC): given directed graph G=(V,E), does G has a DHC?

theorem. DHC is NP-complete.
proof. it is easy to check DHC is in NP. we now will show 3SATpDHC\texttt{3SAT}\leq_p\texttt{DHC}, ie given a 3SAT formula with n variables (xi)i=1n(x_i)_{i=1}^n and m clauses (Ci)i=1m(C_i)_{i=1}^m, we would like to construct a directed graph G so that formula is satisfiable iff G has a DHC. we create a long 'two-way' path for each variable, so that going the path from left to right corresponds to setting variable to True, while from right to left corresponds to setting it to False.

img

now a hamilton cycle in G has one-to-one correspondence with an assignment. now we want to add clause structures so that only satisfying assignments 'survive'. for each clause, we create a separate vertex. depending on whether literal is negated, we assign different directions for edges that go to the clause vertex (thus forcing the left/right direction of each PiP_i). note clause vertex only has edges to the portions reserved for the clause.

img

prop. DHCpHC\texttt{DHC}\leq_p\texttt{HC}.
proof. given input directed graph G, we use a length 3 path to simulate a directed edge. for each vertex vv in G, we create 3 vertices vin,vmid,voutv_{\textrm{in}},v_\textrm{mid},v_\textrm{out} connected with two middle edges, and vinv_\textrm{in} has original in-edges in G, voutv_\textrm{out} has out-edges. we claim G has a DHC iff G' has a HC.

img

it is clear if G has a DHC, we can follow it in G' to form a HC. suppose we have a HC in G' and it starts from aina_\textrm{in}, then must goto amida_\textrm{mid}, then to aouta_\textrm{out}, then go to some other binb_\textrm{in}, .... if we shrink each three vertex we get a DHC in G. (having mid vertex amida_\textrm{mid} forces a path to go to it onces it touches some in vertex aina_\textrm{in}, otherwise it may just leave if some other coutc_\textrm{out} goes to aina_\textrm{in}, which is not what we want it to behave) \square

corollary. HC is NP-complete.

graph coloring

given undirected graph G=(V,E) and positive integer k, is it possible to use k colors to color all vertices so that every vertex receives one vertex and any two adjacent vertices receive different colors?

theorem. 3-coloring is NP-complete.
proof. it is easy to check 3-coloring is in NP. we want to show 3SATp3-coloring\texttt{3SAT}\leq_p\texttt{3-coloring}, ie given a 3SAT formula with n variables (xi)i=1n(x_i)_{i=1}^n and m clauses (Ci)i=1m(C_i)_{i=1}^m, we would like to construct a graph G so that formula is satisfiable iff G is 3-colorable. we associate True and False as two colors. for each variable xix_i, we create two vertices vi,viv_i,\overline{v_i} and add an edge between them so they get different colors. to enforce they get True/False colors, we connect every literal vertex to a common vertex called base vertex colored using color B.

img

so there is 1-1 correspondence between truth assignments and the 3-colorings.

we now involve clauses so that only satisfying assignments remain. we can use a 'gadget' to connect to the three variables in the clause so that there is a 3-coloring for the gadget iff at least one of them gets the color T. say we have clause (x1x2x3)(x_1\cup\overline{x_2}\cup x_3), after trail-and-error, we can attach the following gadget to the base graph:

img

vertices with no labels mean they are added only for this clause. then suppose all v1,v2,v3v_1,\overline{v_2},v_3 are colored F => meaning clause is not satisfied. then all colors of vertices are forced, we cannot color the top node using any color. if any of them is colored T, then the top vertex can be colored using the 3 colors (check).

3-dimensional matching

given disjoint sets X, Y, Z each of size n, a set TX×Y×ZT\subseteq X\times Y\times Z. does there exist a subset of n disjoint triples in T (cover every element exactly once)?

a subset of n disjoint triples is called perfect 3d-matching.

x1 -- y1 -- z1 x2 -- y2 -- z2 x3 -- y3 -- z3 T={(x1,y1,z1),(x2,y2,z2),(x3,y3,z3)}

theorem. 3DM is NP-complete.
proof. it is easy to check 3DM is in NP. we want to show 3SATp3DM\texttt{3SAT}\leq_p\texttt{3DM}, ie given 3SAT formula with n variables (xi)i=1n(x_i)_{i=1}^n and m clauses (Ci)i=1m(C_i)_{i=1}^m, we would like to construct a 3DM instance (hypergraph) so that the formula is satisfiable iff there is a perfect 3d-matching. for each variable vv, we create (xi)i=1m,(yi)i=1m,(zi)i=12m(x_i)_{i=1}^m,(y_i)_{i=1}^m,(z_i)_{i=1}^{2m} enclosed in a gadget. the x,yx,y's are private to this variable and zz's might be open. we have to cover a triple at a time and we try to cover all private nodes (cores).

img

if three nodes are in a circle, it means such combination is possible (in T). note we have 2m zz's as we reserve 2 zz's for each clause, half of them is used to cover triples and half is open.

so we have 1-1 one correspondence between the value of a variable and how private nodes are covered. we have n such flowers corresponding to n variables.

we now add some clause structure to the 3dm instance so that only satisfying assignments survive. the clause structure for each clause contains xc,ycx_c,y_c and can contain a free zz from the existing flowers, reserved for that clause.

img]

at this point we have totally 2mn tips, # of tips covered by clauses is m. we have (2n-1)m tips left to cover (how about already-covered zz's in each variable? should you minus mn?). we add this number of 'dummy' clauses which can form triples connecting two dummy nodes and one free zz like they were clauses.

subset sum

given n positive integers (ai)i=1n(a_i)_{i=1}^n and an integer K, does there exist a subset S[n]S\subseteq[n] with iSai=K\sum_{i\in S}a_i=K?

theorem. Subset-Sum is NP-complete.
proof. it is easy to check it is in NP. we want to show 3DMpSubset-Sum\texttt{3DM}\leq_p\texttt{Subset-Sum}., ie given 3DM instance, we would like to construct a subset-sum instance so that there is perfect 3d-matching iff there is subset of certain sum K. we naturally represent each triple as a vector: if node is used in a triple, in the vector it has 1 else 0.

img

vector Subset-Sum: given a set of n 0-1 vectors, does there exist a subset of vectors that sum to 1\mathbf{1}? we can see there is a perfect 3DM iff there is a subset of vectors summing to 1\mathbf{1}. so this vector Subset-Sum is also NP-complete.

to turn this vector Subset-Sum into Subset-Sum, we map each 0-1 vector to a base (m+1)-numbers, then (xi,yi,zk)(m+1)i+(m+1)n+j+(m+1)2n+k(x_i,y_i,z_k)\mapsto(m+1)^i+(m+1)^{n+j}+(m+1)^{2n+k}, where m is number of available triples in 3DM (|T|).

remark. to turn this vector Subset-Sum into Subset-Sum, we could interpret the 0-1 vector as the binary representation of a number, then (xi,yj,zk)2i+2n+j+22n+k(x_i,y_j,z_k)\mapsto 2^i+2^{n+j}+2^{2n+k}.

but reverse is not true <~ carrying may happen eg. 01+01+01=11=01+10. so we need base m+1 to avoid this.

corollary. knapsack is NP-complete.

map

img

techniques for doing reductions:

2 vs 3