page: https://www.student.cs.uwaterloo.ca/~cs240/s20/

Instructor: Eric Schost, Armin Jamshidpey, Gautam Kamath

Week 1. May 11

running time simplifications

random access machines model

defn. (O-notation) f(n)O(g(n))f(n)\in O(g(n)) if there exist constants c>0c>0 and n0>0n_0>0 such that f(n)cg(n)|f(n)| \leq c|g(n)| for all nn0n\geq n_0. (upper bound)

eg. f(n)=75n+500, g(n)=5n^2 (c=1, n0=20). (we ignore smaller inputs)

eg. show 2n2+3n+11O(n2)2n^2+3n+11\in O(n^2) from first principles.
we need to find cc and n0n_0 so that we have 02n2+3n+11cn2,nn00 \leq 2 n^{2}+3 n+11 \leq c n^{2}, \forall n \geq n_{0}.
can let n0=1n_0=1, so 1n1n21111n21\leq n\longrightarrow 1\leq n^2\longrightarrow11\leq 11n^2, 1nnn23n3n21\leq n\longrightarrow n\leq n^2\longrightarrow 3n\leq 3n^2, 2n22n22n^2\leq 2n^2. so 2n2+3n+1111n2+3n2+2n2=16n22n^2+3n+11\leq 11n^2+3n^2+2n^2=16n^2, so c=16c=16. \square

defn. (Omega-notation) f(n)Ω(g(n))f(n)\in \Omega(g(n)) if there exist constants c>0c>0 and n0>0n_0>0 such that 0cg(n)f(n)0\leq c|g(n)| \leq |f(n)| for all nn0n\geq n_0. (lower bound)

defn. (Theta-notation) f(n)Θ(g(n))f(n) \in \Theta(g(n)) if f(n)O(g(n))f(n) \in O(g(n)) and f(n)Ω(g(n))f(n) \in \Omega(g(n)). (tight bound)

eg. show 12n25nΩ(n2)\frac{1}{2} n^{2}-5 n \in \Omega\left(n^{2}\right) from first principles.
let n0=20n_0=20, have n20n220n14n25n14n25n0n\geq 20\longrightarrow n^2\geq 20n\longrightarrow \frac{1}{4}n^2\geq 5n\longrightarrow \frac{1}{4}n^2-5n\geq 0. so 12n25n=14n2+(14n25n)14n2\frac{1}{2}n^2-5n=\frac{1}{4}n^2+(\frac{1}{4}n^2-5n)\geq \frac{1}{4}n^2. let c=14c=\frac{1}{4}. \square

defn. (omega-notation) f(n)o(g(n))f(n) \in o(g(n)) if for all constants c>0c>0, there exists a constant n0>0n_0>0 such that f(n)<cg(n),nn0|f(n)|<c|g(n)|,\forall n \geq n_{0}.

defn. (theta-notation) f(n)ω(g(n))f(n) \in \omega(g(n)) if for all constants c>0c>0, there exists a constant n0>0n_0>0 such that 0cg(n)<f(n),nn00\leq c|g(n)|<|f(n)|,\forall n \geq n_{0}.

notation meaning eg
Big O asymptotically not bigger
Big Omega asymptotically not smaller
Big Theta asymptotically the same
small o asymptotically strictly smaller
small omega asymptotically strictly greater

theorem. f(n)Θ(g(n))g(n)Θ(f(n))f(n) \in \Theta(g(n)) \Longleftrightarrow g(n) \in \Theta(f(n))
proof. there exist n0,c1,c2n_0,c_1,c_2 with c1gfc2g,nn0c_1|g|\leq|f|\leq c_2|g|,\forall n\geq n_0, so 1c2fg1c2f,nn0\frac{1}{c_2}|f|\leq|g|\leq\frac{1}{c_2}|f|,\forall n\geq n_0. \square

theorem. f(n)O(g(n))g(n)Ω(f(n))f(n) \in O(g(n)) \Longleftrightarrow g(n) \in \Omega(f(n))

theorem. f(n)o(g(n))g(n)ω(f(n))f(n) \in o(g(n)) \Longleftrightarrow g(n) \in \omega(f(n))

theorem. f(n)o(g(n))f(n)O(g(n))f(n) \in o(g(n)) \Longrightarrow f(n) \in O(g(n))

theorem. f(n)o(g(n))f(n)Ω(g(n)f(n) \in o(g(n)) \Longrightarrow f(n) \notin \Omega(g(n)

theorem. f(n)ω(g(n))f(n)Ω(g(n))f(n) \in \omega(g(n)) \Longrightarrow f(n) \in \Omega(g(n))

theorem. f(n)ω(g(n))f(n)O(g(n))f(n) \in \omega(g(n)) \Longrightarrow f(n) \notin O(g(n))

theorem. (identity rule) f(n)Θ(f(n))f(n) \in \Theta(f(n))

theorem. (max rules) suppose f(n)>0f(n)>0 and g(n)>0g(n)>0 for all nn0n\geq n_0, then

  1. O(f(n)+g(n))=O(max{f(n),g(n)})O(f(n)+g(n))=O(\max \{f(n), g(n)\})
  2. Ω(f(n)+g(n))=Ω(max{f(n),g(n)})\Omega(f(n)+g(n))=\Omega(\max \{f(n), g(n)\})

proof. (1) let hO(f+g)h\in O(f+g), then c,n0>0\exist c, n_0>0 with hcf+g2cmax{f,g},nn0|h|\leq c|f+g|\leq 2c|\max\{f,g\}|,\forall n\geq n_0 so hO(max{f,g})h\in O(\max\{f,g\}). let hO(max{f,g})h\in O(\max\{f,g\}), then c,n0>0\exist c,n_0>0 with hcmax{f,g}=c22max{f,g}c2f+g,nn0|h|\geq c|\max\{f,g\}|=\frac{c}{2}|2\max\{f,g\}|\geq\frac{c}{2}|f+g|,\forall n\geq n_0, so hO(f+g)h\in O(f+g). \square

theorem. (transitivity)

proof. (1) there exist c1,n1>0c_1,n_1>0 with fc1g,nn1|f|\leq c_1|g|,\forall n\geq n_1, there exist c2,n2c_2,n_2 with gc2h,nn2|g|\leq c_2|h|,\forall n\geq n_2. take N=max{n1,n2}N=\max\{n_1,n_2\} we have fn1n2h,nN|f|\leq n_1n_2|h|,\forall n\geq N. \square

theorem. suppose f(n)>0f(n)>0 and g(n)>0g(n)>0 for all nn0n\geq n_0, suppose L=limnf(n)g(n)L=\lim _{n \rightarrow \infty} \frac{f(n)}{g(n)}, then

f(n){o(g(n)) if L=0Θ(g(n)) if 0<L<ω(g(n)) if L=f(n) \in\left\{\begin{array}{ll}o(g(n)) & \text { if } L=0 \\ \Theta(g(n)) & \text { if } 0<L<\infty \\ \omega(g(n)) & \text { if } L=\infty\end{array}\right.

the other side does not hold.

eg. let f(n)f(n) be a polynomial f(n)=cdnd+cd1nd1++c1n+c0f(n)=c_{d} n^{d}+c_{d-1} n^{d-1}+\cdots+c_{1} n+c_{0} for some cd>0c_d>0, then f(n)Θ(nd)f(n) \in \Theta\left(n^{d}\right).
proof.

limnf(n)nd=limncdnd+cd1nd1++c1n+c0nd=limnc1ndnd+limncd1nd1nd+...+limnc0nd=cd+0+...+0=cd>0\begin{aligned} \lim _{n \rightarrow \infty} \frac{f(n)}{n^{d}} &= \lim _{n \rightarrow \infty} \frac{c_{d} n^{d}+c_{d-1} n^{d-1}+\cdots+c_{1} n+c_{0}}{n^{d}}\\ &= \lim_{n\rightarrow\infty}\frac{c_1n^d}{n^d}+\lim_{n\rightarrow\infty}\frac{c_{d-1}n^{d-1}}{n^{d}}+...+\lim_{n\rightarrow\infty}\frac{c_0}{n^d} \\ &= c_d+0+...+0\\ &= c_d >0 \end{aligned}

use the limit rule we have f(n)Θ(nd)f(n) \in \Theta\left(n^{d}\right). \square

eg. show n(2+sinnπ/2)n(2+\sin n \pi / 2) is Θ(n)\Theta(n).
proof. note limnn(2+sinnπ2)n=limn(2+sinnπ2)\lim_{n\rightarrow \infty}\frac{n(2+\sin n\frac{\pi}{2})}{n}=\lim_{n\rightarrow\infty}(2+\sin n\frac{\pi}{2}) does not exist.

note if n0=1n_0=1, then for all nn0n\geq n_0 we have 1sinnπ21-1\leq \sin n\frac{\pi}{2}\leq 1, so add sides by 22 and times nn, we have nn(2+sinnπ2)3nn\leq n(2+\sin n\frac{\pi}{2})\leq 3n. \square

we write logn\log n to represent log2n=lnnln2\log_2n=\frac{\ln n}{\ln 2}

eg. compare the growth rates of logn and n.
limnlognn=limn1nln21=limn1nln2=0\lim_{n\rightarrow\infty}\frac{\log n}{n}=\lim_{n\rightarrow\infty}\frac{\frac{1}{n\ln 2}}{1}=\lim_{n\rightarrow\infty}\frac{1}{n\ln 2}=0, so logno(n)\log n\in o(n) by LH.

eg. compare the growth rates of (logn)c(\log n)^c and ndn^d where c,d>0c,d>0 are arbitrary.
repeatedly apply LH we have limn(logn)cnd=limnc!(ln2)cdcnd=0\lim_{n\rightarrow\infty}\frac{(\log n)^c}{n^d}=\lim_{n\rightarrow\infty}\frac{c!}{(\ln 2)^cd^cn^d}=0 so (logn)co(nd)(\log n)^c\in o(n^d).

common growth rates from best to worse

eg. double the size of input for linearithmic algorithm, what's resultant running time?
have T(n)=cnlognT(n)=cn\log n, then T(2n)=c2nlog2n=2cn(logn+log2)=2cnlogn+2cn=2T(n)+2cnT(2n)=c2n\log 2n=2cn(\log n+\log 2)=2cn\log n+2cn=2T(n)+2cn. not bad

techniques for analysis

eg.

Test1(n): auto sum = 0 // Theta(1) --> c_1 for i = 1, n: // \sum{i=1,n} A for j = i, n: // \sum{j=i,n} c_2 --> A sum = sum + (i-j)^2 // Theta(1) --> c_2 return sum // Theta(1) --> c_3

method1:

T(n)=c1+c3+i=1nj=inc2=c0+i=1n(j=inc2)=c0+i=1nc2(ni+1)=c0+i=1nc2ni=1nc2i+i=1nc2=c0+c2n2c2(n(n1)2)+c2n=c0+c2(2n22+n2+n2+2nn)=c0+c2(n2+n2)=c0+c22(n2+n)Θ(n2)\begin{aligned} T(n) &= c_1+c_3+\sum_{i=1}^n\sum_{j=i}^nc_2 \\ &= c_0 +\sum_{i=1}^n(\sum_{j=i}^nc_2)\\ &= c_0+\sum_{i=1}^nc_2(n-i+1)\\ &= c_0+\sum_{i=1}^nc_2n-\sum_{i=1}^nc_2i+\sum_{i=1}^nc_2\\ &= c_0+c_2n^2-c_2(\frac{n(n-1)}{2})+c_2n\\ &= c_0+c_2(\frac{2n^2}{2}+\frac{n^2+n}{2}+\frac{2n}{n})\\ &= c_0+c_2(\frac{n^2+n}{2})\\ &= c_0+\frac{c_2}{2}(n^2+n)\in\Theta(n^2) \end{aligned}

method2:

T(n)=c1+c3+i=1nj=inc2c0+i=1nj=1nc2=c0+c2j=1nj=1n1=c0+c2n2O(n2)T(n)=c1+c3+i=1nj=inc2c0+i=1n/2j=inc2c0+i=1n/2j=n/2+1nc2=c0+c2n2i=1n/21=c0+c2n24Ω(n2)\begin{aligned} T(n) &= c_1+c_3+\sum_{i=1}^n\sum_{j=i}^nc_2 \\ &\leq c_0+\sum_{i=1}^n\sum_{j=1}^nc_2\\ &= c_0+c_2\sum_{j=1}^n\sum_{j=1}^n1\\ &= c_0+c_2n^2\in O(n^2)\\ T(n) &= c_1+c_3+\sum_{i=1}^n\sum_{j=i}^nc_2\\ &\geq c_0+\sum_{i=1}^{n/2}\sum_{j=i}^nc_2\\ &\geq c_0+\sum_{i=1}^{n/2}\sum_{j=n/2+1}^nc_2\\ &= c_0+c_2\frac{n}{2}\sum_{i=1}^{n/2}1\\ &= c_0+c_2\frac{n^2}{4}\in\Omega(n^2) \end{aligned}

eg. show f(n)=i=1nj=ink=ji1Θ(n3)f(n)=\sum_{i=1}^n\sum_{j=i}^n\sum_{k=j}^i1\in\Theta(n^3)
f(n)i=1nj=1nk=1n=n3O(n3)f(n)\leq\sum_{i=1}^n\sum_{j=1}^n\sum_{k=1}^n=n^3\in O(n^3)
f(n)i=1n/3j=2n/3+1nk=n/3+12n/3=...Ω(n3)f(n)\geq\sum_{i=1}^{n/3}\sum_{j=2n/3+1}^n\sum_{k=n/3+1}^{2n/3}=...\in\Omega(n^3)

Week 2. May 18

The running time may depend on the characteristic of the input

defn. let TA(I)T_A(I) denote the running time of an algorithm A on instance I of length n,

we assume the worst-case complexity if not specified.

eg.

Test(A, n): for i = 1, n-1: // \sum_{i=1,n-1} M auto j = i while j > 0 and A[j] > A[j-1]: // M (worst: if j A[j] always > A[j-1] swap(A[j], A[j-1]) // need to loop from 0 to i) j = j-1 //

worst case: Θ(n2)\Theta(n^2)

mergesort

design

  1. split A into two subarrays AL consists of the first n2\Bigl\lceil\dfrac{n}{2}\Bigr\rceil elements and AR consists of the last n2\Bigl\lfloor\dfrac{n}{2}\Bigr\rfloor elements.
  2. recursively call mergesort on AL and AR
  3. AL and AR are sorted, merge them into a single sorted array (merge two sorted arrays)
mergesort(A[n], l = 0, r = n-1): if r <= l: // \Theta(1) return // \Theta(1) else: auto m = (r+l)/2 // \Theta(1) mergesort(A, l, m) // T(n/2) mergesort(A, m+1, r) // T(n/2) merge(A, l, m, r) // ? // assume A[l:m], A[m+1:r] are sorted merge(A[n], l, m, r): auto S = // auxiliary array of size n auto S[l:r] = A[l:r] // copy A[l,...,r] to S[l,...,r] // this is step 1, taking \Theta(n) int iL = l int iR = m+1 for k = l, r: // \sum_{k=l,r} if iL > m: A[k] = S[iR++] else if iR > r: A[k] = S[iL++] else if S[iL] <= S[iR]: A[k] = S[iL++] else: A[k] = S[iR++]

merge takes Θ(rl+1)\Theta(r-l+1), which is Θ(n)\Theta(n) (!)
so the total time for mergesort should be T(n)=2T(n2)+Θ(n)T(n)=2T(\frac{n}{2})+\Theta(n)

analysis

let T(n) denote the time to run mergesort on an array of length n

the recurrence relation for T(n) is:

T(n)={T(n2)+T(n2)+Θ(n) if n>1Θ(1) if n=1T(n)=\begin{cases} T(\Bigl\lceil\dfrac{n}{2}\Bigr\rceil)+T(\Bigl\lfloor\dfrac{n}{2}\Bigr\rfloor)+\Theta(n) & \text{ if } n>1 \\ \Theta(1) & \text{ if } n= 1 \end{cases}

it suffices to consider the exact relation, with constant factor c replacing Θ\Theta's

T(n)={T(n2)+T(n2)+cn if n>1c if n=1T(n)=\begin{cases} T(\Bigl\lceil\dfrac{n}{2}\Bigr\rceil)+T(\Bigl\lfloor\dfrac{n}{2}\Bigr\rfloor)+cn & \text{ if } n>1 \\ c & \text{ if } n= 1 \end{cases}

the sloppy recurrence (with floors and ceilings removed) is:

T(n)={2T(n2)+cn if n>1c if n=1T(n)=\begin{cases} 2T(\frac{n}{2})+cn & \text{ if } n>1 \\ c & \text{ if } n= 1 \end{cases}

solve with method 1

T(n)=2T(n2)+cn=2(2T(n22)+cn2)+cn=22T(n22)+2cn=22(2T(n23)+cn22)+2cn=23T(n23)+3cn=...=2iT(n2i)+icn\begin{aligned} T(n) &= 2T(\frac{n}{2})+cn\\ &= 2(2T(\frac{n}{2^2})+c\frac{n}{2})+cn=2^2T(\frac{n}{2^2})+2cn \\ &= 2^2(2T(\frac{n}{2^3})+c\frac{n}{2^2})+2cn=2^3T(\frac{n}{2^3})+3cn \\ &= ... \\ &= 2^iT(\frac{n}{2^i})+icn \\ \end{aligned}

at ii-th level. since when n=1n=1, we have T(1)=cT(1)=c, we can solve 1=n2j1=\dfrac{n}{2^j} and get j=lognj=\log n, so

T(n)=2jT(1)+jcn=2jc+jcn=cn+logncnΘ(nlogn)T(n)=2^jT(1)+jcn=2^jc+jcn=cn+\log n\,cn\in\Theta(n\log n)

solve with method 2

+----+cn +---+ level 0: cn v v +------+cn/2 cn/2+-----+ level 1: 2*cn/2=cn | + + | v v v v +--+ cn/4 cn/4 cn/4 cn/4 +-+ level 2: 4*cn/4=cn | | ..... ...... ...... ..... | | ... v v c c ....... c c level logn: n*c=cn

first mergesort call have time cn (for merge), then it calls two mergesorts with size n/2, each of which has time cn/2, ...
all rows in fact sums to cn, there are logn+1 rows, so T(n)=(logn+1)cnT(n)=(\log n+1)cn

useful formulas

some recurrence relations

recursion resolves to eg
T(n)=T(n2)+Θ(1)T(n)=T(\frac{n}{2})+\Theta(1) T(n)Θ(logn)T(n)\in\Theta(\log n) binary search
T(n)=2T(n2)+Θ(n)T(n)=2T(\frac{n}{2})+\Theta(n) T(n)Θ(nlogn)T(n)\in\Theta(n\log n) mergesort
T(n)=2T(n2)+Θ(logn)T(n)=2T(\frac{n}{2})+\Theta(\log n) T(n)Θ(n)T(n)\in\Theta(n) heapify
T(n)=T(cn)+Θ(n)T(n)=T(cn)+\Theta(n)
0<c<1\exist 0<c<1
T(n)Θ(n)T(n)\in\Theta(n) selection
T(n)=2T(n4)+Θ(1)T(n)=2T(\frac{n}{4})+\Theta(1) T(n)Θ(n)T(n)\in\Theta(\sqrt{n}) range search
T(n)=T(n)+Θ(1)T(n)=T(\sqrt{n})+\Theta(1) T(n)Θ(loglogn)T(n)\in\Theta(\log\log n) interpolation search

useful sums

useful formulas

eg. log(n!)=logn+log(n1)+...+log1Θ(nlogn)\log(n!)=\log n+\log(n-1)+...+\log 1\in\Theta(n\log n)
proof.

eg. show T(n)=2T(n2)+cT(n)=2T(\frac{n}{2})+c reduces to Θ(n)\Theta(n).
T(n)=2iT(n2i)+(20+21+...+2i1),iT(n)=2^iT(\frac{n}{2^i})+(2^0+2^1+...+2^{i-1}),\,\forall i. let i=logni=\log n we have T(n)=nc+k=0logn12k=nc+2logn121=nc+(n1)Θ(n)T(n)=nc+\sum_{k=0}^{\log n-1}2^k=nc+\frac{2^{\log n}-1}{2-1}=nc+(n-1)\in\Theta(n). \square

priority queues

review of ADTs

stack

queue

binary tree

pqueue

use pqueue to sort

PQsort(A[n]): auto PQ = new priority_queue for k = 0, n-1: PQ.insert(A[k], key=A[k]) // fill elements to pqueue for k = n-1, 0, -1: A[k] = PQ.deleteMax()

run-time: O(i=0ninsert(i)+i=0ndeleteMax(i))O(\sum_{i=0}^n\texttt{insert}(i)+\sum_{i=0}^n\texttt{deleteMax}(i)), depending on how we implement the pqueue

realizations of pqueues

realization 1: use unsorted array

realization 2: use sorted arrays

binary heap

lemma. the height of a heap with nn nodes is Θ(logn)\Theta(\log n)
proof. for a heap of height hh, we have at least level 0(full) + level 1(full) + ... + level h-1(full) + level h(only 1) = 1+2+4+...+2h1+1=(2h1)+1=2h1+2+4+...+2^{h-1}+1=(2^h-1)+1=2^h nodes. and at most (all filled) = 1+2+4+...+h=2h+111+2+4+...+h=2^{h+1}-1 nodes.
hence we have 2hn2h+112h+12^h\leq n\leq 2^{h+1}-1\leq 2^{h+1}, which is hlog(n)h+1h\leq \log(n)\leq h+1. rearrange we have log(n)1hlog(n)\log(n)-1\leq h\leq \log(n) so hΘ(logn)h\in\Theta(\log n). \square

storing heaps in arrays

img

Week 3. May 25

insert in heaps

// k is an index corresponding to a node fix_up(A[], k): while parent(k) exist and A[parent(k)] < A[k]: // worst: swap until it becomes root swap(A[k], A[parent(k)]) k = parent(k) insert(A[], n, item): increase size of A by 1 auto l = last(n) // n-1 A[l] = item fix_up(A, l)

worst time = O(height of heap) = (logn)

deleteMax in heaps

// k is an index corresponding to a node fix_down(A[], n, k): while k is not leaf: // find child with larger key // there can't be case when left doesn't exist and right exist auto j = leftchild(k) if j != last(n) and A[j+1] > A[j]: j = j+1 if A[k] >= A[j]: break swap(A[j], A[k]) k = j deleteMax(A[], n): auto res = A[root()] swap(A[root()], A[last(n)]) decrease size by 1 // remove original root (now last) fix_down(A, n-1, root()) return res

worst time = O(height of heap) = (logn)

pqsort

PQsortWithHeap(A[n]): auto hp = new heap for k = 0, n-1: hp.insert(key=A[k]) // just insert keys, no items for k = n-1, 0, -1: A[k] = hp.deleteMax()

heapsort

problem: given n items all at once, build a heap containing all of them

heapify(A[n] /* A is regular array */): for i = parent(last(n)), 0, -1: fix_down(A, i) // worst time: O(n)

the running time of heapify T(n)T(n) is Θ(n)\Theta(n).
proof for n=2h+11n=2^{h+1}-1 (complete heap).

T(n)Θ(worst case number of swaps)=Θ(level h, swap 0 times each + level h-1, 1 time each, + ... + level 0)=Θ(02h0+12h1+22h2+...+h2hh)=Θ(2hi=0hi2i)=Θ(2h)=Θ(n)\begin{aligned} T(n) &\in \Theta(\text{worst case number of swaps}) \\ &= \Theta(\text{level h, swap 0 times each + level h-1, 1 time each, + ... + level 0}) \\ &= \Theta(0\cdot 2^{h-0}+1\cdot2^{h-1}+2\cdot2^{h-2}+...+h\cdot2^{h-h}) \\ &= \Theta(2^h\sum_{i=0}^{h}\frac{i}{2^i}) \\ &= \Theta(2^h) = \Theta(n) \end{aligned}

where each level hih-i has 2hi2^{h-i} nodes. and ii2i2\sum_i\frac{i}{2^i}\leq 2

we can use fix_ups, however, "The running time is at least nlogn. You can see this by noting there are n over two elements in the bottom layer, each of which are at a depth logn, which is the cost of a fixup operation."

heapsort(A[], n): heapify(A) // O(n) // repeatedly find max while n > 1: // O(nlongn) // delete max (put it to last, and fixdown skips it) swap(A[root()], A[last(n)]) n-- fix_down(A, n, root())

takes O(nlogn) time

find smallest item

problem: find the k-th smallest item in an array A of n distinct numbers

sorting and randomized algorithms

selection problem: given an array A of n numbers, and 0k<n0\leq k<n, find the element that would be at position k of the sorted array

quick-select

  1. choose_pivot(A): return an index p in A. we use the pivot to rearrange the array
    // simplest choose_pivot(A[]) => A.size - 1
  2. partition(a, p): rearrange A and return pivot index i so that

easy partition: create smaller, equal, larger arrays separately, then concatenate them. O(n) time and space

efficient in-place partition

partition(A[n], p): swap(A[n-1], A[p]) auto i = -1 // begin from left auto j = n-1 // begin from right auto v = A[n-1] // pivot index while true: // stop when A[i] is bigger than pivot do: i++ while i < n and A[i] < v // stop when A[j] is smaller than pivot do: j-- while j > 0 and A[j] > v // break if two pointers overlap, where two elems are in correct position if i >= j: break // swap wrongly positioned ones else: swap(A[i], A[j]) swap(A[n-1], A[i]) return i

takes O(n)O(n).

quick-select 1

quick_select1(A[], k): // selects kth element as if A is sorted auto p = choose_pivot(A) auto i = partition(A, p) if i == k: return A[i] else if i > k: return quick_select1(A[:i-1], k) else: return quick_select1(A[i+1:], k-i-1) // note index relative to new subarray

running time analysis:

worst: list is sorted eg [1,2,3,4], and want to select 1st elem. choose pivot to be 4, then after partition the array is the same, qs recurses on [1,2,3], ... recursive call always have size n-1.

we have

T(n){T(n1)+cn , x2c , x=1=cn+c(n1)+c(n2)+...+c2+cΘ(n2)T(n)\begin{cases} T(n-1)+cn & \text{ , } x\geq 2 \\ c & \text{ , } x= 1 \end{cases} =cn+c(n-1)+c(n-2)+...+c\cdot 2+c\in\Theta(n^2)

best: the chosen pivot is just the kth elem. no recursive call (just partition). O(n)O(n).

average:

define T(n,k)T(n,k) as average cost for selecting kth item from array of size n. then

theorem. T(n,k)4cnT(n,k)\leq 4cn.
proof. use induction. base: if n=1n=1, then T(1,k)=c4c1T(1,k)=c\leq 4c\cdot 1.
inductive:

T(n,k)cn+1n(i=0k14c(ni1)+i=k+1n14ci)=cn+4cn(i=0k1ni=0k1(i+1)+i=k+1n1i)=cn+4cn(nk(1+k)k2+(k+1+n1)(n1k)2)cn+4cn(nkk2+n22)=cn+4cnn22+4cn(nkk2)(maximized when k=n2)cn+2cn+4cn(nn2n24)=cn+2cn+4cn(n24)=4cn\begin{aligned} T(n,k)&\leq cn+\frac{1}{n}\left(\sum_{i=0}^{k-1}4c(n-i-1)+\sum_{i=k+1}^{n-1}4ci\right)\\ &=cn+\frac{4c}{n}\left(\sum_{i=0}^{k-1}n-\sum_{i=0}^{k-1}(i+1)+\sum_{i=k+1}^{n-1}i\right)\\ &=cn+\frac{4c}{n}\left(nk-\frac{(1+k)k}{2}+\frac{(k+1+n-1)(n-1-k)}{2}\right)\\ &\leq cn+\frac{4c}{n}(nk-k^2+\frac{n^2}{2})\\ &=cn+\frac{4c}{n}\frac{n^2}{2}+\frac{4c}{n}(nk-k^2) \,\,(\text{maximized when }k=\frac{n}{2})\\ &\leq cn+2cn+\frac{4c}{n}(n\frac{n}{2}-\frac{n^2}{4})\\ &=cn+2cn+\frac{4c}{n}(\frac{n^2}{4})\\ &=4cn \end{aligned}

\square

that is TO(n)T\in O(n).

Week 4. June 1

randomized algorithm

defn. the expected running time is

Texp(I)=E[T(I,R)]=RT(I,R)Pr[R]T^{\text{exp}}(I)=E[T(I,R)]=\sum_RT(I,R)\cdot Pr[R]

where T(I,R)T(I,R) is the running time of a randomized algorithm AA for an input instance II and the sequence of random numbers RR.

randomized quick-select

first idea: randomly shuffle the input

shuffle(A[]): for i = 0, n-2: swap(A[i], A[i+random(n-i)])

assuming the random(n) returns integers uniformly from 0, 1, ..., n-1.

second idea: random pivot by changing pivot selection

choose_pivot2(A[n]): return random(n) quick_select2(A[], k): auto p = choose_pivot2(A) ...

the probability of choosing a pivot index i from 0 to n is 1n\frac{1}{n}. so the analysis is the same as average case. expected runtime is Θ(n)\Theta(n).

quick-sort

quick_sort1(A[n]): if n <= 1: return auto p = choose_pivot1(A) // last element auto i = partition(A, p) quick_sort1(A[:i-1]) quick_sort(A[i+1:])

let T(n)T(n) be the runtime for quick_sort1, then it depends on the pivot index i. we have

T(n)=Θ(n)+T(i)+T(ni1)T(n)=\Theta(n)+T(i)+T(n-i-1)

worst-case: we always have i=0 or n-1

T(n){T(n1)+cn , x2c , x=1T(n)\begin{cases} T(n-1)+cn & \text{ , } x\geq 2 \\ c & \text{ , } x= 1 \end{cases}

similar to worst-case quick select T(n)Θ(n2)T(n)\in\Theta(n^2).

best-case: we always have i = n2\lfloor\frac{n}{2}\rfloor or n2\lceil\frac{n}{2}\rceil

T(n)={T(n12)+T(n12)+cn,n>1c,n=1T(n)=\begin{cases} T(\Bigl\lfloor\dfrac{n-1}{2}\Bigr\rfloor)+T(\Bigl\lceil\dfrac{n-1}{2}\Bigr\rceil)+cn &, n>1 \\ c &, n= 1 \end{cases}

resolves to Θ(nlogn)\Theta(n\log n).

average-case: there are (n-1)! permutations have to pivot indices i

T(n)=1n!i=0n1size of I is n, I has pivot irunning time for instance I1n!i=0n1(n1)!(cn+T(i)+T(ni1))=cn+1ni=0n1(T(i)+T(ni1))\begin{aligned} T(n)&=\frac{1}{n!}\sum_{i=0}^{n-1}\sum_{\text{size of I is n, I has pivot i}}\text{running time for instance I} \\ &\leq\frac{1}{n!}\sum_{i=0}^{n-1}(n-1)!(cn+T(i)+T(n-i-1))\\ &=cn+\frac{1}{n}\sum_{i=0}^{n-1}(T(i)+T(n-i-1)) \end{aligned}

theorem. T(n)Θ(nlogn)T(n)\in\Theta(n\log n).
proof. rewrite

T(n)=cn+2ni=0n1T(i)nT(n)=cn2+2i=0n1T(i)(n1)T(n1)=c(n1)2+2i=0n2T(i)nT(n)(n1)T(n1)=c(2n1)+2T(n1)nT(n)=(n+1)T(n1)+c(2n+1)T(n)n+1=T(n1)n+c(2n1)n(n+1)\begin{aligned} T(n)&=cn+\frac{2}{n}\sum_{i=0}^{n-1}T(i)\\ nT(n)&=cn^2+2\sum_{i=0}^{n-1}T(i)\\ (n-1)T(n-1)&=c(n-1)^2+2\sum_{i=0}^{n-2}T(i)\\ nT(n)-(n-1)T(n-1)&=c(2n-1)+2T(n-1)\\ nT(n)&=(n+1)T(n-1)+c(2n+1)\\ \frac{T(n)}{n+1}&=\frac{T(n-1)}{n}+\frac{c(2n-1)}{n(n+1)} \end{aligned}

define A(n)=T(n)n+1A(n)=\frac{T(n)}{n+1}, we obtain

A(n)=A(n1)+c(2n1)n(n+1)=A(n2)+c(2(n1)1)(n1)n+c(2n1)n(n+1)...=ci=1n2i1i(i+1)=2ci=1n1i+1ci=1n1i(i+1)Θ(logn)+Θ(1)\begin{aligned} A(n)&=A(n-1)+\frac{c(2n-1)}{n(n+1)}\\ &=A(n-2)+\frac{c(2(n-1)-1)}{(n-1)n}+\frac{c(2n-1)}{n(n+1)}\\ &...\\ &=c\sum_{i=1}^n\frac{2i-1}{i(i+1)}\\ &=2c\sum_{i=1}^n\frac{1}{i+1}-c\sum_{i=1}^n\frac{1}{i(i+1)}\\ &\rightarrow\Theta(\log n)+\Theta(1) \end{aligned}

so T(n)=(n+1)T(n)Θ(nlogn)T(n)=(n+1)T(n)\in\Theta(n\log n). \square

optimizations:

quick_sort3(A[n]): auto S = new stack<pair<int, int>> S.push({0, n-1}) while S is not empty: auto [l, r] = S.pop() while r-l+1 > 10: auto p = choose_pivot2(A, l, r) auto i = partition(A, l, r, p) if i-l > r-i: S.push({l, i-1}) l = i+1 else: S.push({i+1, r}) r = i-1 insertion_sort(A)

lower bounds for sorting

comparison-based sorting lower bound is O(nlogn)

theorem. any correct comparison-based sorting algorithm requires at least Ω(nlogn)\Omega(n\log n) comparison operations to sort nn distinct items.
proof. let hh be the max number of comparisons performed by the algorithm. in the decision tree, the number of leaves (permutations) n!\geq n!, and 2h\leq 2^h. so 2hn!hlog(n!)=logn+log(n1)+...+log(1)logn+log(n1)+...log(n2+1)n2log(n2)=n2lognn2Ω(nlogn)2^h\leq n!\Longrightarrow h\geq \log(n!)= \log n+\log(n-1)+...+\log(1)\geq\log n+\log(n-1)+...\log(\frac{n}{2}+1)\geq\frac{n}{2}\log(\frac{n}{2})=\frac{n}{2}\log n-\frac{n}{2}\in\Omega(n\log n). \square

non-comparison-based sorting can be O(n) under assumptions

bucket sort (single digit)

A B A +-----+ +-----+ +-----+ | 123 | |B[0] = 230 -> 320 -> 210 | 230 | +-----+ +-----+ +-----+ | 230 | |B[1] = 021 -> 101 | 320 | +-----+ +-----+ +-----+ | 021 | |B[2] = 232 | 210 | +-----+ +-----| +-----+ | 320 | +--> |B[3] = 123 +--> | 021 | +-----+ +-----+ +-----+ | 210 | | 101 | +-----+ +-----+ | 232 | | 232 | +-----+ +-----+ | 101 | | 123 | |-----| +-----+ ^ sort by this digit
// A: array of size n, contains numbers with digits in {0, ..., R-1} // d: index of the digit by which we wish to sort bucket_sort(A[n], d): auto B = new array<list<int>>(length=R) // space: R+n for i = 0, n-1: B[dth digit of A[i]].append(A[i]) auto i = 0 for j = 0, R-1: while B[j] is not empty: A[i++] = B[j].front B[j].front = B[j].next

this is stable: equal items stay in original order runtime: O(n+R), auxiliary space: O(n+R)

count sort:

key_indexed_count_sort(A[n], d): // count how many each kind there are auto count = new array(size=R, fill=0) for i = 0, n-1: count[dth digit of A[i]]++ // find left boundary of each kind auto idx = new array(size=R) idx[0] = 0 for i = 1, R-1: idx[i] = idx[i-1] + count[i-1] // move to new array in sorted order, then copy back auto aux = new array(size=n) for i = 0, n-1: aux[ idx[dth digit of A[i]] ] = A[i] idx[dth digit of A[i]]++ A = aux

MSD-radix-sort (most significant digit)

// sort array of m-digit radix-r numbers recursively msd_radix_sort(A[n], l=0, r=n-1, d=1 /* digit */): if l < r: key_indexed_count_sort(A[l:r], d) if d < m: for i = 0, R-1: auto [li, ri] = // boundaries of ith bin (A[li:ri] all have dth digit i) msd_radix_sort(A, li, ri, d+1)

LSD-radix-sort

lsd_radix_sort(A[n]): for d = m, 1, -1: key_index_count_sort(A, d)

Week 5. June 9

dictionary ADT

operations:

elementary implementations

common assumptions

using unordered array/linked list:

using ordered array:

binary search trees

BST_search(k)

BST_search(root, k): if root == NULL : not found else if k == root.key: return root else if k < root.key : return BST_search(root.left, key) else return BST_search(root.right, key)

BST_insert(k, v)

BST_insert(root, k, v): if root == NULL: return new node(k, v) if k < root.key: root.left = BST_insert(root.left, k, v) else : root.right = BST_insert(root.right, k, v) return root

BST_delete(k)

BST_delete(root, k): /* locate k */ if root == NULL: return root if k < root.key: root.left = BST_delete(root.left, k) return root else: root.right = BST_delete(root.right, k) return root /* k has one child */ if root.left == NULL: auto &tmp = root.right delete root return tmp else if root.right == NULL: auto &tmp = root.left delete root return tmp /* k has two children */ else: auto succ_parent = root /* find successor */ auto succ = root.right while succ.left != NULL: succ_parent = succ succ = succ.left if succ_parent != root: succ_parent.left = succ.right else: succ_parent.right = succ.right swap(root, succ) delete succ return root

all three operations cost Θ(h)\Theta(h). for h

AVL tree

eg.
img

theorem. any AVL tree with nn nodes has height hΘ(logn)h\in\Theta(\log n).
proof. define N(h)N(h) to be the least number of nodes in a height-h AVL tree. we have N(0)=1,N(1)=2,N(2)=4,...N(0)=1, N(1)=2,N(2)=4,.... suppose a tree has height hh, wlog assume its left subtree has height h1h-1, then by defn of AVL tree its right subtree has to have height h1h-1 or h2h-2. since we are finding a lower bound, take h2h-2. hence we have N(h)=N(h1)+N(h2)+1N(h)=N(h-1)+N(h-2)+1.

will show by induction that N(h)(2)hN(h)\geq(\sqrt{2})^h. base: h=0h=0, we have 111\geq 1. inductive: N(h)=N(h1)+N(h2)+12N(h2)IH(2)2(2)h2=(2)hN(h)=N(h-1)+N(h-2)+1\geq 2N(h-2)\overset{\text{IH}}{\geq}(\sqrt{2})^2(\sqrt{2})^{h-2}=(\sqrt{2})^h.

so we have nN(h)(2)hn(2)hhlog2n=2lognO(logn)n\geq N(h)\geq(\sqrt{2})^h\Longrightarrow n\geq(\sqrt{2})^h\Longrightarrow h\leq\log_{\sqrt{2}}n=2\log n\in O(\log n). the other direction is known. \square

AVL_insert(k, v)

AVL_insert(root, k, v): auto &z = BST_insert(root, k, v) z.height = 0 while z != NULL: setHeightFromChildren(z) if |z.left.height - z.right.height| == 2: AVL_fix(z) // later break else: z = z.parent setHeightFromChildren(u): u.height = 1 + max(u.left.height, u.right.height)

eg.

+---+ +---+ +---+ | 31| | 31| | 31| | 2 | | 2 | | 2 | /+---+\ /+---+\ /+---+\ / \ / \ / \ / \ / \ / \ +--+ +--+ +--+ +--+ +--+ +--+ |28| |37| |28| |37| |28| |46| |0 | |1 | --> |0 | 2|1 | --> |0 | /|1 | +--+ +--+\ +--+ +--+\ +--+ / +--+\ \ \ / \ \ \ / \ +--+ +--+ +--+ +--+ |46| |46| |37| |50| |0 | 1|0 | |0 | |0 | +--+ +--+\ +--+ +--+ \ \ +--+ |50| 0|0 | +--+ not balanced! rotated left

AVL_fix(z)

AVL_fix(z): // find child and grand-child that go deepest Node &x, &y if z.right.height > z.left.height: y = z.right else: y = z.left if y.right.height > y.left.height: x = y.right else: x = y.left // apply appropriate rotation to restructure at x,y,z if x.key < y.key < z.key: rotate_right(z) else if y.key < x.key < z.key: rotate_left(y) rotate_right(z) else if z.key < x.key < y.key: rotate_right(y) rotate_left(z) else /*if z.key < y.key < x.key*/: rotate_left(z) // used for left-left imbalance rotate_right(z): // rotate current subtree auto &y = z.left /* make y.right the new left child of z */ /* make y the new root of the subtree */ /* make z the new right child of y */ setHeightFromChildren(z) setHeightFromChildren(y) // update the parent that originally points to z auto p = z.parent if p != NULL: if z == p.left: p.left = y else: p.right = y else: /* make y the overall root of the tree */ // used for right-right imbalance rotate_left(z): /* same as rotate_right, swap left and right */

AVL_delete(k)

AVL_delete(root, k): auto &z = BST_delete(root, k): while z != NULL: setHeightFromChildren(z) if |z.left.height - z.right.height| == 2: AVL_fix(z) // no break here - may have to call more than once z = z.parent

Week 6. June 16

dictionary (continued)

using binary search trees:

using AVL trees:

skip lists

img

search(k)

get_predecessors(k): // finds all nodes before where k would be auto &p = topmost left sentinel auto P = new stack<node> P.push(p) // go down & right as far as possible while p.below != NULL: p = p.below while p.after.key < k: p = p.after P.push(p) return P skiplist.search(k): auto P = get_predecessors(k) auto &p0 = P.top() if p0.after.key == k: return p0.after else: /* not found */

takes O(logn) time.

insert(k, v)

eg. insert k=100. coin tosses are HHHT -> i = 3. first increase height so h=4, we insert one k in base level, then insert 3 more in the tower
img

delete(k)

skiplist.delete(k): auto P = get_predecessors(k) while P: auto &p = P.pop() if p.after.key == k: // remove auto &tmp = p.after p.after = p.after.after delete tmp else: break auto &p = topmost left sentinel // remove "empty" levels until there is only one "empty" level // at top while p.below.after is inf-sentinel: // the two top lists are both only sentinels, remove one auto &tmp1 = p.below, &tmp2 = p.after.below p.below = p.below.below p.after.below = p.after.below.below delete tmp1; delete tmp2

theorem. the average total number of nodes in a skiplist is 2n2n to contain nn keys.
proof. for one tower jj (above a key k), we have P(tower j has height i)=12iP(\text{tower j has height } i)=\frac{1}{2^i}, then E(# nodes at level i in tower j)=112i+0(112i)=12iE(\text{\# nodes at level }i\text{ in tower j})=1\frac{1}{2^i}+0(1-\frac{1}{2^i})=\frac{1}{2^i}. so E(# nodes at level i)=j=1n12i=n2iE(\text{\# nodes at level }i)=\sum_{j=1}^n\frac{1}{2^i}=\frac{n}{2^i}. for all towers, E(# nodes in all towers)=i0n2i=2nE(\text{\# nodes in all towers})=\sum_{i\geq 0}\frac{n}{2^i}=2n. \square

theorem. an upper bound of average height of skiplist is logn\log n to contain nn keys.
proof. define Vi={0,level i is empty1,else,i0V_i=\begin{cases} 0 &,\text{level i is empty} \\ 1 &,\text{else} \end{cases},i\geq 0. then height=i1Vi\text{height}=\sum_{i\geq 1}V_i so E(height)=E(i1Vi)=i1E(Vi)E(\text{height})=E(\sum_{i\geq 1}V_i)=\sum_{i\geq 1}E(V_i).
note since Vi{0,1}V_i\in\{0,1\} we have E(Vi)1E(V_i)\leq 1.
also note ViV_i is less/eq than # of nodes at level ii, hence E(Vi)E(# nodes at level i)=n2iE(V_i)\leq E(\text{\# nodes at level }i)=\frac{n}{2^i}
combine them we have E(Vi)min(1,n2i)E(V_i)\leq\min(1,\frac{n}{2^i}). if we let i0=logni_0=\log n, then i1E(Vi)=i=1i0E(Vi)+i>i0E(Vi)i=1logn1+(12+14+18+...)=logn+1\sum_{i\geq 1}E(V_i)=\sum_{i=1}^{i_0}E(V_i)+\sum_{i>i_0}E(V_i)\leq \sum_{i=1}^{\log n}1+(\frac{1}{2}+\frac{1}{4}+\frac{1}{8}+...)=\log n+1. \square

reordering items

optimal static ordering

eg.

key A B C D E
# times we access it 2 8 1 10 5
access-probability 2/26 8/26 1/26 10/26 5/26

if use regular linear search,

linear_search(A[n], k): for i = 0, n-1: if A[i].key == k: return true return false

to access ith element we make i comparisons

eg. suppose we have keys k1,...,knk_1,...,k_n with access probabilities p1+...+pn=1p_1+...+p_n=1.
use regular linear search. (1) suppose distribution is uniform, then the average cost of search is 11n+21n+...+n1n=n+121\cdot\frac{1}{n}+2\cdot\frac{1}{n}+...+n\cdot\frac{1}{n}=\frac{n+1}{2}.

(2) suppose p1=12,p2=122,...,pn1=12n1,pn=12n1p_1=\frac{1}{2},p_2=\frac{1}{2^2},...,p_{n-1}=\frac{1}{2^{n-1}},p_n=\frac{1}{2^{n-1}}. then average cost is 112+2122+...+(n1)12n1+n12n1<21\cdot\frac{1}{2}+2\cdot\frac{1}{2^2}+...+(n-1)\frac{1}{2^{n-1}}+n\cdot\frac{1}{2^{n-1}}<2. (constant time)

(3) suppose array is like (2), but in reverse order. the average cost is bn=112n1+212n1+...+(n1)122+n12b_n=1\cdot\frac{1}{2^{n-1}}+2\cdot\frac{1}{2^{n-1}}+...+(n-1)\cdot\frac{1}{2^2}+n\cdot\frac{1}{2}. note bnn(12n1+12n1+...+12)=nb_n\leq n(\frac{1}{2^{n-1}}+\frac{1}{2^{n-1}}+...+\frac{1}{2})=n, and bnn12b_n\geq n\cdot\frac{1}{2}. so bnΘ(n)b_n\in\Theta(n).

claim. over all possible static orderings, the one that sorts items by non-increasing access-probability minimizes the expected cost.
let L=k1...knL=k_1...k_n, p1,...,pnp_1,...,p_n are access probilities. suppose LL minimizes average cost in all n!n! permutations, then we claim p1p2...pnp_1\geq p_2\geq...\geq p_n.
proceed by contradiction. suppose there exists 1in11\leq i\leq n-1 such that pi<pi+1p_i<p_{i+1}. the average cost is c(L)=j=1npjj=j=1,iji+1npjj+pii+pi+1(i+1)c(L)=\sum_{j=1}^np_jj=\sum_{j=1,i\neq j\neq i+1}^np_jj+p_ii+p_{i+1}(i+1). let LL' be obtained by swapping kik_i and ki+1k_{i+1} in LL, then c(L)=...+pi+1i+pi(i+1)c(L')=...+p_{i+1}i+p_i(i+1). then c(L)c(L)=pi+1pi>0c(L)-c(L')=p_{i+1}-p_i>0, this is a contradiction. \square

dynamic ordering

if we do not know the access probabilities ahead of time

MTF works well in practice; can show MTF is "2-competitive": no more than twice as bad as the optimal static ordering

Week 7. June 23

theorem. in the comparison model, Ω(logn)\Omega(\log n) (on keys) comparisons are required to search a size-n dictionary.
proof. the number of distinct answers is n+1n+1 (indecies + not found) corresponding to leaves of the dicision tree, so the tree must have at least n+1n+1 leaves, there are at most tree children for any node at any leaf, so the tree has height at least log3(n+1)\log_3(n+1). \square

interpolation_search(A[n], k): auto l = 0 auto r = n-1 while A[r] != A[l] and k >= A[l] and k <= A[r]: auto m = l + floor( (k-A[l])/(A[r]-A[l])*(r-l) ) if A[m] < k: l = m + 1 else if A[m] > k: r = m - 1 else: return m if A[l] == k: return l return /* not found */

works well if keys are uniformly distributed:

worst case: Θ(n)\Theta(n) eg search n-1 in [1,2,3,...,n-1,n^100]

tries

standard tries

defns.

we use Σ={0,1}\Sigma=\{0,1\} and words are bit-string; $\$ denotes end-of-word, not counted to the length.

eg. a trie for S={00$,0001$,01001$,011$,01101$,110$,1101$,111$}
img

search(x)

trie_search(root, x, d = 0): // d: level of root, x: word if root is leaf: return root else: auto c = root.children[d] if c != NULL: return trie_search(c, x, d+1) else: /* not found */

insert(x)

delete(x)

time complexity of all operations: Θ(x)\Theta(|x|) (length of string).

variation 1. no leaf labels

do not actual keys at the leaves

variation 2. allow proper prefixes

allow prefixes in dictionary

img

variation 3. remove chains to labels (pruned trie)

stop adding nodes to trie as soon as the key is unique

img

compressed tries (Patricia tries)

img

search(x)

patricia_trie_search(root, x): if root is leaf: // lexical return x == root.key ? root : /* not found */ else: auto d = root.index if x.len < d: /* not found */ auto c = root.children[x[d]] if c != NULL: return patricia_trie_search(c, x) else: /* not found */

insert(x)

delete(x)

time complexity of all operations: O(x)O(|x|).

multiway tries: larger alphabet

Week 8. June 29

hashing

eg. suppose pick nn values from {0,...,M1}\{0,...,M-1\}, indepdently uniform distribution. what is probability of having a collision?
p(n,M)=1MMM1M...M(n1)Mp(n,M)=1-\frac{M}{M}\frac{M-1}{M}...\frac{M-(n-1)}{M}.

method 1: separate chaining

eg. M = 11, h(k) = k mod 11
inserting 79
img

insert takes O(1), search, delete takes O( 1+size of bucket T[h(k)] )

re-hashing:

method 2: probe sequencing

probe_sequence_insert(T, k, v): for i = 0, M-1: // h is prob seq if T[h(k, i)] == NULL or T[h(k, i)] is "deleted": T[h(k, i)] = {k, v} return throw // fail to insert, need to rehash probe_sequence_search(T, k): for i = 0, M-1: if T[h(k, i)] == NULL: return /* not found */ else if T[h(k, i)] has key k: return T[h(k, i)] // ignore "deleted" and keep searching return /* not found */

independent hash functions

method 3: double hashing

why coprime: we want to cycle through all the table to insert items, if M is multiple of h1, we cannot.

eg. M=11,h0(k)=k mod 11,h1(k)=1+10(0.618k0.618k)M=11,h_0(k)=k\text{ mod } 11,h_1(k)=1+\lfloor 10(0.618k-\lfloor 0.618k\rfloor)\rfloor

method 4: cuckoo hashing

use two independent hash functions h0,h1h_0,h_1 and two tables T0,T1T_0,T_1

cuckoo_insert(k, v): auto i = 0 do: if Ti[hi(k)] == NULL: Ti[hi(k)] = {k, v} return swap({k, v}, Ti[hi(k)]) i = 1-i while repeated less than 2n times throw /* fail to insert, need rehashing */

summary of open addressing strategies

for any open addresing scheme, must have load factor < 1; cuckoo requires < 1/2

average case cost search (unsuccessful) search (successful) insert
linear probing 1(1α)2\frac{1}{(1-\alpha)^2} 11α\frac{1}{1-\alpha} 1(1α)2\frac{1}{(1-\alpha)^2}
double hashing 11α\frac{1}{1-\alpha} 1αlog11α\frac{1}{\alpha}\log\frac{1}{1-\alpha} 11α\frac{1}{1-\alpha}
cuckoo hashing 1 (worst) 1 (worst) α(12α)2\frac{\alpha}{(1-2\alpha)^2}

all operations have O(1) average case time if hash function is uniform and load factor is kept small. but worst case time is (usually) Θ(n)\Theta(n).

universal hashing

advantages of balanced search trees:

advantages of hash tables:

Week 9. July 6

using unosrted list/array/hash table: requires Ω(n)\Omega(n) time, check for each time explicity whether it is in the range

using sorted array: done in O(logn+s)O(\log n+s) time

using BST: done in O(height+s) more on this later

(orthogonal) multidimensional range search: given a rectangle A, find all points that lie within A

quadtrees

we have n points S={(x0,y0),...,(xn1,yn1)}S=\{(x_0,y_0),...,(x_{n-1},y_{n-1})\} in the plane

structure:

eg.
img

search: analogous to BST and tries

qtree_rangesearch(root, A): // A: query rectangle // can determine result immediately if root.region ⊆ A: yield all points below root return else if root.region ⋂ A == Ø: return // node is a boundary node, recurse if root is leaf: auto &p = point stored in root if p in A: yield p else: return else: for child in root.children: // 4 quadrants yield from qtree_rangesearch(child, A)

we can compute the region each time instead of storing it.

insert:

delete:

analysis

summary

when dimension is 1, quadtree degenerates to a trie

kd-trees

building tree

build kd-tree with initial split by x on points S:

runtime: Θ(nh)\Theta(nh) expectd time. can be reduced by presorting to Θ(nh+nlogn)\Theta(nh+n\log n) worst case.

problem: after inserting/deleting, the split might no longer be at exact median and the height is no longer guaranteed to be O(logn). can be remedied by allowing a certain imbalance and re-building the tree when it becomes too unbalanced.

eg.
img

analysis

assume points are in general position: no two point shave the same x coordinate or y coordinate, for building the kd-tree

for range seach:

higher dimensions

(assume general position and d is a constant):

range search of 1d BST

first consider 1d range search:

BST_rangesearch(root, x1, x2): // returns keys in tree that are in range [x1, x2] if root == NULL: return {} else if x1 <= root.key <= x2: auto L = BST_rangesearch(root.left, x1, x2) auto R = BST_rangesearch(root.right, x1, x2) return L ∪ {root} ∪ R else if root.key < x1: return BST_rangesearch(root.right, x1, x2) else if root.key > x2: return BST_rangesearch(root.left, x1, x2)

observation: keys are reported in original sorted order; copies of duplicates are reported

eg. we can partition them into 3 parts

img

range tree

eg.
img

space analysis

dictionary operations

range search:

range search run time:

higher dimensions

faster search, slower construction and more space than kd-trees.

Week 10. July 14

string matching

defn. the empty string Λ\Lambda is considered substring, prefix and suffix of any string.

pattern matching algorithms consist of guesses and checks:

brute force algorithm

brute_force_PM(T[n], P[m]): for i = 0, n-m: if T[i:i+m-1] == P: return i return FAIL

worst possible input: T=an,P=am1bT=a^n,P=a^{m-1}b; worse case run time: Θ((nm+1)m)=Θ(mn)\Theta((n-m+1)m)=\Theta(mn).

preprocessing pattern: Rabin-Karp fingerprint algorithm

use hashing to eliminate gueses

rabin_karp_simple(T[n], P[m]): auto hp = h(P) for k = 0, n-m: auto ht = h(T[k:k+m-1]) if ht == hp: if T[k:k+m-1] == P: return k return FAIL

using efficient rehash:

rabin_karp_fast(T[n], P[m]): auto M = suitable prime number // h = s => sn-1...x1x0 % M auto hp = h(P) auto ht = h(T[0:m-1]) auto s = 10^(m-1) % M for i = 0, n-m: if ht == hp: if T[i:i+m-1] == P: return i // compute ht using previous ht // since T segment lost one char at front, add one next char at end // 1. ht = ((ht - T[i]*s)*10 + T[i+m]) % M return FAIL

remark. to explain 1. suppose h(x)=xmodMh(x)=x\operatorname{mod}M,
Ti=ti10m1+...+ti+m210+ti+m1T_i=t_i\cdot 10^{m-1}+...+t_{i+m-2}\cdot 10+t_{i+m-1}
in ii th iteration. then for next iteration,
Ti+1=ti+110m1+...+ti+m110+ti+m=(Titi10m1)10+ti+mT_{i+1}=t_{i+1}\cdot 10^{m-1}+...+t_{i+m-1}\cdot 10+t_{i+m}=(T_i-t_i\cdot10^{m-1})\cdot 10+t_{i+m}
so we can compute hash of i+1i+1th iteration based on previous hash:
h(Ti+1)=((Titi10m1)10+ti+m)modM=((TimodMti10m1modM)10+ti+m)modM=((h(Ti)ti(10m1modM))10+ti+m)modM\begin{aligned} h(T_{i+1})&=((T_i-t_i\cdot10^{m-1})\cdot 10+t_{i+m})\operatorname{mod}M\\ &=((T_i\operatorname{mod}M-t_i\cdot 10^{m-1}\operatorname{mod}M)\cdot 10+t_{i+m})\operatorname{mod}M\\ &=((h(T_i)-t_i\cdot (10^{m-1}\operatorname{mod}M))\cdot 10+t_{i+m})\operatorname{mod}M \end{aligned}

preprocessing pattern: Boyer-Moore (-Horspool) algorithm

brute-force search with changes:

boyer_moore(T[n], P[m]): auto L = last occurrences auto k = 0 while k <= n-m: auto j = m-1 while j >= 0: if T[k+j] != P[j]: break --j if j == -1: return k k = k + max(1, j-L(T[k+j])) return FAIL

eg.
img

preprocessing pattern: Knuth-Morris-Pratt algorithm

eg. to search ababaca, use this automation:
img

eg. compute a failure array for P = ababaca
img

kmp(T[n], P[m]): auto F = failure_array(P) auto i = 0 // current character of T to parse auto j = 0 // current state while i < n: if P[j] == T[i]: if j == m-1: return i-m+1 else: i = i+1 j = j+1 else: // failure if j > 0: j = F[j-1] else: i = i+1 return FAIL failure_array(P[m]): auto F = new array(m) F[0] = 0 auto i = 1 auto j = 0 while i < m: if P[i] == P[j]: j = j+1 F[i] = j i = i+1 else if j > 0: j = F[j-1] else: F[i] = 0 i = i+1 return F

preprocessing text: suffix trees

assume we have a suffix tree of text T, to search for pattern P of length m:

eg. search reaches a leaf with remaining parts of P, compare P with T to test whether P is suffix. in this case it is not so P does not exist in T

img

summary

brute-force RK BM DFA KMP suffix trees
preprocess O(m)O(m) O(m+Σ)O(m+\vert\Sigma\vert) O(mΣ)O(m\vert\Sigma\vert) O(m)O(m) O(n2)O(n)O(n^2)\rightarrow O(n)
search O(nm)O(nm) O(n+m)O(n+m)
(expected)
O(n)O(n)
(often faster)
O(n)O(n) O(n)O(n) O(m)O(m)
extra space O(1)O(1) O(m+Σ)O(m+\vert\Sigma\vert) O(mΣ)O(m\vert\Sigma\vert) O(m)O(m) O(n)O(n)

Week 11. July 21

data compression

defn. compression ratio: ClogΣCSlogΣS\frac{|C|\log|\Sigma_C|}{|S|\log|\Sigma_S|} (length of compressed word * log size of compressed alphabet / length of original word * log size of original alphabet)

types of compression:

for media files, lossy, local compression is useful. we concentrate on physical, loseless compression as it is general.

character-by-character encoding

we map each character in the source alphabet to a string in coded alphabet. define E:ΣSΣCE:\Sigma_S\mapsto\Sigma^*_C, for cΣSc\in\Sigma_S, we call E(c)E(c) the codeword or cc.

have 2 possibilities:

encode(E, S[n]): // E: encoding dictionary, S: text with characters in ΣS string C for i = 0, n-1: C.append(E.search(S[i])) return C

for decoding

eg.
img

prefixfree_decode(T, C[n]): // T: trie of a prefix-free code, C: text with characters in ΣC string S auto i = 0 while i < n: auto &r = T.root while r is not leaf: if i == n: /*invalid encoding*/ auto c = child of r that is labelled with C[i] i = i+1 r = c S.append(character stored at r) return S

runtime is Θ(C)\Theta(|C|).

we can also encode directly from the trie:

prefixfree_encode(T, S[n]): // T: trie of a prefix-free code, S: text with characters in ΣS auto L = array of nodes in T indexed by ΣS for leaf in T.leaves: L[character at leaf] = leaf string C for i = 0, n-1: auto w = "" auto v = L[S[i]] // walk up the tree from leaf to get decoded portion while v is not root: w.prepend(character from v to its parent) v = v.parent C.append(w) return C

runtime is O(T+C)O(|T|+|C|), which is O(ΣS+C)O(|\Sigma_S|+|C|) if T is full

Huffman code

for given source text SS, determine the best trie (shortest coding text):

  1. determine frequency of each character cΣc\in\Sigma in SS
  2. for each cc, create a height-0 trie holding cc
  3. tries have a weight: sum of frequencies of all letters in trie. initially these are just the character frequencies
  4. find the two tries with the minimum weight
  5. merge these tries with new interior node; new weight is the sum
  6. repeat last two steps until there is only one trie left
  7. encode text using this trie
huffman_encoding(S[n]): auto f = array indexed by ΣS, initially all 0 // O(|ΣS|) for i = 0, n-1: // O(n) f[S[i]] += 1 auto Q = min pq for c in ΣS: // O(|ΣS|*log|ΣS|) if f[c] > 0: Q.insert(single-node trie for c with weight f[c]) while Q.size > 1: // O(|ΣS|*log|ΣS|) auto &T1 = Q.deleteMin() auto &T2 = Q.deleteMin() Q.insert(trie with T1, T2 as subtries and weight = T1.weight+T2.weight) auto &T = Q.deleteMin() return prefixfree_encode(T, S), T // O(|ΣS|+|C|)

run time of encoding is O(ΣSlogΣS+C)O(|\Sigma_S|\log|\Sigma_S|+|C|)
run time of decoding is O(C)O(|C|)

eg.
img
compression ratio: 25log211log5=0.97\frac{25\cdot\log2}{11\cdot\log5}=0.97

run-length encoding

idea:

Elias gamma coding: to encode kk, add logk\lfloor\log k\rfloor copies of 0, followed by the binary representation of kk (which always start with 1)

kk logk\lfloor\log k\rfloor kk in binary encoding
1 0 1 1
2 1 10 010
3 1 11 011
4 2 100 00100
5 2 101 00101
...

eg. 00000 111 0000 encodes to 0534, can convert to 0 00101 011 00100 (fordecoding, consecutive x 0's means we read the following x+1 bits for length)

remark. a binary digit xx has logx+1\lfloor\log x\rfloor+1 digits.

RLE_encoding(S[n]): string C C.append(S[0]) auto i = 0 while i < n: auto k = 1 while i+k < n and S[i+k] == S[i]: k = k+1 i = i+k // compute and append elias gamma code string K while k > 1: C.append(0) K.prepend(k % 2) k = floor(k/2) K.prepend(1) C.append(K) return C RLE_decoding(C): string S auto b = C.pop_front() do: auto l = 0 while C.front() == 0: // length of base-2 number - 1 l = l+1 C.pop_front() auto k = 1 for _ = 1, l: k = k*2 + C.pop_front() // construct count in decimal for _ = 1, k: S.append(b) b = 1-b while C is not empty return S // if pop is called when there are no bits, then C was not valid input

bzip2

uses text-transform: change input to different text that is not necessarily shorter, but has desirable qualities

  1. T0T1T_0\rightarrow T_1 Burrows-Wheeler transform
  2. T1T2T_1\rightarrow T_2 move-to-front transform
  3. T2T3T_2\rightarrow T_3 modified RLE
  4. T3T4T_3\rightarrow T_4 Huffman encoding

move-to-front transform:

MTF_encode(S): auto L = array with ΣS in some pre-agreed, fixed order (usually ASCII) for char c in S: yield index i such that L[i] = c for j = i-1, 0, -1: swap(L[j], L[j+1]) MTF_decode(C): auto L = array with ΣS in some pre-agreed, fixed order (usually ASCII) for int i in C: yield L[i] for j = i-1, 0, -1: swap(L[j], L[j+1])

Burrows-Wheeler transform

idea:

detail:

eg. encode example
img
observe substring alf occurs three times and causes runs lll and aaa in C.

eg. decode example
img

BWT_decode(C[n]): auto A = array(n) for i = 0, n-1: A[i] = (C[i], i) // store character and index stable_sort(A) for j = 0, n: if C[j] == $: break string S do: j = A[j].second S.append(C[j]) while C[j] != $ return S

encode cost O(n(n+ΣS))O(n(n+|\Sigma_S|)) using radix sort

decoding uses O(n+ΣS)O(n+|\Sigma_S|). both uses O(n)O(n) space.

they need all the text (no streaming possible). BWT is a block compression method.

Week 12. July 28

Lempel-Ziv-Welch

eg. encoding example
img

LZW_encode(S): auto D = dictionary with ASCII in a trie auto idx = 128 while S is not empty: auto &v = D.root auto K = S.front() // find longest existing prefix while v has child c labelled K: v = c S.pop_front() if S is empty: break K = S.front() // ouput list of numbers (idx) // this is usually converted to a bit-string with fixed // length encoding using 12 bits (if too long, start from // beginning, not shown) yield codenumber stored at v if S is not empty: create child of v labelled K with codenumber idx in D idx++ LZW_decode(C): auto D = dictionary maps from 0:127 to ASCII auto idx = 128 string S auto code = C.pop_front() auto s = D[code] S.append(S) while C is not empty: auto sprev = s code = C.pop_front() if code < idx: s = D[code] else if code == idx: // special situation s = sprev + sprev[0] else: /*invalid*/ S.append(s) D.insert(idx, sprev + s[0]) idx++ return S

summary

img

external memory

observation: accessing a single location in external memory automatically loads a whole block (page).

objective: minimize page loads

tree-based dictionaries have poor memory locality: if an operation accesses m nodes, then it must access m spaced-out memory locations. in AVL tree, Θ(log2n)\Theta(\log_2 n) pages are loaded in worst case. better solution: do more in single node

2-3 trees

is a balanced search tree.

eg.
img

search(k)

twothreetree_search(k, root): auto [c0,k1,...,kd,cd] = keys and children at v in order if k >= k1: auto i = max index such that ki <= k if ki == k: return ki else: i = 0 twothreetree_search(k, ci)

insert(k, v)

delete(k)

(a, b)-trees

if ab+12a\leq\frac{b+1}{2}, then search, insert, delete works like for 2-3 trees.

the height of tree with n KVPs is O(logan)O(\log_an) and Ω(logbn)\Omega(\log_bn).

B-trees

a B-tree of order m is a (m2,m)(\lceil\frac{m}{2}\rceil,m)-tree. a 2-3 tree is a B-tree of order 3.

search, insert, delete each require Θ(height)=Θ(logmn)\Theta(\text{height})=\Theta(\log_mn) node operations / page loads.

Week 13. August 3

external sorting

given an array A of n numbers, sort them. n is huge and A is stored in blocks in external memory.

d_way_merge(S1, ..., Sd) // S1,... are sorted sets auto P = new min_priority_queue auto S = new set for i = 1, d: P.insert(key=Si.first(), val=i) while P is not empty: auto (x, i) = P.deleteMin() remove x from Si and append it to S if Si is not empty: P.insert(Si.first(), i) return S

to sort a long array on the disk

observations: