Website: https://student.cs.uwaterloo.ca/~cs488/Winter2022/
http://pedrinho.cs.uwaterloo.ca/~gvgbaran/CS488/Winter22/CS488-W22.html CS488

.. graphics22

Instructor: Gladimir Baranoski

Time: TTh 8:30AM - 9:50AM

.1A * 5 + .04Q * 5 + .3FE

Week 2. Jan 11

forward rendering: rendering primitives are transformed from model to device (hardware and OpenGL)

backward rendering: start with a point in image and work out what model primitives project to it (raytracing)

the graphics pipeline for forward projection:

modelling viewing transformations transformation +------+ +---+ +----------+ +----------+ +----------+ |model +----->+M1 | | | | | | | +------+ +---+ | 3D | +---+ | 3D | +-----------------------------+ | 2D/3D | +---------+ +-------+ +------>+ world +---->+ V +-->+ view +------>+ (P -> clip -> normalize) +----->+ device +---->+ raster- +----->+ 2D | +------+ +---+ | scene | +---+ | scene | | projection | | scene | | ization | | image | |model +----->+M2 | | | | | +-----------------------------+ | | +---------+ +-------+ +------+ +---+ +----------+ +----------+ +----------+ DCS/SCS MCS WCS VCS NDCS device/screen modelling world viewer normalized coord. coordinate coord. coord. device system system system system coord. system

primitives:

algorithms:

available apis:

devices

calligraphic display devices: draw polygon and line segments directly

raster display devices: draw pixels

raster cathode ray tube (CRT):

liquid crystal display (LCD):

maths

points and vectors are different objects:

defn. a vector space VV is a set of vectors that

  1. has addition: u,vV    u+vV\mathbf{u},\mathbf{v}\in V\implies \mathbf{u}+\mathbf{v}\in V
  2. has scalar multiplication: uV    αuV\mathbf{u}\in V\implies\alpha\mathbf{u}\in V where α\alpha is member of some field
  3. axioms of vector spaces:
    1. addition commutes: u+v=v+u\mathbf{u}+\mathbf{v}=\mathbf{v}+\mathbf{u}
    2. addition assoccates: (u+v)+w=u+(v+w)(\mathbf{u}+\mathbf{v})+\mathbf{w}=\mathbf{u}+(\mathbf{v}+\mathbf{w})
    3. scalar multiplication distributes: α(u+v)=αu+αv\alpha(\mathbf{u}+\mathbf{v})=\alpha\mathbf{u}+\alpha\mathbf{v}
    4. unique zero element: 0+u=u\mathbf{0}+\mathbf{u}=\mathbf{u}
    5. field unit element: 1u=u1\mathbf{u}=\mathbf{u}

defn. set B={v1,...,vn}B=\{\mathbf{v}_1,...,\mathbf{v}_n\} spans VV iff any vV\mathbf{v}\in V can be written as a linear combination v=i[n]αivi\mathbf{v}=\sum_{i\in[n]}\alpha_{i}\mathbf{v}_i.

defn. a basis is any minimal spanning set. all bases are the same size.

defn. dimension is the number of vector in basis (we work in 2d and 3d spaces).

defn. an affine space is a set of vectors VV and a set of points P\mathcal{P} where

  1. vectors VV form a vector space
  2. points can be combined with vectors to make new points, ie P+vQP,QP,vVP+\mathbf{v}\Rightarrow Q\,\,\forall P,Q\in \mathcal{P},\mathbf{v}\in V

remark. affine spaces do not have origin or angles

defn. a frame F=(v1,...,vn,O)F=(\mathbf{v_1},...,\mathbf{v}_n,O) is a vector basis plus an origin point OO.

the dimension of an affine space is same as that of VV.

defn. an inner product space for vector space VV is a binary operator V2RV^2\rightarrow\mathbb{R} with

  1. uv=vu\mathbf{u}\cdot\mathbf{v}=\mathbf{v}\cdot\mathbf{u}
  2. (u+v)w=uw+vw(\mathbf{u}+\mathbf{v})\cdot\mathbf{w}=\mathbf{u}\cdot\mathbf{w}+\mathbf{v}\cdot\mathbf{w}
  3. uu0\mathbf{u}\cdot\mathbf{u}\geq0
  4. (αu)v=α(uv)(\alpha\mathbf{u})\cdot\mathbf{v}=\alpha(\mathbf{u}\cdot\mathbf{v})

defn. a metric space is any space with a distance metric d(P,Q)d(P,Q) defined on its elements.

defn. an euclidean space is where distance metric is based on a dot product:

uv=uxvx+uyvy+uzvz=uvcos(uv)\begin{aligned} \mathbf{u}\cdot\mathbf{v}&=u_xv_x+u_yv_y+u_zv_z\\ &=||\mathbf{u}|||\mathbf{v}||\cos(\angle\mathbf{u}\mathbf{v}) \end{aligned}

defn. the norm of vector is u=uu|\mathbf{u}|=\sqrt{\mathbf{u}\cdot\mathbf{u}}.

defn. the scalar projection of vector u\mathbf{u} on v\mathbf{v} is uv=ucos(uv)=uvv\mathbf{u}\rightarrow\mathbf{v} =||\mathbf{u}||\cos(\angle\mathbf{u}\mathbf{v})=\frac{\mathbf{u}\cdot\mathbf{v}}{||\mathbf{v}||}.

defn. (perpendicularity) uv=0    uv\mathbf{u}\cdot\mathbf{v}=0\implies \mathbf{u}\perp\mathbf{v}.

defn. (cross product)

u×v=(uyvzuzvy,uzvxuxvz,uxvyuyvx)u×v=uvsin(uv)\begin{aligned} \mathbf{u}\times\mathbf{v}&=(u_yv_z-u_zv_y,u_zv_x-u_xv_z,u_xv_y-u_yv_x)\\ ||\mathbf{u}\times\mathbf{v}||&=||\mathbf{u}||||\mathbf{v}||\sin(\angle\mathbf{u}\mathbf{v}) \end{aligned}

defn. a cartesian space is an euclidean space with an orthonormal frame (i,j,k,O)(\mathbf{\mathbf{i},\mathbf{j},\mathbf{k}},O).

  1. orthogonal: ij=jk=ki=0\mathbf{i}\cdot\mathbf{j}=\mathbf{j}\cdot\mathbf{k}=\mathbf{k}\cdot\mathbf{i}=0
  2. normal: i=j=k=1|\mathbf{i}|=|\mathbf{j}|=|\mathbf{k}|=1

notation. the standard frame is FS=(i,j,k,O)F_S=(\mathbf{\mathbf{i},\mathbf{j},\mathbf{k}},O).

we use an extra coordinate to distinguish vectors and points:

Week 3. Jan 18

basic geometric transformations

assume the standard frame.

2d translation:

[10Δx01Δy001]T(Δx,Δy)[xy1]=[x+Δxy+Δy1]\overbrace{\left[\begin{array}{ccc} 1 & 0 & \Delta x \\ 0 & 1 & \Delta y \\ 0 & 0 & 1 \end{array}\right]}^{T(\Delta x, \Delta y)}\left[\begin{array}{l} x \\ y \\ 1 \end{array}\right]=\left[\begin{array}{c} x+\Delta x \\ y+\Delta y \\ 1 \end{array}\right]

2d scale about origin:

[sx000sy0001]S(sx,sy)[xy0 or 1]=[sxxsyy0 or 1]\overbrace{\left[\begin{array}{ccc} s_{x} & 0 & 0 \\ 0 & s_{y} & 0 \\ 0 & 0 & 1 \end{array}\right]}^{S\left(s_{x}, s_{y}\right)}\left[\begin{array}{c} x \\ y \\ 0 \text { or } 1 \end{array}\right]=\left[\begin{array}{c} s_{x} x \\ s_{y} y \\ 0 \text { or } 1 \end{array}\right]

2d rotation: counterclockwise about origin by angle θ\theta

[cosθsinθ0sinθcosθ0001]R(θ)[xy0 or 1]=[xy0 or 1]\overbrace{\left[\begin{array}{ccc} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{array}\right]}^{R(\theta)}\left[\begin{array}{c} x \\ y \\ 0 \text { or } 1 \end{array}\right]=\left[\begin{array}{c} x^{\prime} \\ y^{\prime} \\ 0 \text { or } 1 \end{array}\right]

2d shear: intermixes coordinates according to α,βR\alpha,\beta\in\mathbb{R}

[1β0α10001]Sh(α,β)[xy0 or 1]=[x+βyαx+y0 or 1]\overbrace{\left[\begin{array}{lll} 1 & \beta & 0 \\ \alpha & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]}^{S h(\alpha, \beta)}\left[\begin{array}{c} x \\ y \\ 0 \text { or } 1 \end{array}\right]=\left[\begin{array}{c} x+\beta y \\ \alpha x+y \\ 0 \text { or } 1 \end{array}\right]

eg. horizontal (x-axis) shear (45 degrees)

img

reflection: through a line

eg. reflect through x-axis:

[100010001][xy0 or 1]=[xy0 or 1]\left[\begin{array}{ccc} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right]\left[\begin{array}{c} x \\ y \\ 0 \text { or } 1 \end{array}\right]=\left[\begin{array}{c} x \\ -y \\ 0 \text { or } 1 \end{array}\right]

reflect through y-axis:

[100010001][xy0 or 1]=[xy0 or 1]\left[\begin{array}{ccc} -1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right]\left[\begin{array}{c} x \\ y \\ 0 \text { or } 1 \end{array}\right]=\left[\begin{array}{c} -x \\ y \\ 0 \text { or } 1 \end{array}\right]

remark. an arbitrary sequence of rotation, translation, scale and transformations results in an affine transformation which preserves the parallelism of lines, but not lengths and angles.

eg. how to rotate around an arbitrary point P=[x0,y0,1]P=[x_0,y_0,1]^\intercal?

  1. translate PP to origin: T(x0,y0)T(-x_0,-y_0)
  2. rotate around origin: R(θ)R(\theta)
  3. translate origin back to PP: T(x0,y0)T(x_0,y_0)

the desired transformation is T(x0,y0)R(θ)T(x0,y0)T(x_0,y_0)\circ R(\theta)\circ T(-x_0,-y_0) = matrix multiplication

remark. read from right to left.

basic 3d geometric transformations

assume coordinate system is right handled.

translation:

T(Δx,Δy,Δz)=[100Δx010Δy001Δz0001]T(\Delta x,\Delta y,\Delta z)= \left[\begin{array}{cccc} 1&0&0&\Delta x\\ 0&1&0&\Delta y\\ 0&0&1&\Delta z\\ 0&0&0&1 \end{array}\right]

scale: about origin

S(sx,sy,sz)=[sx0000sy0000sz00001]S\left(s_{x},s_{y},s_{z}\right)= \left[\begin{array}{cccc} s_{x}&0&0&0\\ 0&s_{y}&0&0\\ 0&0&s_{z}&0\\ 0&0&0&1 \end{array}\right]

rotation: about an axis

Rz(θ)=[cosθsinθ00sinθcosθ0000100001]Rx(θ)=[10000cosθsinθ00sinθcosθ00001]Ry(θ)=[cosθ0sinθ00100sinθ0cosθ00001]\begin{aligned} &R_{z}(\theta)=\left[\begin{array}{cccc} \cos \theta & -\sin \theta & 0 & 0 \\ \sin \theta & \cos \theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \\ &R_{x}(\theta)=\left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta & 0 \\ 0 & \sin \theta & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \\ &R_{y}(\theta)=\left[\begin{array}{cccc} \cos \theta & 0 & \sin \theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin \theta & 0 & \cos \theta & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \end{aligned}

shear with fixed coordinate system:

Sh=[1000α1100α20100001][1β10001000β2100001][10γ1001γ2000100001]Sh=\left[\begin{array}{cccc} 1 & 0 & 0 & 0 \\ \alpha_{1} & 1 & 0 & 0 \\ \alpha_{2} & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right]\left[\begin{array}{cccc} 1 & \beta_{1} & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & \beta_{2} & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right]\left[\begin{array}{llll} 1 & 0 & \gamma_{1} & 0 \\ 0 & 1 & \gamma_{2} & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right]

shear in 'w-subspace' in direction n\mathbf{n} by angle θ\theta:

Sh=[I3×3tanθn01]Sh=\left[\begin{array}{cc} I_{3\times3}&\tan\theta\mathbf{n}\\ 0&1 \end{array}\right]

img

eg. how to rotate around an arbitrary axis by θ\theta?

  1. translate that axis so it passes the coordinate origin: TT
  2. rotate the translated axis such that it coincides with one of x, y, z axes
  3. perform specified rotation on selected axis by θ\theta: Rz(θ)R_z(\theta)
  4. apply inverse rotations
  5. apply inverse translation

the desired transformation is R(θ)=T1Rx1(α)Ry1(β)Rz(θ)Ry(β)Rx(α)TR(\theta)=T^{-1}R_x^{-1}(\alpha)R_y^{-1}(\beta)\circ R_z(\theta)\circ R_y(\beta)R_x(\alpha)T.

eg. assume we rotate along vector v=(x,y,z)\mathbf{v}=(x,y,z) (ie line that crosses origin and v\mathbf{v}) by θ\theta, and v=1|\mathbf{v}|=1. determine the rotation matrix.

  1. pick closest axis to v\mathbf{v} by maxieiv=max(x,y,z)\max_i\mathbf{e}_i\cdot\mathbf{v}=\max(x,y,z).
  2. project v\mathbf{v} onto a:=(x,0,z)\mathbf{a}:=(x,0,z) in xz-plane
  3. compute cosα,sinα\cos\alpha,\sin\alpha where α\alpha is angle of a\mathbf{a} with x-axis
  4. use them to create Ry(α)R_y(-\alpha)
  5. rotate v\mathbf{v} onto xy-plane by Ry(α)R_y(-\alpha) to create b:=Ry(α)v=(x2+z2,y,0)\mathbf{b}:=R_y(-\alpha)\mathbf{v}=(\sqrt{x^2+z^2},y,0)
  6. compute cosβ,sinβ\cos\beta,\sin\beta where β\beta is angle of b\mathbf{b} with x-axis
  7. use them to create Rz(β)R_z(-\beta)
  8. rotate b\mathbf{b} onto x-axis using Rz(β)R_z(-\beta)
  9. rotate object about x-axis by θ\theta using Rx(θ)R_x(\theta)
  10. reverse x-axis rotation: Rz(β)R_z(\beta)
  11. reverse y-axis rotation: Ry(α)R_y(\alpha)

the overall transformation is R(θ,v)=Ry(α)Rz(β)Rx(θ)Rz(β)Ry(α)R(\theta,\mathbf{v})=R_y(\alpha)R_z(\beta)R_x(\theta)R_z(-\beta)R_y(-\alpha).

img

changing basis

given frames F1=(w1,w2,OW)F_1=(\mathbf{w}_1,\mathbf{w}_2,O_W) and F2=(v1,v2,OV)F_2=(\mathbf{v}_1,\mathbf{v}_2,O_V). we have a 2d point Pp=[x,y,1]P\equiv\mathbf{p}=[x,y,1]^\intercal relative to F1F_1, how to change it to F2F_2?

the matrix that maps from F1F_1 to F2F_2 is MM. if F2F_2 is orthogonal we have

Mi,j=wjvivi2Mi,3=(OWOV)vivi2M3,=[0,0,1]\begin{aligned} M_{i,j}&=\frac{\mathbf{w}_j\cdot\mathbf{v}_i}{\mathbf{v}_i^2}\\ M_{i,3}&=\frac{(O_W-O_V)\cdot\mathbf{v}_i}{\mathbf{v}_i^2}\\ M_{3,}&=[0,0,1] \end{aligned}

otherwise, we need to compute, the top-left portion M1:2,1:2=[[w1]F2,[w2]F2]M_{1:2,1:2}=[[\mathbf{w}_1]_{F_2},[\mathbf{w}_2]_{F_2}] ie solve F1=F2M1:2,1:2F_1=F_2M_{1:2,1:2}, so that MpM\mathbf{p} is the coordinates desired. (does not work for non-ortho?)

the last column is used for offset!

note:

eg.

FW=(w1,w2,w3,OW)=(i,j,k,[0001])FV=(v1,v2,v3,OV)=([2/22/200],[0010],[2/22/200],[1031])\begin{aligned} F_{W} =\left(\mathbf{w}_{1}, \mathbf{w}_{2}, \mathbf{w}_{3}, \mathcal{O}_{W}\right) &=\left(\mathbf{i},\mathbf{j},\mathbf{k},\left[\begin{array}{l} 0 \\ 0 \\ 0 \\ 1 \end{array}\right]\right) \\ F_{V} =\left(\mathbf{v}_{1}, \mathbf{v}_{2}, \mathbf{v}_{3}, \mathcal{O}_{V}\right) &=\left(\left[\begin{array}{c} \sqrt{2} / 2 \\ \sqrt{2} / 2 \\ 0 \\ 0 \end{array}\right],\left[\begin{array}{l} 0 \\ 0 \\ 1 \\ 0 \end{array}\right],\left[\begin{array}{c} \sqrt{2} / 2 \\ -\sqrt{2} / 2 \\ 0 \\ 0 \end{array}\right],\left[\begin{array}{l} 1 \\ 0 \\ 3 \\ 1 \end{array}\right]\right) \end{aligned}

the transformation matrix is

M=[2/22/202/200132/22/202/20001]M=\left[\begin{array}{cccc} \sqrt{2} / 2 & \sqrt{2} / 2 & 0 & -\sqrt{2} / 2 \\ 0 & 0 & 1 & -3 \\ \sqrt{2} / 2 & -\sqrt{2} / 2 & 0 & -\sqrt{2} / 2 \\ 0 & 0 & 0 & 1 \end{array}\right]

eg. we can also use some transformations to align F2F_2 on F1F_1.

img

to map p\mathbf{p} from frame F1F_1 to F2F_2, a translation T(dx,0)T(-d_x,0) would suffice.

remark. if we apply a transformation TT to another transformation (eg a reference frame), we use T1T^{-1}.

viewing frames

typically our space SS is a cartesian space

img

right handed coordinate system:

left handed coordinate system:

we can change basis by

  1. specify frame relative to reviewer
  2. change coordinates to this frame
V = inverse(mat4{ {x, 0}, {y, 0}, {z, 0}, {origin, 1}, })

once we are in the viewing coordinates,

  1. usually place a clipping box around the scene
  2. box is oriented relative to viewing frame

orthographic projection: made by removing the z-coordinate. relative to FV=(i,j,k,O)F_V=(\mathbf{i},\mathbf{j},\mathbf{k},O), we do

Ortho(Q)=Ortho(q1i+q2j+q3k+O)=q1u+q2v+O\mathrm{Ortho}(Q)=\mathrm{Ortho}(q_1\mathbf{i}+q_2\mathbf{j}+q_3\mathbf{k}+O)=q_1\mathbf{u}+q_2\mathbf{v}+O'

or

[q1q21]=[100001000001][q1q2q31]\begin{aligned} \left[\begin{array}{c} q_{1} \\ q_{2} \\ 1 \end{array}\right]=\left[\begin{array}{llll} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right]\left[\begin{array}{c} q_{1} \\ q_{2} \\ q_{3} \\ 1 \end{array}\right] \end{aligned}

Week 4. Jan 25

transforming normals

truth: we can only apply affine transformations to points

if just use the same transform matrix, the circle's normal is not transformed correctly with nonuniform scale/shear:

img

to transform normal n\mathbf{n} by transformation MM, we use

n=(M1:3,1:31)n\mathbf{n}'=(M_{1:3,1:3}^{-1})^\intercal\mathbf{n}

where M1:3,1:3M_{1:3,1:3} is the upper 3x3 submatrix (linear part).

proof. suppose we have a point PpP\equiv\mathbf{p} and a tangent Tt\mathbf{T}\equiv\mathbf{t} at PP. after transformation there are

p=Mpt=Mt=M1:3,1:3t\mathbf{p}'=M\mathbf{p}\\ \mathbf{t}'=M\mathbf{t}=M_{1:3,1:3}\mathbf{t}

note normals must be perpendicular to tangents, so we have

0=nt=nM1:3,1:31M1:3,1:3t=(M1:3,1:31n)(M1:3,1:3t)=ntNT0=\mathbf{n}^\intercal\mathbf{t}=\mathbf{n}^\intercal M_{1:3,1:3}^{-1}M_{1:3,1:3}\mathbf{t}=(M_{1:3,1:3}^{-1}\mathbf{n}^\intercal)^{\intercal}(M_{1:3,1:3}\mathbf{t})=\mathbf{n}'^\intercal\mathbf{t}'\equiv\mathbf{N}'\cdot\mathbf{T}'

\square

notes:

eg. a plane is defined by normal times difference of two points: n(pp0)=0\mathbf{n}^\intercal(p-p_0)=0. points pp on the plane are transformed using MM, find the new normal n\mathbf{n}'.

suppose that matrix is QQ, then we have

(Qn)M(pp0)=0nQM(pp0)=0\begin{aligned} (Q\mathbf{n})^\intercal M(p-p_0)&=0\\ \mathbf{n}^\intercal Q^\intercal M(p-p_0)&=0 \end{aligned}

to make this hold, we must have QM=Q^\intercal M= a multiple of identity matrix. hence we get Q=(M1)Q=(M^{-1})^\intercal, and n=(M1)n\mathbf{n}'=(M^{-1})^\intercal\mathbf{n}.

windows and viewports

we start with 3D scene, but eventually project to 2D scene (which is infinite). we map a finite rectangular region of 2D device scene to device.

given a window point (xw,yw)(x_w,y_w), we want to map to viewport point (xv,yv)(x_v,y_v), given Lw,HwL_w,H_w are length & height of window, and Lv,HvL_v,H_v are length & height of viewport. use:

xv=LvLw(xwxwl)+xvlx_v=\frac{L_v}{L_w}(x_w-x_{wl})+x_{vl}

similarly for yvy_v, where xwl,xvlx_{wl},x_{vl} are window & viewport corners.

if HwLwHvLv\frac{H_w}{L_w}\neq\frac{H_v}{L_v}, the image is distorted. the quantities are called aspect ratios of the window and viewport.

(usually these coords are in [1,1]2[-1, 1]^2 NDC)

normalized device coordinates

where to specify viewport? could specify it in device coordinates...

if we map directly from WCS to a DCS, then changing our device requires writing the mapping. instead, we use normalized device coordinates (NDC) as intermediate system that gets mapped to device layer.

clipping

we want to remove points outside a region of interest (discard parts of primitives outside window).

point clipping: remove points outside window

line clipping: remove portion of line segment outside window

parametric representation of line: L(t)=(1t)A+tB=A+t(BA)L(t)=(1-t)A+tB=A+t(B-A)

implicit representation of line: (Q)=(QP)n\ell(Q)=(Q-P)\cdot\mathbf{n}

clipping point to halfspace:

clipping line segment to halfspace:

given window edge as implicit form (Q)=(QP)n\ell(Q)=(Q-P)\cdot\mathbf{n}, and the line segment as parametric form L(t)=A+t(BA)L(t)=A+t(B-A), we have two trivial cases:

otherwise we need to clip line segment:

img

we want tt such that (L(t))=0\ell(L(t))=0, ie (L(t)P)n=(A+t(BA)P)n=0(L(t)-P)\cdot\mathbf{n}=(A+t(B-A)-P)\cdot n=0, and so

t=(AP)n(AB)n=(AP)n(AP)n(BP)n\begin{aligned} t&=\frac{(A-P)\cdot\mathbf{n}}{(A-B)\cdot\mathbf{n}}\\ &=\frac{(A-P)\cdot\mathbf{n}}{(A-P)\cdot\mathbf{n}-(B-P)\cdot\mathbf{n}}\\ \end{aligned}

not we can reuse the computed values from trivial tests. finally, just clip four halfspaces in turn.

algo (liang-barsky):

clip_line_segment(A, B): for each edge P, n: wecA = (A - P) ⋅ n wecB = (B - P) ⋅ n if wecA < 0 and wecB < 0: return false // outside if wecA >= 0 and wecB >= 0: return A, B // inside t = wecA / (wecA - wecB) if wecA < 0: // in diagram A = A + t * (B - A) else: B = A + t * (B - A) return A, B

note:

for 3D

projections

projections

perspective projection:

img

projective transformation:

affine transformation projective transformation
image of 2 pts on a line determine image of line image of 3 pts on a line determine image of line
image of 3 pts on a plane determine image of plane image of 4 pts on a plane determine image of plane
in dimension n space, image of n+1 points/vectors defines affine map in dimension n space, image of n+2 points/vectors defines map
vectors map to vectors
v=QR=RS    A(Q)A(R)=A(R)A(S)\mathbf{v}=Q-R=R-S\implies A(Q)-A(R)=A(R)-A(S)
mapping of vectors is ill-formed
v=QR=RS̸    P(Q)P(R)=P(R)P(S)\mathbf{v}=Q-R=R-S\not\implies P(Q)-P(R)=P(R)-P(S)
can represent with matrix multiply can represent with matrix multiply and normalization
Q-------R-------S A(Q) P(Q) \ \ \ P(R) A(R) \ \ \ \ \ A(S) P(S)

perspective map

perspective map: given point SS, find its projection PP

img

psedudo-opengl version of the perspective map:

img

the 'box' in world space is known as 'truncated viewing pyramid' or 'frustum'.

for simplicity, we project to the z=1z=1 plane. we want to map xx to xz\frac{x}{z} and yy to yz\frac{y}{z}, we use a matrix multiplication following by a normalization

[1000010000ac00bd][xyz1]=[xyaz+cbz+d][xbz+dybz+daz+cbz+d1]\begin{aligned} \left[\begin{array}{llll} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & a & c \\ 0 & 0 & b & d \end{array}\right]\left[\begin{array}{l} x \\ y \\ z \\ 1 \end{array}\right] &=\left[\begin{array}{c} x \\ y \\ a z+c \\ b z+d \end{array}\right] \equiv\left[\begin{array}{c} \frac{x}{b z+d} \\ \frac{y}{b z+d} \\ \frac{a z+c}{b z+d} \\ 1 \end{array}\right] \end{aligned}

we want to solve for a,b,c,da,b,c,d such that z[n,f]z\in[n,f] maps to z[1,1]z'\in[-1,1], and we get

[1000010000f+nfn2fnfn0010][xyz1]=[xyz(f+n)2fnfnz][xzyzz(f+n)2fnz(fn)1]\begin{aligned} \left[\begin{array}{llll} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & \frac{f+n}{f-n} & \frac{-2fn}{f-n} \\ 0 & 0 & 1 & 0 \end{array}\right]\left[\begin{array}{l} x \\ y \\ z \\ 1 \end{array}\right] &=\left[\begin{array}{c} x \\ y \\ \frac{z(f+n)-2fn}{f-n} \\ z \end{array}\right] \equiv\left[\begin{array}{c} \frac{x}{z} \\ \frac{y}{z} \\ \frac{z(f+n)-2fn}{z(f-n)} \\ 1 \end{array}\right] \end{aligned}

the OpenGL perspective matrix uses a=f+nfn,b=1a=-\frac{f+n}{f-n},b=-1

the upper left entries are different:

openGL maps yy to [1,1][-1,1]:

img

(we do not want to map both x and y to [-1,1] because we may not have square windows)

but we have placed our projection plane at z=1z=1, so this is really yyczdy\mapsto\frac{yc}{zd} where c=1c=1, and cd=cotθ2\frac{c}{d}=\cot\frac{\theta}{2}, so yyzcotθ2y\mapsto\frac{y}{z}\cot\frac{\theta}{2}.

finally, because x-y aspect ratio may not be 1, we will scale xx to give desired ratio: xxzxzcotθ2xzcotθ/2aspectx\mapsto\frac{x}{z}\mapsto\frac{x}{z}\cot\frac{\theta}{2}\mapsto\frac{x}{z}\frac{\cot{\theta}/{2}}{\text{aspect}}.

the final matrix is:

[cot(θ/2) aspect 0000cotθ20000±f+nfn2fnfn00±10]\left[\begin{array}{cccc} \frac{\cot (\theta / 2)}{\text { aspect }} & 0 & 0 & 0 \\ 0 & \cot \frac{\theta}{2} & 0 & 0 \\ 0 & 0 & \pm \frac{f+n}{f-n} & \frac{-2 f n}{f-n} \\ 0 & 0 & \pm 1 & 0 \end{array}\right]

where ±\plusmn is 1 if we look down the z axis, -1 if look down the -z axis.

note:

eg. why do we map Z?

(x,y,z,1)(xnz,ynz,z,1)(x,y,z,1)\mapsto(\frac{xn}{z},\frac{yn}{z},z,1) can retain depth

img

the map

(x,y,z,1)(xnz,ynz,zf+zn2fnz(fn),1)(x,y,z,1)\mapsto\left(\frac{xn}{z},\frac{yn}{z},\frac{zf+zn-2fn}{z(f-n)},1\right)

can map lines to lines and preserves depth info.

img

eg. behaviors of Z.

we know

what happens if z0z\rightarrow0 or zz\rightarrow\infty?

img

what happens if varying f and n?

Week 5. Feb 1

clipping in 3d

homogeneous point: the point representation (x,y,z,w)(\overline{x},\overline{y},\overline{z},\overline{w}) after applying a perspective transformation to a point.

[nr0000ns0000f+nfn2fnfn0010][xyz1]=[xyzw][xwywzw1]=:[XYZ1]\begin{bmatrix} nr&0&0&0\\ 0&ns&0&0\\ 0&0&\frac{f+n}{f-n}&\frac{-2 f n}{f-n}\\ 0&0&1&0 \end{bmatrix}\begin{bmatrix} x\\y\\z\\1 \end{bmatrix}=\begin{bmatrix} \overline{x}\\\overline{y}\\\overline{z}\\\overline{w} \end{bmatrix}\equiv\begin{bmatrix} \frac{\overline{x}}{\overline{w}}\\\frac{\overline{y}}{\overline{w}}\\\frac{\overline{z}}{\overline{w}}\\1 \end{bmatrix}=:\begin{bmatrix} X\\Y\\Z\\1 \end{bmatrix}

region mapping:

img

clipping is not good after normalization:

in order to graphically represent a homogeneous point, we could draw four axes. however for simplicity, we can draw a pair x,w\overline{x},\overline{w}, similar for other pairs.

img

assume we have NDC window of [1,1]2[-1,1]^2. to clip to X=1X=-1:

the point is visible if w+x>0\overline{w}+\overline{x}>0. assume we have a line (1t)p1+tp2(1-t)p_1+tp_2, then the boundary point is:

((1t)w1+tw2)+((1t)x1+tx2)=0t=w1+x1w1+x1w2x2((1-t)\overline{w}_1+t\overline{w}_2)+((1-t)\overline{x}_1+t\overline{x}_2)=0\\ t=\frac{\overline{w}_1+\overline{x}_1}{\overline{w}_1+\overline{x}_1-\overline{w}_2-\overline{x}_2}

we repeat for remaining boundaries:

A2

expressing pipeline:

PVTVMMTpP\cdot VT\cdot V\cdot M\cdot MT\cdot \mathbf{p}

where VTVT is viewing transformation, MTMT is modelling transformation. the result will be available for clipping, homogenization and projecting to viewport.

polygons

simple polygon:

img

the winding number is # of leaving-intersections - # of entering-intersections if we shoot a ray from inside the polygon. if it is > 0, then point is inside.

polygon is convex if for any two points inside polygon, the line segment joining them is also inside.

affine transformations may introduce degeneracies, eg orthographic projection may project entire polygon to a line segment.

polygon clipping

requirements:

given polygon v1,...,vnv_1,...,v_n

comparison with line clipping:

algo (Sutherland-Hodgman):

img

polygon scan conversion

img

algo:

// assume triangle has been split and A,B,C are in device coordinates // and A.x < B.x, A.y = B.y != C.y scan_triangle(A, B, C): // we plot pixels incrementally using the slope auto y = A.y d0 = (C.x-A.x)/(C.y-A.y) // 1 / slope of AC d1 = (C.x-B.x)/(C.y-B.y) // 1 / slope of BC x0 = A.x x1 = B.x while Y <= C.y: for x = x0, x1: plotPixel(x, y) x0 += d0 x1 += d1 ++y

line segment scan conversion (DDA)

assume we have a line of positive slope:

y=mx+bm=y2y1x2x1b=y1mx1y=mx+b\\ m=\frac{y_2-y_1}{x_2-x_1}\\ b=y_1-mx_1

if slope is <= 1, then we sample at unit x unit intervals (Δx=1\Delta x=1) and compute successive y value as

yk+1=yk+m,k=1,...y_{k+1}=y_k+m,k=1,...

if slope > 1, then we reverse the roles of x and y (Δy=1\Delta y=1) and compute successive x value as

xk+1=xk+1mx_{k+1}=x_k+\frac{1}{m}

given if we draw from left to right. note for drawing pixels, we have to round the (x, y) to nearest integer, but for next iteration we keep the decimals.

Week 6. Feb 8

hidden surface removal

when drawing lots of polygons, we want to draw only visible to viewer.

kinds:

backface culling

img

painter's algorithm

steps:

  1. sort polygons on farthest zz
  2. resolve ambiguities were z's overlap
  3. scan convert from largest z to smallest z

it is Ω(n2)\Omega(n^2) algo with lots of subtle detail.

warnock's algorithm

it is divide-conquer algo.

warnock(polylist, viewport):

  1. if the polygon list in 'simple' in viewport, then draw polygons
  2. otherwise:
    1. split the viewport vertically and horizontally into 4 sub-viewports
    2. for each sub-viewport:
      1. warnock(polygon list in sub-viewport, sub-viewport)

runtime: O(pn)O(pn) where p is # of pixels and n # of polygons

this algo can be seen as one of first uses of quadtrees.

img

z-buffer algorithm

// assume triangle has been split and A,B,C are in device coordinates // and A.x < B.x, A.y = B.y != C.y scan_triangle(A, B, C): auto y = A.y d0 = (C.x-A.x)/(C.y-A.y) d0z = (C.z-A.z)/(C.y-A.y) d1 = (C.x-B.x)/(C.y-B.y) d1z = (C.z-B.z)/(C.y-B.y) x0 = A.x z0 = A.z x1 = B.x z1 = B.z while Y <= C.y: auto z = z0 d2z = (z1-z0)/(x1-x0) for x = x0, x1: plotPixel(x, y, z) z += d2z x0 += d0 z0 += d0z x1 += d1 z1 += d1z ++y plotPixel(int x, int y, float z, colour): if z < zbuf[x][y]: zbuf[x][y] = z framebuffer[x][y] = colour

runtime: O(pc+n)O(p_c+n) where pcp_c is # of scan converted pixels, n is # of polygons

it is algorithm of choice for hardware implementation

binary space partitioning

the objects are built into a tree of 'front' and 'back's

  1. pick any object as the root
  2. for objects in front of root, recurse and attach the root as the 'front' child
  3. for objects in back of root, recurse and attach the root as the 'back' child
  4. an object may be in both front and back, then spitting is required (1 triangle -> cut into 3)

img

to draw the scene, if viewer is in root's front halfspace, do tree traversal:

  1. first draw back children
  2. then draw roots
  3. finally draw front children

otherwise reverse order of traversal.

this algo is object-precision, viewing-independent and 'geometry-load' (not online).

background of light

practical choice: geometrical optics and radiative transfer theory

we mainly consider 'white' light (flat spectral distribution).

types of process of light emission:

we also assume light interacting with a material comes from a light source without being subject to scattering phenomena (eg indirect skylight).

radiometric quantities:

when a radiometric term is written at a specific wavelength, it is called a spectral radiometric term.

appearance

variations in spectral distribution of light distributed by a material:

when light interacts with material, it can be also propagated (reflected/transmitted) following different spatial distributions depending on the material microstructure.

img

measurements of distribution of propagated light: reflectance and transmittance:

spatial patterns of light distributions:

eg. BRDF special case: perfect diffuse where incident light is reflected equally in all directions. so the BRDF is simply a constant (1π\frac{1}{\pi}).

Week 7. Feb 15

tristimulus values

it is possible to quantify the effect stimulating cones of the eye:

[lms]=VLp(λ)[l(λ)m(λ)s(λ)]dλ\begin{bmatrix} l\\m\\s \end{bmatrix}= \int_VL_p(\lambda)\begin{bmatrix} \overline{l}(\lambda)\\\overline{m}(\lambda)\\\overline{s}(\lambda) \end{bmatrix} d\lambda

it is an inner product over visible spectrum VV of the input stimulus Lp(λ)L_p(\lambda) projecting onto the spectral sensitives of the cones: l\overline{l} for long wavelength region, m\overline{m} for medium wavelength region, s\overline{s} for short wavelength region.

the CIE XYZ color space:

[XYZ]=VLp(λ)[x(λ)y(λ)z(λ)]dλ[XYZ]=[xyz]Lp(discrete)\begin{bmatrix} X\\Y\\Z \end{bmatrix}= \int_VL_p(\lambda)\begin{bmatrix} \overline{x}(\lambda)\\\overline{y}(\lambda)\\\overline{z}(\lambda) \end{bmatrix} d\lambda \\\,\\ \begin{bmatrix} X\\Y\\Z \end{bmatrix}=\begin{bmatrix} \overline{x}\\\overline{y}\\\overline{z} \end{bmatrix} L_p\tag{discrete}

the x,y,z\overline{x},\overline{y},\overline{z} are the color matching functions (CIE 1931).

CIE chromaticity diagram:

img

the colors in the diagram are specified using chromaticity coordinates using the CIE XYZ tristimulus values:

x=XX+Y+Zy=YX+Y+Zz=1xzx=\frac{X}{X+Y+Z}\\ y=\frac{Y}{X+Y+Z}\\ z=1-x-z

the result factors out effects of brightness.

the xyz is device-agnostic system.

one can convert the spectral signal Lp(λ)L_p(\lambda) resulting from rendering applications to RGB values, by employing RGB tristimulus values r,g,b\overline{r},\overline{g},\overline{b} using some device-dependent transformation TT:

[r(λ)g(λ)b(λ)]=T[x(λ)y(λ)z(λ)]\begin{bmatrix} \overline{r}(\lambda)\\\overline{g}(\lambda)\\\overline{b}(\lambda) \end{bmatrix} =T\begin{bmatrix} \overline{x}(\lambda)\\\overline{y}(\lambda)\\\overline{z}(\lambda) \end{bmatrix}

eg. to obtain T for the standard CRT:

x y z
red .63 .34 .03
green .31 .595 .095
blue .155 .07 .775
white .313 .329 .358

first we use the entries to write

A=[xrxgxbyrygybzrzgzb]A=\begin{bmatrix}x_r&x_g&x_b\\y_r&y_g&y_b\\z_r&z_g&z_b\end{bmatrix}

using the coordinates of thw white point compute

b=A1[xwyw1zwyw]\mathbf{b}=A^{-1}\begin{bmatrix}\frac{x_w}{y_w}\\1\\\frac{z_w}{y_w}\end{bmatrix}

let CC be the matrix whose diagonal is the entries of b\mathbf{b}, the answer is T=(AC)1T=(AC)^{-1}.

after obtaining r,g,b\overline{r},\overline{g},\overline{b}, the tristimulus color is quantified by sampling spectral signal:

R=380 nm720 nmLp(λ)rˉ(λ)dλG=380 nm720 nmLp(λ)gˉ(λ)dλB=380 nm720 nmLp(λ)bˉ(λ)dλ\begin{aligned} &R=\int_{380 \mathrm{~nm}}^{720 \mathrm{~nm}} L_{p}(\lambda) \bar{r}(\lambda) d \lambda \\ &G=\int_{380 \mathrm{~nm}}^{720 \mathrm{~nm}} L_{p}(\lambda) \bar{g}(\lambda) d \lambda \\ &B=\int_{380 \mathrm{~nm}}^{720 \mathrm{~nm}} L_{p}(\lambda) \bar{b}(\lambda) d \lambda \end{aligned}

where the range of VV can vary based on individuals. in practice, summations are used.

metamerism: multiple kinds of light sources finally appear the same in viewer's eyes. eg: materials with different reflectance appear the same color under one illuminant. it is both a blessing and a curse

gamma

monitors are not linear with respect to input. as an approximate characterization of this nonlinearity, monitors are commonly characterized by a gamma value (which is a degree of freedom):

displayed itensity=aγmaximum intensity\text{displayed itensity}=a^\gamma\cdot\text{maximum intensity}

where a[0,1]a\in[0,1] is the input pixel value.

img

usually, we can use this chessboard to correct our input so that a value of a=0.5a=0.5 is displayed with intensity halfway between black (0) and white (1). the gamma of a monitor can be inferred by finding a gray value that appears to have the same intensity as the black and white pattern. once we have the gamma, using the transformation aaγ/1a\mapsto a^{\gamma/1}, we will get displayed intensity = maximum intensity.

optics

materials are characterized by the complex index of refraction:

N(λ)=μ(λ)+jk(λ)N(\lambda)=\mu(\lambda)+jk(\lambda)

when light hits smooth surface, reflection occurs:

img

the law of reflection says incident angle θi\theta_i is equal to reflection angle θr\theta_r, and will on in the same plane n\mathbf{n}. the angle can be obtained: cosθi=nini\cos\theta_i=\frac{\mathbf{n}\cdot\mathbf{i}}{|\mathbf{n}||\mathbf{i}|}. so we can get the reflection direction:

r=i+2cosθin=i2(in)n\mathbf{r}=\mathbf{i}+2\cos\theta_i\mathbf{n}=\mathbf{i}-2(\mathbf{i\cdot\mathbf{n}})\mathbf{n}

the law of refraction says μisinθi=μtsinθ\mu_i\sin\theta_i=\mu_t\sin\theta. then the refraction direction is given by:

t=ncosθt+msinθt\mathbf{t}=-\mathbf{n}\cos\theta_t+\mathbf{m}\sin\theta_t

where m\mathbf{m} is a vector perpendicular to n\mathbf{n} and is in same plane as n,i\mathbf{n},\mathbf{i}.

by [P.S. Heckbert. Writing a ray tracer. In A. Glassner, editor, An Introduction to Ray Tracing, San Diego, CA, 1989. Academic Press] it can be expanded:

t=[ηiηt(in)1(ηiηt)2(1(in)2)]n+ηiηti\mathbf{t}=\left[-\frac{\eta_{i}}{\eta_{t}}(\mathbf{i} \cdot \mathbf{n})-\sqrt{1-\left(\frac{\eta_{i}}{\eta_{t}}\right)^{2}\left(1-(\mathbf{i} \cdot \mathbf{n})^{2}\right)}\right] \mathbf{n}+\frac{\eta_{i}}{\eta_{t}} \mathbf{i}

when θt90°\theta_t\geq90\degree (sqrt becomes negative), we have total internal reflection. the critical angle is θc=arcsin(μtμi)\theta_c=\arcsin(\frac{\mu_t}{\mu_i}).

[E Hecht and A. Zajac. Optics. Addison-Wesley, Reading, Massachusetts, 1974] at an interface between dielectrics, light can also be attenuated, which is given by the Fresnel coefficients for reflection and transmission. it can be split into directions perpendicular and parallel to an interface:

FR=b12+b222b1cosθi+cos2θib12+b22+2b1cosθi+cos2θi FR=FRb12+b222b1sinθitanθi+sin2θitan2θib12+b22+2b1sinθitanθi+sin2θitan2θi b12=12ηi2((ηt2k2ηi2sin2θi)2+4ηt2k2+ηt2k2ηi2sin2θi b22=12ηi2((ηt2k2ηi2sin2θi)2+4ηt2k2(ηt2k2ηi2sin2θi)F_{R\perp}=\frac{b_1^2+b_2^2-2b_1\cos\theta_i+\cos^2\theta_i}{b_1^2+b_2^2+2b_1\cos\theta_i+\cos^2\theta_i}\\~\\ F_{R\parallel}=F_{R\perp}\frac{b_1^2+b_2^2-2b_1\sin\theta_i\tan\theta_i+\sin^2\theta_i\tan^2\theta_i}{b_1^2+b_2^2+2b_1\sin\theta_i\tan\theta_i+\sin^2\theta_i\tan^2\theta_i}\\~\\ b_1^2=\frac{1}{2\eta_i^2}(\sqrt{(\eta_t^2-k^2-\eta_i^2\sin^2\theta_i)^2+4\eta_t^2k^2}+\eta_t^2-k^2-\eta_i^2\sin^2\theta_i\\~\\ b_2^2=\frac{1}{2\eta_i^2}(\sqrt{(\eta_t^2-k^2-\eta_i^2\sin^2\theta_i)^2+4\eta_t^2k^2}-(\eta_t^2-k^2-\eta_i^2\sin^2\theta_i)

for any kk. FR\mathbf{F}_R for polarized light is the weighted sum of the polarized components, in which the weights must sum to one. for unpolarized light, the coefficient is the average of the two components. when k=0k=0, it is:

FR=(ηi2ηt2)2cit2+(cosθi2cosθt2)2nit2(cit(ηi2+ηt2)+nit(cosθi2+cosθt2))2F_{R}=\frac{\left(\eta_{i}^{2}-\eta_{t}^{2}\right)^{2} c_{i t}^{2}+\left(\cos \theta_{i}^{2}-\cos \theta_{t}^{2}\right)^{2} n_{i t}^{2}}{\left(c_{i t}\left(\eta_{i}^{2}+\eta_{t}^{2}\right)+n_{i t}\left(\cos \theta_{i}^{2}+\cos \theta_{t}^{2}\right)\right)^{2}}

where cit=cosθicosθt,nit=ηiηtc_{it}=\cos\theta_i\cos\theta_t,n_{it}=\eta_i\eta_t. these equations apply without regard to the direction of propagation. we also remark there is no absorption at an interface between dielectrics, so the Fresnel coefficient for transmission is simply FT=1FRF_T=1-F_R. once light is transmitted into medium, absorption may occur.

local illumination models

img

Lambertian model for diffuse materials: suppose

img

[A.S. Glassner. Principles of Digital Image Synthesis. Morgan Kaufmann Publishers, Inc, San Francisco, 1995.][F.E. Nicodemus, J.C. Richmond, J.J. Hsia, I.W. Ginsberg, and T. Limperis. Geometrical considerations and nomenclature for reflectance. In L.B. Wolff, S.A. Shafer, and G.E. Healey, editors, Physics-Based Vision Principles and Practice: Radiometry, pages 94–145, Boston, 1992. Jones and Bartlett Publishers.] as consequence, the BRDF of a perfect diffuse model is given by ρπ\frac{\rho}{\pi}, where ρ\rho is the material's reflectance. then we have

img

Lout=ρ(λ)Lin(ψi,λ)cosθi Lout(v)=kdLin(l)(vn)L_{\text{out}}=\rho(\lambda)L_\text{in}(\psi_i,\lambda)\cos\theta_i \\~\\ L_\text{out}(\mathbf{v})=k_dL_\text{in}(\mathbf{l})(\mathbf{v}\cdot\mathbf{n})

where Lin,outL_\text{in,out} are incident and propagated (outgoing) rediances, ψi,ψ\psi_i,\psi are directions of incidence and propagation. in the second equation, reflectance is replaced by coefficient kdk_d usually an RGB triple representing color of material.

for complete environment, Lambertian lighting model is:

Lout(v)=ΩkdπLin(l)(ln)dσlL_\text{out}(\mathbf{v})=\int_\Omega\frac{k_d}{\pi}L_\text{in}(\mathbf{l})(\mathbf{l}\cdot\mathbf{n})d\sigma\mathbf{l}

where Ω\Omega is the hemishpere of all possible incoming directions and dσd\sigma is the solid angle measure. if kd[0,1]k_d\in[0,1], then factor of π\pi is required to conserve energy.

normally we do summations. for complete environment, after considering attenuation, multiple lights and ambient light, there is:

Lout(v)=kaIa+iρ(v,li)Iilinc1+c2ri+c3ri2L_\text{out}(\mathbf{v})=k_aI_a+\sum_i\rho(\mathbf{v},\mathbf{l}_i)I_i\frac{\mathbf{l}_i\cdot\mathbf{n}}{c_1+c_2r_i+c_3r_i^2}

where

if we want to simulate appearance of material that have perfect specular (mirror) light propogation bahavior, we can use law of reflection to obtain reflected vector. similarly if we want coherent transmission (for transparent materials), use Snell's law to get transmitted vector.

specular reflection

Lambertian term models matte surface but not shiny ones as they have 'hightlights' because energy reflected depends on viewer's position.

the classic Phong Bui-Tuong lighting model is:

Lout(v)=kaIa+kd(ln)Id+ks(rv)pIsL_\text{out}(\mathbf{v})=k_aI_a+k_d(\mathbf{l}\cdot\mathbf{n})I_d+k_s(\mathbf{r}\cdot\mathbf{v})^pI_s

or using the previous notation:

ρ(v,l)=kd+ks(rv)pnl\rho(\mathbf{v},\mathbf{l})=k_d+k_s\frac{(\mathbf{r}\cdot\mathbf{v})^p}{\mathbf{n}\cdot\mathbf{l}}

where

img

img

Blinn introduced the Blinn-Phong lighting model variation:

Lout(v)=kaIa+kd(ln)Id+ks(hn)pIsL_\text{out}(\mathbf{v})=k_aI_a+k_d(\mathbf{l}\cdot\mathbf{n})I_d+k_s(\mathbf{h}\cdot\mathbf{n})^pI_s

where

openGL uses blinn-phong model

shading

shading corresponds to the darkening or coloring an object.

given Lin,lL_\text{in},\mathbf{l} and surface properties (including surface normal), we want LoutL_\text{out} in direction v\mathbf{v}

flat shading: shade entire polygon one colour

img

gouraud shading

Gouraud shading interpolates colours across a polygon from the vertices.

img

shading by slicing: for polygons with more then 3 vertices, do

  1. sort vertices by y-coordinate
  2. slice polygon into trapezoids with parallel top and bottom
  3. interpolate colours along each edge of the trapezoid
  4. interpolate colours along each scanline

img

problem: not invariant under rotation:

img

triangulate and shade: make triangles then shade triangles

img

mean value coordinates: provides generalization of Barycentric coordinates

wi=ri+1Ai1riBi+ri1AiAi1Aiw_i=\frac{r_{i+1}A_{i-1}-r_iB_i+r_{i-1}A_i}{A_{i-1}A_i}

where ri=PPir_i=||P-P_i||. this is not normalized. to normalize, devide by iwi\sum_i w_i.

img

img

bilinear interpolation: a patch from an object is a polygon depidcted by parameters u,vu,v in its parametric form.

img

the areas can be expressed as:

A00=(1u)(1v)A01=(1u)(v)A10=u(1v)A11=uvA_{00}=(1-u)(1-v)\\ A_{01}=(1-u)(v)\\ A_{10}=u(1-v)\\ A_{11}=uv

then using vertices' colors CijC_{ij}, we can interpolate colors using:

Color(u,v)=C00A00+C01A01+C10A10+C11A11\text{Color}(u,v)=C_{00}A_{00}+C_{01}A_{01}+C_{10}A_{10}+C_{11}A_{11}

the bilinear interpolation is subject to Mach band effect, which is illusion that suggest the presence of edges where in fact the radiance values are varying smoothly.

img

Gouraud shading is usually used in shading of objects characterzied by a diffuse (Lambertian) light reflection behavior. it cannot handle highlights properly (they will vanish).

eg. how to mitigate mach band effect and disappeared highlights? use non-linear interpolation.

phong shading

phong shading interpolates lighting model parameters, not colours.

steps:

  1. for each pixel:
    1. interpolate the normal
    2. interpolate other shading parameters
    3. compute the view and light vectors
    4. evaluate lighting model

the lighting model does not have to be the Phong lighting model.

phong shading can be simulated with programmable vertex and fragment shaders on modern graphics hardware:

scene graph

we can use a hierachy to organize multiple objects in a scene. each child will be some transformation in coordinates relative to its parent.

img

a matrix stack can be used traverse such hierachy:

traverse(root): push(root.M) // local draw object using composite matrix from stack for child in root.children: traverse(child) pop()

note scaling matrix, you probably do not want this to apply to children.

Week 9. Mar 1

ray tracing

img

for each pixel: ray = (eye, pixel - eye) intersect(ray, scene) // shade the hit object

the objects whose intersection points are closest to screen are visible, and pixels associated with their projection on screen are shaded accordingly.

need to do:

ray casting/tracing

ray casting: selecting the initial ray.

img

to determine ray direction, we need world coordinates of the pixels traversed by the ray. given a screen coordinate pk(xk,yk)p_k(x_k,y_k), we utilize the pipeline in reverse order using parameters image size (nx,nyn_x,n_y), look-from, look-at, up vectors, fov θ\theta, aspect ratio w/hw/h and a distance dd (focal length) in world coordinates

img

steps:

  1. let zk=0z_k=0 and traslate (xk,yk,zk)(x_k,y_k,z_k) by (nx2,ny2,d)(\frac{-n_x}{2},\frac{-n_y}{2},d) with translation matrix T1T_1
  2. scale by (hny,wnx,1)(\frac{-h}{n_y},\frac{w}{n_x},1) where h=2dtanθ2h=2d\tan\frac{\theta}{2} using scaling matrix S2S_2 to preserve aspect ratio and change the x-axis direction. we get viewing coordinates
  3. rotate by R3=[u,v,w,e4]R_3=[\mathbf{u},\mathbf{v},\mathbf{w},\mathbf{e}_4] where
  4. translate by (LookFromx,LookFromy,LookFromz,1)(\text{LookFrom}_x,\text{LookFrom}_y,\text{LookFrom}_z,1) with T4T_4

the result pworldp_\text{world} is T4R3S2T1pkT_4R_3S_2T_1p_k.

the ray can be written as:

puzzle. do we need to use perspective matrix? no. we fix the eye position, so rays have different directions, so perspective is already accounted for. to have orthographic-like projection, put rays through each pixel in same direction.

intersections

triangles:

with vertices P0(x0,y0,z0),P1(x1,y1,z1),P2(x2,y2,z2)P_0(x_0,y_0,z_0),P_1(x_1,y_1,z_1),P_2(x_2,y_2,z_2), the parametric form for a triangle is

P(x,y,z)=P0+β(P1P0)+γ(P2P0)P(x,y,z)=P_0+\beta(P_1-P_0)+\gamma(P_2-P_0)

where 0β,γ10\leq\beta,\gamma\leq1.

the ray can be expressed parametrically as

P(x,y,z)=a+t(ba)P(x,y,z)=\mathbf{a}+t(\mathbf{b}-\mathbf{a})

where a\mathbf{a} is ray origin and 0t10\leq t\leq1. ba\mathbf{b}-\mathbf{a} gives ray direction.

letting two equations equal we can get the intersection point:

axP0x=β(P1xP0x)+γ(P2xP0x)t(bxax)ayP0y=β(P1yP0y)+γ(P2yP0y)t(byay)azP0z=β(P1zP0z)+γ(P2zP0z)t(bzaz)\begin{aligned} &a_{x}-P_{0 x}=\beta\left(P_{1 x}-P_{0 x}\right)+\gamma\left(P_{2 x}-P_{0 x}\right)-t\left(b_{x}-a_{x}\right) \\ &a_{y}-P_{0 y}=\beta\left(P_{1 y}-P_{0 y}\right)+\gamma\left(P_{2 y}-P_{0 y}\right)-t\left(b_{y}-a_{y}\right) \\ &a_{z}-P_{0 z}=\beta\left(P_{1 z}-P_{0 z}\right)+\gamma\left(P_{2 z}-P_{0 z}\right)-t\left(b_{z}-a_{z}\right) \end{aligned}

for performance use cramer's rule one has

β=P1xP0xP2xP0xbxaxP1yP0yP2yP0ybyayP1zP0zP2zP0zbzaz1axP0xP2xP0xbxaxayP0yP2yP0ybyayazP0zP2zP0zbzaz\beta=\left|\begin{array}{ccc} P_{1x}-P_{0x} & P_{2x}-P_{0x} & b_{x}-a_{x} \\ P_{1y}-P_{0y} & P_{2y}-P_{0y} & b_{y}-a_{y} \\ P_{1z}-P_{0z} & P_{2z}-P_{0z} & b_{z}-a_{z} \end{array}\right|^{-1}\left|\begin{array}{ccc} a_x-P_{0x} & P_{2x}-P_{0x} & b_{x}-a_{x} \\ a_y-P_{0y} & P_{2y}-P_{0y} & b_{y}-a_{y} \\ a_z-P_{0z} & P_{2z}-P_{0z} & b_{z}-a_{z} \end{array}\right|

etc.

spheres:

consider sphere with center c\mathbf{c} and radius rr defined implicitly as:

(Pc)(Pc)=r2(Pxcx)2+(Pycy)2+(Pzcz)2=r2(P-\mathbf{c})\cdot(P-\mathbf{c})=r^2\\ (P_x-c_x)^2+(P_y-c_y)^2+(P_z-c_z)^2=r^2

to get intersection, we substitute ray's equation into this equation to get

(ba)2t2+(ba)(ac)2t+(ac)2r2=0(\mathbf{b}-\mathbf{a})^2t^2+(\mathbf{b}-\mathbf{a})\cdot(\mathbf{a}-\mathbf{c})2t+(\mathbf{a}-\mathbf{c})^2-r^2=0

if we want a intersection point between point of origin of the ray (a) and a specific destination point (b), then we should test for 0t10\leq t\leq 1. in the case of two intersections, we pick the closest to the origin.

img

basic ray tracing

after obtaining primary rays, intersections with basic primitives, we can trace the rays through the scene to determine the color of a specific pixel on screen.

img

ray_color(ray r, point2D uv, int maxHits): auto [ hit, kd, ks, ke, // color (RGB triples) n, // normal at intersection point t, // parameter of ray equation at intersection point material_type // diffuse or specular ] = intersect(r, scene) if hit: Color col = ke*Le + kd*La // Le: emitted radiance, La: ambient radiance p = ray(t) // intersection point if material_type is diffuse: col = col + kd * direct_light(p, uv) if material_type is specular and maxHits < /* preset value */: auto reflected_ray = reflection(r.direction, n) col = col + ks * ray_color(reflected_ray, uv, maxHits+1) return col else: // no object is hit, use background color return background_color // RGB triple

surface normal is required to obtain indicence angles required by law of reflection and refraction.

for OO design, we should have some base class Surface with derived classes Triangle, Sphere, Group with intersect virtual methods.

aliasing

raster image sampling is a sampling of a continuous function. if they sample too far apart, the we do not get true representation of scene.

nyquist limit:

resolution:

ray tracing:

sampling improvements

img

oversampling: by increasing number of rays per pixel, one can also mitigate other limitations of the basic ray tracing framework such as sharp shadows, sharp reflections, sharp refractions, but they also add cost.

distributed ray tracing approach by [R.L. Cook, T. Porter, and L. Carpenter. Distributed ray tracing. Computer Graphics (SIGGRAPH Proceedings), 18(4):165–174, July 1984.] consists in stochastically distributing (perturbing) rays rather than simply adding many more of them:

the rendering equation

[J.T. Kajiya. The rendering equation. Computer Graphics (SIGGRAPH Proceedings), 20(4):143–150, 1986]

in simplified form it is written as:

L(x,ψ,λ)=Le(x,ψ,λ)+Lp(x,ψ,λ)L(x,\psi,\lambda)=L_e(x,\psi,\lambda)+L_p(x,\psi,\lambda)

meaning that the radiance of a point xx on a surface in direction ψ\psi and wavelength λ\lambda is given by the sum of the emitted radiance component LeL_e and propagated radiance component LpL_p. usually LeL_e is known, and the computation of LpL_p is the major computational problem.

for ray tracing framework for the simulation of light propagation mechanisms, LpL_p can be computed as

Lp(x,ψ,λ)=incoming ψif(x,ψ,ψi,λ)Li(x,ψi,λ)cosθidωiL_p(x,\psi,\lambda)=\int_{\text{incoming }\psi_i}f(x,\psi,\psi_i,\lambda)L_i(x,\psi_i,\lambda)\cos\theta_id\vec{\omega}_i

where

img

the ray tracing uses determinstic methods (gaussian quadrature) or probabilistic methods to solve the integral. more specifically, new directions followed by a ray are sampled recursively at each intersection point such that

Lr(x,ψ)=Le(x,ψ)+fr(x,ψ,ψi)Le(x,ψ)+fr(x,ψ,ψi)fr(x,ψ,ψi)Le(x,ψ)+\begin{aligned} L_{r}(x, \psi)=& L_{e}(x, \psi)+\\ & f_{r}\left(x, \psi, \psi_{i}\right) L_{e}\left(x^{\prime}, \psi^{\prime}\right)+\\ & f_{r}\left(x, \psi, \psi_{i}\right) f_{r}\left(x^{\prime}, \psi^{\prime}, \psi_{i}^{\prime}\right) L_{e}\left(x^{\prime \prime}, \psi^{\prime \prime}\right)+\ldots \end{aligned}

where LrL_r is reflected radiance. each ray bounce can be seen as a state of a random walk.

Week 10. Mar 8

texture mapping

if we want to render a brick wall, we could geometrically model the details of each brick and the grooves between them... but can also simply map a raster file depicting a brick wall.

we can do texture mapping in different rendering pipelines or ray tracing. if done by ray tracing, we do

  1. find intersection point p(x,y,z)p(x,y,z) on the surface
  2. convert to u, v coordinates between [0, 1] of the (parametric) surface
  3. pick correct pixel/color from the raster file using weighted average

img

eg. suppose we want to map the texture file (right) to the rectangular region (left),

img

after getting the point p(u,v)p(u,v), find the corresponding location in the raster file:

di=(w1)u, dj=(h1)vd_i=(w-1)u,~d_j=(h-1)v

then find up,vpu_p,v_p in [0, 1] for that pixel

i=int(di), j=int(dj), up=dii, vp=djji=\texttt{int}(d_i),~j=\texttt{int}(d_j),~u_p=d_i-i,~v_p=d_j-j

make sure they are within the raster file and clamp if needed (unless we want to repeat). then obtain the color of surrounding pixels:

C00=color(i,j), C01=color(i,j+1), C10=color(i+1,j), C11=color(i+1,j+1)C_{00}=\text{color}(i,j),~C_{01}=\text{color}(i, j+1),~C_{10}=\text{color}(i+1, j),~C_{11}=\text{color}(i+1, j+1)

finally interpolate color using bilinear interpolation

C(u,v)=C00(1up)(1vp)+C01(1up)vp+C10up(1vp)+C11upvpC(u,v)=C_{00}(1-u_p)(1-v_p)+C_{01}(1-u_p)v_p+C_{10}u_p(1-v_p)+C_{11}u_pv_p

how to find the transformation from p(x,y,z)p(x,y,z) to (u,v)(u,v)? for complicated (curvy) surfaces, can use

[Ch 11 "Fundamentals of Computer Graphics" by S. Marschner and P. Shirley]

the main idea is straightforward, but it requires attention for correct use, eg

if needed, tile the polygon

bump (normal) mapping

img

eg. suppose a surface represented parametrically by Q(u,v)Q(u,v) where 0u,v10\leq u,v\leq1. its tanget Qv,QuQ_v,Q_u can be computed using partial derivatives.

img

we can then compute

X=N×Qv, Y=N×QuX=N\times Q_v,~Y=N\times Q_u

then the displacement is:

D=BuXBvYD=B_uX-B_vY

where Bu,BvB_u,B_v are partial derivatives (tangents) of the bump map at B(u,v)B(u,v). using user-defined threshold ϵ (eg 1/64), these partial derivatives can be approximated:

Bu=B(u+ϵ,v)B(uϵ,v)2ϵBv=B(u,v+ϵ)B(u,vϵ)2ϵB_u=\frac{B(u+\epsilon,v)-B(u-\epsilon,v)}{2\epsilon}\\ B_v=\frac{B(u,v+\epsilon)-B(u,v-\epsilon)}{2\epsilon}

then we can get N=N+DN'=N+D.

img

solid textures

it is hard to texture map onto curved surfaces, so we use a 3D texture. given a point p(x,y,z)p(x,y,z) on the surface, the color is T(x,y,z)T(x,y,z) where TT is the texture field.

for example:

if (floor(x) + floor(y) + floor(z)) % 2 == 0 { return red; } else { return silver; } // gives a '3D checker board'

img

turbulance can also be created, for example in 1D:

turbulance(x)=i=0k[noise(2ix)2i]\texttt{turbulance}(x)=\sum_{i=0}^k\left|\left[\frac{\texttt{noise}(2^ix)}{2^i}\right]\right|

where kk is the smallest integer satisfying

12k+1<pixel size\frac{1}{2^{k+1}}<\text{pixel size}

using solid textres avoids the problem of continuity when painting multiple objects, but requires analytical formula.

shadow mapping and environment mapping

... read Sections 11.1, 11.2, 11.4 and 11.5 of the e-book "Fundamentals of Computer Graphics" by S. Marschner and P. Shirley

environment map: if light source is far from the object, then the object's lighting vary a little, so we can consider the only variable is the direction. we can view the light source as having a raster image that is to be used with spherical coordinates. for the object, it will be on the light source's sphere, so we can give it texture mimicking reflection of the sky.

performance

ray intersect object is often expensive and can be improved in two ways:

bounding box:

img

spatial subdivision:

img

path tracing

in basic ray tracing approach, a ray path can have two branches: reflect and transmit. [J.T. Kajiya. The rendering equation. Computer Graphics (SIGGRAPH Proceedings), 20(4):143–150, 1986.] proposes extension that follows most probable path at each intersection and shoots rays toward light source. it can be seen as algorithm that samples most important paths

img

steps:

note this applies to non-diffuse materials, for diffuse materials, no other ray propagates from them except shadow rays.

it can be altered by using weight ww that is initially 1. after generating random number, we still follow reflected and transmitted rays but with weights wFR,(1w)FRwF_R,(1-w)F_R respectively. if weight is lower than some threshold, we stop (similar to counter).

this can be extended to be combined with distributed ray tracing approach

img

when ray hits interface between air and polish material, compute fresnel coefficient and compare with random number. for a ray reflected on the air-polish interface, we can perturb its direction, by polar and azimuthal angular displacements:

α=cos1(1x1)1/(np+1), β=2πx2\alpha=\cos^{-1}(1-x_1)^{1/(n_p+1)},~\beta=2\pi x_2

obtained by [G.V.G. Baranoski and J.G. Rokne. Light Interaction with Plants: A Computer Graphics Perspective. Horwood Publishing, Chichester, UK, 2004. Chapter 2.]. where x1,x2[0,1]x_1,x_2\in[0,1] are random, and npn_p is phong-like exponent. larger npn_p -> glossier surface.

when ray hits diffuse substrate, a reflected ray with a cosine distribution is generated by perturbing material's normal using polar and azimuthal angular displacements:

αd=cos1(1x3)1/2, βd=2πx4\alpha_d=\cos^{-1}(1-x_3)^{1/2},~\beta_d=2\pi x_4

where x3,x4[0,1]x_3,x_4\in[0,1] are random.

Week 11. Mar 15

radiosity method

standard (classic) radiosity method

like ray tracing method, the radiosity method can also be used to solve the rendering equation. instead of expressing LpL_p in terms of all directions visible to a point like in ray tracing, we consider term LpL_p witten as integral over all surfaces within the environment:

Lp(x,dψ,λ)=all xjf(x,ψ,ψi,λ)Li(x,ψi,λ)cosθiV(x,xj)cosθjdAjxjx2L_p(x,d\psi,\lambda)=\int_{\text{all }x_j}f(x,\psi,\psi_i,\lambda)L_i(x,\psi_i,\lambda)\cos\theta_iV(x,x_j)\frac{\cos\theta_jdA_j}{||x_j-x||^2}

where

img

in this model, we use radiosity (radiant exitance) MM instead of radiance (we use conversion M=πLM=\pi L later), and assume all surfaces are diffuse reflectors. we also use the concept of form factor.

defn. the form factor/configuration factor indicates how a patch ii 'sees' a patch jj, ie it specifies the ratio between the fractionn of radiant flux Φij\Phi_{ij} that leaves patch ii and arrives jj, to the total radiant flux Φi\Phi_i that leaves patch ii:

Fij=ΦijΦi=1AiAiAjcosθicosθjπxixj2V(xi,xj)dAidAjF_{ij}=\frac{\Phi_{ij}}{\Phi_i}=\frac{1}{A_i}\oiint_{A_i}\oiint_{A_j}\frac{\cos\theta_i\cos\theta_j}{\pi||x_i-x_j||^2}V(x_i,x_j)dA_idA_j

where

see Section 18.5 of the e-book "Principles of Digital Image Synthesis" by A..Glassner for derivation.

computation is form factors is usually the performance bottleneck. there are some identities that can be used to simplify the computation:

  1. reciprocity: AiFij=AjFjiA_iF_{ij}=A_jF_{ji}
  2. summation: j=1nFij1 i=1,2,...,n\sum_{j=1}^nF_{ij}\leq1~\forall i=1,2,...,n
  3. a planar of a convex patch cannot see itself, ie Fii=0F_{ii}=0

if we assume a closed enviverment whose surfaces are divided into nn patches, then total radiant flux (power) leaving a patch depends on spectral radiant flux emitted by this patch plus spetral radiant flux reflected from the patch (recursively). also assume environment composes of only diffuse surfaces, then we can drop the directional and positional dependencies, and we have

Φj(λ)=ΦjE(λ)+ρj(λ)i=1nFijΦj(λ) j=1,2,...,.\Phi_j(\lambda)=\Phi_j^E(\lambda)+\rho_j(\lambda)\sum_{i=1}^nF_{ij}\Phi_j(\lambda)~\forall j=1,2,...,.

where

the perfect diffuse BRDF is for a patch jj is given by:

fd(λ)=ρj(λ)πf_d(\lambda)=\frac{\rho_j(\lambda)}{\pi}

the π\pi term is absorbed in the definition of form factor FijF_{ij}.

we can write the equation as spectral radiant exitance (spectral radiant flux leaving an element per unit area) and spectral irradiance (spectral radiant flux emitted by element per unit area):

Φj(λ)=πMj(λ)Aj ΦjE(λ)=πEj(λ)Aj\Phi_j(\lambda)=\pi M_j(\lambda)A_j\\~\\ \Phi_j^E(\lambda)=\pi E_j(\lambda)A_j

where

by substitution we get:

πMj(λ)Aj=πEj(λ)Aj+ρj(λ)i=1nFijπMi(λ)Ai\pi M_j(\lambda)A_j=\pi E_j(\lambda)A_j+\rho_j(\lambda)\sum_{i=1}^nF_{ij}\pi M_i(\lambda)A_i

applying reciprocity relationship and divide by AjπA_j\pi, we get the classical expression in terms of spectral radiant exitance for patch jj:

Mj(λ)=Ej(λ)+ρj(λ)i=1nFjiMi(λ)M_j(\lambda)=E_j(\lambda)+\rho_j(\lambda)\sum_{i=1}^nF_{ji}M_i(\lambda)

["Modeling the Interaction of Light Between Diffuse Surfaces" by C.M. Goral et al.]

remark. radiosity = radiant exitance = radiant flux density.

now we have a system of linear equations Gm=eGm=e whose coefficient matrix is GG, the unknowns mm are radiant exitances, and ee is vector of irradiances. the elements of GG are Gij=δijρiFijG_{ij}=\delta_{ij}-\rho_iF_{ij} where δij\delta_{ij} is kronecker delta. the matrix is also called radiosity coefficient matrix and is commonly represented by:

G=IPFG=I-PF

where PP is a diagonal matrix whose diagonal entries pii=ρip_{ii}=\rho_i, and FF is form factor matrix.

since this matrix is sparse, and we do not need high accuracy but speed, direct methods like gaussian elimination and LU decomposition are not suitable. these aspects plus the special properties of GG ir nonsingularity and diagonal dominance, iterative methods like conjugate gradient and the Chebyshev methods are more suitable.

as colors are usually represented as RGB triple, we need to solve this system 3 times. then we use interpolation techniques to get smooth picture.

[M.F. Cohen and J.R. Wallace. Radiosity and Realistic Image Synthesis. Academic Press Professional, Cambridge, 1993.]

improvements

radiosity via ray tracing approach

as in path-tracing strategy, consider a ray carries a certain amount of radiant power/rediant flux Φ\Phi. initially we pick a patch from which we shoot (cast) some rays to the environment. normally, the selected patch corresponds to a light source. as patches are assumed to be diffuse, rays follow cosine distribution, we perturb its normal and shoot the ray rd\mathbf{r}_d.

once we shoot the ray from light source, we recursively determine patches hit by the ray and record the information.

shoot_rays(int Nr): auto n = new int[Nr] {0} // counter of number hits received by patch n[i] for patch i in patches: for j = 0, Nr, 1: auto ray = create ray rj with cos distribution from light source auto maxHits = 0 find_patch(rj, maxHits) return n find_patch(Ray rj int maxHits): auto [i, p] = find patch i hit by ray r at point p ++n[i] auto u = random(0, 1) if u <= patches[i].R and maxHits < /*predefined constant*/: auto rd = generated reflected ray rd with cos distribution from p find_patch(rd, maxHits) + 1

after we trace random walks of the Nr rays shot from the light, we can compute the power incident on patch ii as:

Φi=ΦtotalniNr\Phi_i=\Phi_\text{total}\frac{n_i}{N_r}

where Φtotal\Phi_\text{total} is user-specified value for total power shot from light source

since patches are diffuse, the radiance can be calculated as:

Li=ρiΦiπAiL_i=\frac{\rho_i\Phi_i}{\pi A_i}

where ρi\rho_i is the reflectance, AiA_i area of patch ii [P. Shirley. A ray tracing method for illumination calculation in diffuse-specular scenes. In Graphics Interface, pages 205–212, Toronto, 1990. Canadian Information Processing Society.].

if reflectance is represented by RGB triple, this calculation needs to be calculated 3 times.

this method solves rendering equation by recursively computing reflected component, and can be represented as a Neumann series:

Lr(x)=E+frd iE+frd i(frd jE)+...L_r(x)=E+f_{rd~i}E+f_{rd~i}(f_{rd~j}E)+...

where EE is light source irradiance and frd i=ρiπf_{rd~i}=\frac{\rho_i}{\pi} is BRDF of diffuse patches.

if we shoot NrN_r rays with cos distribution from patch ii and mm rays arrive at patch jj, then the form factor can be approximated as: Fij=mNrF_{ij}=\frac{m}{N_r}.

one can also have diffuse, translucent, surfaces (lampshade). in this case, the algo would also consider rays transmitted with cos distribution, and the rendering equation will have a transmitted component handled similarly.

Ltotal=Lemitted+(Lreflected+Ltransmitted)L_{\text{total}}=L_{\text{emitted}}+(L_\text{reflected}+L_\text{transmitted})

img

Note the color bleeding effect (when the diffuse reflection of a colored patch influences the color of another patch), typical of diffuse interreflections, on the sphere surface. Note also some degree of noise on the surfaces. Since diffuse surfaces reflect light in all directions, to closely approximate this behavior, one has to use a relatively large number of rays. Usually the presence of noise in images generated using stochastic techniques is associated with the use of an insufficient number of rays for a given scene.

standard multipass method

standard multipass method has two main passes:

  1. radiosity pass: generates results from multiple diffuse interreflections in the environment which are obtained using a relative coarse mesh. as radiosity is view-independent, it can be stored and used later to replace ad hoc ambient component
  2. ray tracing pass: adds view dependent features (hightlights). the relative visibility of objects is normally handled in the stage.

img

note no single type of light should be included more than once in the computation of final radiance of point (patch). we can substract the direct component from the first pass.

img

the color-bleeding effect can be obtained using the radiosity pass.

in standard ray tracing method, we follow light backwards. there are view-independent effects like caustics (bright regions elicited by light rays reflected/refracted by curved surface or object acting as a lens) that require tracing forward. we can have a third pass to trace rays from light sources through non-diffuse objects, storing their contributions to nearby surfaces. if the surfaces are somewhat regualar (polygon) they can be stored in a simple matrix (illumination map), otherwise, more complex data structures like trees can be used.

Week 12. Mar 22

splines

splines: constructing curve segments.

cubic hermite curve

review about hermite cubics: a segment of a cubic Hermite spline allows the positions and first derivatives of both of its endpoints to be specified. A chain of segments can be linked into a C1C^1 spline by using the same values for the position and derivative of the end of one segment and for the beginning of the next.

given a set of nn control points, where every other control point is a derivative value, a cubic Hermite spline contains (n2)/2(n - 2)/2 cubic segments.

Bernstein polynomials

linear blend: create segment from an affine combination of points using two control points:

P01(t)=(1t)P0+tP1P_0^1(t)=(1-t)P_0+tP_1

img

quadratic blend: create segment from an affine combination of line segments using three control points:

P01(t)=(1t)P0+tP1P12(t)=(1t)P1+tP2P02(t)=(1t)P01(t)+tP11(t)\begin{aligned} P_0^1(t)&=(1-t)P_0+tP_1\\ P_1^2(t)&=(1-t)P_1+tP_2\\ P_0^2(t)&=(1-t)P_0^1(t)+tP_1^1(t) \end{aligned}

img

cubic blend: create segment from an affine combination of quadratic segments using four control points:

P01(t)=(1t)P0+tP1P11(t)=(1t)P1+tP2P21(t)=(1t)P2+tP3P02(t)=(1t)P01(t)+tP11(t)P12(t)=(1t)P11(t)+tP21(t)P03(t)=(1t)P02(t)+tP12(t)\begin{aligned} P_0^1(t)&=(1-t)P_0+tP_1\\ P_1^1(t)&=(1-t)P_1+tP_2\\ P_2^1(t)&=(1-t)P_2+tP_3\\ P_0^2(t)&=(1-t)P_0^1(t)+tP_1^1(t)\\ P_1^2(t)&=(1-t)P_1^1(t)+tP_2^1(t)\\ P_0^3(t)&=(1-t)P_0^2(t)+tP_1^2(t) \end{aligned}

img

algo (de Casteljau): given ii points

  1. join the points PiP_i's by line segments
  2. divide each segment in halves by ratio t:(1t)t:(1-t). join the t:(1t)t:(1-t) points of those line segments by line segments
  3. repeat for the resultant points if necessary (until there is one segment)
  4. the t:(1t)t:(1-t) points on the final line segment is a point on the curve
  5. the final line segment is tangent to the curve at tt

if we expand the terms, we see the original input points appear as coefficients of Bernstein polynomials:

P00(t)=1P0P01(t)=(1t)P0+tP1P02(t)=(1t)2P0+2(1t)tP1+t2P2P03(t)=(1t)3P0+3(1t)2tP1+3(1t)t2P2+t3P3...P0n(t)=i=0nPiBin(t) where Bin(t)=n!(ni)!i!(1t)niti=(ni)(1t)niti\begin{aligned} P_{0}^{0}(t) &=1 \cdot P_{0} \\ P_{0}^{1}(t) &=(1-t) P_{0}+t P_{1} \\ P_{0}^{2}(t) &=(1-t)^{2} P_{0}+2(1-t) t P_{1}+t^{2} P_{2} \\ P_{0}^{3}(t) &=(1-t)^{3} P_{0}+3(1-t)^{2} t P_{1}+3(1-t) t^{2} P_{2}+t^{3} P_{3} \\...\\ P_{0}^{n}(t) &=\sum_{i=0}^{n} P_{i} B_{i}^{n}(t) \\ \text { where } & B_{i}^{n}(t)=\frac{n !}{(n-i) ! i !}(1-t)^{n-i} t^{i}=\left(\begin{array}{c} n \\ i \end{array}\right)(1-t)^{n-i} t^{i} \end{aligned}

properties:

Bézier splines

defn. a degree nn (order n+1n+1) Bézier curve segment is

P(t)=i=0nPiBin(t)P(t)=\sum_{i=0}^n P_i B_i^n(t)

where PiP_i's are k-dimensional control points.

curve of degree nn is controlled by n+1n+1 control points. the curve interpolates its first and last control points, and the shape is directly influenced by the other points.

properties:

see [Ch 15.6.1 "Fundamentals of Computer Graphics" by S. Marschner and P. Shirley] for algebraic derivation of Bézier curves.

spline continuity

polynomials are inadequate:

piecewise polynomials (splines):

C0C^0 piecewise cubic bezier:

img

C1C^1 piecewise cubic bezier:

img

cubic hermite interpolation:

Catmull-Rom splines:

img

C2C^2 piecewise cubic bezier:

img

img

the grey points are B-spline control points. we can use them to recover the black points.

why not use interpolatory curves?

B-splines

defn. for a given sequence of knots t1,t2,...tnt_1,t_2,...t_n, there is a unique, up to a scaling factor, spline Bi,nB_{i,n} satisfying

Bi,n(x)={0 if x<ti or xti+nnon-zero  otherwise B_{i, n}(x)= \begin{cases}0 & \text { if } x<t_{i} \text { or } x \geq t_{i+n} \\ \text {non-zero } & \text { otherwise }\end{cases}

if we additionally constrain that i=0nBi,n(x)=1\sum_{i=0}^nB_{i,n}(x)=1, then the scaling factor is fixed, and the resulting functions Bi,n(x)B_{i,n}(x) are called B-splines. their linear combination is B-spline curve.

to evaluate B-splines, we would convert to bezier and evaluate the bezier curves

nonuniformity:

see [Ch 15.6.2 "Fundamentals of Computer Graphics" by S. Marschner and P. Shirley] for B-splines.

tensor product patches

defn. a control polygon is polygonal mesh with vertices Pi,jP_{i,j}.

defn. patching blending functions are products of curve basis functions

P(s,t)=i=0nj=0nPi,jBi,jn(s,t)P(s,t)=\sum_{i=0}^{n} \sum_{j=0}^{n} P_{i,j}B_{i,j}^n(s,t)

where Bi,jn(s,t)=Bin(s)Bjn(t)B_{i,j}^n(s,t)=B_i^n(s)B_j^n(t).

img

for the bezier case, just like bezier curve has 4 control points, a bezier patch has 16 control points. traversing along any parametric direction we traverse a bezier curve.

img

properties:

how to render:

tensor product B-splines:

tensor product patch works well for rectilinear patches, but problems arise when filling non-rectangular holes.

img

Week 13. Mar 29

animation

animation: rapid display of slightly different images creates illusion of motion.

(traditional) 2D cel animation:

automated keyframing:

utility:

functional animation

linear interpolation (lerp): given two parameters p0,p1p_0,p_1 at times t0,t1t_0,t_1, the value of the parameter at time tt is given by:

p(t)=t1tt1t0p0+tt0t1t0p1p(t)=\frac{t_1-t}{t_1-t_0}p_0+\frac{t-t_0}{t_1-t_0}p_1

given constant rate of change, we can create variable rate of change. given any function τ=f(t)\tau=f(t), we can reparameterize with τ\tau:

p(τ)=p(f(t))=f(t1)f(t)f(t1)f(t0)p0+f(t)f(t0)f(t1)f(t0)p1p(\tau)=p(f(t))=\frac{f(t_1)-f(t)}{f(t_1)-f(t_0)}p_0+\frac{f(t)-f(t_0)}{f(t_1)-f(t_0)}p_1

spline interpolation:

transformation matrix animation:

rigid body animation:

motion path animation

when we translate a point along a path, we want

splines can easily support continuity control, but velocity control is more difficult:

arc length parameterization:

  1. given spline path P(u)=[x(u),y(u),z(u)]P(u)=[x(u),y(u),z(u)], compute arclength of spline as function of uu: s=A(u)s=A(u)
  2. find inverse: u=A1(s)u=A^{-1}(s)
  3. substitute u=A1(s)u=A^{-1}(s) into P(u)P(u) to find motion path parameterized by arc length: P(s)=P(A1(s))P(s)=P(A^{-1}(s))
  4. u,su,s thus should be global parameters, extending across all segments of the original spline

the arclength is given by:

s=A(u)=0u(dx(v)dv)2+(dy(v)dv)2+(dz(v)dv)2dvs=A(u)=\int_{0}^u\sqrt{\left(\frac{dx(v)}{dv}\right)^2+\left(\frac{dy(v)}{dv}\right)^2+\left(\frac{dz(v)}{dv}\right)^2}dv

issue: s=A(u)s=A(u) does not have analytic solution if motion path is cubic spline => A1(s)A^{-1}(s) has no analytic solution.

velocity control:

  1. let s=f(t)s=f(t) specify distance along spline as function of tt
  2. function f(t)f(t), being just scalar value, can be supported with a functional animation technique (eg. another spline)
  3. function f(t)f(t) may be specified as integral of yet another function v(t)=ddtf(t)v(t)=\frac{d}{dt}f(t)
  4. motion path as function of time is P(A1(f(t)))P(A^{-1}(f(t)))

issue in real-time animation:

img

interpolating rotations

we interpolate angles rather than transformation matrices. in 3d, orientation requires 3 degrees of freedom, interpolation is nastier and harder to visualize.

there are two approaches: euler angle and quaternions.

Euler angles: x-roll, then y-roll, then z-roll

with ca=cosθa,sa=sinθac_a=\cos\theta_a,s_a=\sin\theta_a, the transformation is:

R(θx,θy,θz)=[cyczcyszsy0sxsyczcxszsxsysz+cxczsxcy0cxsycz+sxszcxsyszsxczcxcy00001]=Rx(θx)Ry(θy)Rz(θz)\begin{aligned} R\left(\theta_{x}, \theta_{y}, \theta_{z}\right) &=\left[\begin{array}{cccc} c_{y} c_{z} & c_{y} s_{z} & -s_{y} & 0 \\ s_{x} s_{y} c_{z}-c_{x} s_{z} & s_{x} s_{y} s_{z}+c_{x} c_{z} & s_{x} c_{y} & 0 \\ c_{x} s_{y} c_{z}+s_{x} s_{z} & c_{x} s_{y} s_{z}-s_{x} c_{z} & c_{x} c_{y} & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \\ &=R_{x}\left(\theta_{x}\right) R_{y}\left(\theta_{y}\right) R_{z}\left(\theta_{z}\right) \end{aligned}

animation between two rotations involves interpolating the three angles independently.

gimbal lock: suppose θy=π2\theta_y=\frac{\pi}{2}, then the matrix R(θx,π2,θz)R(\theta_x,\frac{\pi}{2},\theta_z) becomes:

R(θx,θy,θz)=[0010sxczcxszsxsz+cxcz00cxcz+sxszcxszsxcz000001]=[0010sin(θxθz)cos(θxθz)00cos(θxθz)sin(θxθz)000001]\begin{aligned} R\left(\theta_{x}, \theta_{y}, \theta_{z}\right) &=\left[\begin{array}{cccc} 0 & 0 & -1 & 0 \\ s_{x} c_{z}-c_{x} s_{z} & s_{x} s_{z}+c_{x} c_{z} & 0 & 0 \\ c_{x} c_{z}+s_{x} s_{z} & c_{x} s_{z}-s_{x} c_{z} & 0 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \\ &=\left[\begin{array}{cccc} 0 & 0 & -1 & 0 \\ \sin \left(\theta_{x}-\theta_{z}\right) & \cos \left(\theta_{x}-\theta_{z}\right) & 0 & 0 \\ \cos \left(\theta_{x}-\theta_{z}\right) & \sin \left(\theta_{x}-\theta_{z}\right) & 0 & 0 \\ 0 & 0 & 0 & 1 \end{array}\right] \end{aligned}

quaternions approach:

defn. a quaternion is a scalar plus a vector component q=(s,v),sR,vR3q=(s,\mathbf{v}),s\in\R, \mathbf{v}\in\R^3.

interpolating unit quaternions:

animating camera motion

their requirements are different from object motion:

camera cinematic terminology:

camera cinematics:

camera animation specs:

kinemetics

kinemetics: study of motion independent of the forces that cause the motion, including position, velocity and acceleration

forward kinematics: determination of x, v, a of all the links in an articulated model given the x, v, a of the root of the model and all transformation between links

inverse kinematics: derivation of motion of intermediate links in an articulated body given the motion of some key links

in most situations, the animator just wants them to move naturally 'on their own,' and one is much more interested in specifying the behavior of the endpoint of a joint chain

Week 14. Apr 5

realism

["Three Varieties of Realism in Computer Graphics" by J.A. Ferwerda.]

variables of realism:

determining functional realism: