Fall 2014 – Notes

Parts
- \(f:X \to Y\) is surjective \(\iff \forall y\in Y,~\exists x\in X {~\mathrel{\Big\vert}~}y=f(x)\).
- \(f:X \to Y\) is not injective \(\iff \exists x_1\neq x_2 \in X {~\mathrel{\Big\vert}~}f(x_1) = f(x_2).\)
- \(f: (X, d_x) \to (Y, d_y)\) is uniformly continuous \(\iff \forall \varepsilon \exists \delta(\varepsilon) {~\mathrel{\Big\vert}~}\forall x_1,x_2\in X,~ \quad d_x(x_1, x_2) \leq \delta \implies d_y(f(x_1), f(x_2)) \leq \varepsilon.\)
  
  (Note that \(\delta\) can only depend on \(\varepsilon\) and must work for all \(x_1, x_2\) simultaneously.)
- \(f\) is not uniformly continuous \(\iff \exists \varepsilon {~\mathrel{\Big\vert}~}\forall \delta, \exists x_1, x_2\in X {~\mathrel{\Big\vert}~}\quad d_x(x_1, x_2) \leq \delta ~\&~ d_y(f(x_1), f(x_2)) > \varepsilon.\)
Base case: for \(n=1\), we have \(a_1 = 1 \leq a_2 = \frac{16} 3 \leq 10.\) Suppose this holds for \(k < n\), then \begin{align*} a_{n-1} \leq a_n = \frac{a_{n-1}}{3} + 5 \implies 3a_{n-1} \leq a_{n-1} + 15 \implies a_{n-1} \leq \frac{15}{2} \end{align*}

and thus we have \begin{align*} a_{n+1} = \frac{a_n}{3} + 5 = \frac{1}{3}(a_n + 15) \\ = \frac{1}{3}((\frac{a_{n-1}}{3} + 5) + 15) \\ = \frac{a_{n-1} + 60}{9} \\ \leq \frac{\frac{15}{2} + 60}{9} \\ = \frac{150}{18} \\ < \frac{180}{18} = 10, \end{align*}

and \(a_{n+1} \leq 10\). Moreover, note that the relation \(a_{n+1} = \frac{a_n}{3} + 5\) can be rewritten as \begin{align*} a_n = 3a_{n+1} - 15, \\ a_{n-1} = 3a_n - 15. \end{align*} Using the inductive hypothesis \(a_{n-1} \leq a_n\), we can thus write \begin{align*} 3a_n - 15 = a_{n-1} \leq a_n = 3a_{n+1} - 15, \end{align*}

from which we get \(3a_{n} - 15 \leq 3a_{n+1} - 15\) and thus \(a_{n} \leq a_{n+1}\).

To compute \(\lim_{n\to\infty}a_n\), perhaps there are easier ways, but we can just use generating functions. Note that the limit exists by the Monotone Convergence Theorem. Let \(A(x) = \sum_{n=0}^\infty a_n x^n\) where \(a_0 = 0\). Then applying the magic sauce, we have \begin{align*}\begin{align*} a_n = \frac{1}{3}a_{n-1} + 5 &\implies \sum_{n=1}^\infty a_nx^n = \frac{1}{3}\sum_{n=1}^\infty a_{n-1}x^n + 5\sum_{n=1}^\infty x^n \\ &\implies A(x) - a_0 = \frac 1 3 xA(x) + 5\left( \frac 1 {1-x} - 1\right) \\ &\implies A(x)\left(1 - \frac x 3\right) = 5\left( \frac x {1-x}\right) \\ &\implies A(x) = 15\left(\frac 1 {3-x} \right)\left(\frac x {1-x} \right) \\ &\implies A(x) = \frac{15x}{(3-x)(1-x)} \\ &\implies A(x) = \frac{-\frac{45}{2}}{3-x} + \frac{\frac{15}{2}}{1-x} \\ &\implies A(x) = \frac 3 2 \left(-5 \left( \frac{1}{1-\frac x 3} \right) + 5\left( \frac 1 {1-x}\right) \right) \\ &\implies A(x) = \frac{15}{2} \sum_{n=0}^\infty \left(1 - \left( \frac 1 3\right)^n\right)x^n \\ &\implies a_n = \frac {15} 2 \left(1 - \left( \frac 1 3\right)^n\right) \end{align*}\end{align*}

and so we find \begin{align*} \lim_{n\to\infty}a_n = \lim_{n\to\infty}\frac {15} 2 \left(1 - \left( \frac 1 3\right)^n\right) = \frac{15}{2}. \hfill\blacksquare \end{align*}

Alternatively: \begin{align*} a_{n+1} = \frac 1 3 a_n + 5 \implies \lim_{n\to\infty} a_{n+1} = \lim_{n\to\infty} \frac 1 3 a_n + 5 \\ \implies L = \frac 1 3 L + 5 \implies \frac 2 3 L = 5 \implies L = \frac {15} 2 \end{align*}

Parts
- Suppose \(\exists M_g {~\mathrel{\Big\vert}~}\forall x,~ g(x) < M\). Then let \(\varepsilon > 0\) be arbitrarily chosen; we want to show that there exists a \(\delta\) such that \({\left\lvert {x} \right\rvert} \leq \delta \implies {\left\lvert {f(x)g(x)} \right\rvert} \leq \varepsilon\). Since \(\lim_{x\to 0} f(x) = 0\), choose a \(\delta_f\) such that \({\left\lvert {x} \right\rvert} \leq \delta_f \implies {\left\lvert {f(x)} \right\rvert} \leq \frac{\varepsilon}{M_g}\). So letting \(\delta = \delta_f\), we have \begin{align*} {\left\lvert {x} \right\rvert} \leq \delta \implies {\left\lvert {f(x)g(x)} \right\rvert} = {\left\lvert {f(x)} \right\rvert} {\left\lvert {g(x)} \right\rvert} \leq {\frac{\varepsilon}{M_g}}{\left\lvert {g(x)} \right\rvert} \leq \frac{\varepsilon}{M_g}M_g = \varepsilon. \hfill\blacksquare \end{align*}
- Let \(f(x) = x\) and \(g(x) = \frac{1}{x}\). Note that \(g(x)\) is unbounded in any neighborhood of 0, and \(f(x)g(x) = 1 \not\to 0\).
Let \(\mathbf{w}_i\) be the proposed new basis elements – then \(\left\{{\mathbf{w}_i}\right\}\) will be a basis if it is linearly independent and spans \({\mathbf{R}}^3\). Since there are already three vectors in this set, we only need to check that they are linearly independent. By definition, we have \begin{align*} \left\{{\mathbf{e}_i}\right\} \text{ is linearly independent} \iff \sum c_i \mathbf{e}_i = \mathbf{0} \implies \forall i, ~ c_i = 0. \end{align*}

Furthermore, since \(\left\{{\mathbf{v}_i}\right\}\) is known to be a basis, we have \begin{align*} \sum c_i \mathbf{v}_i = \mathbf{0} \implies \forall i, ~ c_i = 0. \end{align*}

So suppose \(\sum c_i \mathbf{w}_i = \mathbf{0}\), we want to show that \(c_i = 0\) for each \(i\). (This will mean that \(\left\{{\mathbf{w}_i}\right\}\) is linearly independent.)

We can expand this in terms of \(\mathbf{v}_i\) as follows: \begin{align*} c_1 \mathbf{w}_1 + c_2 \mathbf{w}_2 + c_3 \mathbf{w}_3 = \mathbf{0}\\ \implies c_1 (\mathbf{v}_1 + \mathbf{v}_2) + c_2(\mathbf{v}_1 + \mathbf{v}_2 + \mathbf{v}_3) + c_3(-\mathbf{v}_2 + 2\mathbf{v}_3) = \mathbf{0}\\ \implies c_1 \mathbf{v}_1 + (c_1+c_2+c_3) \mathbf{v}_1 + (-c_2 + 2c_3) \mathbf{v}_3 = \mathbf{0} \end{align*}

And using the fact that \(\mathbf{v}_i\) is linearly independent, each coefficient of \(\mathbf{v}_i\) here must be zero, and we arrive at the following system of equations: \begin{align*} \begin{array}{lll} c_1 && && &=& 0 \\ c_1 &+& c_2 &+& c_3 &=& 0 \\ && -c_2 &+& 2c_3 &=& 0 \\ \end{array} \end{align*}

which can be rewritten as the matrix equation \begin{align*} A\mathbf{c} = \left[ \begin{array} { l l l } { 1 } & { 0 } & { 0 } \\ { 1 } & { 1 } & { 1 } \\ { 0 } & { - 1 } & { 2 } \end{array} \right] \left[\begin{array}{l}c_1 \\ c_2 \\ c_3 \end{array}\right] = \mathbf{0} \end{align*}

and thus \(\mathbf{w}_i\) will be linearly independent precisely if \(A\mathbf{c} = \mathbf{0}\) has only the trivial solution \(\mathbf{c} = \mathbf{0}\), which is precisely when \(A\) has full rank, which happens iff \(\operatorname{det}A \neq 0\). A quick calculation shows that \(\operatorname{det}A = 3 \neq 0\), and so we are done. \(\hfill\blacksquare\)
We first note that we can rewrite the equation of the region to obtain something more familiar: \(x^2 + y^2 = 2x \implies (x-1)^2 + y^2 = 1\), which is a translated circle. Integrating over this region will be easy compared to the line integral, so we apply Green’s theorem: \begin{align*} \int_C xe^x ~dx + ye^y +x^2 ~dy = \iint_D 2x ~dA. \end{align*}

We can parameterize this region as \begin{align*} D = \left\{{x^2+y^2-2x = 0 {~\mathrel{\Big\vert}~}(x,y) \in {\mathbf{R}}^2, y \geq 0}\right\} = \left\{{(r(1+ \cos\theta), r\sin\theta) {~\mathrel{\Big\vert}~}\theta \in [0, \pi), r\in [0, 1]}\right\}. \end{align*}

Noting that \(dA = r~dr~d\theta\), we can then integrate \begin{align*} \iint_D 2x ~dA = \int_0^{\pi} \int_0^1 2(r(1 + \cos\theta)) r ~dr ~d\theta \\ = 2\int_0^{\pi} \int_0^1 r^2(1+\cos \theta) ~dr ~d\theta \\ = 2\int_0^{\pi} \frac 1 3 r^3(1+\cos \theta) \bigg\rvert_0^1 ~d\theta\\ = \frac 2 3 \int_0^{\pi} (1+\cos \theta) ~d\theta\\ = \frac 2 3 ( \theta + \sin \theta) \bigg\rvert_0^\pi \\ = \frac 2 3 [(\pi + 0) - (0 + 0)] = \frac 2 3 \pi. \hfill\blacksquare \end{align*}
Parts
- If \(A\) has two distinct eigenvalues, we will have \(A = PDP^{-1}\) where \(P\) is the matrix of eigenvectors and \(D\) has eigenvalues on the diagonal. We can compute the characteristic polynomial \begin{align*} p_\chi(x) = x^2 - (\operatorname{Tr}A)x + \operatorname{det}A = x^2 - 7x + 6 = (x-6)(x-1), \end{align*}
  
  and so \(\operatorname{Spec}(A) = \left\{{6,1}\right\}\). Computing the kernel of \(A-\lambda I\) for each of these yields \begin{align*} \mathbf{v}_1 = \left[ \begin{array} { c } { 1 } \\ { - 2 } \end{array} \right], \mathbf{v}_2 = \left[ \begin{array} { c } { 2 } \\ { 1 } \end{array} \right], \end{align*}
  
  And so we can write and check \begin{align*} P = \left[ \begin{array} { c c } { 1 } & { 2 } \\ { - 2 } & { 1 } \end{array} \right] \\ D = \left(\begin{array}{rr} 6 & 0 \\ 0 & 1 \end{array}\right) \\ \end{align*}
  
  We can compute \(PP^T = \mathrm{diag}(5,5)\), so \(P\) can be made orthogonal by replacing \(P\) with \((1/\sqrt 5) P\). With this replacement, a quick computation shows that \(PDP^T = A\).
- We will use the fact that \(A = PDP^{-1}\) where since \(A\) is symmetric and \(P\) is orthogonal. We can write \begin{align*} {\left\langle {A\mathbf{x}},~{\mathbf{x}} \right\rangle} = \mathbf{x}^T A^T \mathbf{x} = \mathbf{x}^T (PDP^T)^T \mathbf{x} = (P^T \mathbf{x})^T D (P^T\mathbf{x}) = \mathbf{y}^T D \mathbf{y} = {\left\langle {\mathbf{y}},~{ D\mathbf{y}} \right\rangle}, \end{align*} where \(\mathbf{x}\in S^2 \implies P^T\mathbf{x} \coloneqq\mathbf{y} \in S^2\) since \(P^T\) is both orthogonal and full-rank (and thus a bijection \(S^2 {\circlearrowleft}\)).
  
  We can now expand \begin{align*} {\left\langle {\mathbf{y}},~{D \mathbf{y}} \right\rangle} = \sum_{i=1}^2 y_i \lambda_i y_i = \sum_{i=1}^2 \lambda_i y_i^2 \end{align*}
  
  We now note that we can take \(\mathbf{y} = {\left[ {0, 1} \right]}\), in which case \(D\mathbf{y} = {\left[ {0, \lambda_2} \right]} = {\left[ {0, 1} \right]}\) and thus \({\left\langle {\mathbf{y}},~{D \mathbf{y}} \right\rangle} = 1\) is a candidate minimum.
  
  We can write this as the constrained optimization problem \begin{align*} \text{Minimize } f(y_1, y_2) = 6y_1^2 + 1 y_2^2\\ \text{subject to } g(y_1, y_2) = y_1^2 + y_2^2 = 1 \end{align*}
  
  where we note that this constraint is equivalent to the original \({\left\lVert {\mathbf{y}} \right\rVert} = \sqrt{y_1^2 + y_2^2} = 1\).
  
  This can be approached with Lagrange multipliers, i.e. looking at where \(\nabla f = \lambda \nabla g\). This yields \begin{align*} {\left[ {12y_1, 2y_2} \right]} = \lambda {\left[ {2y_1, 2y_2} \right]} \implies \\ 6y_1 = \lambda y_1, ~ y_2 = \lambda y_2. \end{align*}
  
  The second condition forces \(y_2 \in \left\{{0,1}\right\}\), and solving for \(\lambda\) in this expression yields \(\lambda = 1\) and so the first condition forces \(y_1 = 0\). This leaves only one possibility, \(\mathbf{y} = {\left[ {0, 1} \right]}\), which is indeed the candidate from above. Thus the minimum value is 1. \(\hfill\blacksquare\)
We need to show that \(R\) is reflexive, transitive, and symmetric.
- Reflexive: this would say that \(x\sim x \iff x^2-4x = x^2-4x\), which is true.
- Transitive: suppose \(x\sim y\) and \(y\sim z\), we want to show \(x\sim z\). But we have \begin{align*} x^2 - 4x = y^2-4y ~\&~ y^2-4y = z^2-4z \implies x^2-4x = y^2-4y = z^2-4z \end{align*}
- We want to show \(x\sim y \implies y \sim x\), which follows because \(x^2-4x = y^2-4y \iff y^2-4y = x^2-4x\).
- The equivalence classes: \begin{align*}\begin{align*} x^2-4x &= 0: &\left\{{0, 4}\right\}\\ x^2-4x &= -3: &\left\{{1,3}\right\} \\ x^2-4x &= -4: &\left\{{2}\right\} \\ x^2-4x &= 5: &\left\{{5}\right\} \end{align*}\end{align*}
A function \(f: {\mathbf{R}}^2 \to {\mathbf{R}}\) is totally differentiable at \(\mathbf{x}\) if there exists a linear map \(D: {\mathbf{R}}^2 \to {\mathbf{R}}\) such that \begin{align*}f(\mathbf{x} + \mathbf{h})- f(\mathbf{x}) = D(\mathbf{x})\mathbf{h} + \varepsilon(\mathbf{h}) \text{ where } \frac{\varepsilon(\mathbf{h})}{{\left\lVert {\mathbf{h}} \right\rVert}} \to 0. \end{align*}

Equivalently, it is when the following limit exists and satisfies \begin{align*} \lim_{\mathbf{h} \to \mathbf{0}} {\left\lvert {\frac{f(\mathbf{x} + \mathbf{h}) - f(\mathbf{x}) - D(\mathbf{x})\mathbf{h}}{{\left\lVert {\mathbf{h}} \right\rVert}}} \right\rvert} = 0, \end{align*}

in which case we write \(f' = D\), or in this special case, \(f' = \nabla f\).

Substituting in \(\mathbf{x} = \mathbf{0}\), being differentiable at zero requires that \(\exists D\) such that \begin{align*} f(\mathbf{h}) - f(\mathbf{0}) = D(\mathbf{0})\mathbf{h} + \varepsilon(\mathbf{h}), ~\frac{\varepsilon(\mathbf{h})}{{\left\lVert {\mathbf{h}} \right\rVert}} \to 0, \\ \text{ or } \\ \lim_{\mathbf{h} \to \mathbf{0}} {\left\lvert {\frac{f(\mathbf{h}) - f(\mathbf{0}) - D(\mathbf{0})\mathbf{h}}{{\left\lVert {\mathbf{h}} \right\rVert}}} \right\rvert} = 0. \end{align*}

We’ll make use the mean value theorem. The general statement is \begin{align*} f' \text{ exists and continuous on } (a,b) \implies \exists A \in [a,b] {~\mathrel{\Big\vert}~}\\ f(b) - f(a) = f'(c)(b-a) \end{align*} and here we’ll use \begin{align*} f_x \text{ exists and continuous on } (0, t) \implies \exists A \in [0, t] {~\mathrel{\Big\vert}~}\\ f(t, y) - f(0, y) = f_x(A, y)(t - 0) = tf_x(A, y) \\ \implies \color{purple} f(t, y) - f(0, y) = tf_x(A, y) \end{align*} where moreover \begin{align*} \lim_{t\to 0} f_x(A, y) = f_x(\lim_{t\to 0} A, y) = f(0, y) \end{align*} since \(f_x\) was assumed to be continuous.

We’ll also use the “best linear approximation” definition of differentiability. In general, it states \begin{align*} f' \text{ exists on } (a,b) \\ \implies f(x) = f(a) + f'(a)(x-a) + o(x-a),~ \frac{o(x-a)}{x-a} \to 0 \\ \implies f(x) - f(a) = f'(a)(x-a) + o(x-a),~ \frac{o(x-a)}{x-a} \to 0 \\ \implies f(b) - f(a) = f'(a)(b-a) + o(b-a),~ \frac{o(b-a)}{b-a} \to 0 \end{align*} and here we’ll take \(a=0, b=t\), so \(x-a = x\) and \(b-a = t\), and use the variant \begin{align*} f_y \text{ exists on } (0, t) \\ \implies f(x, t) - f(x, 0) = f_y(x, 0)(t - 0) + o(t),~ \frac{o(t)}{t} \to 0 \\ \implies \color{green}f(x, t) - f(x, 0) = tf_y(x,0) + o(t), ~\frac{o(t)}{t} \to 0. \end{align*}

We can then write \begin{align*}\begin{align*} f(h_1, h_2) - f(0, 0) &= {\color{green}f(h_1, h_2) - f(h_1, 0)} + {\color{purple}f(h_1, 0) - f(0, 0)} \\ &= {\color{green}h_2f_y(0, 0) + o(h_2)} + {\color{purple}h_1f_x(A, 0)}\\ &= (h_1 f_x(0, 0) - {\color{red}h_1 f_x(0, 0)}) + {\color{purple}h_1f_x(A, 0)} + {\color{green}h_2f_y(0, 0) + o(h_2)} \\ &= h_1 f_x(0, 0) + {\color{green}h_2 f_y(0, 0)} + h_1({\color{purple}f_x(A,0)} - {\color{red}f_x(0,0)}) + {\color{green}o(h_2)} \end{align*}\end{align*}

Now, since \(f_x\) is continuous, \({\color{purple}f_x(A,0)} - {\color{red}f_x(0,0)} \to 0\) and so \(h_1(f_x(A,0) - f_x(0,0)) = o(h_1)\). We thus have \begin{align*}\begin{align*} f(h_1, h_2) - f(0, 0) &= h_1 f_x(0, 0) + h_2 f_y(0, 0) + o(h_1) + o(h_2) \\ &= h_1 f_x(0, 0) + h_2 f_y(0, 0) + o({\left\lVert {\mathbf{h}} \right\rVert}) \\ \\ &\implies f(h_1, h_2) - f(0, 0) - h_1 f_x(0, 0) - h_2 f_y(0, 0) = o({\left\lVert {\mathbf{h}} \right\rVert}) \end{align*}\end{align*}

and so if we define \begin{align*}D(\mathbf{h}) = h_1 f_x(0, 0) + h_2 f_y(0, 0), \end{align*}

we can take absolute values to find that \begin{align*} {\left\lvert {f(\mathbf{h}) - f(\mathbf{0}) - D(\mathbf{h})} \right\rvert} = o({\left\lVert {\mathbf{h}} \right\rVert}) \\ \text{ and so } \\ \lim_{\mathbf{h} \to \mathbf{0}} {\left\lvert {\frac{f(\mathbf{h}) - f(\mathbf{0}) - D(\mathbf{h})}{{\left\lVert {\mathbf{h}} \right\rVert}}} \right\rvert} = 0 \end{align*}

which forces \(f\) to be totally differentiable at \(\mathbf{0}\) with derivative \(D\).