Suppose that a curve $\hat{g}$ is icomputed to smoothly fit a set of $n$ points using the following formula \begin{align} \hat{g} = \argmin_{g} \bigg( \sum_{i=1}^{n} (y_{i} - g(x_{i}))^{2} + \lambda \int \big[ g^{(m)} \big]^{2} dx \bigg) \end{align} where $g^{(m)}$ represents the $m$th derivtive of $g$ (and $g^{(0)}= g$). Provide the functional form of $g$ in the following scenarios.
$\lambda = \inf, m = 0$
$\lambda = \inf, m = 1$
$\lambda = \inf, m = 2$
$\lambda = \inf, m = 3$
$\lambda = 0, m = 3$
Suppose we fit a curve with basis functions $b_{1}(X) = I(0 \leq X \leq 2) - (X-1)I(1 \leq X \leq 2)$, $b_{2}(X) = (X - 3)I(3 \leq X \leq 4) + I(4 \leq X \leq 5)$. We fit the regression model \begin{align} Y = \beta_{0} + \beta_{1}b_{1}(X) + \beta_{2}b_{2}(X) + \epsilon \end{align} and obtain the coefficient estimates as $\hat{\beta}_{0} = 1, \hat{\beta}_{1} = 1$ and $\hat{\beta}^{2} = 3$. Plot the estimated curve between $X = -2$ and $X = 2$.
Consider the two curves $\hat{g}_{1}$ and $\hat{g}_{2}$ defined by \begin{align} \hat{g}_{1} = \argmin_{g} \bigg( \sum_{i=1}^{n}(y_{i} - g(x_{i}))^{2} + \lambda \int \big[ g^{(3)}(x) \big]^{2} dx \bigg) \hat{g}_{2} = \argmin_{g} \bigg( \sum_{i=1}^{n}(y_{i} - g(x_{i}))^{2} + \lambda \int \big[ g^{(4)}(x) \big]^{2} dx \bigg) \end{align} where $g^{(m)}$ represents the $m$th derivative of $g$.
as $\lambda \to \inf$, will $\hat{g}_{1}$ or $\hat{g}_{2}$ have the smaller training RSS ?
as $\lambda \to \inf$, will $\hat{g}_{1}$ or $\hat{g}_{2}$ have the smaller test RSS ?
as $\lambda = 0$, will $\hat{g}_{1}$ or $\hat{g}_{2}$ have the smaller training and test RSS ?