--- title: "Stats 102B - Homework 1" author: "Put Your Name Here" date: "2019" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` Modify this file with your...

R

--- title: "Stats 102B - Homework 1" author: "Put Your Name Here" date: "2019" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` Modify this file with your answers and responses. ### Reading: a. A First Course in Machine Learning [FCML]: Chapter 1 ## Part 1: Weighted Least Squares Regression ## Task 1A Solve exercise 1.11 from page 38 of the textbook. Hint: define matrix $\mathbf{A}$ be a diagonal matrix of the weights $\alpha_1, ... \alpha_N$ as follows: $$ \begin{bmatrix} \alpha_1 & 0 & \ldots & 0 \\ 0 & \alpha_2 & \ldots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \ldots & \alpha_N \\ \end{bmatrix} $$ Hint: With this matrix defined, the loss function can be written as: $$\mathcal{L} = \frac{1}{N} (\mathbf{t} - \mathbf{Xw})^T\mathbf{A}(\mathbf{t} - \mathbf{Xw})$$ Suggestion: Do your work with pencil and paper. Write the dimensions underneath each matrix to make sure you have the order of multiplication correct. As you work to find your solution, typeset the gradient of the loss w.r.t. $\mathbf{w}$ $$\frac{\partial \mathcal{L}}{\partial \mathbf{w}} = $$ Finally, typeset your solution for $\mathbf{\hat{w}}$ here: $$\mathbf{\hat{w}} = $$ If you're new to Latex and typesetting, you can visit: https://en.wikibooks.org/wiki/LaTeX/Mathematics for examples to follow. ## Task 1B Let's see the effect of altering the values of our weights $\alpha$. We'll use this following toy data, where x is the values 1 through 4, and t is the values, 1, 3, 2, 4. ```{r} x <- c(1,2,3,4)="" t=""><- c(1,3,2,4)="" ```="" -="" we="" begin="" by="" setting="" our="" vector="" of="" weights="" all="" equal.="" i="" have="" already="" fit="" a="" model="" using="" `lm()`="" with="" the="" argument="" `weights`.="" go="" ahead="" and="" use="" the="" matrix="" operations="" from="" your="" solution="" from="" above="" to="" find="" and="" print="" the="" parameter="" estimates.="" they="" should="" equal="" the="" parameter="" estimates="" found="" via="" `lm()`="" ```{r}="" a1=""><- c(1,1,1,1)="" model1=""><- lm(t="" ~="" x,="" weights="a1)" plot(x,t,xlim="c(0,5)," ylim="c(0,5)," asp="1)" abline(model1)="" print(model1$coefficients)="" ##="" your="" code="" to="" make="" the="" matrices="" and="" find="" the="" parameter="" estimates="" ```="" -="" now="" we="" alter="" the="" weights.="" this="" vector="" puts="" large="" weight="" on="" the="" two="" inner="" points="" (x="2," x="3)," and="" small="" weight="" on="" the="" outer="" points="" (x="1," x="4)." again,="" use="" the="" matrix="" operations="" to="" find="" and="" print="" the="" parameter="" estimates="" using="" the="" provided="" weights.="" compare="" them="" against="" the="" estimates="" found="" via="" `lm()`.="" i="" have="" plotted="" the="" fitted="" line,="" comment="" on="" the="" effect="" of="" the="" weights.="" ```{r}="" a2=""><- c(0.1,="" 5,="" 5,="" 0.1)="" #="" model2=""><- lm(t="" ~="" x,="" weights="a2)" plot(x,t,xlim="c(0,5)," ylim="c(0,5)," asp="1)" abline(model2)="" print(model2$coefficients)="" ##="" your="" code="" ```="" -="" we="" alter="" the="" weights="" again.="" this="" time="" large="" weight="" are="" on="" the="" two="" outer="" points="" (x="1," x="4)," and="" small="" weight="" on="" the="" inner="" points="" (x="2," x="3)." again,="" use="" the="" matrix="" operations="" to="" find="" and="" print="" the="" parameter="" estimates="" using="" the="" provided="" weights.="" compare="" them="" against="" the="" estimates="" found="" via="" `lm()`.="" look="" at="" the="" fitted="" line="" and="" comment="" on="" the="" effect="" of="" the="" weights.="" ```{r}="" a3=""><- c(5,="" 0.1,="" 0.1,="" 5)="" #="" model3=""><- lm(t="" ~="" x,="" weights="a3)" plot(x,t,xlim="c(0,5)," ylim="c(0,5)," asp="1)" abline(model3)="" print(model3$coefficients)="" ```="" -="" try="" to="" explain="" weighted="" least="" squares="" regression.="" what="" effect="" would="" putting="" a="" very="" large="" weight,="" say="" 1000,="" on="" a="" point="" have="" on="" the="" regression="" line?="" what="" effect="" would="" putting="" a="" weight="" of="" 0="" on="" a="" point="" have="" on="" the="" regression="" line?="" ##="" part="" 2:="" ols="" matrix="" notation="" ##="" task="" 2="" review="" lecture="" 1-3.="" we'll="" take="" a="" look="" at="" the="" chirot="" dataset="" which="" covers="" the="" 1907="" romanian="" peasant="" revolt.="" general="" info="" on="" the="" event:="" https://en.wikipedia.org/wiki/1907_romanian_peasants%27_revolt="" the="" data="" covers="" 32="" counties="" in="" romania,="" and="" the="" response="" variable="" is="" the="" intensity="" of="" the="" rebellion.="" the="" orginal="" paper="" by="" daniel="" chirot,="" which="" provides="" details="" of="" the="" analysis="" and="" variables:="" https://www.jstor.org/stable/2094430="" a="" basic="" data="" dictionary="" can="" be="" found="" with="" `help(chirot)`="" ```{r}="" library(cardata)="" data(chirot)="" chirot_mat=""><- as.matrix(chirot)="" ```="" we'll="" do="" an="" analysis="" with="" matrix="" operations.="" we="" start="" by="" extracting="" the="" commerce="" column="" and="" creating="" our="" $\mathbf{x}$="" matrix.="" ```{r}="" t=""><- chirot_mat[,="" 1,="" drop="FALSE]" #="" response,="" keep="" as="" a="" matrix="" x=""><- chirot_mat[,="" 2,="" drop="FALSE]" #="" commerce="" column,="" keep="" as="" matrix="" x=""><- cbind(1,="" x)="" colnames(x)=""><- c('1','commerce')="" head(x)="" ```="" -="" use="" `lm()`="" to="" fit="" the="" rebellion="" intensity="" to="" matrix="" x="" (which="" has="" columns="" for="" a="" constant="" and="" commercialization).="" make="" sure="" you="" only="" calculate="" the="" coefficient="" for="" the="" intercept="" once.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" using="" only="" matrix="" operations,="" calculate="" and="" show="" the="" coefficient="" estimates.="" verify="" that="" they="" match="" the="" estimates="" from="" `lm()`.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" create="" another="" matrix="" (call="" it="" x_ct)="" with="" three="" columns:="" a="" constant,="" variable="" commerce,="" variable="" tradition.="" print="" the="" head="" of="" this="" matrix.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" using="" matrix="" operations,="" calculate="" and="" show="" the="" coefficient="" estimates="" of="" the="" model="" with="" the="" variables="" commerce="" and="" tradition.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" create="" another="" matrix="" (call="" it="" x_all)="" with="" all="" of="" the="" x="" variables="" (plus="" a="" constant).="" print="" the="" head="" of="" this="" matrix.="" using="" matrix="" operations,="" calculate="" and="" show="" the="" coefficient="" estimates="" of="" the="" model="" with="" all="" the="" variables.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" using="" matrix="" operations,="" calculate="" the="" fitted="" values="" for="" all="" three="" models.="" (no="" need="" to="" print="" out="" the="" fitted="" values.)="" create="" plots="" of="" fitted="" value="" vs="" actual="" value="" for="" each="" model="" (there="" will="" be="" three="" plots).="" be="" sure="" to="" provide="" a="" title="" for="" each="" plot.="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" -="" now="" that="" you="" have="" calculated="" the="" columns="" of="" fitted="" values,="" find="" the="" residual="" sum="" of="" squares="" and="" the="" r-squared="" values="" of="" the="" three="" models.="" which="" model="" has="" the="" smallest="" rss?="" (rss="" is="" the="" sum="" of="" (actual="" -="" fitted)^2.="" r-sq="" is="" the="" correlation="" between="" actual="" and="" fitted,="" squared.)="" ```{r}="" #="" ...="" your="" code="" ...="" ```="" ##="" part="" 3:="" cross-validation="" we="" can="" use="" cross-validation="" to="" evaluate="" the="" predictive="" performance="" of="" several="" competing="" models.="" i="" will="" have="" you="" manually="" implement="" leave-one-out="" cross-validation="" from="" scratch="" first,="" and="" then="" use="" the="" built-in="" function="" in="" r.="" we="" will="" use="" the="" dataset="" `ironslag`="" from="" the="" package="" `daag`="" (a="" companion="" library="" for="" the="" textbook="" data="" analysis="" and="" graphics="" in="" r).="" the="" description="" of="" the="" data="" is="" as="" follows:="" the="" iron="" content="" of="" crushed="" blast-furnace="" slag="" can="" be="" determined="" by="" a="" chemical="" test="" at="" a="" laboratory="" or="" estimated="" by="" a="" cheaper,="" quicker="" magnetic="" test.="" these="" data="" were="" collected="" to="" investigate="" the="" extent="" to="" which="" the="" results="" of="" a="" chemical="" test="" of="" iron="" content="" can="" be="" predicted="" from="" a="" magnetic="" test="" of="" iron="" content,="" and="" the="" nature="" of="" the="" relationship="" between="" these="" quantities.="" [hand,="" d.j.,="" daly,="" f.,="" et="" al.="" (1993)="" __a="" handbook="" of="" small="" data="" sets__]="" the="" `ironslag`="" data="" has="" 53="" observations,="" each="" with="" two="" values="" -="" the="" measurement="" using="" the="" chemical="" test="" and="" the="" measurement="" from="" the="" magnetic="" test.="" we="" can="" start="" by="" fitting="" a="" linear="" regression="" model="" $y_n="w_0" +="" w_1="" x_n="" +="" \epsilon_n$.="" a="" quick="" look="" at="" the="" scatterplot="" seems="" to="" indicate="" that="" the="" data="" may="" not="" be="" linear.="" ```{r="" linear_model}="" #="" install.packages("daag")="" #="" if="" necessary="" library(daag)="" x=""><- seq(10,40,="" .1)="" #="" a="" sequence="" used="" to="" plot="" lines="" l1=""><- lm(magnetic="" ~="" chemical,="" data="ironslag)" plot(ironslag$chemical,="" ironslag$magnetic,="" main="Linear fit" ,="" pch="16)" yhat1=""><- l1$coef[1]="" +="" l1$coef[2]="" *="" x="" lines(x,="" yhat1,="" lwd="2," col="blue" )="" ```="" in="" addition="" to="" the="" linear="" model,="" fit="" the="" following="" models="" that="" predict="" the="" magnetic="" measurement="" (y)="" from="" the="" chemical="" measurement="" (x).="" quadratic:="" $y_n="w_0" +="" w_1="" x_n="" +="" w_2="" x_n^2="" +="" \epsilon_n$="" exponential:="" $\log(y_n)="w_0" +="" w_1="" x_n="" +="" \epsilon_n$,="" equivalent="" to="" $y_n="\exp(w_0" +="" w_1="" x_n="" +="" \epsilon_n)$="" log-log:="" $\log(y_n)="w_0" +="" w_1="" \log(x_n)="" +="" \epsilon_n$="" ##="" task="" 3a="" ```{r="" other_models}="" #="" i've="" started="" each="" of="" these="" for="" you.="" #="" your="" job="" is="" to="" create="" the="" plots="" with="" fitted="" lines.="" l2=""><- lm(magnetic="" ~="" chemical="" +="" i(chemical^2),="" data="ironslag)" l3=""><- lm(log(magnetic)="" ~="" chemical,="" data="ironslag)" #="" when="" plotting="" the="" fitted="" line="" for="" this="" one,="" create="" estimates="" of="" log(y-hat)="" linearly="" #="" then="" exponentiate="" log(y-hat)="" l4=""><- lm(log(magnetic)="" ~="" log(chemical),="" data="ironslag)" #="" for="" this="" one,="" use="" plot(log(chemical),="" log(magnetic))="" #="" the="" y-axis="" is="" now="" on="" the="" log-scale,="" so="" you="" can="" create="" and="" plot="" log(y-hat)="" directly="" #="" just="" remember="" that="" you'll="" use="" log(x)="" rather="" than="" x="" directly="" ```="" ##="" task="" 3b:="" leave-one-out="" cross="" validation="" you="" will="" now="" code="" leave-one-out="" cross="" validation.="" in="" loocv,="" we="" remove="" one="" data="" point="" from="" our="" data="" set.="" we="" fit="" the="" model="" to="" the="" remaining="" 52="" data="" points.="" with="" this="" model,="" we="" make="" a="" prediction="" for="" the="" left-out="" point.="" we="" then="" compare="" that="" prediction="" to="" the="" actual="" value="" to="" calculate="" the="" squared="" error.="" once="" we="" find="" the="" squared="" error="" for="" all="" 53="" points,="" we="" can="" take="" the="" mean="" to="" get="" a="" cross-validation="" error="" estimate="" of="" that="" model.="" to="" test="" out="" our="" four="" models,="" we="" will="" build="" a="" loop="" that="" will="" remove="" one="" point="" of="" data="" at="" a="" time.="" thus,="" we="" will="" make="" a="" `for(i="" in="" 1:53)`="" loop.="" for="" each="" iteration="" of="" the="" loop,="" we="" will="" fit="" the="" four="" models="" on="" the="" remaining="" 52="" data="" points,="" and="" make="" a="" prediction="" for="" the="" remaining="" point.="" ```{r="" loocv}="" #="" create="" vectors="" to="" store="" the="" validation="" errors="" for="" each="" model="" #="" error_model1=""><- rep(na,="" 53)="" #="" ...="" error_model1=""><- rep(na,="" 53)="" for(i="" in="" 1:53){="" #="" write="" a="" line="" to="" select="" the="" ith="" line="" in="" the="" data="" #="" store="" this="" line="" as="" the="" 'validation'="" case="" #="" store="" the="" remaining="" as="" the="" 'training'="" data="" #="" fit="" the="" four="" models="" and="" calculate="" the="" prediction="" error="" for="" each="" one="" #="" hint:="" it="" will="" be="" in="" the="" form="" #="" model1=""><- lm(magnetic="" ~="" chemical,="" data="training)" #="" fitted_value=""><- predict(model1,="" test_case)="" #="" error_model1[i]=""><- (validation_actual_value="" -="" fitted_value)^2="" #="" ...="" #="" model2=""><- #="" ...="" #="" ...="" #="" for="" models="" where="" you="" are="" predicting="" log(magnetic),="" you'll="" want="" to="" #="" exponentiate="" the="" fitted="" value="" before="" you="" compare="" it="" to="" the="" validation="" case="" #="" error[i]=""><- (validation_actual_value - exp(fitted_value))^2 } # once all of the errors have been calculated, find the mean squared error # ... # mean(error_model1) mean(error_model1) ``` compare the sizes of the cross validation error (validation_actual_value="" -="" exp(fitted_value))^2="" }="" #="" once="" all="" of="" the="" errors="" have="" been="" calculated,="" find="" the="" mean="" squared="" error="" #="" ...="" #="" mean(error_model1)="" mean(error_model1)="" ```="" compare="" the="" sizes="" of="" the="" cross="" validation="">

102b-hw1-np1rpzv3.rmd

Aug 14, 2021

SOLUTION.PDF

--- title: "Stats 102B - Homework 1" author: "Put Your Name Here" date: "2019" output: html_document --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` Modify this file with your...

Get Answer To This Question

Related Questions & Answers

Submit New Assignment