R studio programingUniversity of California, Los Angeles Department of Statistics Statistics 12...

Question

R studio programingUniversity of California, Los Angeles Department of Statistics Statistics 12 Instructor: Nicolas Christou Data analysis with R - Some simple commands When you are in R, the command line begins with > To read data from a website use the following command: a  a  head(a). Here is the output: > head(a) x1 y x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 1 1.0853 6.1 22 173.25 72.25 38.5 93.6 83.0 98.7 58.7 37.3 23.4 30.5 28.9 18.2 2 1.0414 25.3 22 154.00 66.25 34.0 95.8 87.9 99.2 59.6 38.9 24.0 28.8 25.2 16.6 3 1.0754 10.3 23 188.15 77.50 38.0 96.6 85.3 102.5 59.1 37.6 23.2 31.8 29.7 18.3 4 1.0722 11.7 23 198.25 73.50 42.1 99.6 88.6 104.1 63.1 41.7 25.0 35.6 30.0 19.2 5 1.0708 12.3 23 154.25 67.75 36.2 93.1 85.2 94.5 59.0 37.3 21.9 32.0 27.4 17.1 6 1.0775 9.4 23 159.75 72.25 35.5 92.1 77.1 93.9 56.1 36.1 22.7 30.5 27.2 18.2 1 Useful commands: • Extracting one variable from the data frame (e.g. the second variable): > a[,2] • Another way to extract a variable : > a$y • Similarly if we want to access a particular row in our data (e.g. first row): > a[1,] • To list all the data simply type: > a • To compute the mean of all the variables in the data set: > mean(a) • To compute the mean of just one variable: > mean(a$y) • To compute the mean of variables 2 and 3: > mean(a[,c(2,3)]) • To compute the variance of one variable: > var(a$y) • To compute the variance-covariance matrix of all the variables: > cov(a) • To compute the variance-covariance matrix of all the variables except the first variable: > cov(a[,-1]) • To compute the variance-covariance matrix of variables 1, 2, and 3: > cov(a[,c(1,2,3)]) or cov(a[,1:3]) • To compute the variance-covariance matrix of variables 1, 2, and 5: > cov(a[,c(1,2,5)]) • To compute the correlation matrix: As above, replace cov with cor, for example: > cor(data[,c(1,2,3)]) • To compute summary statistics for all the variables: > summary(a). • To construct stem-and-leaf plot, histogram, boxplot: > stem(a$y) > boxplot(a$y) > hist(a$y) • To plot variable y against variable x10: > plot(a$x10, a$y) 2 • And you can give names to the axes and to your plot: > plot(a$x10, a$y, main="Scatterplot of percent body fat against thigh circumference", ylab="Percent body fat", xlab="Thigh circumference") And here is the plot: ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 50 60 70 80 0 10 20 30 40 Scatterplot of percent body fat against thigh circumference Thigh circumference P er ce nt  b od y  fa t • To save a plot as a pdf file under the working directory (e.g. your desktop): > pdf("box.pdf") > boxplot(a$y) > dev.off() A box plot of the variable y can be found on your current working directory with the name box.pdf. If you want to read more about a specific command (for example about histograms and boxplots) at the command line you type the following: > ?hist > ?boxplot • Exercise: Construct the same plots with different variables and save them on your desktop. 3 Create multiple graphs on one page. Suppose 9 graphs, 3 × 3: pdf("plot9.pdf") par(mfrow=c(3,3)) hist(a$y) boxplot(a$x10) plot(a$x10, a$y) boxplot(a$y) boxplot(a$x9) hist(a$x1) plot(a$x13, a$y) hist(a$x9) boxplot(a$x13) dev.off() And here is the plot: Histogram of a$y a$y Fr eq ue nc y 0 10 20 30 40 50 0 10 30 50 ● ● ● ● 50 60 70 80 ●●● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● 0 50 100 150 200 250 50 60 70 80 Index a$ x1 0 ● 0 10 20 30 40 ● ● ● 90 11 0 13 0 15 0 Histogram of a$x1 a$x1 Fr eq ue nc y 1.00 1.04 1.08 0 10 20 30 40 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● 25 30 35 40 45 0 10 20 30 40 a$x13 a$ y Histogram of a$x9 a$x9 Fr eq ue nc y 90 110 130 150 0 20 40 60 80 ● 25 30 35 40 45 Create subsets: The following simple commands will create subsets of the original data frame a: a1  log_lead  log_zinc

Sourav · Accepted Answer

Stats 10 Lab 1, Submission 
Name:  Xiangrong Pu 
UID: 705129099 
 
Section 1 
1) a)  
> heights  print(heights) 
[1] 6.0  5.8  5.2 
 
b)  
> names  print(names) 
[1] “Eric"     "Christina"    "Xiangrong"
c)  
> cbind(names, heights) 
     names    heights 
[1,] "6"     "Erirk"     
[2,] "5.8"   "Christina" 
[3,] "5.2"   "Xiangrong" 
 
> class(cbind(names, heights)) 
[1] "matrix" 
 
2) a) 
setwd("~/Desktop/Stats 10 Lab1") 
 
Ncbirths  #install.packages("maps") 
> find.package("maps") 
[1] "C:/Users/Xiangrong/Documents/R/win-library/3.6/maps" 
 
b)  
> library(“maps”) 
> map("state") 
 
 
4) a)  
> weights  weights_in_pounds  weights_in_pounds[1:20] 
 [1]  7.7500 11.0625  6.6875  9.0000  7.3125  6.1250  9.1875  8.6250  6.5000 
[10]  7.6875  9.5625  8.0625  7.4375  6.7500  6.6250  7.8125  7.1875  8.0000 
[19]  8.2500  5.1875
Section 2 
 
1) 
> mean(weights_in_pounds) 
[1] 7.2532 
 
Mean weight of the babies are around 7.2 pounds. 
 
2) 
> library("mosaic") 
> tally(~Habit|Gender,data=NCbirths,format='percent') 
                    Gender 
Habit          Female      Male 
  NonSmoker 90.073145 91.111111 
  Smoker     9.926855  8.888889 
 
Around 10% of the mothers smoke. 
 
3) 
> tally(NCbirths$Habit,format='percent') 
X 
NonSmoker    Smoker  
 90.61245   9.38755  
 
The percentage of the smokers is around 12% off from the CDC’s report.

University of California, Los Angeles Department of Statistics Statistics 12 Instructor: Nicolas Christou Data analysis with R - Some simple commands When you are in R, the command line begins with >...

Answer To: University of California, Los Angeles Department of Statistics Statistics 12 Instructor: Nicolas...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment