Problem Set 2 Problem Set 2 Soohyun Cho Due: 02/28/2022 11:59 PM ET • Partial credit is given, so attempt all questions. You should type answers to math questions—i.e., there is no hand-writing of...

1 answer below »
attached file


Problem Set 2 Problem Set 2 Soohyun Cho Due: 02/28/2022 11:59 PM ET • Partial credit is given, so attempt all questions. You should type answers to math questions—i.e., there is no hand-writing of math. • For open-ended questions, you are expected to answer in complete sentences. Limit the length of your answers—typically a couple of sentences will suffice. • You must submit your homework in PDF format along with R script. If you’re working on R markdown, you can just submit your homework in PDF format. Question 1 (2 pt) Use R to complete the following tasks: 1. Create an object, called a, that takes the values of 0.824, 1.81, 0.2, -2.2, 4.7, 0.003, 1.44, and NA where NA is a missing value. Also, create a vector of b that takes the values of 1, 2, 3, 10, 20, 53, 56, and 103. 2. Compute the arithmetic mean of a. 3. Calculate the length of a and b (Use an R argument/function). 4. Using the ggplot() package, present a scatterplot of a and b. Question 2 (3 pt) Use R to complete the following tasks: 1. Create an object, called mean_x that takes the mean value of the vector x that takes the values of 1, 0.411, 0.234, 0,33, and 10. Don’t store the mean value itself. Use an argument/function. 2. Compute the number of elements of the vector x that are above the mean of x. Report the answer in a comment. 3. Assess whether or not the value 2 appears in the vector x. 4. Round the vector x to two digits. 5. Create an object, called y, that takes the value 1 if the corresponding element of x is above its mean, and 0 otherwise. Report the resulting vector in a comment. 6. Create an object, called y, that contains the values of A, B, C, D, E, and F. Then, combine this vector y with the vector x to make a data frame. And name this data frame as rating and show the first five observations. Question 3 (2 pt) In designing a new test for the COVID-19, a researcher claims a 95 percent true positive rate, meaning that the test correctly identifies the virus in 95 percent patients carrying it. The test has a false positive rate of 15 1 percent, meaning that among 100 patients not carrying the virus, 15 will receive a positive result. A patient came into a clinic for testing. In her demographic group, about 3 percent of people carry the virus. 1. If the test returns a positive result, what is the probability that the patient actually carries the COVID-19? 2. If the test returns a negative result, what is the probability that the patient actually carries the COVID-19? Question 4 (2 pt) We studied covariance and correlation coefficient as two measures of bivariate relationship. 1. What can we learn from the sign and absolute value of covariance? Why does it matter to know the sign of covariance? 2. Are Cor(X,Y ) and Cor(Y,X) same or different? Question 5 (1 pt) In your own words, explain the intuition of statistical inference based on the central limit theorem. What is a p-value? Does a p-value tell us the probability that an observed relationship is real? Question 6 (2 pt) Say that, in California, a proposition to re-establish bilingual education in public schools will be on the ballot. You conduct a poll with a random sample from Californians and find that 220 residents plan to vote “Yea” and 180 plan to vote “Nay” (Assume that you know Ȳ = 0.55 and s2Y = 0.25). 1. We do not know the true proportion of Californians supporting the proposition, which we might call µ. We can use the CLT to approximate the sampling distribution for µ. To do so, we need Ȳ and the standard error. What is the standard error in this case? 2. Given what you computed in (1), what is the 95% confidence interval around Ȳ ? (Using the rule-of-thumb ±2 standard errors is sufficient here.) 3. How do you interpret the confidence interval? 2 Question 1 (2 pt) Question 2 (3 pt) Question 3 (2 pt) Question 4 (2 pt) Question 5 (1 pt) Question 6 (2 pt)
Answered 2 days AfterFeb 26, 2022

Answer To: Problem Set 2 Problem Set 2 Soohyun Cho Due: 02/28/2022 11:59 PM ET • Partial credit is given, so...

Suraj answered on Mar 01 2022
107 Votes
Solution 3:
Let E denote that the event that virus is present. TP denotes that test is positive and
FP denote that result is incorrectly results positive and TN denote test is negative.
Thus, consider the given probabilities as follows:
Hence, If the test returns a positive result, then the probability that the patient actually carries the COVID-19 is 0.1638.
Hence, If the test returns a negative result, then the probability that the patient actually carries the COVID-19 is 0.0018.
Solution 4:
1.
The sign of the covariance is very important while doing correlation analysis. Because the correlation value directly depends on the value of covariance. Hence, if the covariance is negative then correlation would be negative and if covariance is positive then correlation would be positive.
2.
The...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here