Assignment attachedSubject Exercise A We consider n observations y1, . . . , yn of a variable and n...

Question

Assignment attachedSubject Exercise A We consider n observations y1, . . . , yn of a variable and n vectors xi (t(xi) = (xi1, . . . , xik)) where we have the observations of k variables. For i ∈ {1, . . . , n}, we assume that yi is an observation of Yi with Yi ∼ N (t(xi).β, σ2i ) where β is a vector of dimension k (t(β) = (β1, . . . , βk)). We assume that the variable Yi are independent. Let consider the following populations : � I1 = {1, . . . , n1} indices associated to the �rst population with n1 elements � I2 = {n1 + 1, . . . , n1 + n2} indices associated to the second population with n2 elements � .... � Ip = {n1 + . . .+ np−1 + 1, . . . , n} indices association to the population p with np elements We assume that if i ∈ Ij, σ2i = j.σ2. We want to estimate β and σ2 thanks to maximum likelihood. 1. What is the expression of fYi(yi), where fYi denotes the density function of Yi? 2. Show that β̂ and σ̂2 are solution of:{∑p l=1 1 l ∑ i∈Il(yi − t(xi)β) 2 = nσ2 ∀j = 1, . . . , k, ∑p l=1 1 l ∑ i∈Il(yi − t(xi)β).xij = 0 3. Prove that the previous system is similar to:{ ||A(Y −Xβ)||2 = nσ2 t(X)A2(Y −Xβ) = 0 where Y and X are the classical matrices involved in linear model, and A is a diagonal matrix whose elements are 1√ l for i such that i ∈ Il 4. We assume that t(X)A2X is invertible. Give the expression of β̂ and σ̂2. 5. Prove that nσ̂2 = ||V ||2 where V is a centered gaussian vector. 6. Deduce that E(||V ||2) is the sum of the diagonal elements of the covariance matrix of V . 7. Prove that nσ̂ 2 n−k is an unbiased estimator of σ 2. 1 8. Let denote by Xl the matrix with nl rows and k columns composed of the rows of X associated to the indices of Il. And Yl is the vector with nl elements associated to the elements of Il for Y . Let denote β̂l = (t(Xl).Xl) −1.t(Xl).Yl. Prove that β̂l is an unbiased estimator of β. Exercise B Let consider the model yi = β1 + β2.xi + β3. √ xi + εi for i ∈ {1, . . . , n}, The variables εi are gaussian, independent with expectation 0 and variance σ 2. 1. Write the matricial equation with t(β) = (β1, β2, β3). 2. For a dataset, with n = 100, we �nd : t(X).X =  ? ? 222? 3767 ? ? 1408 544  t(X).Y =  −569−4505 −1610  t(Y ).Y = 651900 What are the values of `¾ 3. What is the mean of the xi? 4. What are the estimations of β1, β2 and β3? 5. What is the estimation of σ2? 6. Compute a con�dence interval for β2 with level 95%. 7. Perform the test β2 = 0 with respect β2 6= 0 with level 10%. 8. Compute the coe�cient  of determination. 9. Compute a con�dence interval for yn+1 at level 95%, knowing xn+1 = 49. 10. Compute a con�dence interval for yn+1 at level 95%, knowing xn+1 = 25. 11. Which one is the biggest one? Why? 2

Himanshu · Accepted Answer

Exercise A
1.
The likelihood function is the density function regarded as a function of θ. 
L(θ|x) = f(x|θ), θ ∈ Θ. 
The maximum likelihood estimator (MLE), 
ˆθ(x) = arg max θ L(θ|x).
2.
A random sample x1, x2,...,xn from a distribution f(x) is a set of independently and identically variables with xi ∼ f(x) for all i.
 Their joint p.d.f is 
f(x1, x2,...,xn) = f(x1)f(x2)··· f(x2) =  n i=1 f(xi). The sample moments provide estimates of the moments of f(x). We need to know how they are distributed. 
The mean ¯x of a random sample is an unbiased estimate of the population moment µ = E(x), since E(¯x) = E xin  = 1n E(xi) = nnµ = µ. 
The variance of a sum of independent variables is the sum of their variances, since covariances are zero. 
Therefore V (¯x) = V xin  = 1n2 V (xi) = nn2 σ2 = σ2n . Observe that V (¯x) → 0 as n → ∞. Since E(¯x) = µ, the estimates become increasingly concentrated around the true population parameter. 
Such an estimate is said to be consistent.  The sample variance is not an unbiased estimate of σ 2 = V (x), 
since E(s 2) = E  1 n (xi − x¯) 
2  = E  1 n  (xi − µ)+( µ − x¯) 
2 = E  1 n  (xi − µ) 2 + 2( xi − µ)( µ − x¯)+( µ − x¯) 2 = V (x) − 2 E {(¯x − µ) 2 } + E {(¯x − µ) 2 } = V (x) − V (¯x). 
Here, we have used the result that E  1 n (xi − µ)( µ − x¯)  = − E {(µ − x¯) 2 } = − V (¯x). It follows that E(s 2) = V (x) − V (¯x) = σ 2 − σ 2 n = σ 2 (n − 1) n . Therefore, s 2 is a biased estimator of the population variance. For an unbiased estimate, we should use σˆ 2 = s 2 n n − 1 = 
(xi − x¯)2 n − 1 . However, s 2 is still a consistent estimator, since E(s 2) → σ 2 as n → ∞ and also V (s 2) → 0. The value of V (s 2) depends on the distribution of underlying population, which is often assumed to be a normal.
3.
Let B be the set of all possible vectors  . If there is no further information, the B is k -dimensional real Euclidean space. 
The object is to find a vector 1 2 ' ( , ,..., ) k b bb b  from B that minimizes the sum of squared deviations of ' , 
i  s i.e., 2 1 ( ) ' ( )'( ) n i i S      y X y X       for given y and X. 
A minimum will always exist as S( )  is a real-valued, convex and differentiable function. Write S() ' ' ' 2' '    y y   XX X y . Differentiate S( )  with respect to  2 2 ( ) 2' 2' ( ) 2 ' (atleast non-negative definite). S XX Xy S X X             
The normal equation is ( ) 0 ' ' S X Xb X y        where the following result is used: Result: If f () ' z Z AZ  is a quadratic form, Z is a m1 vector and A is any m m symmetric matrix then F z Az () 2 z    . 
Since it is assumed that rank ( ) X  k (full rank),

Subject Exercise A We consider n observations y1, . . . , yn of a variable and n vectors xi (t(xi) = (xi1, . . . , xik)) where we have the observations of k variables. For i ∈ {1, . . . , n}, we...

Answer To: Subject Exercise A We consider n observations y1, . . . , yn of a variable and n vectors xi (t(xi) =...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment