Questions 1-6 whole sheet Question 1 The lifetimes (in units of 106 seconds) of certain satellite...

Question

Questions 1-6 whole sheet

Question 1 The lifetimes (in units of 106 seconds) of certain satellite components are shown in the frequency distribution given in ‘Dataset1’. 1. Draw a frequency polygon, histogram and cumulative frequency polygon for the data. 2. Calculate the frequency mean, the frequency standard deviation, the median and the first and third quartiles for this grouped data. 3. Compare the median and the mean and state what this indicates about the distribution. Comment on how the answer to this question relates to your frequency polygon and histogram. 4. Explain the logic behind the equations for the mean and standard deviation for grouped data, starting from the original equations for a simple list of data values. (This does not just mean ’explain how the equations are used’.) Page 2 5. Carry out an appropriate statistical test to determine whether the data is normally distributed. Question 2 A manufacturer of metal plates makes two claims concerning the thickness of the plates they produce. They are stated here: • Statement A: The mean is 200mm • Statement B: The variance is 1.5mm2 . To investigate Statement A, the thickness of a sample of metal plates produced in a given shift was measured. The values found are listed in Part (a) of worksheet ‘Dataset2’, with millimetres (mm) as unit. 1. Calculate the sample mean and sample standard deviation for the data in Part (a) of ’Dataset2’. Explain why we are using the phrase ’sample’ mean or sample’ standard deviation. 2. Set up the framework of an appropriate statistical test on Statement A. Explain how knowing the sample mean before carrying out the test will influence the structure of your test. 3. Carry out the statistical test and state your conclusions. To investigate the second claim, the thickness of a second sample of metal sheets was measured. The values found are listed in Part (b) of worksheet ‘Dataset2’, with millimetres (mm) as unit. 1. Calculate the sample mean and then the sample variance and standard deviation for the data in Part (b). Page 3 2. Set up the framework of an appropriate statistical test on Statement B. Explain how knowing the sample variance before carrying out the test would influence the structure of your test. 3. Carry out the statistical test and state your conclusions. Question 3 A manager of an inter-county hurling team is concerned that his team lose matches because they ‘fade away’ in the last ten minutes. He has measured GPS data showing how much ground particular players cover within a given time period; this is the data in list (a) in worksheet ‘Dataset3’. He has acquired the corresponding data from an opposing, more successful team, which is given in list (b). 1. Calculate the sample mean and sample standard deviation for the two sets of data. 2. Set up the frame work of an appropriate statistical test to determine whether there is a difference in the distances covered by the two groups of players. 3. Explain how having the results of the calculations above in advance of doing your statistical test will influence the structure of that test. 4. Carry out the statistical test and state your conclusions. Question 4 A study was carried out to determine whether the resistance of the control circuits in a machine are lower when the machine motor is Page 4 running. To investigate this question, a set of the control circuits was tested as follows. Their resistance was measured while the machine motor was not running for a certain period of time and then again while the motor was running. The values found are listed in worksheet ‘Dataset4’, with kilo-Ohms as the unit of measurement. 1. Set up the structure of an appropriate statistical test to determine whether the resistance of the control circuit in a machine are lower when the machine motor is running. 2. Explain how the order of subtraction chosen to calculate the differences will influence the structure of the test. 3. Give a reason why the data is measured with the engine not running first and then with the engine running. 4. Explain how knowing the mean of the differences in advance will influence the structure of your statistical test. 5. Carry out the statistical test and state your conclusions. Question 5 A study was carried out to determine the influence of a trace element found in soil on the yield of potato plants grown in that soil, defined as the weight of potatoes produced at the end of the season. A large field was divided up into 14 smaller sections for this experiment. For each section, the experimenter recorded the amount of the trace element found (in milligrams per metre squared) and the corresponding weight of the potatoes produced (in kilograms). This information is presented in the worksheet ‘Dataset5’ in the Excel document. Define X as the trace element amount and Y as the yield. Page 5 1. Draw a scatterplot of your data set. 2. Calculate the coefficients of a linear equation to predict the yield Y as a function of X. 3. Calculate the correlation coefficient for the paired data values. 4. Set up the framework for an appropriate statistical test to establish if there is a correlation between the amount of the trace element and the yield. Explain how having the scatterplot referred to above and having the value of r in advance will influence the structure of your statistical test. 5. Carry out and state the conclusion of your test on the correlation. 6. Comment on how well the regression equation will perform based on the results above. Question 6 A multinational corporation is conducting a study to see how its employees in five different countries respond to three gifts in an incentive scheme. The numbers of employees who choose each of the three gifts (G1 to G3) in each of the five countries (A to E) are given in the table in ‘Dataset6’ in the Excel document. 1. Set up the structure of an appropriate statistical test to determine whether the data supports a link between choice of gift and country, including the statistic to be used. 2. Carry out this test, showing clearly in your work how the expected values are calculated for your test statistic.

assignment-questions-7-t2drk5qv-p5pnt4rv.pdf assignment-datasets-4-zd52d1ry-4wuf54uf.xlsx

Baljit · Accepted Answer

Statistics and Probability
Assignment on Statistical Testing
Question 1:
1. A frequency polygon is a graphical representation of the distribution of a dataset. It displays the frequencies of different intervals of data on the y-axis against the midpoint of each interval on the x-axis, connecting these points with straight lines.We can draw frequency polygon graph using excel scatter plot.
Now a histogram is a graphical representation of the distribution of a dataset. It displays the frequency of data values falling within specified intervals, called bins or classes, along the x-axis, and the count or frequency of observations within each interval along the y-axis.
A cumulative frequency polygon is a graphical representation of cumulative frequencies plotted against their corresponding data values. It is used to visualize the cumulative distribution of a dataset.
2. Now we have following data
Now we know that
Here fi is frequency of ith interval and xi is mid point or value of ith interval.
Similarly
 =10.44196
Here L is the lower boundary of the interval containing the median ,CFprevious​ is the cumulative frequency of the previous interval, f is the frequency of the interval containing the median and w is the width of the interval
Now the median corresponds to the cumulative frequency closest to N/2
Now N/2 fall in interval 321-328
So L=321 ,f=88 CFprevious=74 and w=7
Now First Quartile corresponds to the cumulative frequency closest to N/4
Now 78.25 is close to CF 74 which corresponds to interval 314-321 so L=314, CF_previous=30 and f=44
Similarly third quartile Q3 is corresponds to 3N/4
Now 234.75 is close to CF=248 so L=328,CFprevious=162 and f=86
3. The mean and median are close to each other which suggests that distribution is symmetric, resembling a normal distribution. The bell-shaped frequency polygon and histogram  curve typically indicates a symmetrical distribution where the majority of the data points cluster around the mean. This alignment between the values of mean, median, and the bell-shaped graphs confirms that the dataset exhibits characteristics of a normal distribution.
4. We know that means and standard deviation of simple data are
In grouped data, instead of individual data points, we have intervals or classes with midpoints representing the central tendency of each interval. To calculate the mean for grouped data, we find the weighted average of the midpoints, where each midpoint is weighted by its frequency. This accounts for the fact that some intervals may contain more data points than others. Similar to simple data, we calculate the deviations of the midpoints from the mean for each interval. However, in grouped data, we also consider the frequency of each interval when calculating the deviations. Intervals with higher frequencies contribute more to the overall spread of the data.  
Question 2:
Part (a):-
1.  Now sample mean and standard deviation are
The phrases "sample mean" and "sample standard deviation" are used to emphasize that these statistics are calculated from a subset of data, known as a sample, rather than from the entire population.
2. Since population standard deviation is unknown we will do t- test.
Hypothesis statements for the test
Null Hypothesis (H0): The population mean thickness is 200mm.
Alternative Hypothesis (H1): The population mean thickness is  not 200mm.
We can calculate t-value using following formula
Now here  , and sample size n=42
Assuming a significance level of 0.05.
Now degree of freedom is
Df=n-1=42-1
t-crtical value of alpha=0.05 and df=41 is
So knowing the sample mean allows for the calculation of the test statistic.
3. Since our t-value is less than critical t –value so we failed to reject null Hypothesis. It means that we do not have enough evidence to conclude that the population mean thickness is different from the claimed value of 200mm.
Part (b):
1. Now sample mean and standard deviation are
2. We can use chi-square test in this case
Hypothesis statements are
Hypothesis statements for the test
Null Hypothesis (H0): The population variance  of the metal sheets is 1.5mm² 
Alternative Hypothesis (H1): The population variance  of the metal sheets is not 1.5mm² 
Now chi-square statistic
 Here sample size n is 18.
Now significance level alpha=0.05 and degree of freedom is 17.
Critical chi square value for alpha=0.05 and df=17 is
Knowing the sample variance before conducting the test helps in determining the appropriate statistical test and calculating the test statistic. In this case, it allows for the selection of the chi-square test and incorporation of the sample variance into the computation of the chi-square statistic.
3. Since the critical value for the chi-square test is less than the calculated chi-square statistic, it means that the calculated statistic falls within the acceptance region, and we fail to reject the null hypothesis.

Questions 1-6 whole sheetQuestion 1 The lifetimes (in units of 106 seconds) of certain satellite components are shown in the frequency distribution given in ‘Dataset1’. 1. Draw a frequency polygon,...

Answer To: Questions 1-6 whole sheetQuestion 1 The lifetimes (in units of 106 seconds) of certain satellite...

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment