Page 1 of 6 Statistics Project Temperatures in January, 2006 in Purcellville, Virginia Submitted by Suzanne Sands Purpose: Analyze temperatures for January, 2006 in my local region, Purcellville,...

1 answer below »
Page 1 of 6
Statistics Project
Temperatures in January, 2006 in Purcellville, Virginia
Submitted by Suzanne Sands
Purpose: Analyze temperatures for January, 2006 in my local region, Purcellville, Virginia.
Most people are interested in the local weather, including me! I focused on the weather in January, 2006. News
reports indicated a much warmer January than usual. I was interested in compiling descriptive summaries in the
form of charts and numerical measures to get a sense of the typical temperature for January, 2006, and how the
temperatures have varied over the course of the month. (This particular project example is an adaptation of similar
project examples I have used in statistics classes I have taught in the past.)
Data: Random Sample of 30 Temperatures in January, 2006 in Purcellville, Virginia
Data Collection: An excellent website, www.weatherunderground.com, provides temperature readings from
thousands of weather stations. Toward the middle of the screen, I typed “Purcellville” in the “Location” box and
arrived at the Purcellville forecast. At the bottom of that page, there are links for personal weather stations. I
clicked on the “Top of Tranquility, Purcellville, VA” link and arrived at
http://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=KVAPURCE1
You can search for weather readings any day you like in the recent past. This particular weather station
recorded 12 temperatures every hour back in 2006, so there were 12 readings/hr x 24 hours x 31 days = 8,928
temperature readings for January, 2006! I decided to select a simple random sample of 30 temperatures from
this large collection of data.
I collected 30 temperatures at random times in January, 2006. (Random sampling is NOT a
requirement for your project. For instance, you could record the high temperature for each day.)
FYI: Here is how I chose the random sample: Since there are 31 days in January, I generated 30 random numbers
between 1 and 31 (with possible repetition). (You will see below that many days are repeated.) Next I determined
the sampling times. Since there were 288 temperature readings each day, I generated 30 random numbers between 1
and 288, representing the reading numbers. Since there were 12 readings per hour, I divided the reading random
number by 12 to get the hour and used the remainder to figure out which reading to choose during that hour. I
looked up the temperatures for each randomly selected day and time, and recorded the appropriate temperature.
Count
(January, 2006)
Date Time
Temperature
(degrees) Count
(January, 2006)
Date Time
Temperature
(degrees)
1 1 2:41 44.1 16 13 11:55 48.4
2 2 6:35 31.3 17 13 16:01 57.9
3 2 16:21 39.7 18 15 17:35 34.9
4 2 18:11 40.6 19 16 4:55 32.4
5 3 11:45 39.7 20 16 11:11 37.6
6 4 8:05 35.2 21 16 18:35 36.3
7 4 11:55 42.4 22 18 9:01 44.6
8 4 20:35 39.9 23 18 13:45 40.5
9 4 23:52 40.3 24 23 5:41 34.2
10 5 5:21 39.2 25 25 20:31 32.2
11 9 11:31 52.5 26 25 21:31 31.9
12 10 2:45 43.7 27 27 2:11 27.7
13 10 4:21 43.0 28 28 14:35 61.7
14 12 2:31 37.9 29 28 19:21 48.7
15 12 16:55 54.9 30 31 1:01 48.7
Page 2 of 6
Temperature Data, in ascending order:
27.7 31.3 31.9 32.2 32.4 34.2 34.9 35.2 36.3 37.6 37.9 39.2 39.7 39.7 39.9
40.3 40.5 40.6 42.4 43.0 43.7 44.1 44.6 48.4 48.7 48.7 52.5 54.9 57.9 61.7
Notes: To construct a frequency distribution, typically we need to group the data into about four to eight intervals. In
looking over the sorted data, ranging from 27.7 to 61.7, it seems reasonable to use intervals of width 5 or 10 degrees.
Frequency Distribution:
Grouped in intervals of 10 degrees
Grouped in intervals of 5 degrees
REMARKS: Both tables show that the temperatures are principally clustered in the 30’s and 40’s. Which
table is better? It’s really a toss-up; either one is fine. It’s not necessary to make more than one table. I am
showing two tables, just for illustration purposes.
If a table has very low frequencies for all of the intervals (say a frequency of 1-2 for each interval), or if
there are more than 10 intervals, that would be an indication that the interval width is too small. For
example, if each interval consisted of just one degree, then the frequency table for this temperature data
would have over 30 rows and that table would not be very informative, in terms of helping to see where
the data are clustered.
30 Random Temperatures in January, 2006,
Purcellville, VA
Temperature
(degrees) Frequency
Relative
Frequency
19.95 - 29.95 1 .033
29.95 - 39.95 14 .467
39.95 - 49.95 11 .367
49.95 - 59.95 3 .100
59.95 - 69.95 1 .033
Total 30 1.000
30 Random Temperatures in January, 2006,
Purcellville, VA
Temperature
(degrees) Frequency
Relative
Frequency
24.95 - 29.95 1 .033
29.95 - 34.95 6 .200
34.95 - 39.95 8 .267
39.95 - 44.95 8 .267
44.95 - 49.95 3 .100
49.95 - 54.95 2 .067
54.95 - 59.95 1 .033
59.95 - 64.95 1 .033
Total 30 1.00
Page 3 of 6
Histogram
The histogram is a visual representation of the frequency distribution on the previous page, with the
temperatures grouped in intervals of 5 degrees.
The majority of temperatures fall between 34.95 and 44.95 degrees.
The histogram was generated with spreadsheet software. Your histogram does not have to be fancy. It can be handdrawn or typed in plain text form. It is important that the scales and the labeling are clear and accurate.
Plain text histogram:
Temperatures in January, 2006 in Purcellville, Virginia
Frequency |
9---|
| 8 8
8---| |XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|
7---| |XXXXXXX|XXXXXXX|
| 6 |XXXXXXX|XXXXXXX|
6---| |XXXXXXX|XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|XXXXXXX|
5---| |XXXXXXX|XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|XXXXXXX|
4---| |XXXXXXX|XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|XXXXXXX| 3
3---| |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX| 2
2---| |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|
| 1 |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX| 1 1
1---| |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|
| |XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|XXXXXXX|
0-- .----|-------|-------|-------|-------|-------|-------|-------|-------|
24.95 29.95 34.95 39.95 44.95 49.95 54.95 59.95 64.95
Temperatures (Degrees Fahrenheit)
(NOTE: If typing in plain text, use a fixed width font, such as Courier New)
1
6
8 8
3
2
1 1
0
1
2
3
4
5
6
7
8
9
25.95-
29.95
29.95-
34.95
34.95-
39.95
39.95-
44.95
44.95-
49.95
49.95-
54.95
54.95-
59.95
59.95-
64.95
Frequency
Temperature (degrees Fahrenheit)
30 Random Temperatures in January, 2006, Purcellville, VA
Page 4 of 6
MEDIAN:
When the 30 data values are sorted, since 30 is even, the median is the average of the observations in the
middle, the average of the values in positions 15 and 16 in the sorted list.
27.7 31.3 31.9 32.2 32.4 34.2 34.9 35.2 36.3 37.6 37.9 39.2 39.7 39.7 39.9
40.3 40.5 40.6 42.4 43.0 43.7 44.1 44.6 48.4 48.7 48.7 52.5 54.9 57.9 61.7
Median = (39.9 + 40.3)/2 = 40.1 degrees.
SAMPLE MEAN = ¯ = 1242.1/30 = 41.40 degrees = the sum of the temperatures, divided by the
sample size
Note that the mean is larger than the median. The histogram has a longer right "tail" compared to the left
end, due to a few relatively high temperatures. The mean is affected by the size of the highest
temperatures, but the median is not, so the mean is larger than the median.
RANGE = 61.7 - 27.7 = 34.0 degrees = the difference between the maximum and minimum
SAMPLE VARIANCE = 66.1417 (calculations shown on the next page; used a spreadsheet & pasted it in
the document)
SAMPLE STANDARD DEVIATION = s = 8.13 degrees (calculation shown on the next page)
Data within one standard deviation of the mean must fall in the interval
¯ - , ¯ +  = 41.40 - 8.13, 41.40 + 8.13 = 33.27, 49.53
Data within two standard deviations of the mean must fall in the interval
¯ - 2, ¯ + 2 = 41.40 - 28.13, 41.40 + 28.13 = 25.14, 57.66
Data within three standard deviations of the mean must fall in the interval
¯ - 3, ¯ + 3 = 41.40 - 38.13, 41.40 + 38.13 = 17.01, 65.79
__
__ 27.7 31.3 31.9 32.2 32.4 34.2 34.9 35.2 36.3 37.6 37.9 39.2 39.7 39.7 39.9
40.3 40.5 40.6 42.4 43.0 43.7 44.1 44.6 48.4 48.7 48.7 52.5 54.9 ___ 57.9 61.7 __
In the interval 33.27, 49.53, there are 21 temperatures, and 21/30 = 70.0%
In the interval 25.14, 57.66 , there are 28 temperatures, and 28/30 = 93.3%
In the interval 17.01, 65.79 , there are 30 temperatures, and 30/30 = 100.0%
So, 70.0% of the temperatures fall within one standard deviation of the mean, 93.3% of the temperatures
fall within two standard deviations of the mean, and 100% of the temperatures fall within three standard
deviations of the mean. For a bell-shaped distribution, the respective percentages are approximately 68%,
95%, and 100%. For the temperature data, the percentages are reasonably close to the bell-shaped model,
so yes, the data distribution is approximately bell-shaped.
Page 5 of 6
Calculation of sample variance and sample standard deviation:
Col 1 Col 2 Col 3 Col 4 = [Col 3]^2
Count Temperature x x - Mean (x - Mean)^2
1 44.1 2.6967 7.2720
2 31.3 -10.1033 102.0773
3 39.7 -1.7033 2.9013
4 40.6 -0.8033 0.6453
5 39.7 -1.7033 2.9013
6 35.2 -6.2033 38.4813
7 42.4 0.9967 0.9933
8 39.9 -1.5033 2.2600
9 40.3 -1.1033 1.2173
10 39.2 -2.2033 4.8547
11 52.5 11.0967 123.1360
12 43.7 2.2967 5.2747
13 43.0 1.5967 2.5493
14 37.9 -3.5033 12.2733
15 54.9 13.4967 182.1600
16 48.4 6.9967 48.9533
17 57.9 16.4967 272.1400
18 34.9 -6.5033 42.2933
19 32.4 -9.0033 81.0600
20 37.6 -3.8033 14.4653
21 36.3 -5.1033 26.0440
22 44.6 3.1967 10.2187
23 40.5 -0.9033 0.8160
24 34.2 -7.2033 51.8880
25 32.2 -9.2033 84.7013
26 31.9 -9.5033 90.3133
27 27.7 -13.7033 187.7813
28 61.7 20.2967 411.9547
29 48.7 7.2967 53.2413
30 48.7 7.2967 53.2413
Sum 1242.1 1918.1097
Mean 41.40333333 Sample Variance 66.14171264
(divide Sum by 30) (divide Col 4 sum by 29, one less than the sample size)
Sample Standard Deviation
(sqrt of variance) 8.132755538
Note: The results of the calculations can be checked by using the spreadsheet functions var( ) and
stdev( ) in Excel. However, for the purposes of demonstrating understanding of the calculations,
you must show work similar to the table above.
Page 6 of 6
CONCLUSION
In January, 2006 in Purcellville, Virginia, the 30 sampled temperatures fell between 27.7 and
61.7 degrees, for a range of 34 degrees. Temperatures tended to be concentrated in the upper
30’s and low 40’s, as shown the histogram.
The median temperature is 40.1° and the mean temperature is 41.4°, with standard deviation
8.13°. The temperature data distribution is approximately bell-shaped.
As mentioned at the beginning of this report, January of 2006 seemed to be unusually warm. The
analysis in this project agrees with this conjecture. In looking at the website www.weather.com, I
found that the average daily HIGH temperature for January (in any year) in Purcellville is 42
degrees. My analysis found an average of ALL sampled temperatures (not merely the daily
highs) to be 41.4, not much below the typical daily high.
[Remark: The average of the data, 41.4, is a statistic – it is the average temperature for the sample. It is possible that
the average of all January temperature readings is somewhat different. If we were familiar with the techniques of
inferential statistics, we could assess whether we can take this statistic and use it in making a statistical inference.]
FINAL REMARKS: This sample project could have been done without the use of a spreadsheet or fancy
software, if the frequency distribution, and histogram were carefully hand-drawn or typed. I have added
considerable commentary to the project items, to indicate what I was thinking about when completing the tasks. You
can be less “wordy,” but be sure that your work and summary are detailed and informative, and you show
calculations as requested.
Answered Same DayDec 23, 2021

Answer To: Page 1 of 6 Statistics Project Temperatures in January, 2006 in Purcellville, Virginia Submitted by...

Robert answered on Dec 23 2021
112 Votes
INTRODUCTION
In this paper, I will analyze the job satisfaction of employees working in Company XYZ ltd. The
data set has been collected from the American Intellectual Union to study job satisfaction. The
main purpose of
this study is to see how much satisfaction an employee has working with this
company by way of intrinsic and extrinsic factors. For the purpose of analysis, we will analyze
the several variables related to the overall job satisfaction such as intrinsic factors, extrinsic
factors, benefits to employees and the salaries given to them. Based upon these variables, I will
be studying the statistical descriptive and analyzing them along with the visual representation of
the most significant ones.
DATA AVAILABILTY
In this section, I will give you the brief description of the variables that I am going to study and
analyze. The data set consists of 30 observations from the XYZ Ltd. The variables under
consideration are Gender (1 = Male, 2 = Female), Age (1 = 21 and under, 2 = 22 -49 years and 3
= 50 and over), Department (1 = Human resources, 2 = Information technology and 3 =
Administration), Position (1 = Hourly employee and 2 = Salaried employee), Tenure with
Company (1 = less than 2 years, 2 = 2 to 5 years and 3 = Over 5 years), Overall Job Satisfaction
(1 = least satisfied and 7 = most satisfied), Intrinsic (1 = least satisfied and 7 = most satisfied),
Extrinsic (1 = least satisfied and 7 = most satisfied), Benefits (1 = least satisfied and 7 = most
satisfied) and Salary ($ ‘000).
Data Analysis
After knowing the variables of the data, I will analyze the variables one by one and see how they
help in supplementing company performance. For this purpose, I will use both the techniques –
tabular and graphical representation. The most distinguishing aspect of my study will that here I
will discuss all the significant variables by male and female separately.
Among all the listed variables, we will first use the Gender distribution and will see its impact on
job satisfaction. Since gender distribution is a qualitative variable, the statistical measure of
central tendency is not appropriate. We always code gender as 1 and 2 to represent gender, male
or female. From the table...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here