Microsoft Word - W23 MATH341_345 Project V1.docx1 Winter 2023 MATH 341/345 Project Deployments of Safety Cars in Formula One in XXXXXXXXXX (Version 1. February 5, 2023) Introduction:...

1 answer below »
submit two things (can be submitted as a single file):




  • Your informative tentative project title.








  • Your answers to the Probability Questions. (they are in the file provided)









Microsoft Word - W23 MATH341_345 Project V1.docx 1 Winter 2023 MATH 341/345 Project Deployments of Safety Cars in Formula One in 2010-2019 (Version 1. February 5, 2023) Introduction: This project aims at modeling the frequencies of safety car deployments per race in Formula One and the time intervals between safety car deployments in 2010-2019. A safety car in Formula One is deployed while the “yellow flags” are waved by the marshals and the Race Director decides that it is necessary to remove any hazards on the race track or that the racing cars need to slow down due to unfavorable track conditions (i.e., heavy rain). When a safety car is deployed, in addition to the yellow flags, each driver sees “SC” boards on the sides of the track. Moreover, the same information is displayed on the steering wheel of each racing car. Safety cars and yellow flags are important components of Formula One racing to protect drivers’ and marshals’ lives. When the safety car is leading the race, each racing car needs to bunch up and follow the safety car without overtaking any other cars, unless they are allowed to unlap themselves. As the safety car goes around the track at a much slower speed than the normal racing pace, marshals can quickly remove any hazards on the track and improve the track condition without worrying about fast-moving racing cars. However, even with strict regulations under the yellow flag condition, accidents happen, especially during wet weather races. A notable recent incident happened at the 2014 Japanese Grand Prix, when a very promising young French driver Jules Bianchi of Marussia collided with a tractor crane under the “double yellow flag” condition. A “double yellow flag” condition indicates that marshals may be present on the track and the driver needs to prepare to stop, if necessary. Bianchi lost control of the car due to aquaplaning on the wet surface and suffered a fatal injury as a result of the collision with the tractor crane. The FIA (governing body of the Formula One races) took the incident very seriously and implemented a number of safety measures. One of them is an introduction of the “virtual safety car (VSC)”. Under VSC condition, each driver needs to slow down their car to the posted speed limit, usually resulting in a 35 to 40% speed reduction. Because it is a “virtual” safety car, under VSC, the actual safety car is not deployed; rather, each racing car is equipped with the device which automatically slows down to the posted speed limit under VSC. Even with the introduction of VSC in 2015, under severe conditions, safety cars are deployed once in a while. Here, an interesting question arises: Did the introduction of VSC change the frequency of safety car deployments? This is an important question to answer for race strategists, as the deployment of a safety car means that each team needs to react quickly to adjust their tire strategies. Each driver is required to make at least one pit stop to change their tires during the race, and a pit stop under the safety car condition implies that they can save about 20 seconds, possibly gaining several precious positions in the race without overtaking. At 2 the same time, fresh tires typically make the racing car more drivable, increasing the chances of catching and overtaking the other racing cars in front after the pit stop. Related article: https://www.mclaren.com/racing/2019/canadian-grand-prix/how-make-right- call-safety-car/ Note that the importance of understanding probability is emphasized in this article. Your Tasks in This Project: Your main task in this project is to analyze the safety car deployment data in Formula One to determine whether there are any changes in the frequency of safety car deployments between the pre-VSC era (2010-2014) and post-VSC era (2015-2019). That involves fitting reasonable distribution(s) to the data for the number of safety car deployments per race and time intervals between the safety car deployments in these two time periods. Then, by comparing these two distributions, you are asked to conclude whether strategic adjustments were necessary to account for increased/decreased safety car deployments after VSC was introduced in 2015. The dataset is originally retrieved from Kaggle (https://www.kaggle.com/datasets/jtrotman/formula-1-race-events), but it was further augmented by adding Type, Round, TotalRounds, TotalLaps, and Condition. These additional pieces of information were taken from the Wikipedia entries for the Formula One races. A thorough and complete analysis of the main task above is sufficient to receive full credit for this project. That is, you are not required to do any additional programming beyond what is given if you choose to do so. However, you are probably interested in doing a more detailed analysis of the dataset to make your analysis useful and interesting for the participating Formula One teams. To help you analyze the dataset in more detail, the dataset provided (augmented_safety_cars.csv) contains additional information such as type of the circuit (permanent or street) and track condition (dry, mixed, or wet). In addition, you will be asked to watch an interesting video titled “What Does An F1 Strategist Do?” (https://youtu.be/4CFkltWIc8o) so that you can see what Formula One strategists actually do before, during, and after each race. At the same time, you will see how they interact with racers, mechanics, race engineers, data analysts, and team principals. How This Project Works: This project consists of three parts; Probability Questions, Statistics Questions, and project write-up. For the Probability and Statistics Questions, you need to answer the questions given below. For the project write-up, you may choose to summarize the results based on the R code given. However, to make the project more interesting, you are encouraged to carry out additional analysis. If you find anything interesting, you may choose to write about your interesting finding(s) instead. To make sure that what you decide to write in your write-up is appropriate, please talk to the instructor before you do anything. The instructor will be happy to assist you with additional programming if necessary. 3 Probability Questions (12 Points in Total): 1. Watch “What Does An F1 Strategist Do?” (https://youtu.be/4CFkltWIc8o) and describe how the Formula One strategist position is related to your major(s) in a paragraph or two. Note: Everyone on your team needs to write a separate paragraph or two. (2pts) 2. Suppose that you look at each of ? different laps in Formula One races. Why is checking whether or not each of these laps was led by a safety car is a binomial experiment? (2pts) 3. Why is it reasonable to assume that the number of safety car deployments in a fixed period of time (i.e., five seasons) follows the Poisson distribution (approximately)? Recall the relationship between the binomial and Poisson distribution, and state what happens to ? (the number of laps) and ? (the probability that each lap is led by a safety car). (2pts) 4. Why is it reasonable to assume that the time intervals between safety car deployments are (approximately) exponentially distributed? (2pts) 5. Suppose that we consider two time periods of Formula One racing (2010 – 2014 and 2015 – 2019). Is it safe to assume that the number of safety car deployments in each of these two time periods is independent of each other? In other words, is it reasonable to say that the number of safety car deployments in 2010 – 2014 does not significantly influence the number of safety car deployments in 2015 – 2019? Justify. (2pts) 6. Recall the memoryless property of the exponential distribution, which says that ?(? ≥ ?! + ?"|? ≥ ?!) = ?(? ≥ ?"), ?! ≥ 0, ?" ≥ 0, if and only if ? is exponentially distributed. What does this imply regarding the probability that the next safety car deployment is 5 races from now given that it has been 3 races since the last safety car deployment? Comment. (2pts) Note: The above phenomenon is known as “the waiting time paradox”. 7. (Optional) Any questions you have about this project. Statistics Questions (Read W23MATH341Project.R and run the program to answer these questions. Look for “SQ” in the comments in the R code to identify which part of the code is referring to which question.) (20 Points in Total): Note: The length of each race is set to 1, which is reasonable given that each race has an approximately the same race distance. According to the data, the first safety deployment in 2010 occurred at lap 2 of Round 2 (which was a 58-lap race) and the second deployment occurred at lap 1 of Round 4 (which was a 56-lap race). Thus, the first duration is simply (Round 1) + (Deployment in Round 2) = 1 + 2/58 = 1.034483. Then, the second duration is the duration between these two deployments is given by (Remaining laps in Round 2) + Round 3 + (Deployment in Round 4) = (1 – 2/58) + 1 + 1/56 = 1.983374. 1. Look at the histograms of the number of safety car deployments per race. Do these histograms suggest that the data are Poisson distributed (approximately)? Or, is there any clear evidence against that? Comment. (2pts) 2. The best-fit Poisson pmfs, as represented by the blue dotted lines, use lambda=mean(first_half) and lambda=mean(second_half) for the first and second half of the 2010’s, respectively. Explain why it makes sense to use these values. (2pts) 4 3. Look at the histograms of the time intervals between two safety car deployments. Do these histograms suggest that the data are exponentially distributed (approximately)? Or, is there any clear evidence against that? Comment. (2pts) 4. The best-fit exponential pmfs, as represented by the blue dotted lines, use rate=1/mean(interval1) and rate=1/mean(interval2) for the first and second half of the 2010’s, respectively. Explain why it makes sense to use these values. (2pts) 5. Report mean(first_half), mean(interval1), mean(second_half), and mean(interval2). Then, describe how mean(first_half) and mean(interval1), as well as mean(second_half) and mean(interval2), are approximately related to each other. After that, explain why that happens by recalling the distributions you identified for the number of safety car deployments and the time interval between two safety car deployments. (2pts) 6. Running a two-sample t-test for comparing means or to construct a confidence interval for the difference in means using the time interval data may potentially lead to wrong results. Explain why in terms of normality and independence. (2pts) 7. Explain why the concerns you mentioned in the previous question are actually not concerning for this dataset. (2pts) 8. The t.test() function in R gives the one- and two-sample t-test results for the mean or difference in means, including the confidence intervals and p-values. The parameter var.equal in the t.test() function specifies whether or not the common variance can be assumed (if yes, TRUE, and otherwise, FALSE). For comparing the time intervals, can we assume common variance? Comment. Recall that the mean and standard deviation are equal to each other in the case of exponential distribution. (2pts) 9. Report the results of the t.test() function (95% confidence interval, degrees of freedom used, and p-value) for the var.equal=TRUE and var.equal=FALSE cases. (2pts) 10. Based on the results above, discuss whether or not there is any statistically significant change in the distribution of the safety car deployments between these two time periods. (2pts) 11. (Optional) The Kolmogorov-Smirnov test is a one- and two-sample test that directly compares the cumulative distribution function(s) of the data. In the one-sample case, a researcher hypothesizes the underlying distribution and see how well the cumulative distribution function (cdf) estimated from the data (known as the empirical cdf) matches that of the hypothesized distribution. In the two-sample case, the two empirical cdf’s are directly compared. Do the test results show any evidence against the deviation from the exponential distribution for the time interval data? Also, are these two datasets significantly different from each other? Justify your conclusion by reporting the p-values and interpreting these p-values. (Extra credit: 1pt) 12. (Optional) The quantile-quantile (Q-Q) plot is a visual tool to see if the dataset of interest follows a certain distribution. Although the Q-Q plot is typically used for the normal distribution, for this project, we use the Q-Q plot for the exponential distribution. If the points on the plot follows a straight line on the Q-Q plot, that is an indication that the dataset follows the exponential distribution well. Present the Q-Q plots for the time interval datasets (pre- and post-VSC) and comment. (Extra credit: 1pt) 13. (Optional) Another important aspect of the dataset is the independence of the observations. A common assumption
Answered 2 days AfterFeb 28, 2023

Answer To: Microsoft Word - W23 MATH341_345 Project V1.docx1 Winter 2023 MATH 341/345 Project...

Banasree answered on Mar 02 2023
29 Votes
Probability Question:
1.Ans.
As a mechanical engineering major, the role of an F1 strategist is one that particularly intriguing. The position requires a strong understanding of the technical aspects of racing as well as strategic thinking and quick decision-making skills. The coursework has covered a variety of topics related to the design and operation of racing vehicles, which is essential to understanding the data that the strategist must analyze and interpret during a race. Additionally, the ability to think critically and make quick decisions is a skill that have honed throughout the coursework and co-curricular activities. The role of an
F1 strategist requires a unique combination of technical knowledge and strategic thinking, making it a fascinating career option for those with a background in mechanical engineering.
As a statistics major, the role of an F1 strategist is one that find particularly interesting due to the importance of data analysis in the decision-making process. The ability to analyze and interpret large amounts of data is crucial to making informed decisions as a strategist. Additionally, statistical modeling can help predict the frequency of safety car deployments and time intervals between deployments, which can be used to inform strategic decisions during the race. The role of an F1 strategist requires a unique combination of technical knowledge, strategic thinking, and statistical analysis, making it a fascinating career option for those with a background in statistics.
2.Ans.
Checking whether or not each of the laps was led by a safety car can be considered a binomial experiment because it has the following characteristics:
1. The experiment consists of a fixed number of trials, which is the total number of laps in the race.
2. Each trial has only two possible outcomes: either the lap was led by a safety car or it was not.
3. The outcomes of the trials are independent of each other. The fact that one lap was led by a safety car does not affect the likelihood of the next lap being led by a safety car.
4. The probability of success (i.e., the probability of a lap being led by a safety car) is constant for each trial.
5. It can use the binomial distribution to calculate the probability of a certain number of laps being led by a safety car out of the total number of laps.
3.Ans.
The Poisson distribution is related to the binomial distribution in the following way: when the number of trials in a binomial experiment (i.e., the number of laps in a race) becomes very large and the probability of success in each trial (i.e., the probability that a lap is led by a safety car) becomes very small, the binomial distribution converges to the Poisson distribution with the mean parameter equal to the product of the number of trials and the probability of success.
In the case of safety car deployments in Formula One, the number of laps in a race (i.e., the number of trials) can be very large, and the probability that each lap is led by a safety car is typically very small. Therefore, if we consider the number of safety car deployments over a fixed period of time (e.g., five seasons), it is reasonable to approximate the distribution of the number of safety car deployments by the Poisson distribution with the mean parameter equal to the product of the total number of laps in the five seasons and the probability of a lap being led by a safety car.
4.Ans.
It is reasonable to assume that the time intervals between safety car deployments are (approximately) exponentially distributed for several reasons. Firstly, the occurrence of safety car deployments is a random and unpredictable event, and the exponential distribution is often used to model the time between occurrences of rare events. Secondly, the exponential distribution has a memoryless property, which means that the probability of a safety car being deployed in a given time interval is independent of the time since the last deployment. This property is often observed in situations where events occur randomly and independently over time, making the exponential distribution a natural choice for modeling the time between safety car deployments in Formula One races.
5.Ans.
It is not safe to assume that the number of safety car deployments in each of the two time periods (2010-2014 and 2015-2019) is independent of each other. There are many factors that can affect the number of safety car deployments in a given time period, including changes in track conditions, changes in the rules and regulations, and changes in the behavior of the drivers. Some of these factors may have carried over from one time period to another, making it difficult to assume independence between them.
For example, the introduction of the virtual safety car (VSC) in 2015 could have affected the number of safety car deployments in the 2015-2019 time period. It is possible that teams became more conservative with their tire strategies under VSC conditions, leading to fewer safety car deployments overall. Alternatively, teams could have become more aggressive with their driving under VSC conditions, leading to more safety car deployments overall. These types of carryover effects make it difficult to assume independence between the two time periods, and therefore it is important to analyze them separately to understand the underlying trends and factors that influence safety car deployments.
6.Ans.
Let's say that the average time between safety car deployments is 10 races, which means that the rate parameter of the exponential distribution is λ = 1/10.
Using the memoryless property of the exponential distribution, we can calculate the probability that the next safety car deployment is 5 races from now given that it has been 3 races since the last safety car deployment:
P(X > 8 | X > 3) = P(X > 5)
where X is the time between safety car deployments.
Since X follows an exponential distribution with λ = 1/10, we can find P(X > 5) as follows:
P(X > 5) = e^(-5λ) = e^(-1/2) ≈ 0.6065
Therefore, the probability that the next safety car deployment is 5 races from now given that it has been 3 races since the last safety car deployment is approximately 0.6065.
7.Ans.
No.
Statistical Questions:
SQ - This code is an R script that analyzes safety car deployments in Formula 1 races during the 2010s. The script loads a dataset of augmented safety car deployments (including additional information such as type of race track, lap numbers, and conditions), extracts data for the 2010s, and...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here