Report(submitted as a groupif working with collaborators): written presentation of yourwork, with supporting figures. The report should includesixsections, associated with I-VI above,with full...

1 answer below »
Report(submitted as a groupif working with collaborators): written presentation of yourwork, with supporting figures. The report should includesixsections, associated with I-VI above,with full descriptions for each distribution definition, plot, and calculated values.Use an easy-to-follow format with proper grammar. Make sure figures are properly labeled, captioned, andreferenced in the text.Practice conciseness by finding an optimal balance between rigorousnessand brevity.Submit your MATLAB(or other visualization tool)code with your submission


This project is the primary assessment mechanism in IE342 for the Probability portion of the course. It is the culmination of our study of Probability Theory over Modules 2-9, which included the following topics: · sample spaces, · events, · additive rules of probability, · conditional probability, · Bayes Theorem, · random variables, · general properties of probability distributions, · expected value and other mathematical expectation definitions, · joint probability distributions, · covariance, · correlation, · linear combinations of random variables, · Chebyshev’s Theorem, · parameterized discrete probability distributions (specifically the binomial distribution), · the normal distribution, and · parameterized asymmetric continuous probability distributions. After a half-semester focused on these topics, you are now tasked with applying the concepts of Probability Theory to real-world populations. Probability and Statistics for Engineers Task #1: Probability Distributions for a Real-World Populations I. Define Random Variables Project #1 Description Pick a (or multiple) real-world system(s) or scenario(s) where a population can be modeled with four random variables that have the following characteristics: · ?: a normal random variable that has a normal distribution · ?: a continuous random variable with an asymmetric distribution (i.e. exponential or lognormal) · ?: a random variable with a uniform distribution (could be discrete or continuous) · ?: a binomial random variable with large enough ? such that it can be accurately modeled using a normal approximation Keep in mind that this is an exercise in modeling, and no model is perfect. Thus, you will need to make assumptions about the nature of the population and how you are quantifying observations to define the random variables. This project does NOT require any real data associated with the population of choice. Instead, assign distribution types and parameters based on your understanding of the system. Data may be used to justify parameter choices, but it is not required. Discuss how probabilistic modeling is useful for the system of choice. Here are example definitions related to winter weather in Chicago (DO NOT USE THESE DEFINITIONS FOR YOUR PROJECT, THE BLUE FONT IS USED FOR EXAMPLES): · ?: daily high temperature during winter in Chicago · ?: total daily snowfall, in inches, during winter in Chicago · ?: the day, indexed from the start of winter, with the largest snowfall amount · ?: number of winter days with nonzero snowfall Within these sample random variable definitions, modeling ? with a uniform distribution likely has some inconsistencies with reality, in that there are systematic date preferences for higher/lower snowfall amounts. However, it is okay to assume that a uniform distribution model captures the overall tendencies well. Some other ideas for real-world systems that would work nicely for the probabilistic modeling for this project include: · COVID-19 – number of confirmed cases, hospitalizations, deaths, etc. across different states, countries, or worldwide and potentially through time; efficacy of COVID tests, antibody tests, or vaccines; adherence rates for public health guidance, etc. · Federal, State, and/or Local Elections – voting rates and preferences for various demographic groups or through time, · Public Health, Poverty, and Social Justice – use U.S. Census of WHO data to study matters related to housing, economic conditions, public health, racial inequalities, etc. · NASA Exoplanet Exploration – exoplanet radius, star radius, distance to its star, equilibrium temperature, orbit period, habitability for life, number of exoplanets per stellar system, etc. · Sports – world record marathon times, time between goals in soccer, most common score in basketball, etc. · Etc. – find something that you are passionate about; there is a great deal of flexibility here, so you should be able to make it work with just about any real-world system or scenario of interest to you. II. Set Population Parameters With the four random variables set, choose realistic values for the population parameters. Provide references and justification for all parameter choices. For the example random variables above, a quick internet search leads to some reasonable choices as · ?? = 35℉ (https://www.currentresults.com/Weather/Illinois/Places/chicago-temperatures-by-month-average.php), · ?? = 10℉ (a reasonable choice based on my experience living in Chicagoland), · ?? = 0.12 (https://www.currentresults.com/Weather/Illinois/Places/chicago-snowfall-totals-snow-accumulation-averages.php), · ?? = 91 (# winter days = 365/4) · etc. III. Produce Distribution Plots Define the probability distributions and generate distribution plots for all four random variables. The plot for ? should include both the binomial distribution AND the normal curve for the approximation. It is strongly recommended that you use MATLAB for this task, using the tools developed on the HW assignments. IV. Define a Joint Probability Distribution Choose two of the random variables from the list of four to produce a joint probability distribution. Consider the dependency of the two random variables. If the two random variables can be reasonably assumed to be independent, then provide justification for the assumption, and quickly get to a joint probability distribution (this is the straightforward approach). Otherwise, build a joint distribution that reasonably models the joint nature of the two random variables (this is more complicated, but is more likely to produce more accurate models). Again, provide justification for the distribution form applied. V. Calculate Meaningful Probabilities and Expected Values Use the probability distribution definitions to calculate at least two meaningful probabilities for each random variable AND two meaningful joint probabilities. Thus, at least 10 probability calculations are required. Use your plots to help visualize the probability values (i.e. show the areas under the curves). For example, using the Chicago weather random variable definitions from above, it would be interesting to know the following: · ?(26 <>< 35)="" as="" that="" relates="" to="" the="" likelihood="" of="" having="" icy="" road="" conditions="" ·="" (?=""> 3) as at least 3 inches of snow are needed for sledding · ?(? < 25,=""> 5) as that gives likelihood of a cold, snowy day where the snow may stay awhile · ?(? > 20) as 20 snow days was regular in the 1980s, and winter 2021 had that many Additionally, calculate and report the mean and variance for each random variable. Lastly, define a fifth random variable as a meaningful linear combination of two (or more) of the original random variables. A convenient choice would be the two random variables from part IV. Then, calculate and report the mean and variance of the new random variable. For example, using the Chicago weather random variable definitions, define ? = 32 − ? + 5? as a measure of the intensity of winter weather, where the colder and snowier days produce larger values of U, with an inch of snowfall having equal impact to a 5 degree Fahrenheit drop in temperature. VI. Discussion of Results and Conclusions Summarize your calculations and discuss the overall validity of the models. Do the probabilities and expected values match what would be expected for the real-world system? Where do the models work well and where is their accuracy limited, for each random variable? Are the potential issues related to the parameter values, the assumed form of the distribution, or both? Suggest potential improvements to the models. Lastly, discuss how to design a statistical experiment that could be implemented to test the population parameters. How can you ensure a random sample? Deliverables I. Report (submitted as a group if working with collaborators): written presentation of your work, with supporting figures. The report should include six sections, associated with I-VI above, with full descriptions for each distribution definition, plot, and calculated values. Use an easy-to- follow format with proper grammar. Make sure figures are properly labeled, captioned, and referenced in the text. Practice conciseness by finding an optimal balance between rigorousness and brevity. Submit your MATLAB (or other visualization tool) code with your submission. II. Video Highlight (submitted individually): 2-3 minute video presentation of ONE probabilistic model and ONE associated probability calculation. Approach this as a summary highlight of ONE aspect of your work presented to your boss or a client in a limited timeframe. Your highlight should include the following: · Introduce the real-world system being studying in your overall work and how probabilistic modeling is useful for that system · Define the ONE random variable you have chosen to highlight and justify the distribution type chosen · Present the distribution using a visualization (i.e. a distribution plot) and relevant parameters for the chosen random variable, with justification for each. · Explain all steps of the ONE probability calculation you have chosen to highlight. · Discuss the significance of the probability value calculated and the relevance of the probabilistic model chosen for the random variable and the real-world system. Prepare a short (1-3 slides) PowerPoint presentation to organize the summary highlight discussion. Then, record yourself presenting the slides as an individual. Record the video using the Panopto, Zoom, or any other method used for Quizzes all semester. If working with a partner or group on this project, all members must choose a different random variable to highlight, so some coordination is required here. Submit both the presentation slides and a link to the video recording. Submitting Work to Gradescope There are two separate Gradescope submissions for this project, one for each of the deliverables above. The Report is submitted as a group, so only one Gradescope submission per group. One group member should submit all documents on the submission item. Only after the submission is complete can you add the other group members to your submission. The Video Highlight must be submitted individually. Project #1 Deadline and Grading Policies All Project #1 Deliverables are due to the Project #1 submission item on Gradescope by Monday, 11/14, 11:59 pm. You are encouraged to get an early jumpstart on Project #1 during Week10 and Week11, and then use Week12 to complete your work. Note that there is no HW or Quiz for Week12, for which there is no lecture on Tuesday due to the Election Day Holiday. Late submissions are penalized -20 points/day late, with no submissions accepted after Friday, 11/18, 11:59 pm. The grading rubric for Project #1 is provided on the next page. Project #1 GRADING RUBRIC All Criteria are graded using this scale: Excellent [5 points] Criteria is fully satisfied and/or demonstrates top- level critical thinking Very Good [4 points] Criteria is mostly satisfied and/or demonstrates a high-level of critical thinking Acceptable [3 points] Criteria is partially satisfied and/or demonstrates some critical thinking Minimal [2 points] Criteria is not satisfied and/or demonstrates a low-level of critical thinking Missing [0 points] Criteria is not addressed or is missing completely Criteria Score 1.Report – Define RVs: introduction to real-world system or scenario, four RVs clearly defined 2.Report – Population Parameters: realistic values, justification and/or references provided for each 3.Report – Distribution Plots: mathematical definitions for all RV distributions are presented 4.Report – Distribution Plots: visualize all RV distributions, corresponding descriptions highlight key features 5
Answered 3 days AfterOct 28, 2022

Answer To: Report(submitted as a groupif working with collaborators): written presentation of yourwork, with...

Banasree answered on Oct 31 2022
45 Votes
1.Ans.
I.)
A random variable is a numerical value variable which represents various values with given p
robability conditions. In this assignment, project dataset represents the Federal, State, and/or Local election. Therefore,
X = a normal random variable that has a normal distribution – For, Against, Total Caste, voting of the older people are the data of the selected dataset.
Y = a continuous random variable with an asymmetric distribution (i.e., exponential or lognormal) - County’s Population data of the selected dataset.
V = a random variable with a uniform distribution (could be discrete or continuous) = County’s Population density, No. of Churches, No.of Church members are the selected dataset.
W = a binomial random variable with large enough ? such that it can be accurately modeled using a normal approximation = State wise, size of the county, total caste, percent white, percent black, percent other, percent male and percent female are the data of selected dataset.
Hence selected dataset has the sample of random variable...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here