Dataset 6 : COVID data (cases, deaths, and vaccination by states in US) Source : https://covid.cdc.gov/covid-data-tracker/#datatracker-home No default problem is defined. Define your own. Primary Data...

1 answer below »
to be coded using jupyter notebook. so i need the jupyter notebook codes and all, with markdown saying what each section of code is for. i have attached a notebook file for reference. i also need a simple ppt presentation that covers the requirement in the assignment doc.


Dataset 6 : COVID data (cases, deaths, and vaccination by states in US) Source : https://covid.cdc.gov/covid-data-tracker/#datatracker-home No default problem is defined. Define your own. Primary Data : The dataset is posted on the CDC website. Download the data from of your choice, in plain CSV format. Read the data description on the website/portal carefully. There is amazing amount of weekly data available on cases, deaths, and vaccinations by state.  First 3 to 4 minutes of your presentation · You MUST mention which dataset you worked on, and what EXACTLY was your objective. · You may then want to BRIEFLY present your Exploratory Data Analysis and observations. · If it is an ML project (regression/classification/clustering/anomaly), mention which one(s). · You may now briefly clarify why and how the ML problem(s) aim(s) to solve your objective. Next 4 to 5 minutes of your presentation · Is the data already clean/structured for your ML problem? If not, how did you prepare the data? · How did you apply the ML technique to solve your problem? Which models did you use? Why? · Did you only use tools and techniques learned within the course? What else did you learn/try? Last 1 to 2 minutes of your presentation · What is the outcome of your project? Did you meet your initial objective? Anything interesting? · You MUST conclude by clearly mentioning who in your team worked on which portion of work. Everything mentioned above should be presented within the 10 minutes. You will be timed, so practice it! The Professor/TA may ask questions after your presentation is over -- this will be for another 2 to 3 minutes, per team.  FAQs for the Mini-Project FAQs for Mini-Project What is the Grading Scheme for the Mini-Project? 10% for coming up with an interesting problem based on the dataset 10% for exploratory data analysis / visualization to understand the data 10% for preparing the dataset to suit your specific problem definition 20% for the use of data science / machine learning to solve the problem 10% for learning something new and/or doing something beyond the course 20% for the presentation of your project, and overall impression 20% for your individual contribution, evaluated through peer assessment   If you are attempting something different, especially something that you think do not fit into the grading scheme, feel free to discuss with your Lab Instructors and the Course Coordinator. What is an "interesting problem" based on the dataset? Your Lab Instructors will be able to help you choose something interesting. It should not be something that you can solve by copy-pasting the LinearRegression or DecisionTreeClassifier codes from the regular course material. There should be something beyond that, for which you will have to learn something new, or apply some new technique. If you are unsure whether your problem is interesting, ask for the Lab Instructor's advice. Warning : In quest for "interesting problem", please do not attempt something that you can't finish in time. How much of Visualization should be presented? It's worth only 10%, so do not spend the bulk of time on cool visualization tools. Do standard exploration of the data, and standard statistical visualizations, as done during the course, just to understand your data well enough. You DO NOT need to produce data dashboards and cool web interfaces to do an impressive project. Warning : In quest for "cool presentation", please do not try something that takes too much time to learn. What do you mean by "preparing the dataset"? The dataset given to you may not be in the proper format to solve the problem you targeted. Preparing means cleaning the data, resizing/reshaping the data, removing outliers (if necessary), balancing imbalanced classes (if necessary), grouping the rows/columns as necessary, etc. This is an important part of any DS/ML project. How much of DS / ML tools should I use for the project? This is one of the main parts of your project. You may use any tool and technique that you have seen during the course, for Regression, Classification, Clustering, Anomaly Detection. If you want something simple, stick to Scikit-Learn as your DS/ML toolbox. You may also choose to use new models that you have not seen in the course, like Random Forest for regression, Naive Bayes for classification, DBSCAN for clustering, etc. Warning : In quest for "quick impression", please do not try complex tools that takes too much time to learn. What do you mean by "learning something new" beyond the course? The goal for the mini-project is to make you learn something new. Try to use new DS/ML model for regression, classification, clustering or anomaly detection, beyond what we have already covered in the course. That's the quickest way to prove you learned something new. You may also want to explore a new visualization tool (like Plotly), or a new technique for data preparation (like resampling), or explore additional/extra data.
Answered 9 days AfterMar 14, 2022

Answer To: Dataset 6 : COVID data (cases, deaths, and vaccination by states in US) Source...

Sharda answered on Mar 24 2022
101 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here