--- Authors: Sinan Ozdemir, Jacob Deming, Hobert Bush Format: Jupyter Notebook Created: Q3 2019 Revised: Q4 2020 --- 1 Airbnb Data - Marketing Assignment Requirements for Assignment 1 1. Directions 2....

1 answer below »
Hi,Can you please let me know if you could assist with this assignment? Data and document are attached.Thanks,Lamiley


--- Authors: Sinan Ozdemir, Jacob Deming, Hobert Bush Format: Jupyter Notebook Created: Q3 2019 Revised: Q4 2020 --- 1 Airbnb Data - Marketing Assignment Requirements for Assignment 1 1. Directions 2. Requirements 3. Due Date 4. Evaluation Directions In this assignment, you will work with sample Airbnb data looking at listings in San Francisco, CA from 2020. You'll be analyzing the data in order to glean insights into the short term rental market and inform real estate market trends. Expected Time to complete: 4 hours Instructions 1. Open the assignment notebook. 2. Save a copy of your notebook and retitle it “yourname_assignment.ipynb”. 3. Attempt answers for all required questions. Include comments to explain your logic. 4. Submit your notebook to your instructional team by the due date. Include a public link to your file and add a brief description. --- Requirements Objectives This assignment will ask you to: 1. Read/write CSV files using Python's built-in csv module. 2. Clean and transform raw data from a csv into lists and dicts. 3. Use Python to filter out entries and match specific criteria. Problem Your goal is to perform some basic summary statistics on the data, looking to glean market insights and answer questions, such as: · What is the most frequently offered amenity? · What is the average cost of listings that match specific consumer preferences? As you navigate the notebook, you will see clearly labeled sections setting up questions for you to solve marked required. You will need to attempt answers for all of these required questions. Please include all work within your Jupyter notebook. Questions 1 Part 1: First, you'll need to load the sanfran_airbnb CSV from your local files. Alternatively, you can also click here to access the data online. Data Our data is a truncated subset of data taken from Inside Airbnb. The original set contains extra columns which have been removed for this assignment. Hint: The delimiter for this file is a tab character, which can be passed into the csv.reader as csv.reader(csvfile, delimiter='\t') Part 2: Next, create a list called column_names that holds the column names from the csv. Hint: There should be 10 columns, total. For example: columns_names == ['id', 'listing_url', ....] Part 3: Now create a list called listings that holds each listing as it's own list. There should be 6,346 total. Question 2 Next, answer the following questions using the listings variable: Part 1. Print the first listing Part 2. Print the 100th listing Part 3. Print the price of the 100th listing without printing the rest of the listing information! Question 3 Create a list called parsed_listings that contains the original listings as its elements - but with the following changes: - First, change the 4th item (amenities) to be a list of strings (this one is a bit tricky). Hint, you may have to remove the ", }, and the { characters and then split the string by the comma. - Second, change the 5th item (price) to be a float. Try using .replace to remove a few bad characters from your floats - Third, change the 6th item (bedrooms) to be a float. - Fourth, change the 7th item (bathrooms) to be a float. - Fifth and finally, try using a `for` loop to accomplish this. When you're done, the first element (`parsed_listing[0]`) Question 4 Next, let's dig into price differences between listings with different criteria. Part 1. Begin by creating two lists called one_bathroom and two_bathroom where the elements fit the following criteria: · small_homes_one should only have listings with less than two bathrooms · small_homes_two should only have listings with more than two bathrooms but less than three Part 2. What is the average price for each set of listings? Part 3. Finish by printing the number of elements in each list. Part 4. Then create a new list called small_homes that only contains listings that have either: Exactly 1 bathroom OR Less than 2 bathrooms AND exactly 1 bedroom Part 5. Wrap up by printing the number of elements in the list small_homes. Question 5 Part 1. Now let's create a dictionary called amenities_count. Hint: A dictionary uses key/value pairs. For more info on Python dictionaries, check out this link. For your new amenities_count dictionary, make the keys of the dictionary equal the amenities listed and the values indicate the number of times that amenity appears across every listing. Examples: - amenities_count['Day bed'] == 7 - amenities_count['Coffee maker'] == 1230 Part 2. Now iterate over your new amenities_count dictionary to surface the amenity that appears the most often across all listings! Question 6 This dataset has a bunch of properties in it that are ABSURDLY priced ($10000 per night seems a bit high) and are probably priced in this way to deter rentals whilst still keeping the property up. This makes them severe outliers in the dataset and could throw off any analysis we want to make in the future. Let's try to clear this up. Part 1. Create a loop that goes through the original list of properties and places them into a new list from least to most expensive. Then take some time to look through a few of the higher priced properties. This will reveal some strange values. Note: There are many ways to accomplish this task but we recommend using a new library method called itemgetter which was made specifically for this purpose and the sorted function. Part 2. Calculate the median price of the sorted dataset. This will be used in order to determine the quartiles of our dataset. Part 3. Calculate the lower quartile (the data point below which 25% of the observations set) Part 4. Calculate the upper quartile (the data point above which 25% of the observations set) Part 5. Find the interquartile range by subtracting the value of the lower quartile from the value of the upper quartile. Part 6. Find the "inner fences" of the data set. To find the inner fences of the data set first multiply the interquartile range by 1.5. Then add the result to the upper quartile and subtract it from the lower quartile. The two values you recieve are the boundries for the dataset's inner fences. Note: A point that falls outside of this numeric boundry is classified as a minor outlier Part 7. Find the "outer fences" of the data set. This is done in the same way as uncovering the inner fences, except that the interquartile range is multiplied by 3 instead of 1.5. The result is then added to the upper quartile and subtracted from lower quartile to find the upper and lower boundaries of the outer fence. Note: A point that falls outside of this numeric boundry is classified as a major outlier Part 8. Now it is time to finally clean the dataset! Remove any values from the listings whose prices are outside of the outer fences. Part 9. Finally, let's add a new value to each listing that tells the viewer whether or not the listing is a minor outlier or not. Note: Some questions can be solved in multiple ways. Use comments to explain your code or logic! Data Our data is a truncated subset of data taken from Inside Airbnb, including required and optional CSV data files: listings.csv, calendar.csv, and airbnb_truncated.csv. Within our required file, listings.csv, you'll see seven columns: · id - A unique identifier of the Airbnb · listing_url - The URL to the Airnb · name - The name of the listing · amenities - A list of the amenities that the listing offers · price - The nightly fee of the listing (before cleaning fees) · bedrooms - The reported number of bedrooms · bathrooms - The reported number of bathrooms 1 --- Authors: Sinan Ozdemir, Jacob Deming, Hobert Bush Format: Jupyter Notebook Created: Q3 2019 Revised: Q4 2020 --- 1 Airbnb Data - Marketing Assignment Requirements for Assignment 1 1. Directions 2. Requirements 3. Due Date 4. Evaluation Directions In this assignment, you will work with sample Airbnb data looking at listings in San Francisco, CA from 2020. You'll be analyzing the data in order to glean insights into the short term rental market and inform real estate market trends. Expected Time to complete: 4 hours Instructions 1. Open the assignment notebook. 2. Save a copy of your notebook and retitle it “yourname_assignment.ipynb”. 3. Attempt answers for all required questions. Include comments to explain your logic. 4. Submit your notebook to your instructional team by the due date. Include a public link to your file and add a brief description. --- Requirements Objectives This assignment will ask you to: 1. Read/write CSV files using Python's built-in csv module. 2. Clean and transform raw data from a csv into lists and dicts. 3. Use Python to filter out entries and match specific criteria. Problem Your goal is to perform some basic summary statistics on the data, looking to glean market insights and answer questions, such as: · What is the most frequently offered amenity? · What is the average cost of listings that match specific consumer preferences? As you navigate the notebook, you will see clearly labeled sections setting up questions for you to solve marked required. You will need to attempt answers for all of these required questions. Please include all work within your Jupyter notebook. Questions 1 Part 1: First, you'll need to load the sanfran_airbnb CSV from your local files. Alternatively, you can also click here to access the data online. Data Our data is a truncated subset of data taken from Inside Airbnb. The original set contains extra columns which have been removed for this assignment. Hint: The delimiter for this file is a tab character, which can be passed into the csv.reader as csv.reader(csvfile, delimiter='\t') Part 2: Next, create a list called column_names that holds the column names from the csv. Hint: There should be 10 columns, total. For example: columns_names == ['id', 'listing_url', ....] Part 3: Now create a list called listings that holds each listing as it's own list. There should be 6,346 total. Question 2 Next, answer the following questions using the listings variable: Part 1. Print the first listing Part 2. Print the 100th listing Part 3. Print the price of the 100th listing without printing the rest of the listing information! Question 3 Create a list called parsed_listings that contains the original listings as its elements - but with the following changes: - First, change the 4th item (amenities) to be a list of strings (this one is a bit tricky). Hint, you may have to remove the ", }, and the { characters and then split the
Answered Same DayOct 25, 2021

Answer To: --- Authors: Sinan Ozdemir, Jacob Deming, Hobert Bush Format: Jupyter Notebook Created: Q3 2019...

Sandeep Kumar answered on Oct 26 2021
133 Votes
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source
": [
"import numpy as np\n",
"import pandas as pd\n",
"import pprint"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"san = pd.read_csv('sanfranairbnb.csv', delimiter='\\t')"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"['id',\n",
" 'listing_url',\n",
" 'name',\n",
" 'host_id',\n",
" 'host_name',\n",
" 'host_is_superhost',\n",
" 'neighbourhood_cleansed',\n",
" 'accommodates',\n",
" 'bathrooms',\n",
" 'bedrooms',\n",
" 'amenities',\n",
" 'price']"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"column_names = list(san.columns) \n",
"column_names"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"listing =...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here