[Q3 ~ Q7] Exploring and analyzing coffee shop transaction data. 1. (7 points) Create the following panel data set that consists of the unique customer ID and customers’ weekly total spending (in USD)...



[Q3 ~ Q7]
Exploring and analyzing coffee shop transaction data.




1.
(7 points) Create the following panel data set that consists of the unique customer ID and customers’ weekly total spending (in USD) by coffee type.













R Code:
























2.
(8 points) Draw a boxplot for spending by coffee type and interpret your boxplot.











R Code:









Answer:














3.
(10 points) The following table shows the unit price (in USD) for each coffee type. Please add a new variable,
quantity, to the “coffee_weekly_trans” data set.





















americano



espresso



latte



cappuccino



drip



1.5



1.25



2.95



2.45



1




[Hint: You may first create a new variable that represents the unit price for each coffee type and add it to the data set. Then you may calculate the quantity sold by using the following equation:
Quantity sold = Total spending / Unit price.]











R Code:





















4.
(12 points) To increase sales of cappuccino, the owner of the coffee shop plans to initiate a promotion that provides a free drink with customers who buy more than three cappuccinos in a certain week.



To identify the customers on a weekly basis, you are asked to add a dummy variable “promo” that is equal to 1 if a customer bought
more than three cappuccinos
in a certain week and 0 otherwise to the “coffee_weekly_trans” data set.



Please also identify the customers who are eligible for a free drink (i.e., customers whose promo = 1) in each week. Specify the customers’ IDs and Weeks. For example, “ID in Week #.”











R Code:










Answer:












5.
(12 points) Create a new data set “coffee_weekly_trans_sp” (from “coffee_weekly_trans” data set) that presents the total spending over the four weeks for each customer. Who is the most profitable customer? Specify the customer’s ID.











R Code:






















Answer:













[Q8 ~ Q12]
LJ is a multichannel clothing retailer in Bloomington-Normal. The owner of LJ has collected sales data by city and detailed customers’ transaction data for the last two years. To understand the shopping behavior of its customers, he pulled up the transaction data for a month and requested you to analyze it. Please access one of his datasets,
LJ.trans.csv
(posted in “Assignments” → “Exam I” @ ReggieNet), explore it and analyze it through the following steps.



Variable description



cust.id: unique customer ID



age: customer age



email: “yes” if LJ has the customer’s email address; “no” otherwise



city: living in Bloomington or Normal



sex: female or male



distance.to.store: distance between home to the closest offline store (mile)



online.visits: total number of visits to online store for a month



online.trans: total number of transactions via online store for a month



online.spending: total spending ($) in online store for a month



online.return: total product return amount ($) via online store for a month



offline.trans: total number of transactions via offline store for a month



offline.spending: total spending ($) in offline store for a month



offline.return: total product return amount ($) via offline store for a month



review: “yes” if the customer posted her/his reviews; “no” otherwise



income: income ($)



loyalty: customer segments (high, med, low) based on customer’s level of loyalty




6.
(5 points) How many variables and observations in the
LJ.trans
dataset?











R Code:










Answer:








7.
(8 points) What are the average, median, standard deviation, and variance of the following variables?





-
age


-
distance.to.store


-
offline.spending


-
online.return











R Code:



















Answer:





















8.
(13 points) Suppose we have the two customer groups’ profiles as below. What is the
average offline spending
for the customer group A who match all of the following conditions? What is the
average offline spending
for the customer group B who match all of the following conditions? Which group of customers spend more in the offline store on average?




Customer group A’s profile


a. Age >= 20


b. Living in Normal


c. Female


d. Customers who posted reviews


e. Customers who live more than 10 miles (> 10) away from the store


f. Customers who never returned products through offline store




Customer group B’s profile


a. Age


b. Income


c. Loyalty level = low


d. Living in Bloomington


e. Customers who live within 40 miles (


f. Customers who never visited online store


g. Customers who purchased products through offline store at least once












R Code:






















Answer:











9.
(10 points) Create a new variable,
generation_group, that represents the generation of the LJ customers and add it to the
LJ.trans
dataset. More specifically, generation_group variable takes a value “Post-Millennials” for ages between 1 and 21 (including 1 and 21), “Millennials” for ages between 22 and 37 (including 22 and 37), “Generation X” for ages between 38 and 53 (including 38 and 53), and “Baby Boomers” for ages between 54 and 72 (including 54 and 72). Please use for() and if else() functions or ifelse() function to create the
generation_group
variable.











R Code:




















10.
(5 points) Boxplots are a compact way to represent a distribution of variable and good for initial data visualization. Boxplots are not appropriate for:




  1. age

  2. loyalty

  3. online.return

  4. offline.spending

  5. income











Answer:







11.
(7 points) Write a R function that converts “kcal” to “kJ” and show how many kilojoules (kJ) of energy are found in a 700 kilocalorie (kcal) meal?













R Code:
























Answer:











12.
(12 points) When you multiply the square of the cumulative sum of natural numbers 1, 2, …,
n, i.e., 12 ,
(1+2)2 ,
(1+2+3)2, …
(1+2+…+n)2
find
n
that produces the product which is more than 10,000 and the closest to 10,000”
Hint: Use either while() or repeat() loop.











R Code:
















Answer:


























[Q15 ~ Q18]
LJ is a multichannel clothing retailer in Bloomington-Normal. The owner of LJ has collected sales data by city and detailed customers’ transaction data for a year. Specifically, he has collected net sales information by city and sales channel. After taking Marketing Analytics course at ISU, he realized that his way of collecting and storing data was wrong. That is, he has created “untidy” datasets. Please access one of his datasets,
LJ.sales.csv
(posted in “Assignments” → “Exam I” @ ReggieNet), and transform it into “tidy” dataset through the following steps.



Variable description



channel: offline or online



normal: net sales (= total sales ($) - total product return amount ($)) in Normal



bloomington: net sales in Bloomington



peoria: net sales in Peoria



champaign: net sales in Champaign



springfield: net sales in Springfield



joliet: net sales in Joliet








13.
(8 points) Step 1: Import “LJ.sales.csv” file into R first. Transform the original untidy data (LJ.sales) into “step1.data” as follows:



13.[Hint: You can load “tidyr” package by using library(tidyr)]



original


































channel



bloomington



champaign



joliet



normal



peoria



springfield



offline



1130 - 313



883 - 413



1143 - 201



1978 - 225



2236 - 296



2004 - 123



online



1122 - 285



950 - 204



1098 - 187



2534 - 145



3245 - 300



2542 - 135





→ step1.data







































































channel



city



netsales



offline



bloomington



1130 - 313



online



bloomington



1122 - 285



offline



champaign



883 - 413



online



champaign



950 - 204



offline



joliet



1143 - 201



online



joliet



1098 - 187



offline



normal



1978 - 225



online



normal



2534 - 145



offline



peoria



2236 - 296



online



peoria



3245 - 300



offline



springfield



2004 - 123



online



springfield



2542 - 135




























R Code:


























14.
(10 points) Step 2: Transform the “step1.data” you created in Step 1 into the “step2.data” as follows:



step1.data








































































channel



city



netsales



offline



bloomington



1130 - 313



online



bloomington



1122 - 285



offline



champaign



883 - 413



online



champaign



950 - 204



offline



joliet



1143 - 201



online



joliet



1098 - 187



offline



normal



1978 - 225



online



normal



2534 - 145



offline



peoria



2236 - 296



online



peoria



3245 - 300



offline



springfield



2004 - 123



online



springfield



2542 - 135



































→ step2.data









































city



offline.netsales



online.netsales



bloomington



1130 - 313



1122 - 285



champaign



883 - 413



950 - 204



joliet



1143 - 201



1098 - 187



normal



1978 - 225



2534 - 145



peoria



2236 - 296



3245 - 300



springfield



2004 - 123



2542 - 135




















R Code:
























15.
(10 points) Step 3: Transform the “step2.data” you created in Step 2 into the “tidy.data” as follows:



step2.data










































city



offline.netsales



online.netsales



bloomington



1130 - 313



1122 - 285



champaign



883 - 413



950 - 204



joliet



1143 - 201



1098 - 187



normal



1978 - 225



2534 - 145



peoria



2236 - 296



3245 - 300



springfield



2004 - 123



2542 - 135
















→ tidy.data























































city



offline.sales



offline.return



online.sales



online.return



bloomington



1130



313



1122



285



champaign



883



413



950



204



joliet



1143



201



1098



187



normal



1978



225



2534



145



peoria



2236



296



3245



300



springfield



2004



123



2542



135




[Hint: You may take two steps to get the
tidy.data. At first, try to create “offline.sales” and “offline.return” columns, and then create “online.sales” and “online.return” columns.]











R Code:



Oct 08, 2021
SOLUTION.PDF

Get Answer To This Question

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here