from the pdf attached, complete the questions under "GROUPWORK" on page 3. Only one connected question, simple.
Curve Fitting Film production companies use different techniques to predict movie sales. We’ll consider predicting the final worldwide box office sales of movies based on the first weekend box office sales and the rating of the film (by IMDb). • Why would first weekend sales as the only variable predicting final box office sales? • Why do we think I’ve used movies from 2017 and not movies from different years? Let’s assume a model of the form F = a0 + a1 S + a2 R where F is the final box office sales, S is the first weekend sales, R is the IMDb rating, and a0, a1, a2 are coefficients we’ll choose to minimize some measure of error. • What would a good measure of the error be? What would the error be a function of? What would a good set of equations be to solve for the coefficients a0, a1, a2? Movies from 2017 (a simpler time) First weekend Box Office ($USD) Final Worldwide Box Office ($USD) Rating (IMDb) Baby Driver 20,553,320 227,250,102 7.6 Baywatch 18,503,871 175,863,783 5.5 Beauty and the Beast 174,750,616 1,273,109,220 7.1 Blade Runner 2049 32,753,122 258,829,058 8.0 Despicable Me 3 72,434,025 1,032,596,894 6.3 Emoji Movie 24,531,923 216,564,840 3.3 It 123,403,419 701,083,042 7.3 Murder on the Orient Express 28,681,472 351,767,147 6.5 Spiderman: Homecoming 117,027,053 878,346,440 7.4 Star Wars: Episode VIII 220,009,584 1,331,635,141 6.9 1 Day 3 lecture: More Curve Fitting 2 This set of equations is over-determined; that is, we have more equations than vari- ables to solve for. In practice, we’ll find that there is no solution to over-determined systems, but we’ll find something that best represents the solution by linear least squares. Listing 1. Movie Sales 1 F = [227.25; 175.86; 1273.11; 258.83; 1032.60; ... 2 216.56; 701.08; 351.77; 878.35; 1331.64; ]; 3 4 S = [20.55; 18.50; 174.75; 32.75; 72.43; ... 5 24.53; 123.40; 28.68; 117.03; 220.01; ]; 6 7 R = [7.6; 5.5; 7.1; 8.0; 6.3; ... 8 3.3; 7.3; 6.5; 7.4; 6.9; ]; 9 10 M=[ones(size(S)) S R]; 11 a=M\F; 12 13 scatter3(S,R,F,100,[0 0 0],'filled','o') 14 hold on 15 x=0:25:250; y=2:.5:9; 16 [X,Y]=meshgrid(x,y); 17 Z=a(1)+a(2)*X+a(3)*Y; 18 surf(X,Y,Z) 19 20 FPredLin =@(S,R) a(1)+a(2)*S+a(3)*R; 21 predLin(:) = FPredLin(S(:),R(:)); 22 diffLin = abs(predLin'�F); 23 errorLin = sum(diffLin.^2); Day 3 lecture: More Curve Fitting 3 How could we set up a second-order polynomial (in two variables) to fit the data. • What would the general form of the fit look like? • What would a system look like to solve for the coefficients in the general form? • GROUPWORK: Set up a system and find the second-order fit in Matlab. • Find the error of the second-order fit. • Use each model to predict the final box office sales for Wonder Woman, a movie with an IMDB rating of 7.4 and first weekend box office sales of $103,251,471. (You can compare your solution to the actual final box office sales: $818,058,22.) • Overall, is your model good? Comment on if the predicted relationships make sense. • Using some of the additional data provided (production budget, opening weekend theaters, etc.), create a new second-order model in two variables. (You can use any two of the variables from the box office information or the IMDb ratings, but you should spend a little time justifying your choices. That is, using ‘domestic box office’ and ‘inflation adjusted domestic box office’ would not be a good choice since these two variables are essentially the same thing.) • Determine if your new model is better than the second-order fit you found above.