讲解:Stat2Data、Java/C++、p-valueSPSS|Web

Logistic Regression projectFor this project, you will need the dataset Film in the Stat2Data package. This dataset is based on 100 movie reviews by the movie critic Leonard Matlin. The variables are described on page 501 of your textbook (problem 9.31). The response variable is Good (1 = a rating of 3 stars or better, 0 = any lower rating).In the dataset, there is a variable called Origin, which specifies the country of origin. Although the textbook lists 5 values for this variable (0 = USA, 1 = Great Britain, 2 = France, 3 = Italy, 4=Canada), there are actually 7 (additionally, 5 = Sweden and 6 = Hungary). Before you analyze the data, however, you will need to create a new variable that consolidates the 7 levels of Origin into 2: (2 points) Create the variable English, which takes the valueso1 for the movies in English (Origin = 0 or 1 or 4)o0 for all other moviesIn the following, NewVariable refers to the new variable you created above.Model 1: Do a Chi-square test of independence to see if there is any relationship between NewVariable and Good (whether or not the movie was rated as “good” by Leonard Maltin). If there is a relationship, use residuals or odds ratios to explain the association.(4 points: 1 point for performing the test1 point for checking the assumptions1 point for the correct decision based on the p-value1 point for the correct conclusion based on the decision)Model 2: Create a logistic regression with Good as the response and NewVariable as the predictor, and test whether the slope for NewVariable is significantly different form 0. Is your conclusion consistent with that of Model 1?(4 points: 1 point for setting up the Null Hypo代做Stat2Data作业、代写Java/C++程序语言作业、p-value留学生作业代做 代做SPSS|代写Web开发thesis and Alternative Hypothesis correctly 1 point for getting the correct p-value1 point for the correct decision and conclusion based on the p-value1 point for the comparing the conclusions of Models 1 and 2 and point out whether they are consistent.)Model 3: Find the simplest, best logistic regression model for predicting whether a movie isgood from the following variables: NewVariable, Year, Time, Cast, and Description. Performstepwise selection starting with the additive model. Your simplest model should include NewVariable, and your most complex model should include a 5-way interaction (if a 5-way interaction is not possible, then include all the 4-way interactions).(2 points: 1 for doing model selection where the starting model is different from the ones in the scope1 for using the “both” direction.)After you select your final model, write it down and use it to predict the probability of a good rating for the following film:For Lab Sec 002: A film made in Europe in 1971 that is 90 minutes long, has 5 cast members listed, and a description that is 10 lines of text long.For Lab Sec 003: A film made in US in 1950 that is 120 minutes long, has 8 cast members listed, and a description that is 16 lines of text long.For Lab Sec 005: A film made in the Britain in 1982 that is 75 minutes long, has 13 cast members listed, and a description that is 5 lines of text long.For Lab Sec 006: A film where the actors do not speak English, was made in 1933, is140 minutes long, has 3 cast members listed, and a description that is 20 lines of text long.(4 points: 2 points for getting the formula correct2 points for getting the prediction correct)转自:http://ass.3daixie.com/2018120110863825.html

你可能感兴趣的:(讲解:Stat2Data、Java/C++、p-valueSPSS|Web)