讲解:STAT 3312、R、R、SAS codeR|R

STAT 3312 (Fall, 2019)Final exam (take-home)Name (ID):Instructions• This take-home exam is due 3:00PM, December 17, 2019.• All of your answers and work must be your own.• You are NOT allowed to discuss any part of this exam with anyone. If you have any questions,ask me.• For question #2, R or SAS code along with output must me submitted to support youranswer. It would be good if you underline results on the output relevant to your answer.1. True/False questions (1.5 points each)(1) The diagnosis of a mental illness (ex: schizophrenia, neurosis, depression) is an ordinal categoricalvariable.True ( ) False ( )(2) If the odds of success equal 0.5 in a binary response, the the probability of success is 0.25.True ( ) False ( )(3) In a logistic regression model, logit[π(x)] = α + βx, eα equals the odds of success when x = 1.True ( ) False ( )(4) In a logit model logit[π(x)] = α+βx, the probability increases at a rate of 0.16β when π(x) = 0.4.True ( ) False ( )(5) The Fisher’s exact test can be used to test if the odds ratio of a 2 × 2 table equals 1 when thefrequency counts are small.True ( ) False ( )(6) A classical linear regression model with errors having a normal distribution is a special case ofgeneralized linear model with the probit link.True ( ) False ( )1(7) In testing for independence in two-way contingency tables, likelihood ratio tests and Pearson’sχ2tests are equivalent for small sample sizes.True ( ) False ( )(8) In a generalized linear model, the link function is used to connect the values of the randomcomponent and the systematic component.True ( ) False ( )(9) When x1 or x2 is the sole predictor for a binary response y, the likelihood ratio test of the effecthas P-value tests for H0 : β1 = 0 and for H0 : β2 = 0 could both have P-values larger than 0.05.True ( ) False ( )(10) For the logistic regression model with the identity link, the estimated probability of any valuefor predictor x could exceed one.True ( ) False ( )2. The following table is based on an epidemiological survey of 3,000 subjects to investigate snoringas a possible factor for heart disease. We use scores (0, 2, 3, 5, 6) for x = snoring level.Heart DiseaseSnoring Yes NoNever 24 1355Sometimes 35 603More often than not 21 215Almost always 30 224Every night 27 230(a) Use R or SAS to fit the model with three link functions: the logit, probit, and complementaryLog-Log. Write down the estimated equations for all three models. (12 points)2(b) Find the estimated proportion for the logistic model when the snoring level is 2 and interpretit in terms of the odds. (4 points)(c) Use the fitted logistic model to calculate an approximate 97% confidence interval for the oddsratio of a person in the “sometimes” category compared to a person in the “every night” category.(5 points)(d) Find the estimated proportion for the probit model when the snoring level is 3. (4 points)(e) Find the estimated proportions for the complementary Log-Log model when the snoring levelsare “sometimes” and “almost always”, respectively. Which value is larger? (5 points)3. Consider the following logistic regression model based on the horseshoe data with color andwidth predictors:logit[P(STAT 3312作业代做、R语言作业代写、代做R编程设计作业、代写SAS code作业 代做R语言程序|代写R语言编程Y = 1)] = α + β1c1 + β2c2 + β3c3 + β4x,where x denotes width andc1 = 1 for color = medium light, 0 otherwisec2 = 1 for color = medium, 0 otherwise3c3 = 1 for color = medium dark, 0 otherwise.Fitting the model yields the following estimated equation:logit[P(Yd= 1)] = −13.015 + 1.097c1 + 1.302c2 + 1.254c3 + 0.458x. (1)Consider this fit for crabs of width x = 21cm.(a) Estimate two probabilities for medium-light crabs and for dark crabs, and then calculate theratio of these two probabilities. (7 points)(b) Estimate the odds ratio of a satellite for medium-light crabs and for dark crabs. Interpret it interms of the context. (7 points)(c) Is there a big difference between the ratio of probabilities in (a) and the odds ratio in (c)? Ifnot, why does this happen? (5 points)(d) Verify the value of the odds ratio in part (b) using the parameter estimates in Equation (1). (5points)44. In order to investigate effects of AZT in slowing the development of AIDS symptoms, a total of343 veterans whose immune systems were beginning to falter after infection from the AIDS viruswere randomly assigned either to receive AZT immediately or to wait until their T cells showedsevere immune weakness. The following table is a 2 ×2×2 cross classification of the veteran’s race,whether AZT was given immediately, and whether AIDS symptoms developed during the 3-yearstudy.SymptomsRace AZT use Yes (Fitted) No (Fitted) Row totalBlack Yes 14 (A) 90 (B) 104No 28 (C) 85 (D) 113White Yes 10 (E) 55 (F) 65No 14 (G) 47 (H) 61Let X = AZT treatment (1 for AZT taken, 0 otherwise), Z = race (1 for blacks, 0 for whites), andY = whether AIDS symptoms developed (1 = yes, 0 = no). The ML fit turned out to belogit(ˆπ) = −1.1427 − 0.6537x − 0.0037z. (2)(a) Use Equation (2) to find the fitted values (A) - (H). (8 points)5(b) Perform a goodness of fit test by calculating the Pearson statistic X2 based on the observedand fitted values in the table above. Does the model fit decently well? Justify your answer withthe P-value. (8 points)65. Does job satisfaction depend on one’s income? The 1991 General Society Survey shows thefollowing results. Note that there are four levels in the job satisfaction categories (dissatisfied,little, moderate, very) and four levels in the income categories (0-5K, 5K-15K, 15K-25K, >35K).The income values are in dollars.Income Job satisfactionDissatisfied Little Moderate Very0-5K 2 4 13 35K-15K 2 6 22 415K-25K 0 1 15 8>25K 0 3 13 8Let Y = job satisfaction and let X = income scores (3K, 10K, 20K, 25K). Consider the baselinecategorylogit model with “very” as the baseline category:log(πjπ4) = αj + βjx, j = 1, 2, 3.The following table shows a part of the output regarding the estimated coefficients for a baselinecategorylogit model.(Intercept):1 (Intercept):2 (Intercept):30.430 0.456 1.704Income:1 Income:2 Income:3−0.185 −0.054 −0.037(a) Write down the three predicted equations, log(ˆπj/πˆ4) for j = 1, 2, 3. (6 points)7(b) Notice that βˆj (c) What is the meaning of e−0.185 = 0.83? Explain it rigourously in terms of the context. (4points)(d) Find the estimated probability of being “Moderate” category when his/her income is 20K. (4points)8转自:http://www.daixie0.com/contents/18/4471.html

你可能感兴趣的:(讲解:STAT 3312、R、R、SAS codeR|R)