cdanote087.txt ------------- Section 4.2.3 ------------- . infile LI n r using cda088.dat (14 observations read) . blogit r n LI Logit Estimates Number of obs = 27 chi2(1) = 8.30 Prob > chi2 = 0.0040 Log Likelihood = -13.036482 Pseudo R2 = 0.2414 ------------------------------------------------------------------------------ _outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LI | .1448632 .0593412 2.441 0.015 .0285567 .2611697 _cons | -3.77714 1.378628 -2.740 0.006 -6.479202 -1.075078 ------------------------------------------------------------------------------ . blogit, or Logit Estimates Number of obs = 27 chi2(1) = 8.30 Prob > chi2 = 0.0040 Log Likelihood = -13.036482 Pseudo R2 = 0.2414 ------------------------------------------------------------------------------ _outcome | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LI | 1.155881 .0685913 2.441 0.015 1.028968 1.298448 ------------------------------------------------------------------------------ Estimating the dose that produces 50% remission = finding value for dose that makes logit=0 (that is, prob=0.5) = - alpha/beta = -3.777/0.145 . display -3.777/0.145 -26.048276 Constructing Table 4.2 requires us to calculate fitted probabilities for the logit, probit, and linear probability models. We have just calculated the logistic regression model, so the predict command will do exactly the calculation we need for the logit model. Following that, we fit the probit model and have Stata create the fitted probabilities from that model. Finally, we calculate the linear probability model and the fitted values corresponding to that model. To accomplish that, we must first estimate the probabilities from each row, and then fit these probabilities using linear regression. Note that the number of cases at each value of the Labelling Index is not the same; the [fweight=n] portion of the regress command lets Stata know that the variable n contains the frequency corresponding to that particular observation. . predict logithat . bprobit r n LI . predict prbithat . generate phat = r/n . regress phat LI [fweight=n] Source | SS df MS Number of obs = 27 ---------+------------------------------ F( 1, 25) = 18.33 Model | 1.76246588 1 1.76246588 Prob > F = 0.0002 Residual | 2.40420086 25 .096168034 R-squared = 0.4230 ---------+------------------------------ Adj R-squared = 0.3999 Total | 4.16666675 26 .160256413 Root MSE = .31011 ------------------------------------------------------------------------------ phat | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LI | .0278284 .0065004 4.281 0.000 .0144405 .0412163 _cons | -.2252962 .1434906 -1.570 0.129 -.5208206 .0702282 ------------------------------------------------------------------------------ . predict wlinhat . list LI n r logithat prbithat wlinhat LI n r logithat prbithat wlinhat 1. 8 2 0 .0679741 .0531573 -.0026689 2. 10 2 0 .0887893 .0750353 .0529879 3. 12 3 0 .1151908 .1031899 .1086447 4. 14 3 0 .1481664 .1383233 .1643015 5. 16 3 0 .18857 .1808359 .2199583 6. 18 1 1 .2369268 .2307179 .2756152 7. 20 3 2 .2932034 .287472 .331272 8. 22 2 1 .3566005 .3500869 .3869288 9. 24 1 0 .425454 .4170732 .4425856 10. 26 1 1 .4973257 .4865633 .4982424 11. 28 1 1 .5693082 .5564648 .5538992 12. 32 1 0 .7023434 .6891388 .6652129 13. 34 1 1 .7591835 .7482874 .7208697 14. 38 3 2 .849113 .8462564 .8321833 For Table 4.3, the following illustration shows how to calculate the empirical logits and the predicted number of remissions. . input LI LI 1. 10 2. 16 3. 22 4. 28 5. 36 . regress emplogit LI [fw=cases] Source | SS df MS Number of obs = 27 ---------+------------------------------ F( 1, 25) = 235.22 Model | 43.5440066 1 43.5440066 Prob > F = 0.0000 Residual | 4.62796322 25 .185118529 R-squared = 0.9039 ---------+------------------------------ Adj R-squared = 0.9001 Total | 48.1719698 26 1.85276807 Root MSE = .43025 ------------------------------------------------------------------------------ emplogit | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LI | .1456771 .0094984 15.337 0.000 .1261147 .1652395 _cons | -3.824297 .2078753 -18.397 0.000 -4.252424 -3.39617 ------------------------------------------------------------------------------ . predict fitlogit . gen predrem = cases * exp(fitlogit) / (1+ exp(fitlogit)) . format emplogit fitlogit predrem %6.2f . list LI cases remits emplogit fitlogit predrem LI cases remits emplogit fitlogit predrem 1. 10 7 0 -2.71 -2.37 0.60 2. 16 7 1 -1.47 -1.49 1.28 3. 22 6 3 0.00 -0.62 2.10 4. 28 3 2 0.51 0.25 1.69 5. 36 4 3 0.85 1.42 3.22 Note that the predicted number of remissions disagrees slightly with those of Table 4.3. The predicted numbers in the last column correspond to an estimated beta of 0.1473 instead of the 0.1457 value obtained above. For practical purposes, the differences are of no importance, but it would be nice to know just how the last column of the table was calculated. ------------- Section 4.2.4 The Wald statistic ------------- Returning to the top of the page, here is an excerpt from the logistic regression calculation based on the full table Logit Estimates Number of obs = 27 chi2(1) = 8.30 _outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- LI | .1448632 .0593412 2.441 0.015 .0285567 .2611697 _cons | -3.77714 1.378628 -2.740 0.006 -6.479202 -1.075078 ------------------------------------------------------------------------------ The z-statistic is exactly what Agresti refers to in the last paragraph of the section. Note that the chi-squared statistic at the upper right of the display is the likelihood-ratio statistic.