Loglinear analysis of cross-classifications                     [STB-8: smv5.1]
-------------------------------------------
     ^loglin^ count varlist [^in^ range] [^if^ exp] [^weight^]
            ,fit(^margins to be fit^) [ltol(^#^) iter(^#^) offset(^variable^)
                                       level(^#^) irr anova keep resid collapse]

estimates a Poisson maximum-likelihood loglinear model.  There are two
cases:  1)  You have only a summary table, and count indicates the number
of cases that fall in each level of varlist, or 2) you have full
information on all cases, so that each case should count once.  If you fall
into case #2, you would be better served to use the ^poisson^ command.

For ^loglin^, the ^count^ variable should be a positive integer, a count of
the number of cases which fall in the cross-classification of varlist. The
counts must be non-negative for all combinations of the
independent variables specified in varlist.  If a count exactly equals zero,
you have three choices: 1) you may assume that it is a ^structural zero^ and
replace it with a missing value or a zero cell weight; 2) you may add a small
positive constant, for example, .5, to zero cells; or, best of all,
3) you may get more data.



Cell weights
------------

	If you specify a ^weight^, ^loglin^ will assume that the
numbers represent cell weights.  The only option for cell weights is
frequency weights.  If you wish to specify that a particular cell is a
^structural zero^, an appropriate method is to specify a cell weight of
zero or missing for that cell.  In most instances you will want to
use only cell weights of zero or one.














 
Functional Form
---------------
This model falls in the class of generalized linear models with a categorical
design matrix, a log link, and a poisson distributed disturbance.  Thus,
the program generates a design matrix similar to the ^anova^ command which
is then passed to ^poisson^.
The functional form of the model is log-linear:

                 (predicted value) + (offset, if present)
      E(count) = e

or

      ln E(count) = (predicted value) + (offset, if present)

where the predicted value is a linear combination of the design matrix for the
categorical independent variables in varlist.   If you wish to see estimated
expected cell frequencies, residuals, and standardized residuals, specify the
^resid^ option.  If the offset is present, it is added onto the predicted
value for the purposes of estimation, so that the prediction is actually a
predicted rate.



^Anova^ option and Constraints
--------------------------------
Like ^anova^, the design matrix for ^loglin^ is not identified, hence
constraints must be imposed on estimated parameters in order to generate an
unique solution. There are two used in this command:  Anova-like and
regression-like.  In regression-like constraints, redundant levels of
independent variables are summarily dropped (the ^first^ level is dropped, then
any interaction with it).  In anova-like constraints, the ^last^ level is
dropped, but the missing level is set equal to -1 times the sum of all
the other levels.  Interpret regression-like parameter estimates as deviations
from the baseline level, and interpret anova-like parameter estimates as
deviations from the grand mean.  To activate anova-like constraints, specify
the ^anova^ option.  Otherwise, regression-like constraints will be used.

^Resid^ option
------------
If you specify the ^resid^ option, estimated expected cell frequencies, 
residuals and standardized residuals will be calculated and displayed as
the variables  ^cellhat^, ^resid^ and ^stdres^.




^Keep^ option
-----------
Normally, the loglin program ^drop^s all the variables it generates for
estimation.  If you specify the ^keep^ option, these variables, estimated
expected cell frequencies, residuals, and standardized residuals will
remain in the data set for future use.  Only the 1st-order variables (i.e.,
A1...m, B1...n, C1...o, etc.) will be labeled.  Keeping the variables
allows the user to create a new design matrix from the already existing
variables.  It does add substantially to the size of the data set, however.
^Keep^ does not work when ^collapse^ is specified.














^Collapse^ option
---------------
Specify the ^collapse^ option ONLY if:
      1) your data set contains more variables than you wish to work with in
         the specific model fit, AND
      2) you wish to analyze the subset specified in ^varlist^ AS IF they were
         the complete table.
The ^collapse^ option calculates cell counts for the variables in ^varlist^,
adding together the counts from all other variables not in ^varlist^ and
placing them in appropriate cells (i.e., it collapses the table).  It then
generates a temporary data set on which it performs analysis.  After
calculations are completed, it restores the original data set.  Note that if
you specify both the ^keep^ and ^collapse^ option, your estimated 
expected cell frequencies, residuals, and standardized residuals will be
displayed, but not saved with your original data set. 









Fit(^margins to be fit^)
----------------------
To specify a loglinear model, the fit option must be specified.  This program
generates hierarchical models, so that only the highest interaction must be
specified.  All lower-level interactions will be automatically included.
Separate the margins by commas, and specify interactions with a ^blank^.  The
fit notation follows that developed by S. Feinberg, 1981, ^The Analysis of^
^Cross-classified Categorical Data^, Cambridge, MA:MIT Press. 

For example, suppose we have summary data with three independent variables,
^iv1^, ^iv2^, and ^iv3^, with counts coded in a variable called ^dv^.  If we
wish to fit an independence model, we type:

^loglin dv iv1 iv2 iv3, fit(iv1,iv2,iv3)^

If we wish to fit a saturated model, we type:

^loglin dv iv1 iv2 iv3, fit(iv1 iv2 iv3)^

An alternative model might be:

^loglin dv iv1 iv2 iv3, fit(iv1 iv2,iv2 iv3)^


Estimation
----------
^Loglin^ generates the appropriate design matrix and passes that matrix to
the ^poisson^ command for estimation.  ^Poisson^ uses iteratively reweighted
least squares, the estimates of which are equivalent to maximum-likelihood.

Convergence
-----------
The parameters ^ltol()^ and ^iter()^ may be used to control the maximization
process.  ^ltol()^ specifies the maximum change in the log likelihood that
will be accepted as indicating convergence (default 1e-7), and ^iter()^
specifies the maximum number of iterations (default 100).

Other options
-------------
The ^level^ option controls display of the confidence interval of the estimate.
The ^irr^ option presents estimates in their exponentiated form, i.e.,
   as odds ratios.
In either case, see ^help^ for ^poisson^ for details.

Also see
--------
Manual: [4] Estimate, [5s] Poisson
On-line: ^help^ for ^correlate^, ^epitab^, ^linktest^, ^lrtest^, ^predict^, ^test^