Overview of MQLS MQLS is a program, written in C, for case-control association testing of a binary trait in samples that contain related individuals. The program allows for testing association of the trait with any number of binary or multiallelic markers (e.g. from a genomewide screen), where separate tests are performed at each marker. The program is applicable to association studies with completely general combinations of related and unrelated individuals, where the relationships among the sampled individuals are assumed to be known. For instance, the program allows cases to be related to controls, and it is equally applicable to complex inbred pedigrees and to simpler study designs consisting of unrelated individuals and small outbred families. The main reference for this program is Thornton T., McPeek M. S. "Case-Control Association Testing with Related Individuals: A More Powerful Quasi-Likelihood Score Test" (2007) American Journal of Human Genetics, vol 81, pp. 321-337. The MQLS program can be considered as a significantly enhanced version of the CC-QLS program of Bourgain C., Hoffjan S., Nicolae R., Newman D., Steiner L., Walker K. Reynolds R., Ober C., McPeek M. S. "Novel case-control test in a founder population identifies P-selectin as an atopy susceptibility locus" (2003) American Journal of Human Genetics vol 73,pp. 612-626. For each marker, the MQLS program computes 3 different test statistics for association: the MQLS test statistic of Thornton and McPeek (2007), the WQLS test statistic of Bourgain et al. (2003), and the corrected chi-square test statistic of Bourgain et al. (2003). As a default, we recommend using the MQLS test. The MQLS is a quasi-likelihood score test that was developed to improve the power of the WQLS test. (The "M" in MQLS stands for "more powerful" or "modified."). The MQLS test improves power over the WQLS by taking advantage of the principle that there is enrichment for predisposing variants in affected individuals with affected relatives. For a more detailed comparison of the 3 statistics, see Thornton and McPeek (2007). The current release of the MQLS software, version 1.5, calculates all three association statistics using the variance estimator of Equation (3) of Thornton and McPeek (2010), which is a robust variance estimator that relaxes the Hardy-Weinberg Equilibrium (HWE) assumption under the null hypothesis. For each test, a p-value is calculated based on the chi-square asymptotic null distribution. To calculate the MQLS statistic, an estimate of the population prevalence of the trait must be specified by the user. We emphasize that the test will be valid regardless of the input value. We recommend using an estimate from previous studies or registry data from the population. We have demonstrated, through simulation (see Thornton and McPeek (2007)), that power of the MQLS statistic is in fact quite robust to misspecification of the population prevalence. Additional features of the MQLS test include: (1) The MQLS test for a given marker incorporates information on phenotyped individuals who have missing genotype data at the given marker. This information is used to optimize the weights given to relatives with non-missing genotype data at the marker being tested, following the principle that there is enrichment for predisposing variants in individuals with affected relatives. This enrichment principle implies, for example, that an affected individual with no phenotyped relatives should be weighted differently from an affected individual with an affected sibling, and that this should still hold true when the affected sibling happens to have missing genotype data at the marker being tested. At the same time, the genotypes of the two sibs are dependent, so there should be downweighting of the sibs when they are both genotyped which does not occur when only one is typed. The MQLS test takes into account both the enrichment principle and the effects of dependence in setting the weights. In contrast, the WQLS and corrected chi-squared test statistics will exclude individuals with missing genotype data at the given marker. (2) Another useful feature of the MQLS test is that it allows individuals' phenotypes to be coded as "affected", "unaffected", or "unknown." An individual's phenotype is appropriately coded as "unknown" if no direct phenotype information was measured on the individual. One situation in which the "unknown" phenotype designation is appropriate is for general population controls. "General population controls" refers to a set of control individuals, sampled from some population, who have not been screened for the phenotype. Another situation in which the "unknown" designation is appropriate is when the trait of interest is a late onset disease (e.g., Alzheimer's). There may be individuals that are not affected with a trait because they are too young to be affected at the time of screening, but they may develop that trait later on in life. These individuals could be appropriately considered to have unknown phenotype. Individuals with unknown phenotype and unknown genotype for a given marker are not included in any test for that marker. Genotyped individuals with unknown phenotype are included in the MQLS test (using Option 1 of the software), and their weight in the analysis is determined by a combination of the population prevalence of the trait (as input by the user) and by the phenotypes of any relatives they have in the study. In contrast, the WQLS and Corrected Chi-squared statistics do not make a distinction between unaffected individuals and individuals of unknown phenotype. The MQLS software gives the user TWO OPTIONS for how to handle the individuals of unknown phenotype. OPTION 1: This should be considered the default for the MQLS test. Under this option, the MQLS test is performed with 3 different phenotype categories allowed: affected, unaffected, and unknown. Furthermore, phenotyped individuals with missing genotype data are allowed to contribute to the MQLS test (if they have genotyped relatives in the sample). The WQLS and corrected chi-squared statistics are computed with the cases taken to be the affecteds and the controls taken to be the unknown and unaffected individuals combined. They do not make use of individuals with missing genotype data at the tested marker. OPTION 2: This option is provided for backward compatibility with the CC-QLS software for calculating WQLS and corrected chi-squared. In this option, individuals with unknown phenotype are excluded from all tests, and individuals with missing genotype data at a given marker are excluded from the test at that marker. If this option is run, results for WQLS and corrected chi-squared will be consistent with the output of the CC-QLS software (provided that there are no MZ twin pairs in the sample --- see below). Under option 2, the MQLS test will also be performed with these individuals removed from the analysis, which could reduce its power. (3) The original versions of the WQLS and corrected chi-squared tests and their implementations in the CC-QLS software (Bourgain et al. 2003) did not allow both members of an MZ twin pair to be included in the analysis. We have made changes that allow both members of one or more MZ twin pairs to be included in all 3 tests: MQLS, WQLS, and corrected chi-squared. These changes are described in Thornton and McPeek (2007) and are implemented in the MQLS program.