Overview CC-QLS is a program, written in C, to test for association in case-control samples that include related individuals with known genealogy. The tests implemented in the program have been developed for large inbred pedigrees but are also suitable for studies in outbred populations in which some individuals are relatives. Two different test statistics are computed : a Quasi-Likelihood Score test for Case-Control association (CC-QLS test) and a Case-Control corrected Chi2 test. The two tests require to know the exact relationship between any two related individuals in the sample, whatever their status (case or control). These relationships should be expressed in terms of kinship coefficient values. Inbreeding should also be taken into account through values of individual inbreeding coefficient values. Kinship and inbreeding coefficients are not computed in the CC-QLS program but are required for the test statistics to be computed. The Quasi-Likelihood Score test for Case-Control association ---------------------------------------------------------- Because the method has to be suitable even when the pedigree structure is too complex for an exact likelihood computation, we base inference on the quasi-likelihood score function proposed by McPeek, Wu and Ober (2003) for estimation of allele frequencies in large inbred pedigrees. We extend their approach to construct a Quasi-Likelihood Score test for allelic association. -------------- allele frequency estimation Using a sample of size N, the frequency of allele j (p_j) is computed by setting U(p_j)=0 where U(p_j)=Dp^T.\Sigma^{-1}.(Y_j-2*P_j) with Dp being a N-vector of 2's, Y_j a N-vector with element Y_j(i) being the number of alleles of type j in individual i, P_j being a N-vector of p_j and \Sigma being the N*N covariance matrix of Y. \Sigma is a simple function of the N*N correlation matrix K which has diagonal element i equals to 1+hi (where hi is the inbreeding coefficient of individual i) and non-diagonal element (i,j) equals to 2*phi_ij (where phi_ij is the kinship coefficient between individuals i and j). Finally, the estimate of p_j is \hat{p}_j=[Dp^T.\Sigma^{-1}.Dp]^{-1}[(Dp^T.\Sigma^{-1}.Y_j] When the N individuals belong to F independent families, \hat{p}_j is obtained by summing the numerators over the F families and dividing by the sum of the F denominators. -------------- score test computation Bi-allelic case To test for association, we consider the model where mu_i=E(Yi) is mu_i=2*(p+r) if i is a case and mu_i=2*p if i is a control where p is the allele frequency of, say allele 1, and test the absence of association by testing the null hypothesis r=0. The null hypothesis is in fact a composite null where r=0 and p is a nuisance parameter. The two steps of the test are thus 1. estimation of \hat{p}_0 at r=0 2. computation of the score statistic W=Ur^T.[var(Ur)^{-1}].Ur at r=0 plugging \hat{p}_0 for p In this case, at r=0, Ur=Dr^T.\Sigma^{-1}.(Y-2*P) where Dr is a N-vector with element i equals 2 if individual i is a case and element i equals 0 if individual i is a control, P is a N-vector of p and Y is a N-vector with element Y(i) being the number of alleles of type 1 in individuals i. Under the null hypothesis W should follow a chi2 distribution with 1 df. Ur is positive when r>0 : the frequency of the allele (which frequency is p) is increased in the cases. Ur is negative when r<0 : the frequency of the allele (which frequency is p) is decreased in the cases. When the N individuals belong to F independent families, Ur is the sum of the Ur's over the F families and the different components of the variance (var(Ur)) are also sums of the corresponding components in each family. Multi-allelic case In the multi-allelic case, r is a parameter of length (a-1): (r_1,...,r_(a-1)) where a is the number of alleles. In this case, the computation of the W statistic involves (a-1) "allele specific scores" Uj for each j allele, where Uj=Dr^T.\Sigma^{-1}.(Y_j-2*P_j) where Y_j and P_j are as described in the allele frequency estimation section. Under the null hypothesis, W follows a chi2 distribution with (a-1) df. Uj is positive when r_j>0 : frequency of allele j is increased in the cases Uj is negative when r_j<0 : frequency of allele j is decreased in the cases Similarly to the bi-alleleic case, when the N individuals belong to F independent families, Uj is the sum of the Uj's over the F families and the different components of the variance are also sums of corresponding components in each family. The Case-Control corrected Chi2 test ---------------------------------------- The other test statistic proposed is a classical chi2 test for Case-Control association corrected for the presence of related individuals in the sample. Bi-allelic case When all the individuals are independent in a Case-Control sample, the classical chi2 test can be shown to be equal to Wchi2=S^T.[var(S)^{-1}]. S where S=V^T.Y and Y is the N-vector counting the number of alleles of type 1 V=Dr-(Dr^T.Dp)(Dp^T.Dp)^{-1}(Dp^T.Dr) with Dp and Dr being the N-vectors described in the previous sections. One way to extend the classical chi2 test so that it is correct when some individuals are related is to use the same S but recompute var(S) to take the correlations among the individuals into account. Finally the correction to be applied to the classical chi2 test to have a correct test is W_{corrected chi2}=rho_corrected * Wchi2 where rho_corrected=(Nc-Nc^2/N)*[Dr^T.K.Dr-2Nc/N.Dp^T.K.Dr+(Nc/N)^2.Dp^T.K.Dp]^{-1} where Nc is the number of cases, N the total number of individuals and K is the correlation matrix previously described. Allele frequencies obtained by naive counting are required to compute the Wchi2 part of the W_{corrected chi2}. Under the null hypothesis of no association W_{corrected chi2} follows a chi2 distribution with 1df. When the N individuals belong to F independent families, the different components of the denominator of the rho_corrected factor are computed as the sums of the corresponding components in each family. Multi-allelic case The same correction (rho_corrected) applies for the multi-allelic case. Under the null hypothesis : r=(0,....,0), W_{corrected chi2} follows a chi2 distribution with (a-1)df.