Output DHSMAP output consists of five files, including "dhsmap_errors" and four files named by the user in the datafile. They are introduced here using the output from the sample input files given in "input.txt": 1. "dhsmap_errors" file This file contains all error and warning messages (if there are any). It includes errors detected in the formats of the input files and errors and warnings triggered while running the software. The program will stop immediately after an error is detected but will continue after a warning. In the given example, there are no errors or warnings; the file dhsmap_errors is empty. 2. "resout_ex" file This file contains the point estimates and 95% confidence intervals for the location of the trait-associated variant for the cases in which (1) a star-shaped genealogy is assumed and (2) a conditional-coalescent genealogy is assumed. The results are given in terms of a genetic map (cM) where the first marker (as listed in datafile) is assigned location 0. The following is resout_ex: Point Estimate for location of trait-associated variant: 0.49524 95% CI [Star-shaped genealogy]: (0.35238, 0.58571) 95% CI [Cond. Coalescent]: (0.02381, 0.73810) 3. "ancout_ex" file As described in "Search Procedures", DHSMAP uses a three-stage method to search over ancestral haplotype and variant location. Each stage produces or uses a list of ancestral haplotypes. This file contains these three haplotype lists. The first and second lists give the ancestral haplotypes estimated in the first two stages of the procedure. Each row corresponds to an estimate of the ancestral haplotype for a putative location of the trait-associated variant; the ancestral haplotype may be estimated for several variant locations between each adjacent pair of markers. The first entry x of each row, enclosed by parentheses to distinguish it from the haplotype that follows, reports the interval in which the trait-associated variant is assumed to reside; the variant lies between markers x and x+1, where the markers are in map order and the marker at map position 0 is labeled 1. Consecutive rows listing identical intervals correspond to estimates assuming different positions of the variant within the interval. The label 9 is assigned to the variant and is inserted in its assumed position. Each list is preceded by the value of the parameter E_int, the interval around which the ancestral haplotypes are grown. The third list gives the haplotypes in the set S. The first entry in each row is an integer label for this haplotype. This entry is enclosed in parentheses to set it apart from the haplotype it denotes. (Note that these haplotypes do not list the variant.) The first, or first and second, stages of the search procedure may be bypassed. In that event, the corresponding lists do not appear in this file. The following are several lines from ancout_ex: (original may be downloaded) [Stage 1] E_int= 0 ( 1) 1 9 2 1 1 2 1 ( 1) 1 9 2 1 1 2 1 ( 2) 1 2 9 1 1 2 1 ( 2) 1 2 9 1 1 2 1 ( 3) 1 2 1 9 1 2 1 ... [Stage 2] E_int= 3 ( 1) 1 9 2 1 1 2 1 ( 1) 1 9 2 1 1 2 1 ( 2) 1 2 9 1 1 2 1 ( 2) 1 2 9 1 1 2 1 ( 3) 1 2 1 9 1 2 1 ... [Stage 3] Set S of ancestral haplotypes ( 0) 1 2 1 1 2 1 4. "maxout_ex" file This file contains the parameter estimates and diagnostic statistics corresponding to each estimate of the ancestral haplotype reported in the previous file (from first and second stages, as described in "Search Procedures"). The following is a line from maxout_ex (original may be downloaded): ind C 1/Tau p s-m.lik s-n.lik lloc rloc \ cloc iter ... 1 0.0666667 1.29365 0.37254 -122.84735 -165.67497 23.307 12.152 \ 25.099 17 ... "ind" identifies the interval in which the trait-associated variant is assumed to lie; the variant lies between markers ind and ind+1, where the markers are in map order and the marker at map position 0 is labeled 1 (in this case, the variant is assumed to lie between the first two markers) "C" gives the assumed location of the trait-associated variant on a genetic map (cM) in which the first marker is assigned location 0. "1/Tau" gives the estimate of 1/tau given the estimated ancestral haplotype and the assumed location of the trait-associated variant. "p" gives the estimate of the heterogeneity parameter p given the estimated ancestral haplotype and the assumed location of the trait-associated variant. "s-m.lik" is the log-likelihood evaluated at the given parameter values, assuming independence of the recombinational histories, i.e., a star-shaped genealogy. Note that we recommend using the more conservative quasi-likelihood assuming a conditional coalescent model, as given in "oneout_ex" "s-n.lik" is the log-likelihood evaluated under the null model, i.e. the model with p=1 "lloc" is the expected value of the number of affected haplotypes still sharing from the ancestral haplotype, conditional on the model and the data, at location 0 "rloc" is the expected value of the number of affected haplotypes still sharing from the ancestral haplotype, conditional on the model and the data, at the marker farthest from from location 0 "cloc" is the expected value of the number of affected haplotypes sharing the variant by descent from the ancestral haplotype, conditional on the model and the data; equal to (1-p) * (number of haps) "iter" is the number of the iterations of the HMM/EM that were performed before the algorithm was determined to have converged. (The maximum number of iterations is arbitrarily set to 200 for haplotype data and 100 for genotype data but may easily be changed by the user. If the maximum value is reported, this may indicate that the algorithm has not converged.) 5. "oneout_ex" file This file contains the results and diagnostics from the estimation of the location of the trait-associated variant on a fine grid (third stage, as described in "Search Procedures"), in which the likelihood is maximized over 1/tau, p, and the set S of ancestral haplotypes given in "ancout_ex". The following is a line from oneout_ex (original may be downloaded): ind C 1/Tau p s-m.lik s-n.lik \ c-m.lik c-n.lik lloc rloc cloc iter anchap 1 0.0047619 1.44979 0.39786 -123.80748 -165.67497 \ -35.41986 -47.39766 23.939 11.731 24.085 14 0 ... "c-m.lik" is the log-quasi-likelihood evaluated at the given parameter values assuming dependence between the haplotypes' recombinational histories, i.e., a conditional coalescent genealogy. "c-n.lik" is the log-quasi-likelihood evaluated under the null model, i.e. the model with p=1 "anchap" identifies the ancestral haplotype assumed by the model; this ID matches the first column in the list of candidates at the end of "ancout_ex" All other entries are as defined previously. The likelihood surface can be viewed by plotting "c-m.lik" vs "C".