Tips 1. READ "Input" DOCUMENT CAREFULLY. The program will stop if any errors are detected in the format of the datafile or the pedfile. So read the "Input" document closely and make sure the input files are formatted correctly. 2. The software will allow the user to input allele frequencies rather than a set of control haplotypes or genotypes. However, this is almost never recommended for analysis of real data. The reason is that with only allele frequencies available, the software is forced to use the assumption of no background linkage disequilibrium. In our experience, this assumption rarely holds in practice and can lead to misleading results when background LD is present. This feature is preserved for analyzing simulated or real data in which there is believed to be no background LD. 3. We now allow the user to specify the order of the Markov chain used to model background LD. For microsatellite data, 1 is often satisfactory. For biallelic markers, provided that genotypes are available for a sufficient number of control individuals (>100), a 2nd order chain may be more appropriate. 4. Set the "Bayesian adjustment" parameter to 1. This ensures that all estimated control haplotype frequencies are positive, a condition necessary for the software to run properly. (See Strahs 2001 for details.) 5. Set the parameter called "max_cand" (See "Search Procedures") to be at least 20. This integer is the number of candidate ancestral haplotypes allowed to continue to grow as each additional marker is added in the search procedure described in "Search Procedures". The larger this number, the more confident you should be that the maximum likelihood ancestral haplotype has been found. Be advised that running time is linear in this parameter, i.e., using max_cand=40 takes twice as long as max_cand=20. 6. Set "max_res" such that the ancestral haplotype is estimated 2-3 times in the largest interval; it is not generally necessary to estimate this parameter more frequently, because the maximum likelihood ancestral haplotype is not likely to change more than once between each pair of markers. Note that computation time is linear in the number of times the ancestral haplotype is estimated. 7. Set "map_res" to be at least 10; for higher values of map_res, the point estimate and confidence intervals are more accurate and the plot of log-likelihood vs. location is smoother. Also Note 8. Use of Physical Maps DHSMAP assumes a genetic marker map. However, marker distances are often available in the form of a physical map. This suggests two possible approaches: (1) input physical distances instead of genetic distances or (2) first convert physical distances to genetic distances, input the genetic distances, then convert the results back to physical distances. As long as there is a constant conversion between cM and Mb in the region and mutation rates are set to 0, the two approaches will yield identical results, i.e. it is not necessary to know the conversion factor between genetic and physical distance in that case. However, when the model includes mutation (or if genetic distance is not assumed to be a fixed multiple of physical distance), approach (1) is incorrect and only approach (2) should be used. In that case an appropriate conversion between genetic and physical distances is needed. 9. Specification of E_int If you know the approximate location of the trait-associated variant, you may specify E_int, skipping the first stage of the search procedure. If DHSMAP estimates the location outside this interval, we recommend rerunning the program without specifying E_int, i.e., set E_int=0. (See "Search Procedures" for more on E_int.) 10. Ancestral Haplotype Known DHSMAP generally spends a large percentage of its running time estimating the ancestral haplotypes. If you know the ancestral haplotype, if there are multiple ancestral haplotypes and you know all of them, or if you have a set of candidate ancestral haplotypes over which you wish to maximize (e.g., when you had already performed a similar analysis), you can save time by setting "anc_hap_known"=1 and specifying the set S of ancestral haplotypes over which DHSMAP will maximize the likelihood in the third stage of the search procedure (See "Search Procedures" for details). This option should be used with caution; incorrectly specifying the MLE ancestral haplotypes can bias the other parameter estimates and CIs.