Single nucleotide polymorphisms (SNPs), or biallelic markers, are popular in genetic linkage studies due to their abundance in the genome, stability, and ease of scoring. We determined the 'information ratio' (IR) of closely spaced SNPs in simulated nuclear families and affected sib pairs (ASPs). (The IR is the ratio of actual average maximum lod score to the maximum lod score attainable if the marker were fully informative.) The nuclear families included parental information, whereas the ASPs did not. We analyzed these SNPs in two ways: (1) using multipoint analysis, and (2) treating the SNPs as 'composite markers' (i.e., haplotypes, as assigned by GENEHUNTER). (3) We also calculated the IR of a single microsatellite marker with multiple alleles and compared with the IR from the SNPs. For each set of input conditions, we simulated 1000 nuclear families, of 2, 3, 4, or 5 children each, as well as 1000 ASPs. We generated SNP marker data for strings of k = 1, 2, 3, 5, 7, and 10 SNP loci, with no recombination (theta = 0) and no linkage disequilibrium among the SNPs. The MAF (minor allele frequency) was either 0.5 or 0.25, and allele frequencies were the same for all k loci in any analysis. We also generated marker data for one single-locus microsatellite marker, with m = 3, 4, 5, 6, 7, and 9 equally frequent alleles. In all simulations, the disease was fully penetrant dominant, and there was no recombination or linkage disequilibrium among markers or between marker and disease. When multipoint analysis was used, we found that 5-7 closely spaced SNPs were usually enough to yield an IR of approximately 100%, for nuclear families of any size. However, for the ASPs, even 7-10 SNPs yielded an IR of only 70-80%. A microsatellite with 9 equally frequent alleles yielded about the same IR (86-88%) as a string of 4-5 SNPs, in nuclear families. SNPs analyzed as 'composite markers' analyses performed worse, due to the inherent ambiguity of SNP haplotyping.
Copyright 2004 S. Karger AG, Basel.