Estimating the effect of SNP genotype on quantitative traits from pooled DNA samples
1 CSIRO Livestock Industries, FD McMaster Laboratory Chiswick, Armidale 2350, NSW, Australia
2 Cooperative Research Centre for Beef Genetic Technologies, University of New England, Armidale NSW 2351, Australia
3 CSIRO Livestock Industries, Queensland Bioscience Precinct, Brisbane QLD 4067, Australia
4 Present Address: Cobb-Vantress, Siloam Springs, Arkansas, USA
Genetics Selection Evolution 2012, 44:12 doi:10.1186/1297-9686-44-12Published: 17 April 2012
Studies to detect associations between DNA markers and traits of interest in humans and livestock benefit from increasing the number of individuals genotyped. Performing association studies on pooled DNA samples can provide greater power for a given cost. For quantitative traits, the effect of an SNP is measured in the units of the trait and here we propose and demonstrate a method to estimate SNP effects on quantitative traits from pooled DNA data.
To obtain estimates of SNP effects from pooled DNA samples, we used logistic regression of estimated allele frequencies in pools on phenotype. The method was tested on a simulated dataset, and a beef cattle dataset using a model that included principal components from a genomic correlation matrix derived from the allele frequencies estimated from the pooled samples. The performance of the obtained estimates was evaluated by comparison with estimates obtained using regression of phenotype on genotype from individual samples of DNA.
For the simulated data, the estimates of SNP effects from pooled DNA are similar but asymptotically different to those from individual DNA data. Error in estimating allele frequencies had a large effect on the accuracy of estimated SNP effects. For the beef cattle dataset, the principal components of the genomic correlation matrix from pooled DNA were consistent with known breed groups, and could be used to account for population stratification. Correctly modeling the contemporary group structure was essential to achieve estimates similar to those from individual DNA data, and pooling DNA from individuals within groups was superior to pooling DNA across groups. For a fixed number of assays, pooled DNA samples produced results that were more correlated with results from individual genotyping data than were results from one random individual assayed from each pool.
Use of logistic regression of allele frequency on phenotype makes it possible to estimate SNP effects on quantitative traits from pooled DNA samples. With pooled DNA samples, genotyping costs are reduced, and in cases where trait records are abundant this approach is promising to obtain SNP associations for marker-assisted selection.